This
software is intended to be useful
in planning statistical studies. It is not intended to be
used
for analysis of data that have already been collected.
Each selection provides a graphical
interface for studying the power of one or more tests. They
include sliders
(convertible to number-entry fields) for varying parameters, and a
simple
provision for graphing one variable against another.
Each dialog window also
offers a Help
menu (on Macs, the Options and Help menus are added at the top of the
screen). Please read the Help menus before
contacting me with
questions.
The "Balanced ANOVA" selection
provides another dialog with a list of several popular experimental
designs, plus
a provision for specifying your own model.
Note:
The dialogs open in separate
windows. If you're
running this on an Apple Macintosh, the applets' menus are added to the
screen menubar -- so, for example,
you'll
have
two "Help" menus there!
You
may also downloadthis
software to run it
on your own PC.
Note:
These require a web
browser capable of running Java applets (version 1.3 or higher). If you
do not see a selection list above, chances are that you either have
disabled Java, or you have an outdated implementation of
Java. In
the latter case, you need to download and install the JRE plug-in from java.sun.com.
Due to a
compatibility bug, many plug-ins size the applet window before allowing
for an additional strip with a security warning.; to
compensate, drag the bottom of
the window downward a bit.
Please read this comment
I receive quite a few questions that start with something like this:
"I'm not much of a stats person, but I tried [details...] -- am I doing it right?"
Please compare this with:
"I don't know much about heart surgery, but my wife is suffering from ... and I plan to operate ... can you advise me?"
Folks, just because you can plug numbers into a program doesn't change
the fact that if you don't know what you're doing, you're almost
guaranteed to get meaningless results -- if not dangerously misleading
ones. Statistics really is like rocket science; it isn't easy, even to
us who have studied it for a long time. Anybody who think it's
easy surely lacks a deep enough knowledge to understand why it isn't!
If your scientific integrity matters, and statistics is a mystery to
you, then you need expert help. Find a statistician in your company or
at a nearby university, and talk to her face-to-face if possible. It
may well cost money. It's worth it.
New - Discussion
group
I have created a discussion group at https://groups.google.com/forum/#!forum/piface-discussion
where users may post questions to one another (and answers too, one
hopes, as well as examples). I will try to look at it occasionally. I
take no responsibility for the level of correctness of whatever is
posted there. I believe that anyone may read the postings, but must
login using a gmail account order to post material.
Citing
this software
If you use this software in preparing a research paper, grant proposal,
or other prublication, I would appreciate your acknowledging it by
citing it in the references. Here is a suggested bibliography
entry in APA or "author (date)" style:
Lenth, R. V.
(2006-9). Java Applets
for Power and Sample Size [Computer software]. Retrieved month
day, year, from
http://www.stat.uiowa.edu/~rlenth/Power.
This form of the citation is appropriate whether you run it online
(give the date you ran it) or the stand-alone version (give the date
you downloaded it).
Download
to run locally
The file piface.jar
may be
downloaded so that you can run these applications locally. [Note: Some mail software
(that thinks
it is smarter than you) renames this file piface.zip.
If this happens, simply rename it piface.jar;
do not
unzip the file.]
You
may also want the icon file piface.ico
if you put it on your desktop or a toolbar. You
will need to have the Java Runtime Environment (JRE) or the Java
Development Kit (JDK) installed on your system. You probably
already have it; but if not, these are available for free download for
several platforms from Sun.
If
you have JDK or JRE version 1.2 or later, then you can probably run the
application just by double-clicking on piface.jar.
Otherwise,
you may run it from the command line in a terminal or DOS window, using
a command like
java -jar piface.jar
This will bring up a selector list similar to the one in this web
page. A particular dialog can also be run directly from the
command line, if you know its name (can be discovered by browsing piface.jar
with a zip file utility such as WinZip).
For example, the two-sample t-test
dialog may be run using
java
-cp
piface.jar rvl.piface.apps.TwoTGUI
Frequently
Asked Questions
What
formula(s) do you use in these calculations?
In most cases, power is an exact calculation based on the
distributional situation in question. Typically it is a
probability associated with a non-central distribution. In a
few
cases, an approximation is used, and is labeled as such.
Sample
sizes are calculated using root-finding methods in conjunction with
power calculations. There are usually not nice neat
formulas. That's why we need this software.
Help
on using a particular applet
I am willing to
provide
minimal support if you truly don't understand what inputs are
required. However, each applet has a help menu, and I do
request
that you carefully read that before you e-mail me with
questions.
I
need consulting help
I am providing this software for free, but I do not have time to also
answer substantive questions on power/sample size for your research
project. If you need statistical advice on your research
problem,
you should
contact a statistical consultant; and if you want expert advice, you
should expect to pay for it. Most universities with
statistics
departments or statistics programs also offer a consulting
service. If you think your research is important, then it is
also
important to get good advice on the statistical design
and analysis (do this before
you start
collecting data).
How
to do... Retrospective power ... Cohen's effect sizes
I recommend against these (see Advice
section
below). I have been asked why the Options menu in every
single
applet has links for retrospective power and Cohen effect sizes.
It seems to some to be placing undue emphasis on methods I
don't
like. The technical answer to the question is that these menu items are
inherited from a base class, along with some other things (e.g., the
graphics capabilities). The other answer is that people ask
me
about this all the time,
in
spite of everything I say on this site. If you follow those
menu
links, you get explanations of why not to do it. I'm
especially
proud of the dialog for retrospective power.
Disclaimer
This software is made available as-is, with no guarantees; use it at
your own risk. I welcome comments on bugs, additional
capabilities you'd like to see, etc.
Other
questions If you have
carefully
read the
above FAQs, and still find it
appropriate to contact me, my e-mail address is russell-lenth@uiowa.edu.
You may also find useful information in the discussion group.
Advice
Here are two
very wrong things that people try to do with my software:
Retrospective
power
(a.k.a. observed power, post hoc power). You've got the data,
did
the analysis, and did not achieve "significance." So you
compute
power retrospectively to see if the test was powerful enough or
not. This is an empty question. Of course it wasn't
powerful enough -- that's why the result isn't significant.
Power
calculations are useful for design, not analysis.
(Note: These comments refer to power computed based on
the
observed effect size and sample size. Considering a different
sample size is obviously prospective in nature. Considering a
different effect size might make sense, but probably what you really
need to do instead is an equivalence test; see Hoenig and Heisey, 2001.)
Specify
T-shirt effect sizes
("small", "medium", and "large"). This is an elaborate way to
arrive at the same sample size that has been used in past social
science studies of large, medium, and small size
(respectively).
The method uses a standardized effect size as the goal. Think
about it: for a "medium" effect size, you'll choose the same n regardless of the
accuracy or
reliability of your instrument, or the narrowness or diversity of your
subjects. Clearly, important considerations are being ignored
here. "Medium" is definitely not the message!
Here are three
very right things you can do:
Use
power
prospectively for planning future studies.
Software such
as is provided on this website is useful for determining an appropriate
sample size, or for evaluating a planned study to see if it is likely
to yield useful information.
Put
science
before statistics. It is easy to get caught up
in
statistical significance and such; but studies should be designed to
meet scientific goals, and you need to keep those in sight at all times
(in planning and
analysis). The appropriate inputs to power/sample-size
calculations are effect sizes that are deemed clinically important,
based on careful considerations of the underlying scientific (not
statistical) goals of the study. Statistical considerations
are
used to identify a plan that is effective in meeting scientific goals
-- not the other way around.
Do pilot
studies.
Investigators tend to try to answer all the world's questions with one
study. However, you usually cannot do a definitive study in
one
step. It is far better to work incrementally. A
pilot study
helps you establish procedures, understand and protect against things
that can go wrong, and obtain variance estimates needed in determining
sample size. A pilot study with 20-30 degrees of freedom for
error is generally quite adequate for obtaining reasonably reliable
sample-size estimates.
Many funding agencies require a power/sample-size section in grant
proposals. Following the above guidelines is good for
improving
your chances of being funded. You will have established that
you
have thought through the scientific issues, that your procedures are
sound, and that you have a defensible sample size based on realistic
variance estimates and scientifically tenable effect-size
goals.
To read more, please see the following references:
Lenth, R. V. (2001), ``Some Practical Guidelines for
Effective
Sample Size Determination,'' The American Statistician, 55,
187-193.
Hoenig, John M. and Heisey, Dennis M. (2001), ``The Abuse
of
Power: The Pervasive Fallacy of Power Calculations for Data Analysis,''
The American Statistician, 55,
19-24.
An earlier draft of the Lenth reference above is _here_,
and a shorter summary of some comments I made in a panel discussion at
the 2000 Joint Statistical Meetings in Indianapolis is _here_.
Additional brief comments, prepared as a handout for my
poster
presentation at the 2001 Joint Statistical Meetings in Atlanta, are _here_.
Most computations are ``exact'' in the sense that they are based on
exact formulas for sample size, power, etc. The exception is
Satterthwaite approximations; see below.
Machine accuracy
Even with exact formulas, computed values are inexact, as are all
double-precision floating-point computations. Many
computations (especially
noncentral distributions) require summing one or more series, and there
is a serious tradeoff between speed and accuracy. The error
bound
set for cdfs is 1E-8 or smaller, and for quantiles the bound is
1E-6.
Actual errors can be much larger due to accumulated errors or other
reasons.
Quantiles, for example, are computed by numerically solving an equation
involving the cdf; thus, in extreme cases, a small error in the cdf can
create a large error in the quantile.
A warning (typically, ``too many iterations'') is generated when an
error bound is not detected to have been achieved. However,
in
the case of quantile computations, no warning message is generated for
extreme quantiles. If you want a power of .9999 at
alpha=.0001,
you can expect the computed
sample size to not be accurate to the nearest
integer! If
you
specify reasonable criteria, the answers will be pretty reliable.
Satterthwaite approximations
Some of the dialogs (two-sample t, mixed ANOVA) implement Satterthwaite
approximations when certain combinations of inputs require an error
term
to be constructed. These are of course not exact, even in
their
formulation. Moreover, the Satterthwaite degrees of freedom
is
used as-is in computing power from a noncentral t or
noncentral F distribution, and this introduces
further errors
that could be large in some cases.
In the two-sample t setting, I'd expect the worst
errors to
exist
when there is a huge imbalance in sample sizes and/or
variances.
In
the dialogs for mixed ANOVA models (either F tests
or multiple
comparisons/contrasts), I expect these errors to get worse as more
variance components are involved, especially when one or more of them
is given negative weight.
Links
to other power software
I have removed this section because it can't have even been close to
comprehensive. Visitors
since August 14, 2006: View
hit-counter statisticshttp://homepage.stat.uiowa.edu/~rlenth/Power
This page was last modified Tuesday, 02-Oct-2012 17:25:47 CDT.
The views and opinions expressed in this page are strictly those of the page
author. The contents of this page have not been approved by Mathematical
Sciences, the College of Liberal Arts, or The University of Iowa.