`presize` : precision based sample size calculation

It is sometimes desirable to power a study on the precision of an estimate rather than for a particular hypothesis test. presize provides a range of functions for performing these calculations. presize returns either the confidence interval width that could be expected given a sample size, or the sample size that would be necessary to attain a given confidence interval width.
For instance, it may be that we want to estimate the mean amount of a blood parameter to within 5 units. Based on published literature, we expect the mean value to be 20 units, with a standard deviation of 3. To achieve the 5 unit confidence interval width, participants are required. If we know that we have funding to include 50 participants, we can calculate the confidence interval width that we could expect, and find that it would be units wide.

The different estimators are grouped according to their type (e.g. mean and proportion are under 'Descriptive statistics', while odds and risk ratios are under 'Relative differences'.

Each statistic has a set of fields. Mandatory fields are marked with an asterisk (*). There are also two fields that pertain to the sample size and confidence interval width, indicated by a dagger (†). Only one of these should be entered.

Relevant references are listed on each page.

This site uses the presize R package (version ), which was developed at CTU Bern, the Clinical Trials Unit of the University of Bern and University Hospital Bern, on behalf of the Statistics & Methodology Platform of the Swiss Clinical Trial Organisation . The R package version of presize can be installed in R using the from CRAN ( install.packages('presize') ). The R code for running the calculations in this site is shown after the results. The presize package website can be found here .
If you use presize , please cite it in your publication as: Haynes et al., (2021). presize: An R-package for precision-based sample size calculation in clinical research. Journal of Open Source Software, 6(60), 3118, DOI 10.21105/joss.03118

Precision of a mean

Enter the mean and standard deviation you expect. To estimate the confidence interval width from a population of size X, enter the population size in 'Number of observations'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

Please enter both of the following

Mean

Standard deviation

Please enter one of the following

Results

Code to replicate in R:

Precision of a proportion

Enter the proportion you expect. To estimate the confidence interval width from a population of size X, enter the population size in 'Number of observations'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

Please enter the following

Proportion

Please enter one of the following

Other settings

The Wilson confidence interval is recommended, but others are available.

Confidence interval method

Results

Code to replicate in R:

References

Brown LD, Cai TT, DasGupta A (2001) Interval Estimation for a Binomial Proportion, Statistical Science , 16:2, 101-117, doi:10.1214/ss/1009213286

Precision of a rate

Enter the rate you expect. To estimate the confidence interval width from a number of events, enter the number of events in 'Number of events'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

Please enter the following

Rate

Please enter one of the following

Other settings

Confidence interval method

Results

Code to replicate in R:

References

Barker, L. (2002) A Comparison of Nine Confidence Intervals for a Poisson Parameter When the Expected Number of Events is ≤ 5, The American Statistician , 56:2, 85-89, DOI: 10.1198/000313002317572736

Precision of a mean difference

Enter the mean difference and standard deviations you expect. If you intend to use uneven allocation ratios (e.g. 2 allocated to group 1 for each participant allocated to group 2), adjust the allocation ratio accordingly. To estimate the confidence interval width expected with a particular number of observations, enter the number of observations in 'Number of observations'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'. For the difference between paired observations, use the routine for a simple mean.

Please enter the following

Mean difference

Standard deviation of group 1

Group variances unequal

Standard deviation of group 2

Allocation ratio

(N2 / N1)

Please enter one of the following

Results

Code to replicate in R:

Precision of a risk difference

Enter the proportions of events you expect in the groups. If you intend to use uneven allocation ratios (e.g. 2 allocated to group 1 for each participant allocated to group 2), adjust the allocation ratio accordingly. To estimate the confidence interval width from a number of events, enter the number of events in 'Number of events'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

Please enter the following

Proportion of events in group 1

Proportion of events in group 2

Allocation ratio

N1 / N2

Please enter one of the following

Other settings

Method

Results

Code to replicate in R:

References

Agresti A (2003) Categorical Data Analysis , Second Edition, Wiley Series in Probability and Statistics DOI: 10.1002/0471249688
Agresti A and Caffo B (2000) Simple and Effective Confidence Intervals for Proportions and Differences of Proportions Result from Adding Two Successes and Two Failures, The American Statistician 54(4):280-288
Miettinen O and Nurminen M (1985) Comparative analysis of two rates, Statistics in Medicine , 4:213-226
Newcombe RG (1998) Interval estimation for the difference between independent proportions: comparison of eleven methods, Statistics in Medicine , 17:873-890
Fagerland MW, Lydersen S, and Laake P (2015). Recommended confidence intervals for two independent binomial proportions, Statistical methods in medical research 24(2):224-254.

Precision of an odds ratio

Please enter the following

Proportion of events in group 1*

Proportion of events in group 2*

Allocation ratio*

(N2 / N1)

Please enter one of the following

Other settings

Method

Results

Code to replicate in R:

References

Fagerland MW, Lydersen S, Laake P (2015). Recommended confidence intervals for two independent binomial proportions. Statistical Methods in Medical Research , 24(2):224-254. doi:10.1177/0962280211415469

Precision of a risk ratio

Please enter the following

Proportion of events in group 1

Proportion of events in group 2

Allocation ratio

(N2 / N1)

Please enter one of the following

Other settings

Method

Results

Code to replicate in R:

References

Fagerland MW, Lydersen S, and Laake P (2015). Recommended confidence intervals for two independent binomial proportions, Statistical methods in medical research 24(2):224-254.
Katz D, Baptista J, Azen SP, and Pike MC (1978) Obtaining Confidence Intervals for the Risk Ratio in Cohort Studies. Biometrics 34:469-474
Koopman PAR (1984) Confidence Intervals for the Ratio of Two Binomial Proportions, Biometrics 40:513-517

Precision of a rate ratio

Enter the proportions of events you expect in the groups. If you intend to use uneven allocation ratios (e.g. 2 allocated to group 1 for each participant allocated to group 2), adjust the allocation ratio accordingly. To estimate the ratio of the upper confidence interval limit to the lower limit from a number of events, enter the number of events in 'Number of events'. To estimate the number of observations required to get a ratio of the upper confidence interval limit to the lower limit of X, enter the ratio in 'Upper-lower ratio'.

Please enter the following

Event rate in the exposed group

Event rate in the control group

Allocation ratio

(N2 / N1)

Please enter one of the following

Results

Code to replicate in R:

References

Rothamn KJ, Greenland S (2018) Planning Study Size Based on Precision Rather Than Power. Epidemiology 29:599-603 doi:10.1097/EDE.0000000000000876

Precision of a correlation coefficient

Enter the correlation coefficient you expect. To estimate the confidence interval width from a number of events, enter the number of events in 'Number of events'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

Please enter the following

Correlation coefficient

Type of correlation coefficient

Please enter one of the following

Results

Code to replicate in R:

References

Bonett DG, and Wright TA (2000) Sample size requirements for estimating Pearson, Kendall and Spearman correlations. Psychometrika 65:23-28 doi:10.1007/BF02294183

Precision of an intraclass correlation coefficient

Enter the intraclass correlation coefficient you expect. To estimate the confidence interval width from a number of subjects, enter the number of subjects in 'Number of subjects'. To estimate the number of observations required to get a confidence interval width of X, enter the desired width in 'Confidence interval width'.

Please enter the following

Intraclass correlation coefficient

Number of observations per subject

Please enter one of the following

Results

Code to replicate in R:

References

Bonett DG (2002). Sample size requirements for estimating intraclass correlations with desired precision. Statistics in Medicine 21:1331-1335. doi: 10.1002/sim.1108

Precision of limits of agreement

Bland-Altmann (also known as Tukey mean-difference) plots are often used to assess the agreement between two methods of measuring a quantity. A typical plot might look like the following figure. The blue line represents the mean difference between the methods, while the red lines represent the confidence interval of that difference (the limit of agreement). The dotted lines represent the confidence intervals around the limit of agreement.

This page calculates the width of the confidence interval around the limit of agreement (as indicated by the black arrows), the width of which is only a function of sample size. To calculate the width of the confidence interval of the difference itself (e.g. the grey line), a paired mean difference can be used.
Enter the sample size or confidence interval width to calculate the other.

Please enter one of the following

Result

Code to replicate in R:

References

Bland & Altman (1986) Statistical methods for assessing agreement between two methods of clinical measurement. Lancet i(8476):307-310 doi: 10.1016/S0140-6736(86)90837-8

Cohen's kappa

Kappa is used to assess the agreement between multiple raters, each classifying items into mutually exclusive categories. This function supports up to 6 raters and 5 categories.

Please enter all of the following parameters:

Estimated kappa

Number of raters

Number of categories

Expected proportion of each category (must sum to 1, enter values separated by a comma)

Please enter one of the following:

Result

Code to replicate in R:

References

Donner & Rotondi (2010) Sample Size Requirements for Interval Estimation of the Kappa Statistic for Interobserver Agreement Studies with a Binary Outcome and Multiple Raters. International Journal of Biostatistics 6:31 doi: 10.2202/1557-4679.1275
Rotondi & Donner (2012) A Confidence Interval Approach to Sample Size Estimation for Interobserver Agreement Studies with Multiple Raters and Outcomes. Journal of Clinical Epidemiology 65:778-784 doi: 10.1016/j.jclinepi.2011.10.019

Cronbach's alpha

Cronbach's alpha is used to assess the internal consistency of tests and measures.

Please enter all of the following parameters:

Number of measurements/items in test

Desired/expected Cronbach's alpha

Please enter one of the following:

Result

Code to replicate in R:

References

Bonett & Wright (2015) Cronbach's alpha reliability: Interval estimation, hypothesis testing, and sample size planning. Journal of Organizational Behaviour 36(1):3-15 DOI: 10.1002/job.1960

Precision of sensitivity

Sensitivity is the proportion of positive test results that are identified as such. It is also known as the true positive rate, recall or probability of detection. It is actually a simple proportion, but as the total sample size, rather than the number of cases, is typically of interect this function requires an estimate of the prevalence of cases.

Please enter the following

Sensitivity

Please enter one of the following

Optional parameters

Confidence interval method

When rounding the number of cases, round...

The number of cases is calculated as sample size * prev, which can result in fractions so rounding is necessary.

Code to replicate in R:

References

Brown LD, Cai TT, DasGupta A (2001) Interval Estimation for a Binomial Proportion, Statistical Science , 16:2, 101-117, doi:10.1214/ss/1009213286

Precision of specificity

Specificity is the proportion of negative test results that are identified as such. It is also known as the true negative rate. It is actually a simple proportion, but as the total sample size, rather than the number of non-cases, is typically of interect this function requires an estimate of the prevalence of cases.

Please enter the following

Specificity

Please enter one of the following

Other settings

Confidence interval method

When rounding the number of cases, round...

The number of cases is calculated as sample size * prev, which can result in fractions so rounding is necessary.

Results

Code to replicate in R:

References

Brown LD, Cai TT, DasGupta A (2001) Interval Estimation for a Binomial Proportion, Statistical Science , 16:2, 101-117, doi:10.1214/ss/1009213286

AUC (Area under the curve)

The AUC refers to the areas under the Receiver Operating Characteristic (ROC) curve - the grey area in the figure below. The higher the AUC, the better a predictive model performs.

Please enter the following

AUC

Prevalence

Please enter one of the following

Code to replicate in R:

References

Hanley, JA and McNeil, BJ (1982) The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve. Radiology 148, 29-36

Positive likelihood ratio

Calculate precision or sample size for the positive likelihood ratio based on sensitivity and specificity. Formula 10 from Simel et al is used.
Groups here refer to e.g. the disease status.

Please enter the following

Prevalence

Sensitivity

Specificity

Please enter one of the following

Results

Code to replicate in R:

References

Simel, DL, Samsa, GP and Matchar, DB (1991) Likelihood ratios with confidence: Sample size estimation for diagnostic test studies. J Clin Epidemiol 44(8), 763-770, DOI 10.1016/0895-4356(91)90128-v

Negative likelihood ratio

Calculate precision or sample size for the negative likelihood ratio based on sensitivity and specificity. Formula 10 from Simel et al is used.
Groups here refer to e.g. the disease status.

Please enter the following

Prevalence

Sensitivity

Specificity

Please enter one of the following

Results

Code to replicate in R:

References

Simel, DL, Samsa, GP and Matchar, DB (1991) Likelihood ratios with confidence: Sample size estimation for diagnostic test studies. J Clin Epidemiol 44(8), 763-770, DOI 10.1016/0895-4356(91)90128-v

presize : precision based sample size calculation

Precision of a mean

Please enter both of the following

Please enter one of the following

Results

Precision of a proportion

Please enter the following

Please enter one of the following

Other settings

Results

References

Precision of a rate

Please enter the following

Please enter one of the following

Other settings

Results

References

Precision of a mean difference

Please enter the following

Please enter one of the following

Results

Precision of a risk difference

Please enter the following

Please enter one of the following

Other settings

Results

References

Precision of an odds ratio

Please enter the following

Please enter one of the following

Other settings

Results

References

Precision of a risk ratio

Please enter the following

Please enter one of the following

Other settings

Results

References

Precision of a rate ratio

Please enter the following

Please enter one of the following

Results

References

Precision of a correlation coefficient

Please enter the following

Please enter one of the following

Results

References

Precision of an intraclass correlation coefficient

Please enter the following

Please enter one of the following

Results

References

Precision of limits of agreement

Please enter one of the following

Result

References

Cohen's kappa

Please enter all of the following parameters:

Please enter one of the following:

Result

References

Cronbach's alpha

Please enter all of the following parameters:

Please enter one of the following:

Result

References

Precision of sensitivity

Please enter the following

Please enter one of the following

Optional parameters

References

Precision of specificity

Please enter the following

Please enter one of the following

Other settings

Results

References

AUC (Area under the curve)

`presize` : precision based sample size calculation