# `presize` : precision based sample size calculation

It is sometimes desirable to power a study on the precision of an estimate rather than for a particular hypothesis test. `presize` provides a range of functions for performing these calculations. `presize` returns either the confidence interval width that could be expected given a sample size, or the sample size that would be necessary to attain a given confidence interval width.
For instance, it may be that we want to estimate the mean amount of a blood parameter to within 5 units. Based on published literature, we expect the mean value to be 20 units, with a standard deviation of 3. To achieve the 5 unit confidence interval width, participants are required. If we know that we have funding to include 50 participants, we can calculate the confidence interval width that we could expect, and find that it would be units wide.

The different estimators are grouped according to their type (e.g. mean and proportion are under 'Descriptive statistics', while odds and risk ratios are under 'Relative differences'.

Each statistic has a set of fields. Mandatory fields are marked with an asterisk (*). There are also two fields that pertain to the sample size and confidence interval width, indicated by a dagger (†). Only one of these should be entered.

Relevant references are listed on each page.

This site uses the `presize` R package (version ), which was developed at CTU Bern, the Clinical Trials Unit of the University of Bern and University Hospital Bern, on behalf of the Statistics & Methodology Platform of the Swiss Clinical Trial Organisation . The R package version of `presize` can be installed in R using the from CRAN ( `install.packages('presize')` ). The R code for running the calculations in this site is shown after the results. The `presize` package website can be found here .

## Precision of a mean

Enter the mean and standard deviation you expect. To estimate the confidence interval width from a population of size X, enter the population size in 'Number of observations'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

#### Results

Code to replicate in R:

## Precision of a proportion

Enter the proportion you expect. To estimate the confidence interval width from a population of size X, enter the population size in 'Number of observations'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

#### Other settings

The Wilson confidence interval is recommended, but others are available.

#### Results

Code to replicate in R:

#### References

Brown LD, Cai TT, DasGupta A (2001) Interval Estimation for a Binomial Proportion, Statistical Science , 16:2, 101-117, doi:10.1214/ss/1009213286

## Precision of a rate

Enter the rate you expect. To estimate the confidence interval width from a number of events, enter the number of events in 'Number of events'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

#### Results

Code to replicate in R:

#### References

Barker, L. (2002) A Comparison of Nine Confidence Intervals for a Poisson Parameter When the Expected Number of Events is ≤ 5, The American Statistician , 56:2, 85-89, DOI: 10.1198/000313002317572736

## Precision of a mean difference

Enter the mean difference and standard deviations you expect. If you intend to use uneven allocation ratios (e.g. 2 allocated to group 1 for each participant allocated to group 2), adjust the allocation ratio accordingly. To estimate the confidence interval width from a number of events, enter the number of events in 'Number of events'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

(N2 / N1)

#### Results

Code to replicate in R:

## Precision of a risk difference

Enter the proportions of events you expect in the groups. If you intend to use uneven allocation ratios (e.g. 2 allocated to group 1 for each participant allocated to group 2), adjust the allocation ratio accordingly. To estimate the confidence interval width from a number of events, enter the number of events in 'Number of events'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

N1 / N2

#### Results

Code to replicate in R:

### References

Agresti A (2003) Categorical Data Analysis , Second Edition, Wiley Series in Probability and Statistics DOI: 10.1002/0471249688
Agresti A and Caffo B (2000) Simple and Effective Confidence Intervals for Proportions and Differences of Proportions Result from Adding Two Successes and Two Failures, The American Statistician 54(4):280-288
Miettinen O and Nurminen M (1985) Comparative analysis of two rates, Statistics in Medicine , 4:213-226
Newcombe RG (1998) Interval estimation for the difference between independent proportions: comparison of eleven methods, Statistics in Medicine , 17:873-890
Fagerland MW, Lydersen S, and Laake P (2015). Recommended confidence intervals for two independent binomial proportions, Statistical methods in medical research 24(2):224-254.

## Precision of an odds ratio

Enter the proportions of events you expect in the groups. If you intend to use uneven allocation ratios (e.g. 2 allocated to group 1 for each participant allocated to group 2), adjust the allocation ratio accordingly. To estimate the confidence interval width from a number of events, enter the number of events in 'Number of events'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

(N2 / N1)

#### Results

Code to replicate in R:

#### References

Fagerland MW, Lydersen S, Laake P (2015). Recommended confidence intervals for two independent binomial proportions. Statistical Methods in Medical Research , 24(2):224-254. doi:10.1177/0962280211415469

## Precision of a risk ratio

Enter the proportions of events you expect in the groups. If you intend to use uneven allocation ratios (e.g. 2 allocated to group 1 for each participant allocated to group 2), adjust the allocation ratio accordingly. To estimate the confidence interval width from a number of events, enter the number of events in 'Number of events'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

(N2 / N1)

#### Results

Code to replicate in R:

#### References

Fagerland MW, Lydersen S, and Laake P (2015). Recommended confidence intervals for two independent binomial proportions, Statistical methods in medical research 24(2):224-254.
Katz D, Baptista J, Azen SP, and Pike MC (1978) Obtaining Confidence Intervals for the Risk Ratio in Cohort Studies. Biometrics 34:469-474
Koopman PAR (1984) Confidence Intervals for the Ratio of Two Binomial Proportions, Biometrics 40:513-517

## Precision of a rate ratio

Enter the proportions of events you expect in the groups. If you intend to use uneven allocation ratios (e.g. 2 allocated to group 1 for each participant allocated to group 2), adjust the allocation ratio accordingly. To estimate the ratio of the upper confidence interval limit to the lower limit from a number of events, enter the number of events in 'Number of events'. To estimate the number of observations required to get a ratio of the upper confidence interval limit to the lower limit of X, enter the ratio in 'Upper-lower ratio'.

(N2 / N1)

#### Results

Code to replicate in R:

#### References

Rothamn KJ, Greenland S (2018) Planning Study Size Based on Precision Rather Than Power. Epidemiology 29:599-603 doi:10.1097/EDE.0000000000000876

## Precision of a correlation coefficient

Enter the correlation coefficient you expect. To estimate the confidence interval width from a number of events, enter the number of events in 'Number of events'. To estimate the number of observations required to get a confidence interval width of X, enter the width in 'Confidence interval width'.

#### Results

Code to replicate in R:

#### References

Bonett DG, and Wright TA (2000) Sample size requirements for estimating Pearson, Kendall and Spearman correlations. Psychometrika 65:23-28 doi:10.1007/BF02294183

## Precision of an intraclass correlation coefficient

Enter the intraclass correlation coefficient you expect. To estimate the confidence interval width from a number of subjects, enter the number of subjects in 'Number of subjects'. To estimate the number of observations required to get a confidence interval width of X, enter the desired width in 'Confidence interval width'.

#### Results

Code to replicate in R:

#### References

Bonett DG (2002). Sample size requirements for estimating intraclass correlations with desired precision. Statistics in Medicine 21:1331-1335. doi: 10.1002/sim.1108

## Precision of limits of agreement

Bland-Altmann (also known as Tukey mean-difference) plots are often used to assess the agreement between two methods of measuring a quantity. A typical plot might look like the following figure. The blue line represents the mean difference between the methods, while the red lines represent the confidence interval of that difference (the limit of agreement). The dotted lines represent the confidence intervals around the limit of agreement.
This page calculates the width of the confidence interval around the limit of agreement (as indicated by the black arrows), the width of which is only a function of sample size. To calculate the width of the confidence interval of the difference itself (e.g. the grey line), a paired mean difference can be used.
Enter the sample size or confidence interval width to calculate the other.

#### Result

Code to replicate in R:

#### References

Bland & Altman (1986) Statistical methods for assessing agreement between two methods of clinical measurement. Lancet i(8476):307-310 doi: 10.1016/S0140-6736(86)90837-8

## Cohen's kappa

Kappa is used to assess the agreement between multiple raters, each classifying items into mutually exclusive categories. This function supports up to 6 raters and 5 categories.

#### Result

Code to replicate in R:

#### References

Donner & Rotondi (2010) Sample Size Requirements for Interval Estimation of the Kappa Statistic for Interobserver Agreement Studies with a Binary Outcome and Multiple Raters. International Journal of Biostatistics 6:31 doi: 10.2202/1557-4679.1275
Rotondi & Donner (2012) A Confidence Interval Approach to Sample Size Estimation for Interobserver Agreement Studies with Multiple Raters and Outcomes. Journal of Clinical Epidemiology 65:778-784 doi: 10.1016/j.jclinepi.2011.10.019

## Precision of sensitivity

Sensitivity is the proportion of positive test results that are identified as such. It is also known as the true positive rate, recall or probability of detection. It is actually a simple proportion, but as the total sample size, rather than the number of cases, is typically of interect this function requires an estimate of the prevalence of cases.

#### Optional parameters

The number of cases is calculated as sample size * prev, which can result in fractions so rounding is necessary.

Code to replicate in R:

#### References

Brown LD, Cai TT, DasGupta A (2001) Interval Estimation for a Binomial Proportion, Statistical Science , 16:2, 101-117, doi:10.1214/ss/1009213286

## Precision of specificity

Specificity is the proportion of negative test results that are identified as such. It is also known as the true negative rate. It is actually a simple proportion, but as the total sample size, rather than the number of non-cases, is typically of interect this function requires an estimate of the prevalence of cases.

#### Other settings

The number of cases is calculated as sample size * prev, which can result in fractions so rounding is necessary.

#### Results

Code to replicate in R:

#### References

Brown LD, Cai TT, DasGupta A (2001) Interval Estimation for a Binomial Proportion, Statistical Science , 16:2, 101-117, doi:10.1214/ss/1009213286

## AUC (Area under the curve)

The AUC refers to the areas under the Receiver Operating Characteristic (ROC) curve - the grey area in the figure below. The higher the AUC, the better a predictive model performs.

Please enter one of the following

Code to replicate in R:

### References

Hanley, JA and McNeil, BJ (1982) The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve. Radiology 148, 29-36

## Positive likelihood ratio

Calculate precision or sample size for the positive likelihood ratio based on sensitivity and specificity. Formula 10 from Simel et al is used.
Groups here refer to e.g. the disease status.

#### Results

Code to replicate in R:

#### References

Simel, DL, Samsa, GP and Matchar, DB (1991) Likelihood ratios with confidence: Sample size estimation for diagnostic test studies. J Clin Epidemiol 44(8), 763-770, DOI 10.1016/0895-4356(91)90128-v

## Negative likelihood ratio

Calculate precision or sample size for the negative likelihood ratio based on sensitivity and specificity. Formula 10 from Simel et al is used.
Groups here refer to e.g. the disease status.

#### Results

Code to replicate in R:

#### References

Simel, DL, Samsa, GP and Matchar, DB (1991) Likelihood ratios with confidence: Sample size estimation for diagnostic test studies. J Clin Epidemiol 44(8), 763-770, DOI 10.1016/0895-4356(91)90128-v