Results
(9 Answers)

Jump to Debate

Answer Explanations

  • Yes
    Expert 9
    The references and source papers help in this regard - thanks.
  • No (please explain) I cannot answer
    Expert 3
    The presented calculators are good extensions of other power calculators, but they do not go far enough. This is a question best addressed by an occupational biostatistician. I am responding as an advanced user of statistical tools, but I cannot address all the statistical details comprehensively. In a recent research project, I attempted to estimate the same quantities as those solved by the presented calculators, and we needed to use simulations. I have several concerns with the proposed methodologies. The points below may address issues that are only tangentially related to the assumptions.

    1. Distribution Assumption: Exposures are typically log-normally distributed, but the white paper assumes a normal distribution.

    2. Variability Consideration: Both exposures AND biomarkers vary between and within subjects. The calculators need to address both types of variability.

    3. Methodological Concerns: The proposed calculators present closed-form solutions. However, I understand that simulations are often necessary to estimate sample size accurately, especially since random mixed-effects models best describe exposures. The presented linear and logistic regressions will likely overestimate sample sizes and the number of repeats needed. The key issues include whether variances are homogeneous and simple, whether the design is balanced, the consideration of only random intercepts, the hypothesis test, and the within-subject correlation structure. [Confirmation with an occupational biostatistician required.]

    4. Missing Calculations: The calculations for contrast, attenuation, and bias are missing. These parameters are required in exposure assessment publications and can be estimated from linear mixed-effects models, which are used to model exposures.

    5. Literature and Justification: The justification for the calculators relies on traditional approaches. However, between- and within-subject variance issues were addressed in exposure assessment as early as 1993 by Kromhout et al. in "A comprehensive evaluation of within- and between-worker components of occupational exposure to chemical agents." A rich body of literature, including several textbooks on exposure assessment, addresses these issues. For example, Rappaport and Kupper's textbook "Quantitative Exposure Assessment" discusses sample size calculations, incorporating observed variance data from studies and sampling strategies.

    6. Variance Range: The Lin et al. (2005) paper on observed exposure variance suggests different ranges for variances and ICC than those accounted for by the calculators.

    7. Sampling Strategies: Group strategies can optimize exposure sampling, but the calculators do not address these issues.

    8. Default Values: The default values do not cover all necessary scenarios. For example, power values of 0.7 or even 0.6 may need to be included in the calculators. ICCs are often much smaller.

    9. Biomarker vs. Exposure Samples: Whether biomarker samples or exposure samples (e.g., air samples) are better suited for an epidemiological study has not been addressed. Again, see Lin et al.'s paper.

    10. Focus on Significance: The emphasis on significance is concerning because significance is a function of sample size. The focus should be on effect size instead.

    11. Time Assumptions: The half-life of biomarkers is not included in the calculators. For example, Preau et al. (2010) in "Variability over 1 Week in the Urinary Concentrations of Metabolites of Diethyl Phthalate and Di(2-Ethylhexyl) Phthalate among Eight Adults: An Observational Study" shows that half-life is crucial for calculating the number of repeats and samples needed as well as when to take the samples. 
    12. Need for Pilot Studies: Brunekreef et al. (1987) suggested that variance components should first be estimated in a pilot study. Epidemiological studies cannot simply start without these estimates. The calculators require point estimates, such as significance and power, among others. It would be better to emphasize that these values need to be observed in a pilot study or obtained through simulations, such as Monte Carlo methods, to estimate an optimal sample size and the number of repeats needed.

  • Yes
    Expert 4
    Classical measurement error model is assumed and this should be appropriate. 
  • Yes
    Expert 7
    However, Berkson error is not included. It can be relevant in biomarker studies, especially when biomarkers are used to assign exposure levels or when group-level or averaged data are used. 
  • Yes
    Expert 1
    I think the statistical assumptions on which the online calculators are based are sound and come from well cited publications and literature. 
  • Yes
    Expert 8
    I have some general feedback for the white paper, not sure if this is the right place to report them, but will add some here in the comments + will keep more extensive notes for the subsequent discussions and meetings. 

    General, related to the questions below: not sure if I would make such a clear distinction in logistic versus linear regression models. There are many more models, many assuming linearity, so just using 'linear' may be confusing. Maybe clarify the distinction as 'dichotomous' or categorical outcomes versus continuous outcomes, as measurement error in the exposure may indeed impact estimated associations these types of outcomes differently. For random error in exposure assessment, associations with continuous outcomes are generally attenuated (biased towards null), while for associations with dichotomous outcomes this may go either way. Sorry for the lengthy comment, but to summarize, there are indeed crucial differences, just not sure if I would refer to them as 'linear' versus 'logistic'. 

    Also: would not use 'bias analysis' as this seems too simplistic. There are many types of biases in epi study designs and bias in exposure assessment and subsequent consequence for estimating associations with outcomes is just one of them. 
  • Yes
    Expert 5
    I have some edits/suggestions for the manuscript that I can share at some point, but I did not find any issues with the assumptions and statistical models.


  • Yes
    Expert 2
    The assumptions and statistical models detailed in the white paper are generally well-suited for addressing challenges related to exposure biomonitoring variability in epidemiological studies. The emphasis on classical measurement error and the use of mixed-effects models aligns well with many common scenarios in biomonitoring research. However, these assumptions may not fully account for all potential complexities, such as non-classical errors or non-normally distributed data. Therefore, caution should be exercised when applying these models and calculators to ensure they are appropriate for the specific study context.

Debate (2 Comments)

Back to Top
1
Expert 4
09/01/2024 02:31
Agree to Expert 3's comments about the issues of the calculator. The authors could consider adopting these assumptions / parameter settings to their calculator.
4
Expert 2
09/03/2024 07:38
I agree that Expert 3 provides valuable insights on improving these statistical tools. However, we must also consider the balance between realism and idealism. Highly complicated tools are not easily adopted by a broader audience, especially those with limited statistical and epidemiological knowledge. In practical situations, for example in sample size or power calculations, we do not expect to give an exact value. Instead, providing an approximate range that informs users of potential uncertainties in their estimations is more helpful. As an experienced statistician, I believe that a simplified, suboptimal, but more user-friendly solution would be more popular, without significantly compromising scientific integrity.
Comments are closed for this page.