What are the additional factors or variables, if any, that should be considered to enhance the robustness and generalizability of the calculators?

Results
(9 Answers)

Expert 3
From the specific recommendations:
Distribution Assumptions:
- Incorporate options to account for different distribution types, such as log-normal distributions, rather than assuming a normal distribution for all exposures.
Variability Consideration:
- Include mechanisms to address both between-subject and within-subject variability for exposures and biomarkers.
Incorporation of Simulations:
- Enable the use of simulations, such as random mixed-effects models, to provide more accurate estimates of sample size and account for complex exposure patterns.
Calculation of Additional Parameters:
- Add calculations for contrast, attenuation, and bias, which are essential in exposure assessments and can improve the accuracy and relevance of the results.
Expanded Range of Default Values:
- Broaden the default values to cover a wider range of scenarios, such as lower power values (e.g., 0.7 or 0.6) and smaller ICCs.
Flexibility in Output Selection:
- Provide options to choose between different statistical measures, such as Relative Risk versus Odds Ratio, to better suit the needs of specific epidemiological studies.
Consideration of Time-Related Factors:
- Factor in the half-life of biomarkers to determine appropriate sample timing and the number of repeats required for accurate exposure assessment.
Inclusion of Pilot Studies and Simulations:
- Encourage the use of pilot studies and simulations (e.g., Monte Carlo methods) to estimate variance components and refine study designs before full-scale implementation.
Sampling Strategy Optimization:
- Introduce group sampling strategies that can optimize exposure sampling and enhance the efficiency and accuracy of the study design.
Focus on Effect Size Over Significance:
- Shift the focus from statistical significance, which is dependent on sample size, to effect size, which provides more meaningful insights into a study’s findings.
Expert 7

An ability to use a different sample (from a pilot or another study) to estimate exposure or biomarker.
Addition of a parameter that indicates how far back the exposure of interest is (number of year) as compared to when biomarker was obtained.
Expert 4

Please see the above response.
Expert 6

A brief explanation of how to convert conventional input variables to standardized units.
I don't see the utility of Section 2. The "motivating examples" are separated from the corresponding "data," and no data are shown.
Expert 1

I can't think of any other factors or variables to include.
Expert 9
The comments below actually pertain more to the presentation of this white paper than specific factors or variables.

GENERAL COMMENTS

How will this white paper be released? As a publication in a scientific journal? As a stand-alone tool on the web, together with links to the calculators? This question is relevant because I suspect that the typical structure for scientific papers (Introduction, Methods, Results, Discussion) may not work as well for the purpose of this paper. If the authors have freedom to arrange the paper as they like, I suggest that they consider something more like the following:
- Introduction (background / statement of problem of variability and determining sample size, tools currently existing to address the problem, purpose of this paper).
- The tools and examples of using them and how they perform, results of using them.
- General discussion of how they add to the current literature / comparison to tools that are currently available.
Expanding on the last point, how do these calculators advance the field? Have the authors done their homework to see what is currently available? If there is nothing available that addresses these study design questions, then state it. If there is something available, it would help to compare the calculators presented in this paper to what currently exists, and argue why the presented calculators are better. For example, a VERY cursory online survey of currently available sample size calculators resulted in the following (many of which ignore within-person variability):
- EpiInfo – StatCalc from the CDC. https://cdc.gov/epiinfo/user-guide/statcalc/statcalcintro.html.
- Open Epi. https://openepi.com/SampleSize/SSCC.htm.
- Epi Tools. https://epitools.ausvet.com.au/samplesize.
- WinPEPI (PEPI for Windows). http://brixtonhealth.com/pepi4windows.html.
- Sample Size Calculator from CliniCalc. https://clinicalc.com/stats/samplesize.
- Sample Size Calculator from sample-size.net. https://sample-size.net.
- Sample Size for Repeated Measures ANOVA. https://scalestatistics.com/sample-size-for-repeated-measures-anova.html.
- Sample size packages or procedures in the major statistical packages available, such as R or SAS.
SPECIFIC COMMENTS

1. INTRODUCTION

Sections 1.2, 1.3, and 1.4 all seem to say similar things; this seems redundant and could be made more concise. As a specific example, all background and description of ICC could be combined and presented in one place near the beginning.

Section 2.3. Example 3: Longitudinal Studies on Chronic Disease Progression. “The ICC becomes particularly relevant in these studies, as it reflects the consistency of biomarker measurements within individuals across multiple time points, if we assume that there is no time-trend in the studied exposure.” As written, this seems odd – it is precisely the within-individual variation and the time trend that is of interest to study in chronic disease progression, so the between-individual variation and hence the ICC is of lower concern.

SECTION 2. MOTIVATING EXAMPLES AND DATA

What is the benefit of presenting Section 2 – Motivating Examples and Data? It doesn’t seem to relate to the rest of the paper. If the authors used those data to illustrate the calculators they present, that should be explicitly stated. Otherwise, there seems to be little reason to include this section at all; I suggest deleting it.

If the authors feel strongly about including Section 2, perhaps compress it to a short paragraph placed in the Introduction. Alternatively, if the authors truly want to keep it in the paper at this level of detail, this section would be simpler to follow if it was rearranged, such that each background example (e.g., section 2.1, Example 1, Assessing the Impact of Air Pollution on Respiratory Health) was immediately followed by the data example (e.g. section 2.5, Data Example 1: EPA Air Quality Monitoring Data).

SECTION 3. METHODS

Are the parameters for the examples in this section drawn from actual data? (E.g. EPA data on air pollution)? If so, it would be helpful to state that clearly for each example. Otherwise, readers could assume data are all hypothetical and thus perhaps of less relevance.

Sections 3 and 4. It might help to add a paragraph about how the calculators presented in this paper relate to each other, or a suggested order for use by a researcher designing a study from scratch.

SECTION 4. RESULTS

If the authors are able to arrange the paper as they like, I recommend rearranging the Methods and Results sections to put the Results immediately after the corresponding Methods. Example: Start with 3.1 – Calculator #1: The Number of Repeats Calculator, followed immediately by what is currently in 4.1: Number of Repeats Needed for Desired Validity Coefficient. I would move Figure 2 immediately after 4.1, so the reader doesn’t need to flip back and forth between the Figure and the text that describes it.

Many if not most of the Results are intuitively obvious and do not add to the literature; some readers may wonder why their time is being wasted. Example in 4.1: “The number of repeats needed decrease as the ICC increases” or in 4.2: “When we need more precise estimates and exposure is more variable one can anticipate more resource-intensive research, far more so than if there was not intrinsic variability over time within a person.” However, this can be acknowledged and used to bolster the validity and hence usefulness of the calculators, by adding something like the following sentence when the first results are presented: “The calculators yielded results that confirm what we would expect, indicating validity and utility in the field. Several specific results are listed below…”

SECTION 5. DISCUSSION

Neither section 5.1 or 5.2 seem to add anything really new to the literature. I suggest combining and compressing them.

What would really help is a discussion of how these calculators compare with already existing calculators, and thereby add to what is already known or available.

In the Discussion, it could be helpful to expand on practical aspects of using these tools, considering additional approaches in study design. For example:
- Recommendations to lower intra-individual variability, such as to use cumulative 24-hour voids in studies using urine samples, taking spot samples at the same time each day for compounds that show marked circadian patterns, or taking spot samples after fasting overnight.
- Recommendations to increase between-individual variability, such as selecting subjects with a large contrast in individual exposures (mentioned in Section 6.4).
- If an exposure of interest results in several metabolic products, choose the one with higher ICC (as long as it is a valid measure of exposure). Or if an exposure of interest includes several different compounds, focus on those with longer half-lives or other desirable biological characteristics (such as excreted in saliva or urine instead of requiring repeated blood draws).
Some of these tools require inputting parameter values that researchers might not be sure of for the compound(s) of interest, or that might not be readily available. Some potential solutions you might suggest in this white paper:
- Plan for an initial small pilot study such as the one by Preau and Calafat, to generate all necessary parameters for your exposure(s) of interest. This has the added advantage of using your target study population under your laboratory procedures, as well as further specifically training your lab and clinical staff.
- Estimate what likely maximum and minimum values might be for a given parameter, and plug those into the tools. The resulting answers should provide a range of maximum and minimum sample sizes needed to obtain desired results.
- Encourage researchers to publish results of their biomonitoring studies that include the relevant parameters (between- and within-subject variability, etc). If not directly relevant to their study question, the values of these parameters could be in appendices. Or perhaps a database containing parameter values could be constructed and made publicly available, to which researchers could contribute as studies accumulate. If such a database already exists in some form, direct readers to it.
Expert 8

Any models with time (longitudinal data, time to event,).
e.g. cox models are very common in my field.
Expert 5

The ability to include additional independent variables (that are correlated with the exposure variable of interest) in the calculators would reflect real word exposure scenarios more appropriately.
Expert 2
1. Consider adding options that allow users to include confounders or covariates. While this may introduce additional complexity, it would enhance the calculators' versatility and applicability to a broader range of study designs.
2. Additionally, explore the possibility of incorporating options for non-linear exposure-response relationships, which are increasingly relevant in epidemiological research.

Debate (3 Comments)

Expert 3

09/08/2024 16:10

I really appreciate Expert 9's suggestions and would like to expand on the ideas provided in Expert 9's discussion comment. Could the calculators offer direct, actionable recommendations or additional outputs beyond just sample size? For example, could they suggest increasing between-individual variability as a recommendation?

Expert 6

09/10/2024 13:57

I suggest greater discussion of options beyond increase in number of exposure measurements per individual or increase in number of individuals when intra-individual exposure variability is high. Standardizing timing of samples during the day or by season is mentioned, but a simpler approach is to compare outcomes in groups of individuals known to have very high vs very low exposures by virtue of their occupations, residence, etc.

Expert 7

09/13/2024 01:39

Explicitly addressing likely missing data

Comments are closed for this page.

SciPi 646: Improving Epidemiology Study Designs When Using Biomonitoring for Exposure Assessment

What are the additional factors or variables, if any, that should be considered to enhance the robustness and generalizability of the calculators?

Share Page

Copy URL to Clipboard

Results (9 Answers)

Expert 3

Expert 7

Expert 4

Expert 6

Expert 1

Expert 9

Expert 8

Expert 5

Expert 2

Debate (3 Comments)

Expert 3

Expert 6

Expert 7

Results
(9 Answers)