Analysis of Biomarker Data: A Practical Guide


thumbnail image: Analysis of Biomarker Data: A Practical Guide

A “how to” guide for applying statistical methods to biomarker data analysis

Presenting a solid foundation for the statistical methods that are used to analyze biomarker data, Analysis of Biomarker Data: A Practical Guide features preferred techniques for biomarker validation. The authors provide descriptions of select elementary statistical methods that are traditionally used to analyze biomarker data with a focus on the proper application of each method, including necessary assumptions, software recommendations, and proper interpretation of computer output. In addition, the book discusses frequently encountered challenges in analyzing biomarker data and how to deal with them, methods for the quality assessment of biomarkers, and biomarker study designs.

Covering a broad range of statistical methods that have been used to analyze biomarker data in published research studies, Analysis of Biomarker Data: A Practical Guide also features:

  • A greater emphasis on the application of methods as opposed to the underlying statistical and mathematical theory
  • The use of SAS, R, and other software throughout to illustrate the presented calculations for each example
  • Numerous exercises based on real-world data as well as solutions to the problems to aid in reader comprehension
  • The principles of good research study design and the methods for assessing the quality of a newly proposed biomarker
  • A companion website that includes a software appendix with multiple types of software and complete data sets from the book’s examples
Analysis of Biomarker Data: A Practical Guide is an ideal upper-undergraduate and graduate-level textbook for courses in the biological or environmental sciences. An excellent reference for statisticians who routinely analyze and interpret biomarker data, the book is also useful for researchers who wish to perform their own analyses of biomarker data, such as toxicologists, pharmacologists, epidemiologists, environmental and clinical laboratory scientists, and other professionals in the health and environmental sciences.


Preface xiii

Acknowledgements xvii

1 Introduction 1

1.1 What is a Biomarker? 1

1.2 Biomarkers Versus Surrogate Endpoints 2

1.3 Organization of This Book 3

2 Designing Biomarker Studies 5

2.1 Introduction 5

2.2 Designing the Study 6

2.2.1 The Exposure–Disease Association 6

2.2.2 Cross-sectional Studies 7

2.2.3 Case–Control Studies 7

2.2.4 Retrospective Cohort Studies 9

2.2.5 Prospective Cohort Studies 9

2.2.6 Observational Studies 10

2.2.7 Randomized Controlled Trials 11

2.3 Designing the Analysis 13

2.3.1 Choosing the Appropriate Measure of Association 15 Odds Ratio versus Risk Ratio 15 Consequences of Not Choosing the Appropriate Measure of Association 16

2.3.2 Choosing the Appropriate Statistical Analysis 16

2.3.3 Choosing the Appropriate Sample Size 17

2.4 Presenting Statistical Results 18

Problems 20

3 Elementary Statistical Methods for Analyzing Biomarker Data 21

3.1 Introduction 21

3.2 Graphical and Tabular Summaries 21

3.3 Descriptive Statistics 26

3.4 Describing the Shape of Distributions 31

3.5 Sampling Distributions 33

3.6 Introduction to Statistical Inference 34

3.6.1 Point Estimation and Confidence Interval Estimation 34

3.6.2 Hypothesis Testing 38

3.7 Comparing Means Across Groups 43

3.7.1 Two Group Comparisons 44

3.7.2 Multiple-Group Comparisons 45

3.8 Correlation Analysis 50

3.9 Regression Analysis 52

3.9.1 Simple Linear Regression 52

3.9.2 Multiple Regression 55

3.9.3 Analysis of Covariance 58

3.10 Analyzing Cross-Classified Data 61

3.10.1 Testing for Independence 61

3.10.2 Comparison of Proportions 65

Problems 69

4 Frequently Encountered Challenges in Analyzing Biomarker Data and How to Deal with Them 72

4.1 Introduction 72

4.2 Non-Normally Distributed Data 73

4.2.1 The Effects of Non-Normality 73

4.2.2 Testing Distributional Assumptions 74 Graphical Methods for Assessing Normality 74 Measures of Skewness and Kurtosis 81 Formal Hypothesis Tests of the Normality Assumption 83

4.2.3 Remedial Measures for Violation of a Distributional Assumption 86 Choosing a Transformation 86 Using a Robust Statistical Procedure 92 Distribution-Free Alternatives 93

4.3 Heterogeneity of Variance 113

4.3.1 The Effects of Heterogeneity 113

4.3.2 The Importance of Heterogeneity in the Comparison of Means 113 Comparisons of Two Groups 113 Comparisons of More Than Two Groups 116 Multiple Comparisons 118

4.4 Dependent Groups 122

4.4.1 The Consequences of Ignoring Dependence Among Groups 122

4.4.2 Comparing Two Dependent Means 124 Paired t-test 124 Wilcoxon Signed Ranks Test 127 Sign Test 128

4.4.3 Tests of Dependent Proportions 134 McNemar’s Test 134 Cochran’s Q test 138 Sample Size and Power Considerations 142

4.5 Correlated Outcomes 144

4.5.1 Choosing the Appropriate Measure of Association 144 Spearman’s rho 144 Kendall’s tau-b 146

4.5.2 Recommended Methods of Statistical Analysis for Correlation Coefficients 148

4.5.3 Recommended Methods for Interpreting Correlation Coefficient Results 156

4.5.4 Sample Size Issues in Correlation Analysis 157

4.5.5 Comparison of Correlation Coefficients 171 Comparison of Independent Correlation Coefficients 172 Comparison of Dependent Correlation Coefficients 174

4.5.6 Sample Size Issues When Comparing Two Correlation Coefficients 181 Sample Size Issues When Comparing Independent Correlation Coefficients 181 Sample Size Issues When Comparing Dependent Correlation Coefficients 183

4.6 Clustered Data 184

4.7 Outliers 199

4.7.1 The Effects of Outliers 199

4.7.2 Detection of Outliers 199

4.7.3 Methods for Accommodating Outliers 207

4.8 Limits of Detection and Non-Detected Observations 208

4.8.1 Statistical Inference When NDs Are Present 210

4.8.2 Maximum Likelihood Estimation of a Correlation Coefficient When Both X and Y Are Subject to Non-Detects 210

4.8.3 Comparison of Confidence Interval Methods for Correlation Coefficients When Both Variables Are Subject to Limits of Detection 212

4.9 The Analysis of Cross-Classified Categorical Data 221

4.9.1 Choosing the Appropriate Measure of Association 221 The Odds Ratio 221 Risk Ratio 223 Risk Difference 224 Odds Ratio for Paired Data 225

4.9.2 Choosing the Appropriate Statistical Analysis 225

4.9.3 Choosing the Appropriate Sample Size 226

4.9.4 Choosing a Statistical Method When Both the Predictor and the Outcome Are Dichotomous 226 Comparing Two Independent Groups in Terms of a Binomial Proportion 226 Exact Test for Independence of Rows and Columns in a 2 × 2 Table 230 Exact Inference for Odds Ratios 232 Inference for the Odds Ratio for Paired Data 234

4.9.5 Choice of a Statistical Method When the Predictor Is Ordinal and the Outcome is Dichotomous 237 Tests for a Significant Trend in Proportions 237

4.9.6 Choice of a Statistical Method When Both the Predictor and the Outcome are Ordinal 240 Test for Linear-by-Linear Association 240

4.9.7 Choice of a Statistical Method When Both the Predictor and the Outcome are Nominal 243 Fisher–Freeman–Halton Test 243

Problems 246

5 Validation of Biomarkers 255

5.1 Overview of Methods for Assessing Characteristics of Biomarkers 255

5.2 General Description of Measures of Agreement 257

5.2.1 Discrete Variables 257 Cohen’s Kappa 257 Extensions of Coefficient Kappa 265 Weighted Kappa 273

5.2.2 Continuous Variables 275 Pearson’s Correlation Coefficient 275 Alternatives to Pearson’s Correlation Coefficient 277

5.3 Assessing Reliability of a Biomarker 287

5.3.1 General Considerations 287

5.3.2 Assessing Reliability of a Dichotomous Biomarker 287 Dichotomous Biomarker More Than Two Raters 289

5.3.3 Assessing Reliability of a Continuous Biomarker 291

5.3.4 Assessing Inter-Subject Intra-Subject and Analytical Measurement Variability 292

5.4 Assessing Validity 294

5.4.1 General Considerations 294

5.4.2 Assessing Validity When a Gold Standard Is Available 295 Dichotomous Biomarkers 295 Comparing Several Dichotomous Biomarkers 302 Continuous Biomarkers 304

5.4.3 Assessing Validity When a Gold Standard Is Not Available 314 Dichotomous Biomarkers 315 Continuous Biomarkers 319

5.4.4 Assessing Criterion Validity in Method Comparison Studies 328

5.4.5 Assessing Construct Validity in Method Comparison Studies 329

Problems 329

References 332

Solutions to Problems 348

Index 391

Related Topics

Related Publications

Related Content

Site Footer


This website is provided by John Wiley & Sons Limited, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ (Company No: 00641132, VAT No: 376766987)

Published features on are checked for statistical accuracy by a panel from the European Network for Business and Industrial Statistics (ENBIS)   to whom Wiley and express their gratitude. This panel are: Ron Kenett, David Steinberg, Shirley Coleman, Irena Ograjenšek, Fabrizio Ruggeri, Rainer Göb, Philippe Castagliola, Xavier Tort-Martorell, Bart De Ketelaere, Antonio Pievatolo, Martina Vandebroek, Lance Mitchell, Gilbert Saporta, Helmut Waldl and Stelios Psarakis.