Statistics for Censored Environmental Data Using Minitab and R, 2nd Edition


thumbnail image: Statistics for Censored Environmental Data Using Minitab and R, 2nd Edition

Praise for the First Edition

" . . . an excellent addition to an upper-level undergraduate course on environmental statistics, and . . . a 'must-have' desk reference for environmental practitioners dealing with censored datasets."
—Vadose Zone Journal

Statistical Methods for Censored Environmental Data Using Minitab® and R, Second Edition introduces and explains methods for analyzing and interpreting censored data in the environmental sciences. Adapting survival analysis techniques from other fields, the book translates well-established methods from other disciplines into new solutions for environmental studies.

This new edition applies methods of survival analysis, including methods for interval-censored data to the interpretation of low-level contaminants in environmental sciences and occupational health. Now incorporating the freely available R software as well as Minitab® into the discussed analyses, the book features newly developed and updated material including:

  • A new chapter on multivariate methods for censored data

  • Use of interval-censored methods for treating true nondetects as lower than and separate from values between the detection and quantitation limits ("remarked data")

  • A section on summing data with nondetects

  • A newly written introduction that discusses invasive data, showing why substitution methods fail

  • Expanded coverage of graphical methods for censored data

The author writes in a style that focuses on applications rather than derivations, with chapters organized by key objectives such as computing intervals, comparing groups, and correlation. Examples accompany each procedure, utilizing real-world data that can be analyzed using the Minitab® and R software macros available on the book's related website, and extensive references direct readers to authoritative literature from the environmental sciences.

Statistics for Censored Environmental Data Using Minitab® and R, Second Edition is an excellent book for courses on environmental statistics at the upper-undergraduate and graduate levels. The book also serves as a valuable reference for¿environmental professionals, biologists, and ecologists who focus on the water sciences, air quality, and soil science.

Preface ix

Acknowledgments xi

Introduction to the First Edition: An Accident Waiting To Happen xiii

Introduction to the Second Edition: Invasive Data xvii

1 Things People Do with Censored Data that Are Just Wrong 1

Why Not Substitute—Missing the Signals that Are Present in the Data 3

Why Not Substitute?—Finding Signals that Are Not There 8

So Why Not Substitute? 10

Other Common Misuses of Censored Data 10

2 Three Approaches for Censored Data 12

Approach 1: Nonparametric Methods after Censoring at the Highest Reporting Limit 13

Approach 2: Maximum Likelihood Estimation 14

Approach 3: Nonparametric Survival Analysis Methods 17

Application of Survival Analysis Methods to Environmental Data 17

Parallels to Uncensored Methods 21

3 Reporting Limits 22

Limits When the Standard Deviation is Considered Constant 23

Insider Censoring–Biasing Interpretations 29

Reporting the Machine Readings of all Measurements 33

Limits When the Standard Deviation Changes with Concentration 34

For Further Study 36

4 Reporting, Storing, and Using Censored Data 37

Reporting and Storing Censored Data 37

Using Interval-Censored Data 41

Exercises 42

5 Plotting Censored Data 44

Boxplots 44

Histograms 46

Empirical Distribution Function 47

Survival Function Plots 49

Probability Plot 52

X–Y Scatterplots 59

Exercises 61

6 Computing Summary Statistics and Totals 62

Nonparametric Methods after Censoring at the Highest Reporting Limit 62

Maximum Likelihood Estimation 64

The Nonparametric Kaplan–Meier and Turnbull Methods 70

ROS: A “Robust” Imputation Method 79

Methods in Excel 86

Handling Data with High Reporting Limits 86

A Review of Comparison Studies 87

Summing Data with Censored Observations 94

Exercises 98

7 Computing Interval Estimates 99

Parametric Intervals 100

Nonparametric Intervals 103

Intervals for Censored Data by Substitution 103

Intervals for Censored Data by Maximum Likelihood 104

Intervals for the Lognormal Distribution 112

Intervals Using “Robust” Parametric Methods 125

Nonparametric Intervals for Censored Data 126

Bootstrapped Intervals 136

For Further Study 140

Exercises 141

8 What Can be Done When All Data Are Below the Reporting Limit? 142

Point Estimates 143

Probability of Exceeding the Reporting Limit 144

Exceedance Probability for a Standard Higher than the Reporting Limit 148

Hypothesis Tests Between Groups 151

Summary 152

Exercises 152

9 Comparing Two Groups 153

Why Not Use Substitution? 154

Simple Nonparametric Methods After Censoring at the Highest Reporting Limit 156

Maximum Likelihood Estimation 161

Nonparametric Methods 167

Value of the Information in Censored Observations 178

Interval-Censored Score Tests: Testing Data that Include (DL to RL) Values 180

Paired Observations 183

Summary of Two-Sample Tests for Censored Data 192

Exercises 192

10 Comparing Three or More Groups 194

Substitution Does Not Work—Invasive Data 195

Nonparametric Methods after Censoring at the Highest Reporting Limit 196

Maximum Likelihood Estimation 199

Nonparametric Method—The Generalized Wilcoxon Test 209

Summary 215

Exercises 216

11 Correlation 218

Types of Correlation Coefficients 218

Nonparametric Methods after Censoring at the Highest Reporting Limit 219

Maximum Likelihood Correlation Coefficient 224

Nonparametric Correlation Coefficient—Kendall’s Tau 227

Interval-Censored Score Tests: Testing Correlation with (DL to RL) Values 230

Summary: A Comparison Among Methods 232

For Further Study 234

Exercises 235

12 Regression and Trends 236

Why Not Substitute? 237

Nonparametric Methods After Censoring at the Highest Reporting Limit 239

Maximum Likelihood Estimation 249

Akritas–Theil–Sen Nonparametric Regression 258

Additional Methods for Censored Regression 264

Exercises 266

13 Multivariate Methods for Censored Data 268

A Brief Overview of Multivariate Procedures 269

Nonparametric Methods After Censoring at the Highest Reporting Limit 273

Multivariate Methods for Data with Multiple Reporting Limits 288

Summary of Multivariate Methods for Censored Data 296

14 The NADA for R Software 297

A Brief Overview of R and the NADA Software 297

Summary of the Commands Available in NADA 300

Appendix: Datasets 303

References 309

Index 321

Related Topics

Related Publications

Related Content

Site Footer


This website is provided by John Wiley & Sons Limited, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ (Company No: 00641132, VAT No: 376766987)

Published features on are checked for statistical accuracy by a panel from the European Network for Business and Industrial Statistics (ENBIS)   to whom Wiley and express their gratitude. This panel are: Ron Kenett, David Steinberg, Shirley Coleman, Irena Ograjenšek, Fabrizio Ruggeri, Rainer Göb, Philippe Castagliola, Xavier Tort-Martorell, Bart De Ketelaere, Antonio Pievatolo, Martina Vandebroek, Lance Mitchell, Gilbert Saporta, Helmut Waldl and Stelios Psarakis.