# Measuring Agreement: Models, Methods, and Applications

## Books

Measuring Agreement Methodology and Applications successfully blends the currently available statistical methodologies for agreement evaluation in a unified, coherent, and lucid manner.
This up-to-date and comprehensive book describes the theoretical underpinnings of the methodologies and presents case studies using several real data sets to illustrate the application of the methodologies. A perfect reference for statisticians, biostatisticians, clinical chemists, and biomedical scientists

Preface xv

1 Introduction 1

1.1 Preview 1

1.2 Notational Conventions 1

1.3 Basic Characteristics of a Measurement Method 2

1.3.1 A Statistical Model for Measurements 3

1.3.2 Quality Characteristics 3

1.4 Method Comparison Studies 5

1.5 Meaning of Agreement 6

1.6 A Measurement Error Model 8

1.6.1 Identifiability Issues 9

1.6.2 ModelBased Moments 10

1.6.3 Conditions for Perfect Agreement 10

1.6.4 Link to Test Theory 11

1.7 Similarity versus Agreement 11

1.7.1 Evaluation of Similarity 11

1.7.2 Evaluation of Agreement 12

1.8 A Toy Example 13

1.9 Controversies and Our View 14

1.10 Concepts Related to Agreement 15

1.11 Role of Confidence Intervals and Hypotheses Testing 16

1.11.1 Formulating the Agreement Hypotheses 16

1.11.2 Testing Hypotheses Using Confidence Bounds 17

1.11.3 Evaluation of Agreement Using Confidence Bounds 17

1.11.4 Evaluation of Similarity Using Confidence Intervals 18

1.12 Common Models for Paired Measurements Data 18

1.12.1 A Measurement Error Model 19

1.12.2 A MixedEffects Model 20

1.12.3 A Bivariate Normal Model 21

1.12.4 Limitations of the Paired Measurements Design 22

1.13 The BlandAltman Plot 23

1.13.1 The Ideal Plot 23

1.13.2 A Linear Trend in the BlandAltman Plot 25

1.13.3 Heteroscedasticity in the BlandAltman Plot 26

1.13.4 Variations of the BlandAltman Plot 27

1.14 Common Regression Approaches 29

1.14.1 Ordinary Linear Regression 29

1.14.2 Deming Regression 31

1.15 Inappropriate Use of Common Tests in Method Comparison Studies 34

1.15.1 Test of Zero Correlation 34

1.15.2 Paired ttest 36

1.15.4 Test of Zero Intercept and Unit Slope 38

1.16 Key Steps in the Analysis of Method Comparison Data 39

1.17 Chapter Summary 40

1.18 Bibliographic Note 41

Exercises 47

2 Common Approaches for Measuring Agreement 53

2.1 Preview 53

2.2 Introduction 53

2.3 Mean Squared Deviation 54

2.4 Concordance Correlation Coefficient 54

2.5 A Digression: Tolerance and Prediction intervals 57

2.5.1 Definitions 57

2.5.2 Normally Distributed Data 58

2.6 Lin’s Probability Criterion and BlandAltman Criterion 59

2.7 Limits of Agreement 60

2.7.1 The Approach 60

2.7.2 Why Ignore the Variability? 61

2.7.3 Limits of Agreement versus Prediction and Tolerance Intervals 62

2.8 Total Deviation Index and Coverage Probability 62

2.8.1 The Approaches 62

2.8.2 Normally Distributed Differences 63

2.9 Inference on Agreement Measures 64

2.10 Chapter Summary 64

2.11 Bibliographic Note 65

Exercises 66

3 A General Approach for Modeling and Inference 71

3.1 Preview 71

3.2 MixedEffects Models 71

3.2.1 The Model 72

3.2.2 Prediction 73

3.2.3 Model Fitting 74

3.2.4 Model Diagnostics 75

3.3 A LargeSample Approach to Inference 76

3.3.1 Approximate Distributions 77

3.3.2 Confidence Intervals 78

3.3.3 Parameter Transformation 80

3.3.4 Bootstrap Confidence Intervals 81

3.3.5 Confidence Bands 83

3.3.6 Test of Homogeneity 83

3.3.7 Model Comparison 84

3.4 Modeling and Analysis of Method Comparison Data 85

3.5 Chapter Summary 88

3.6 Bibliographic Note 89

Exercises 89

4 Paired Measurements Data 95

4.1 Preview 95

4.2 Modeling of Data 95

4.2.1 MixedEffects Model 95

4.2.2 Bivariate Normal Model 97

4.3 Evaluation of Similarity and Agreement 98

4.4 Case Studies 99

4.4.1 Oxygen Saturation Data 99

4.4.2 Plasma Volume Data 101

4.4.3 Vitamin D Data 103

4.5 Chapter Summary 106

4.6 Technical Details 106

4.6.1 MixedEffects Model 106

4.6.2 Bivariate Normal Model 107

4.7 Bibliographic Note 108

Exercises 108

5 Repeated Measurements Data 111

5.1 Preview 111

5.2 Introduction 111

5.2.1 Types of Data 112

5.2.2 Individual versus Average Measurement 113

5.2.3 Example Datasets 113

5.3 Displaying Data 114

5.3.1 Basic Plots 114

5.3.2 Interaction Plots 116

5.4 Modeling of Data 117

5.4.3 Model Fitting and Evaluation 123

5.5 Evaluation of Similarity and Agreement 123

5.6 Evaluation of Repeatability 124

5.7 Case Studies 126

5.7.1 Kiwi Data 126

5.7.2 Oximetry Data 129

5.8 Chapter Summary 133

5.9 Technical Details 134

5.10 Bibliographic Note 135

Exercises 137

6 Heteroscedastic Data 141

6.1 Preview 141

6.2 Introduction 141

6.2.1 Diagnosing Heteroscedasticity 142

6.2.2 Example Datasets 143

6.3 Variance Function Models 144

6.4 Repeated Measurements Data 146

6.4.1 A Heteroscedastic MixedEffects Model 147

6.4.2 Specifying the Variance Function 149

6.4.3 Model Fitting and Evaluation 150

6.4.4 Testing for Homoscedasticity 151

6.4.5 Evaluation of Similarity, Agreement, and Repeatability 151

6.4.6 Case Study: Cholesterol Data 152

6.5 Paired Measurements Data 162

6.5.1 A Heteroscedastic Bivariate Normal Model 162

6.5.2 Specifying the Variance Function 163

6.5.3 Model Fitting and Evaluation 164

6.5.4 Testing for Homoscedasticity 164

6.5.5 Evaluation of Similarity and Agreement 164

6.5.6 Case Study: Cyclosporin Data 165

6.6 Chapter Summary 171

6.7 Technical Details 171

6.7.1 Repeated Measurements Data 171

6.7.2 Paired Measurements Data 173

6.8 Bibliographic Note 174

Exercises 174

7 Data from Multiple Methods 177

7.1 Preview 177

7.2 Introduction 177

7.3 Displaying Data 179

7.4 Example Datasets 179

7.4.1 Systolic Blood Pressure Data 180

7.4.2 Tumor Size Data 180

7.5 Modeling Unreplicated Data 184

7.6 Modeling Repeated Measurements Data 186

7.7 Model Fitting and Evaluation 189

7.8 Evaluation of Similarity and Agreement 190

7.9 Evaluation of Repeatability 191

7.10 Case Studies 192

7.10.1 Systolic Blood Pressure Data 192

7.10.2 Tumor Size Data 195

7.11 Chapter Summary 198

7.12 Technical Details 198

7.13 Bibliographic Note 200

Exercises 200

8 Data with Covariates 205

8.1 Preview 205

8.2 Introduction 205

8.3 Modeling of Data 206

8.3.1 Modeling Means of Methods 206

8.3.2 Modeling Variances of Methods 207

8.3.3 Data Models 208

8.3.4 Model Fitting and Evaluation 211

8.4 Evaluation of Similarity, Agreement, and Repeatability 211

8.4.1 Measures of Agreement for Two methods 212

8.4.2 Measures of Agreement for More Than Two Methods 213

8.4.3 Measures of Repeatability 213

8.4.4 Inference on Measures 214

8.5 Case Study 214

8.6 Chapter Summary 224

8.7 Technical Details 225

8.8 Bibliographic Note 226

Exercises 226

9 Longitudinal Data 229

9.1 Preview 229

9.2 Introduction 229

9.2.1 Displaying Data 231

9.2.2 Percentage Body Fat Data 231

9.3 Modeling of Data 234

9.3.1 The Longitudinal Data Model 236

9.3.2 Specifying the Mean Functions 237

9.3.3 Specifying the Correlation Function 237

9.3.4 Model Fitting and Evaluation 240

9.4 Evaluation of Similarity and Agreement 241

9.5 Case Study 242

9.6 Chapter Summary 247

9.7 Technical Details 247

9.8 Bibliographic Note 249

Exercises 250

10 A Nonparametric Approach 253

10.1 Preview 253

10.2 Introduction 253

10.3 The Statistical Functional Approach 255

10.3.1 A Weighted Empirical CDF 256

10.3.2 Distributions Induced by Empirical CDF 256

10.4 Evaluation of Similarity and Agreement 258

10.5 Case Studies 259

10.5.1 Unreplicated Blood Pressure Data 259

10.5.2 Replicated Blood Pressure Data 263

10.6 Chapter Summary 267

10.7 Technical Details 267

10.7.1 The  Matrix 268

10.7.2 Estimation of  269

10.7.3 Influence Functions for the Measures 270

10.7.4 TDI Confidence Bounds 270

10.7.5 Summary of Steps 271

10.8 Bibliographic Note 271

Exercises 272

11 Sample Size Determination 279

11.1 Preview 279

11.2 Introduction 279

11.3 The Sample Size Methodology 281

11.3.1 Paired Measurements Design 281

11.3.2 Repeated Measurements Design 281

11.4 Case Study 282

11.5 Chapter Summary 286

11.6 Bibliographic Note 286

Exercises 287

12 Categorical Data 289

12.1 Preview 289

12.2 Introduction 289

12.3 Experimental Setups and Examples 290

12.3.1 Types of Data 290

12.3.2 Illustrative Examples 290

12.3.3 A Graphical Approach 292

12.4 Cohen’s Kappa Coefficient for Dichotomous Data 293

12.4.1 Definition and Basic Properties: Two Raters 293

12.4.2 Sample Kappa Coefficient 297

12.4.3 Agreement with a Gold Standard 298

12.4.4 Unbiased Raters: Intraclass Kappa 299

12.4.5 Multiple Raters 300

12.4.6 Combining and Comparing Kappa Coefficients 301

12.4.7 Sample Size Calculations 302

12.5 Kappa Type Measures for More Than Two Categories 303

12.5.1 Two Fixed Raters with Nominal Categories 303

12.5.2 Two Raters with Ordinal Categories: Weighted Kappa 303

12.5.3 Multiple Raters 304

12.6 Case Studies 305

12.6.1 Two Raters with Two Categories 305

12.6.2 Weighted Kappa: Multiple Categories 306

12.7 Models for Exploring Agreement 306

12.7.1 Conditional Logistic Regression Models 306

12.7.2 LogLinear Models 307

12.7.3 A Generalized Linear MixedEffects Model 308

12.8 Discussion 309

12.9 Chapter Summary 310

12.10 Bibliographic Note 311

Exercises 312

References 319

Dataset List 331

Index 333

View all

View all