# Introductory Statistics and Analytics: A Resampling Perspective

## Books Concise, thoroughly class-tested primer that features basic statistical concepts in the concepts in the context of analytics, resampling, and the bootstrap

A uniquely developed presentation of key statistical topics, Introductory Statistics and Analytics: A Resampling Perspective provides an accessible approach to statistical analytics, resampling, and the bootstrap for readers with various levels of exposure to basic probability and statistics. Originally class-tested at one of the first online learning companies in the discipline, www.statistics.com, the book primarily focuses on applications of statistical concepts developed via resampling, with a background discussion of mathematical theory. This feature stresses statistical literacy and understanding, which demonstrates the fundamental basis for statistical inference and demystifies traditional formulas.

The book begins with illustrations that have the essential statistical topics interwoven throughout before moving on to demonstrate the proper design of studies. Meeting all of the Guidelines for Assessment and Instruction in Statistics Education (GAISE) requirements for an introductory statistics course, Introductory Statistics and Analytics: A Resampling Perspective also includes:

• Over 300 “Try It Yourself” exercises and intermittent practice questions, which challenge readers at multiple levels to investigate and explore key statistical concepts
• Numerous interactive links designed to provide solutions to exercises and further information on crucial concepts
• Linkages that connect statistics to the rapidly growing field of data science
• Multiple discussions of various software systems, such as Microsoft Office Excel®, StatCrunch, and R, to develop and analyze data
• Areas of concern and/or contrasting points-of-view indicated through the use of “Caution” icons

Introductory Statistics and Analytics: A Resampling Perspective is an excellent primary textbook for courses in preliminary statistics as well as a supplement for courses in upper-level statistics and related fields, such as biostatistics and econometrics. The book is also a general reference for readers interested in revisiting the value of statistics.

Preface ix

Acknowledgments xi

Introduction xiii

1 Designing and Carrying Out a Statistical Study 1

1.1 A Small Example 3

1.2 Is Chance Responsible? The Foundation of Hypothesis Testing 3

1.3 A Major Example 7

1.4 Designing an Experiment 8

1.5 What to Measure—Central Location 13

1.6 What to Measure—Variability 16

1.7 What to Measure—Distance (Nearness) 19

1.8 Test Statistic 21

1.9 The Data 22

1.10 Variables and Their Flavors 28

1.11 Examining and Displaying the Data 31

1.12 Are we Sure we Made a Difference? 39

Appendix: Historical Note 39

1.13 Exercises 40

2 Statistical Inference 45

2.1 Repeating the Experiment 46

2.2 How Many Reshuffles? 48

2.3 How Odd is Odd? 53

2.4 Statistical and Practical Significance 55

2.5 When to use Hypothesis Tests 56

2.6 Exercises 56

3 Displaying and Exploring Data 59

3.1 Bar Charts 59

3.2 Pie Charts 61

3.3 Misuse of Graphs 62

3.4 Indexing 64

3.5 Exercises 68

4 Probability 71

4.1 Mendel’s Peas 72

4.2 Simple Probability 73

4.3 Random Variables and their Probability Distributions 77

4.4 The Normal Distribution 80

4.5 Exercises 84

5 Relationship between Two Categorical Variables 87

5.1 Two-Way Tables 87

5.2 Comparing Proportions 90

5.3 More Probability 92

5.4 From Conditional Probabilities to Bayesian Estimates 95

5.5 Independence 97

5.6 Exploratory Data Analysis (EDA) 99

5.7 Exercises 100

6 Surveys and Sampling 104

6.1 Simple Random Samples 105

6.2 Margin of Error: Sampling Distribution for a Proportion 109

6.3 Sampling Distribution for a Mean 111

6.4 A Shortcut—the Bootstrap 113

6.5 Beyond Simple Random Sampling 117

6.6 Absolute Versus Relative Sample Size 120

6.7 Exercises 120

7 Confidence Intervals 124

7.1 Point Estimates 124

7.2 Interval Estimates (Confidence Intervals) 125

7.3 Confidence Interval for a Mean 126

7.4 Formula-Based Counterparts to the Bootstrap 126

7.5 Standard Error 132

7.6 Confidence Intervals for a Single Proportion 133

7.7 Confidence Interval for a Difference in Means 136

7.8 Confidence Interval for a Difference in Proportions 139

7.9 Recapping 140

Appendix A: More on the Bootstrap 141

Resampling Procedure—Parametric Bootstrap 141

Formulas and the Parametric Bootstrap 144

Appendix B: Alternative Populations 144

Appendix C: Binomial Formula Procedure 144

7.10 Exercises 147

8 Hypothesis Tests 151

8.1 Review of Terminology 151

8.2 A–B Tests: The Two Sample Comparison 154

8.3 Comparing Two Means 156

8.4 Comparing Two Proportions 157

8.5 Formula-Based Alternative—t-Test for Means 159

8.6 The Null and Alternative Hypotheses 160

8.7 Paired Comparisons 163

Appendix A: Confidence Intervals Versus Hypothesis Tests 167

Confidence Interval 168

Relationship Between the Hypothesis Test and the Confidence Interval 169

Comment 170

Appendix B: Formula-Based Variations of Two-Sample Tests 170

Z-Test With Known Population Variance 170

Pooled Versus Separate Variances 171

Formula-Based Alternative: Z-Test for Proportions 172

8.8 Exercises 172

9 Hypothesis Testing—2 178

9.1 A Single Proportion 178

9.2 A Single Mean 180

9.3 More Than Two Categories or Samples 181

9.4 Continuous Data 187

9.5 Goodness-of-Fit 187

Appendix: Normal Approximation; Hypothesis Test of a Single Proportion 190

Confidence Interval for a Mean 190

9.6 Exercises 191

10 Correlation 193

10.1 Example: Delta Wire 194

10.2 Example: Cotton Dust and Lung Disease 195

10.3 The Vector Product and Sum Test 196

10.4 Correlation Coefficient 199

10.5 Other Forms of Association 204

10.6 Correlation is not Causation 205

10.7 Exercises 206

11 Regression 209

11.1 Finding the Regression Line by Eye 210

11.2 Finding the Regression Line by Minimizing Residuals 212

11.3 Linear Relationships 213

11.4 Inference for Regression 217

11.5 Exercises 221

12 Analysis of Variance—ANOVA 224

12.1 Comparing More Than Two Groups: ANOVA 225

12.2 The Problem of Multiple Inference 228

12.3 A Single Test 229

12.4 Components of Variance 230

12.5 Two-Way ANOVA 240

12.6 Factorial Design 246

12.7 Exercises 248

13 Multiple Regression 251

13.1 Regression as Explanation 252

13.2 Simple Linear Regression—Explore the Data First 253

13.3 More Independent Variables 257

13.4 Model Assessment and Inference 261

13.5 Assumptions 267

13.6 Interaction Again 270

13.7 Regression for Prediction 272

13.8 Exercises 277

Index 283

## Books & Journals

### Books #### Common Errors in Statistics (and How to Avoid Them), 4th Edition #### Theory of Computation View all

### Journals #### Biometrical Journal #### Random Structures & Algorithms View all