Analysis of Biomarker Data: A Practical Guide – An interview with the authors

In April of this year, Wiley was proud to publish Analysis of Biomarker Data: A Practical Guidea “how to” guide for applying statistical methods to biomarker data analysis.

Presenting a solid foundation for the statistical methods that are used to analyze biomarker data, Analysis of Biomarker Data: A Practical Guide features preferred techniques for biomarker validation. The authors, Professor Stephen W. Looney of Georgia Regents University and Professor Joseph L. Hagan of Baylor College of Medicine provide descriptions of select elementary statistical methods that are traditionally used to analyze biomarker data with a focus on the proper application of each method, including necessary assumptions, software recommendations, and proper interpretation of computer output. In addition, the book discusses frequently encountered challenges in analyzing biomarker data and how to deal with them, methods for the quality assessment of biomarkers, and biomarker study designs.

Covering a broad range of statistical methods that have been used to analyze biomarker data in published research studies, Analysis of Biomarker Data: A Practical Guide also features:

  • A greater emphasis on the application of methods as opposed to the underlying statistical and mathematical theory
  • The use of SAS, R, and other software throughout to illustrate the presented calculations for each example
  • Numerous exercises based on real-world data as well as solutions to the problems to aid in reader comprehension
  • The principles of good research study design and the methods for assessing the quality of a newly proposed biomarker
  • A companion website that includes a software appendix with multiple types of software and complete data sets from the book’s examples

Analysis of Biomarker Data: A Practical Guide is an ideal upper-undergraduate and graduate-level textbook for courses in the biological or environmental sciences. An excellent reference for statisticians who routinely analyze and interpret biomarker data, the book is also useful for researchers who wish to perform their own analyses of biomarker data, such as toxicologists, pharmacologists, epidemiologists, environmental and clinical laboratory scientists, and other professionals in the health and environmental sciences.

 

1. Congratulations on the publication of Analysis of Biomarker Data: A Practical Guide, which presents a solid foundation for the statistical methods that are used to analyse biomarker data, featuring preferred techniques for biomarker validation. How did the book come about in the first place?

We felt that there was an obvious gap in the statistical literature: A textbook devoted exclusively to the analysis of biomarker data. Biomarkers are ever-present in health sciences research and in our personal healthcare (blood pressure, serum cholesterol, DNA testing, etc.), so it seemed that there should be a book that described preferred methods for analyzing such data.

2. What were the main objectives during the writing process?

We wanted to make the book accessible to professionals outside of statistics who are tasked with analyzing biomarker data. We tried to keep the writing at a level that someone with a reasonable background in statistics (for example, two semesters of graduate-level statistics courses) could understand. We also provided numerous exercises with complete solutions to facilitate self-study, as well as the software code (primarily SAS) needed to work all of the examples and exercises in the book.

3. In the book, you have provided descriptions of select elementary statistical methods that are traditionally used to analyse biomarker data with a focus on the proper application of each method, including necessary assumptions, software recommendations, and proper interpretation of computer output. In addition, the book discusses frequently encountered challenges in analysing biomarker data and how to deal with them, methods for the quality assessment of biomarkers, and biomarker study designs. If there is one message you wish to get across to your reader, the one idea he/she will take away, what would that be?

Looney: My primary message would be that someone who wishes to analyze biomarker data should not necessarily rely on statistical techniques that have been used in previously published articles in the biomarker literature. As illustrated in our book, we found many instances of published analyses of biomarker data where the authors applied the wrong method, or applied the right method incorrectly. Those who analyze biomarker data should pay close attention to underlying assumptions and other aspects of the proper application of statistical methods before they begin their analysis.

Hagan: I agree. Inappropriate application of statistical methods is fairly common in the biomarker research literature. I hope we do not sound overly critical by saying this. In my opinion, misapplication of statistical methods is really not surprising because it is often the case that the person analysing the biomarker data is an expert in a clinical area, but he or she has much less training and experience in statistical analysis.

4. Who should read the book and why?

We consider the primary audience for this textbook to be investigators who wish to perform their own analyses of biomarker data; this includes toxicologists, pharmacologists, epidemiologists, environmental and clinical laboratory scientists and other professionals in the health and environmental sciences. Statisticians who routinely analyze and interpret biomarker data should also find it to be of interest. We think they will find “nuggets” in the book that will help facilitate their efficient use of statistical tools in their analyses of biomarker data, which can often be quite challenging. The software code we have provided should also be of value to those who routinely analyze such data; it can be downloaded free of charge from the companion website for the book.

5. There is also companion website that includes a software appendix with multiple types of software and complete data sets from the book’s examples. SAS, R, and other software are used throughout the book itself to illustrate the presented calculations for each example, so this is very much a ‘how to’ guide on applying statistical methods to biomarker data analysis?

It was our intention to provide our readership with most of the tools they would need to analyze biomarker data. Out of necessity, however, we had to omit some techniques that we felt were too technical or too controversial for a non-statistical audience (for example, variable selection in multiple linear regression).

My primary message would be that someone who wishes to analyze biomarker data should not necessarily rely on statistical techniques that have been used in previously published articles in the biomarker literature. As illustrated in our book, we found many instances of published analyses of biomarker data where the authors applied the wrong method, or applied the right method incorrectly. Those who analyze biomarker data should pay close attention to underlying assumptions and other aspects of the proper application of statistical methods before they begin their analysis.

6. Were there any areas that you found more challenging to write? If so, why?

Looney: For me, the most challenging part was finding “real-world” data sets that illustrate the difficulties one often encounters when analyzing biomarker data. Dr. Hagan is a master at this, so he deserves all the credit for finding such stimulating examples.

Hagan: Well, I simply would have been unable to write some of the more technical parts of the book without Dr. Looney’s guidance. Dr. Looney has been an exemplary statistical consultant and teacher of statistics for a long time. In addition to being recognized as an expert in statistical methodology, he really understands the practical aspects and the “art” of applied statistics. And I think the reader will find his practical advice to be very useful.

7. What is it about this particular area that fascinates you both?

Looney: I find the ongoing research that attempts to find surrogate endpoints (which are often referred to as surrogate markers) particularly fascinating. Being able to accurately stage a tumour without performing a biopsy, for example, represents a tremendous breakthrough in my way of thinking.

Hagan: I was originally trained as a biologist and employed in the clinical laboratory before becoming a biostatistician. So this book really is the product of a synthesis of my two loves. To me, it is very exciting to be able to use statistics and the scientific method to extract information from biomarker data that can be used to improve our health and livelihood.

8. What is your current research focussing on? What are your main objectives and what do you hope to achieve through the results?

Looney: My methodological research in statistics is currently focused on analysis of what is commonly referred to as “messy data”; that is, data for which there is good reason to believe that the assumptions underlying commonly used statistical tests have been violated. This includes the area of “incomplete” data, in which the investigators were unable to obtain the desired measurements for all of their subjects. “Messy data” situations frequently occur with biomarkers, so my methodological research is directly applicable to the applied problems I commonly work on. My best research ideas have come about because of an applied problem where the standard statistical methods did not apply and I had to develop a new method or adapt an existing one in order to analyze the data at hand.

Hagan: My “day job” involves multi-disciplinary collaborative healthcare research. I really enjoy the variety of fascinating fields that I have the opportunity to work in. Statistical Process Control is an area of methodological research in which I have recently become interested. Similar to Dr. Looney, my research interest was stimulated by issues in real-world applications of Statistical Process Control to healthcare data analysis.

9. What will be your next book length undertaking?

We are thinking of writing a SAS manual devoted exclusively to the analysis of biomarker data.

10. Please could you tell us more about your educational backgrounds and what was it that brought you to recognise statistics as a discipline in the first place?

Looney: I have a traditional arts and sciences background in mathematics – I started out with a Bachelor of Science in mathematics and then moved on to a Master of Science in math. Halfway through my M.S. program of study, I decided that pure mathematics was not for me. My former college roommate had gotten his undergraduate degree in computer science, so I decided I would try “comp sci” after completing my M.S. in math. At the time, many graduate degree programs in computer science also required graduate-level training in statistics. Once I took my first graduate course in statistics, I was hooked, and I never did any formal study in graduate-level computer science. I eventually got my Ph.D. in statistics in the same department where I had planned to study computer science.

Hagan: After getting a Bachelor of Arts in Biology, I became a certified Medical Technologist. I became very interested in working in healthcare research after being exposed to it while employed in the clinical laboratory setting. The more I learned about research, the more I came to see that a thorough knowledge of biostatistics is needed to be able to design a study and analyse the data to properly address a research question. So I obtained a master’s degree and doctorate in Biostatistics. My passion is using the scientific method to answer a research question and biostatistics is the tool we use to do that.