Gerda Claeskens is Professor of Statistics at KU Leuven where she has been teaching since 2004. She gained her M.S. in Biostatistics and PhD in Mathematics at Limburgs Universitair Centrum. She has also taught at Eindhoven University of Technology, Texas A&M University and Universite Catholique de Louvain.
Her research interests include lack of fit tests and Goodness of fit tests, model selection and model averaging and nonparametric estimation (kernel smoothers, local polynomials, regression splines).
Alison Oliver, Editor of Statistics Views talked to Professor Claeskens about her work during JSM 2016 where she gave one of the IMS Medallion lectures.
1. You gained your licentiate in mathematics from the University of Antwerp, moving onto the doctoral program at Hasselt University, Belgium where you then also studied for an M.S. in Biostatistics. What inspired this move to biostatistics?
I think it is still quite common to start from a mathematical bachelor degree. I find it useful to have a mathematics background, or at least a solid knowledge of mathematics, when you want to pursue a statistics career.
So I started mathematics with a strong emphasis on statistics and probability, and then I decided for myself that statistics was what I wanted to go for. When I started working on my PhD research, I had the opportunity to also follow courses of a Masters in Biostatistics program, which I took, eventually finished and obtained that degree. Biostatistics is a very interesting area because you can apply many statistical techniques to it. It is so broad and has many applications. Several techniques find their origin into applications in biostatistics. Many studies get so complicated that they require such statistical tools. Even though I would not call myself today a biostatistician or definitely I’m not applying biostatistics in the applied sense, I often use situations studies from biostatistics to apply mathematical statistical techniques.
2. You have taught at Texas A&M University and Université Catholique de Louvain but have remained at KU Leuven since 2004. What is it about KU Leuven that you love that has kept you there?
It was interesting for me to join KU Leuven, because they offered me a research position, which was very good because I could focus on research for ten years in a row, only little teaching tasks were involved, and not much administrative duties. They attracted me by that offer. Now those ten years have passed, so I am taking a full professorship there with all duties. While it’s a big university, it does not have a statistics department, maybe for historical reasons, I don’t know the case, but it has a large group of statisticians. We have a Leuven statistics research centre (LStat) and there is also a joint venture with the University of Hasselt – the Interuniversity Institute for Biostatistics and statistical Bioinformatics (I-BioStat). There are so many opportunities for statisticians because it is a very large university, so there are also many statisticians you can contact and interact with.
Also, I like Leuven because it is a nice city, with many historic buildings, amongst them several university buildings.
3. How do you feel the teaching of statistics has evolved and adapted to meet the changing needs of students since you began teaching?
I’m now teaching very big groups of students – rooms of about 400 students (if they all attend), so that’s quite a challenge to keep them interested. I think now university access has become a lot easier for many people so the groups are getting bigger and bigger every year. So that’s definitely more challenging to keep them interested. But obviously I prefer smaller groups! Teaching is easier with smaller groups of students, though I still try to give the students situations that they recognize and show the relevance of what they’re doing.
But actually, what did not change for me is that I find it important in teaching to explain why students should use the methods they learn. Even students in business, which I’m teaching now. I want them to understand why they need to use a certain formula, why sometimes a confidence interval is constructed this way, sometimes another way. So I don’t want to just give them the formulas, but I also give the background explanation, which I hope will help them to get a better understanding.
Something that has changed is the ease by which computer programs are available, like free programs such as R, and students are very fast in picking it up. I remember several years ago that I needed to organize several sessions to make them familiar to work with the programs, but now they almost know it before they see it, which makes teaching easier.
4. What are your current research interests? Are you working on anything in particular at the moment?
I’m working on several projects. I have several PhD students and postdocs working with me, I try to work with separate topics with separate people. So some people I’m working with are focusing on graphical models and high dimension models like analyzing FMRI array data, trying to estimate which connections are present. That’s fascinating because you can analyze such data on different levels of accuracy and only look at big areas in the brain or really tiny ones, there are many things there still to learn.
Other research it connects with is something many people are working on now – high dimensional data, because situations we encounter nowadays are often high dimensional because it is easier to collect a lot of data than it was many years ago. Such data come with their own challenges, especially when investigating hypothesis tests, and performing proper inference taking all uncertainties into account.
I am also working on functional data that also naturally arises when you keep studying aspects over time. There are in fact, several other interesting research projects ongoing as well.
5. Your lecture at JSM 2016 was one of the Medallion lectures and your chosen title is Model Averaging and Post-Model Selection. Please could you tell us more about this topic? If there was the one thing that you would like your audience to take away from your lecture?
The main message I wanted to pass on is that there is a step after model selection, and that step has been often neglected in the past, but nowadays people are getting more aware of it: that if you intervene, if you do a selection step, that changes your inference afterwards. For example, suppose that I started out from five models and that I have used a model selection criterion which lead me to take one of them. I cannot pretend that this one model is the one I would have had if I had not done the selection I did. The act of selection changes things, it changes inference, and it might change the conclusions from the analysis. I think that’s also a challenging aspect for the future. Not everything is certain about such selections or about other inspections or pretesting of the data, not everything is yet understood about it. There are some separate cases that people understand and can handle, but the general picture is much broader, and I’m hoping to see and understand more of that in the future.
6. You have authored many publications. What is the article or book that you are most proud of?
I’ve written one book together with Nils Lid Hjort from University of Oslo. That’s probably the biggest publication I wrote. That one I am proud of. Writing a book was a very good experience for me, because I cannot write if I don’t fully understand it. I’ve read dozens of papers to prepare for the book writing. I wanted to not only read them but to really understand them as well. So it was very good in getting the full picture in my head before writing it down. I learned a lot also about basic methods that are explained in the book – methods I knew about but only fully understood after making the effort to understand everything. That book grew out of two papers I wrote with Nils Hjort in 2003 that were published in the Journal of the American Statistical Association, on Frequentist Model Averaging and Focused Information Criterion.
It turned out that the book was the start of many other research projects that followed on from it, so once you get a good picture of what’s available in the literature, you also start to see the interesting other areas where you can contribute. I think writing a book was a very good experience.
8. What do you see as the greatest challenges facing the profession of statistics in the coming years?
I think statistics somehow will need to find its place in all these new areas that are evolving with different names such as data science and big data. There are some people who mainly think that those areas require mostly computational tools, such that if we find algorithms, we are fine, we can come up with answers and that’s it. But I think statistics has a real place in these new areas, not just from the side, but in the centre, because there’s so much uncertainty involved in all these aspects. So I think defining the role of statistics and placing it in the new areas is one important challenge.
Another aspect which is maybe more important for Europeans now, maybe not so much in US, is that statistics is often still seen as an applied branch of mathematics. For example, at grant agencies, there is no separate statistics community or statistics branch. You always need to go, for example, via mathematics where the majority of the committee members are still mathematicians. In contrast, physics also relates to mathematics, though this is considered its own discipline. I think statistics encompasses a big enough area to also be considered as its own discipline, and not just the sub-branch of something else. I think that this is still a challenge for the statistics community.
9. What has been the best book on statistics that you have ever read?
This is a very difficult question! I write a lot in my research on something called focus model selection—it means you need to first say what you want to do, and then you find the best model to do that thing. I think it applies very well to this question. If you want to learn something about the topic, then I prefer a book that gives a broad introduction, not too difficult, with examples linking it to other areas. I’m afraid I can’t name a particular one. I read a lot so I have many books on my bookshelves both at home and in the office.
I do like broad introductory books to start to learn about a new sub-area or something new, but then when you get more intimate, I feel a need to go into more theoretical books, containing more advanced topics.
10. Who are the people that have been influential in your career?
There is actually one person, that’s Jeffrey Hart from Texas A&M University. When I was still a PhD student at the University of Hasselt he came there for a one-year sabbatical, and at that time he wrote a book which I loved reading. I started talking to him, and we ended up working together. After that year, he left to go back to Texas A&M, and I finished my PhD. He encouraged me to apply for a position at Texas A&M for an assistant professor and I was very lucky to get offered the job. I started out at a very small university in a very small country; to make the big step to the US to such a very big statistics department – I really owe him that as he encouraged me to look further than this very small place that I was living in. We’ve authored several papers together, so he’s been really very stimulating for me and got my career started.
11. If you had not got involved in the field of statistics, what do you think you would have done? (Is there another field that you could have seen yourself making an impact on?)
Not now anymore, but when I was a student, I couldn’t decide whether I wanted to study mathematics or physics. At Hasselt University there was the option, in the first year of undergraduate study, to combine the two, and that is what I did. So I postponed the choice until the second year. After having had courses in statistics, I knew that was what I wanted to do. But I’m still interested in physics and astronomy. If I would not have had the chance to study statistics, I might have seen myself end up there, but not now anymore. Now I’m much too hooked on statistics to make such a change.