“I’m a statistical engineer”: An interview with Professor Diego Kuonen

Diego Kuonen is a statistical consultant with his own successful software-vendor independent company, Statoo Consulting which he launched in 2001 and is also a professor at the University of Geneva, where he primarily teaches business analytics.

He obtained his MSc in mathematics from the Swiss Federal Institute of Technology (EPFL) Lausanne followed by a PhD in statistics also from EPFL. Both degrees were obtained for outstanding works in applied statistics, mainly in the interface of statistics and computer science.

He has extensive experience in applying statistical engineering, statistical thinking, statistics, data science – a rebranding of data mining – and big data analytics within large and small companies in Switzerland and throughout Europe. In January 2016 Professor Kuonen was ranked 22th within Maptive’s global `Top 100 Big Data Experts to Follow in 2016’ list, and in February 2016 he was ranked 12th within Onalytica’s global `Big Data 2016: Top 100 Influencers and Brands’ list.

Alison Oliver spoke to Professor Kuonen at ISCB 2016 where he was President David Warne’s invited speaker about his consulting experience and career in statistics.

1. What was it that first introduced you to statistics as a discipline and what was it that led you to pursue statistics as a career rather than mathematics?

I didn’t know what to do after high school but I chose back in 1992 mathematics as it seemed there were plenty of opportunities. I went to EPFL in Lausanne and started basic courses in mathematics. I always wanted to study more applied topics. During the first two years, I got introduced to statistics and I was really fascinated by the idea of exploratory data analysis. I then still had the impression that statistics was a subtopic of mathematics which it is not, of course, but it was how it was structured at the time (and unfortunately still is) in most European universities.

During my third and fourth years of studies, I took all the possible statistics courses you could take. I also studied operations research and took programming courses. For several semesters, I started working on models for soccer predictions which was such fun. I completed my diploma with Anthony C. Davison in 1998. I think I was his first diploma student at EPFL and that led to doing my PhD in statistics with him.

In 2001, once I had completed my PhD, I came to the conclusion that everyone needs statistical consulting, data analysis and data mining (now rebranded as data science) services. I first taught semester courses for engineers at the EPFL but at the same time, I started my company Statoo Consulting. I then started teaching at the University of Geneva and in the beginning of 2016 I was nominated as Adjunct Professor of Data Science at the Research Center for Statistics (RCS) of the Geneva School of Economics and Management (GSEM) at the University of Geneva.

2. Topics of interest for you include business analytics, big data analytics, data science and statistical engineering. What it is about these topics that appeal to you?

For the past four or five years, I have been interested in statistical engineering. If somebody asks me what my best job description is, I tell him/her: “I’m a statistical engineer”. I approach a client, consider the problems he/she faces and help him/her decide from a tactical level the best strategy on how to best utilise statistical concepts, principles, methods and tools, and on how to integrate them with information technology and other relevant sciences to generate improved results and to enable continuous improvement. They key is to help the client to augment (using, and along with, existing operational methods, tools and technologies) the strengths of his/her people, processes and services.

I have picked up over the past 15 years the idea of continuous improvement using statistical thinking, which is based on the following fundamental principles: all work occurs in a system of interconnected processes; variation, which gives rise to uncertainty, exists in all processes; and understanding and reducing (unintended) variation are keys to success. So how can this be done in order to continuously improve? This is really a philosophical approach which you can really apply everywhere. This idea of statistical thinking, augmented by the usage of statistical engineering at the tactical level, is a driving force in what I do in respect to business analytics and big data analytics.

From an academic point of view, I am currently Professor of Data Science at the Geneva School of Economics and Management (GSEM). In autumn of 2017, we will start a new master program in business analytics. The GSEM will be one of the first European schools to offer a complete 120-credit master program in business analytics. I am proud to be its founding Director in these exciting times of the digital transformation and to bridge the gap with this new master program between university education and professional needs. The students will learn about the collection, management, analysis and utilisation of data in strategic, tactical and operational decision making under uncertainties to enable continuous improvement.

This is really something in which I have a particular interest: the wish to fill the gap between academic research and professional business management practice to enable continuous improvement. I believe I know how the “real world” (i.e. the world outside academia) is working as well as academia, as well as what their needs are. With my appointment at the University of Geneva I try to bridge the gap in between. I bring my professional experience back in order to educate students so that they are more prepared for the future ahead.

This idea of statistical thinking, augmented by the usage of statistical engineering at the tactical level, is a driving force in what I do in respect to business analytics and big data analytics.

Back in 2014, I convinced GSEM’s Dean to offer a new mandatory bachelor course “Business Analytics” for students making their major in management. The objectives of this course are to provide a solid introduction to business analytics and to statistical thinking. The course should prepare students for business leadership by developing their capacities to apply statistical thinking to improve business processes. As such, the course provides students with an overall problem solving and process improvement strategy. The making of this course has been an immense challenge. It took three months to create the slides but I had the feeling that this is the type of course that students need. I introduce older ideas such as those of Deming which continue to work very well now and I have the support of the Dean to try out new ideas. I want to give something back to the students and here I am doing so.

With my consulting, I am doing more and more strategic consulting where you approach decision makers and major companies who need to create value from their greatest asset – data. In your first interview with National Statistician John Pullinger, he said to you that “Statisticians needs to have a seat at decision-making tables” and he is absolutely right. It depends on the background of the decision maker as to whether they have an idea of the scientific process or not. The issue is that most of the “older generation” believe that they just buy the software (and as such the operational technology) and then the issue is fixed, but you have to set up a culture to do data-driven decision making where people are able to communicate with each other. Currently I do a lot of workshops with decision makers where I try to demystify big data (along with analytics, data science, statistics, machine intelligence and machine learning) as to what they are really about and what the problems are.

One of the main problems is the data quality because the data collection is not under their control – as most of their data have been collected (and designed) for other reasons. National Statistical Institutes have, for example, with surveys data collection under their control and have the best data to answer their (primary) questions, but they are now also interested in additional (secondary) data sources nowadays available (e.g. to augment their existing statistical information and statistics). Pharmaceutical companies have the same issue and the success of companies such as Google, Facebook and LinkedIn is due to the fact that from the beginning, the idea was the data would be their main asset. The company culture is inherent in the company. Then you work with other companies who were established a hundred or more years ago and they discover that they need analytics and no analytical and statistical mindsets exist. It is really a big change of paradigm! W. Edwards Deming once said that, “Survival is not compulsory. Improvement is not compulsory. But improvement is necessary for survival.”

3. Your lecture at ISCB 2016 was the Swiss Statistician’s ‘Big Tent’ Overview of Big Data and Data Science in Pharmaceutical Development. Please could you tell us more about this topic? What was the one thing that you would like your audience to have taken away from your lecture?

The idea was to demystify all this hype on big data and data science and their connections to statistics. I gave the first similar lecture to this in late 2013 and then someone advised me that I needed to make my slides available to the public. I put them on SlideShare and up to now, there have been over 200,000 views of the presentation (and its updates). It quickly became listed in the top 10 list of SlideShare presentations on data science and Big Data (as 8th in September 2014) and in the top 24 list on data mining (as 16th in November 2014).

Three years ago, someone advised me to use Twitter and communicate. At the time, I replied that this was a waste of time but then I started to use it (https://twitter.com/DiegoKuonen) and tried to really communicate the beauty of statistics. It was and is not self-promotion. I comment when there are statistical principles missing or some further education is needed when someone has treated some analysis in a bizarre way. Out of this, because I am quite active and am an idealistic person, I want above all for statistics to be seen for what it is actually about.

I got on a lot of top 100 lists. For example, in January this year I was ranked 22th within Maptive’s global “Top 100 Big Data Experts to Follow” list, and in February 2016 I was ranked 12th within Onalytica’s global “Big Data: Top 100 Influencers and Brands” list. I got incredible visibility that I never thought possible. Apparently the messages and views I am sharing are not so wrong! Social media is also a wonderful marketing machine – also for the statistics profession! Since I started the company 15 years ago, I have never used any marketing budget.

4. From your experience, what are the key principles of success that you mentioned within the lecture?

It is like a summary of my professional experience I gained over the last 15 years. It is about not neglecting the following four principles that ensure successful outcomes:

– use of sequential (iterative) approaches to problem solving and improvement, as studies are rarely completed with a single data set but typically require the sequential analysis of several data sets over time;
– having a strategy for the project and for the conduct of the data analysis; including thought about the “business” objectives (“strategic thinking”);
– carefully considering data quality and assessing the “data pedigree” before, during and after the data analysis; and
– applying sound subject matter knowledge (“domain knowledge” or “business knowledge”, i.e. knowing the “business” context, process and problem to which analytics will be applied), which should be used to help define the problem, to assess the data pedigree, to guide data analysis and to interpret the results.

In addition, the key elements for a successful big data analytics and data science future are statistical principles and rigour of humans! And we should never forget that statistics, (big) data analytics and data science are aids to thinking and not replacements for it!

Many statisticians are beginning to criticize big data and data scientists but this is not the way to go. We should advise instead that a wonderful job is being done but next time, we can help you to do it even better. We have to educate about data and how to handle them. Data science is moving very fast and statisticians are not the fast guys but we have the chance to jump on the train and to put it in the right direction. Collaborating is very important as there is a lot of denial going on. Statistics is key to data collection and some only see it in terms of its use when it comes to confirmatory or explanatory analysis using p-values for example. There are plenty of opportunities ahead. This really is a golden age for statistics.

5. What do you feel has been the key in your approach when introducing the beauty of statistics in your consulting?

In the beginning, it was very hard because I was not taught how to communicate to a non-statistician. I needed to adapt and explain. Now after 15 years, I can give a course on statistics to people having no formal mathematical or statistical training. Nevertheless, if there is someone who wishes for more technical detail, I am able to help him/her as well. With respect to communication skills, many professors tell you that you should learn on the job after your degree but I don’t think this is right. This education needs to be offered at university right up to presenting your ideas to an audience who do not have the same background as you; how to write an article as an executive summary to a reader who has no clue. Communication skills are vital and this is the missing element in education.

When I was asked, around seven years ago, by the professors responsible for the master program in statistics at the University of Geneva to give a course in their master program, I told them that I would teach a course on communication skills – a course that I always wanted to have as a student. In my course “Statistical Consulting” I offer case studies and encourage the students to ask me (playing the role of a passive client with no mathematical or statistical skills) questions about my data and needs, and from what they learn, they have to make a written report and perform an oral presentation. I give them a mark as to how well they communicated. I am not interested in whether they did it right or wrong from a statistical point of view. I want them to be able to tell me a story! Moreover, the students need to work in consulting teams and they will experience to efficiently work together in a group. The feedback I have received over the years is great!

6. In your lecture, you mentioned SMART machines. Do you think SMART machines will replace the statistician? There is quite a lot of fear in the profession about machine learning and artificial intelligence replacing the work of the statistician?

Automation is needed but it is very important to augment the strength of statisticians. This is illustrated in a cartoon by my friend Enrico Chavez (who also holds a PhD in Statistics from the EPFL).

 

You augment the strength of the statistician to become even more powerful. In the past, we statisticians used to have only one job: “to do our work”. Nowadays, with the digital transformation and the related data revolution, we must accept a second job: “to improve how we do our work”. As such, we need to augment our strengths by automating any routinizable work and by focusing on our core competences.

Many statisticians are beginning to criticize big data and data scientists but this is not the way to go. We should advise instead that a wonderful job is being done but next time, we can help you to do it even better. We have to educate about data and how to handle them. Data science is moving very fast and statisticians are not the fast guys but we have the chance to jump on the train and to put it in the right direction. Collaborating is very important as there is a lot of denial going on. Statistics is key to data collection and some only see it in terms of its use when it comes to confirmatory or explanatory analysis using p-values for example. There are plenty of opportunities ahead. This really is a golden age for statistics.

In summary, big data need statistics as much as statistics needs big data!

7. You were President of the Swiss Statistical Society from 2009-2015. When you look back on those years now, what do you feel were your main achievements?

I am still quite young, only 43 years old. But for twelve years, I joined the Society’s committee and was later President for six years.

I had plenty of ideas about making the profession more visible and it was not easy at the beginning. There is a big future for statistics ahead so we have to move. I started creating a section of statistics in business and industry. Moreover, from the beginning, I had the feeling that we could have a big statistical family in Switzerland to join together in discourse. We had an annual Swiss Statistics meeting. I presided over a number of committees and the idea was always to find a course mix between academia, business and industry, and official statistics.

I think this worked very well and the feedback was wonderful and we did a lot for the profession in Switzerland. We contacted Wiley and obtained a special rate for Significance magazine. Every year it was difficult to find people to join committees and I always tried to persuade people that statistics is the profession of the future. Statistics is at the heart of everything these days so it should be a big chance for us to do something for our profession.

8. What has been the best book on statistics that you have ever read?

Statistical Thinking: Improving Business Performance by Roger W. Hoerl and Ron D. Snee was very influential as was their book Six Sigma Beyond the Factory Floor. These both influenced how I see things.

9. Who are the people that have been influential in your career?

Anthony C. Davison from when I did my diploma and PhD. He supported me when I started my consultancy so that I could simultaneously lecture. He taught me about statistics. He is an incredible guy and mentor. Anthony was definitely the most influential person in my career. We keep in touch regularly and he continues to offer help and guidance.

Another influence is Roger W. Hoerl whom I came across through my interest in statistical thinking. I bought the first edition of his book on statistical thinking in 2001 and then in 2006, I invited Roger to be a keynote speaker at the Swiss Statistics Meeting in Lugano. He opened my world to statistical thinking and statistical engineering and introduced me to Deming’s works.

Several other professors also helped me a great deal.

10. If you had not got involved in the field of statistics and consultancy, what do you think you would have done? (Is there another field that you could have seen yourself making an impact on?)

That’s a good question. I’ve never thought of that. During my first year at EPFL, I was not sure whether I would make it to the second year. I had an idea to start producing wine, like my family background, but nowadays, you can use analytics even in the wine making process to end up with a satisfactory result and to enable continuous improvement! I would have been driven to the analytics side of something as it touches so many domains (if not all).

 

Copyright: Image of Professor Kuonen copyright of Statistics Views. Cartoon copyright of Enrico Chavez.