Teaching Statistics is pleased to announce that the article entitled “Bare bones, or a rich feast? Taking care with context in a data rich world” by Sue Finch and Ian Gordon has been awarded the C. Oswald George prize for 2023.
For many decades, those teaching statistics have had great thirst for data for teaching, student learning experiences and assessment both formative and summative. Desirable features of such datasets have typically been potential interest for students, types of variables and data for exploration and analysis, any interesting results and potential for diagnostics, that is, for critiquing analysis assumptions. In past eras, finding a sufficient range and number of such datasets was always a challenge. As commented in this paper, datasets which “demonstrated important pedagogical principles and supported applied practice of statistical methods” were regarded as “gold”. Textbooks and other publications were some sources of data. Statistical computing packages such as Minitab, SPSS and R provide an extensive library of datasets. Such datasets tended to be clean and already organised by variables and subject in worksheets and spreadsheets, reinforcing the emphasis on the available variables and their types.
As the emphasis on authentic student experience of the whole statistical investigation process increased, many programs included students collecting, observing or sourcing their own data, with instructors then able to use such data in current and subsequent teaching, with the permission and acknowledgement of the students who collected the data. The what, how, when, where, who and why of such datasets were, or should have been, an inherent part of the student reporting. With the increasing proliferation of data available and the growing emphasis on data science and students considering available data, there is now much attention on the provenance and contextual issues of data, as well as on the original data in all its complexity and messiness.
This paper reports on an overview of the “contextual information provided in 41 data sets suitable for introductory tertiary statistics teaching, available in the R datasets package, and investigates the source information for four data sets” which on the surface appear relatively simple, with a continuous response and just one or two categorical explanatory variables. The work involved in fully investigating the provenance of data demonstrates the importance of full disclosure of context and origin of all datasets. The investigation shows how what the authors call “sanitisation” leads to impoverished data and potentially misleading or incorrect context for analysis and/or interpretation, diminishing “integral parts of the statistical investigation process”. In contrast, full contextual information and data provenance provide a ”rich feast” for teaching and learning of authentic statistical thinking, investigating, analysing and interpreting.
The investigation, presentation and discussion in this article are themselves rich with points that build bridges between the variety of desirable criteria of datasets for teaching. Perhaps there are also current messages of the ever-present dangers of “sanitisation” of contexts, issues and data, and what Stark calls “quantifauxcation”.
Congratulations to the authors for a thoughtful and challenging investigation and discussion.
The article is Open Access and can be read here.
Bare bones, or a rich feast? Taking care with context in a data rich world, Teach. Stat. 45 (2023), 4–13. https://doi.org/10.1111/test.12322
and ,HISTORY OF THE PRIZE
Dr C. Oswald George was an eminent government statistician in the UK; one of the founders of the UK’s Institute of Statisticians who served as Chairman and President. He donated a sum of money for the ‘best paper, especially submitted by younger authors, in the field of applied statistics’. The prize was subsequently attached to the Institute’s own professional exams. After the formation of Teaching Statistics in 1979, the Institute made the prize money available for the best article in Teaching Statistics, and this prize has continued to be made available following the merger of the Institute with the Royal Statistical Society. Dr C. Oswald George died on 6 January 1974, but we are pleased to be able to honour his legacy each year through the award of this prize to the ‘best’ article in Teaching Statistics.