50 free eprints are available at this link. If the free eprints run out, please contact me, email@example.com.
To review the variables that computing education researchers measure and how they measure them. The particular aim of this review was to highlight areas for improving standardization in the field so that we can more easily make comparisons among projects when appropriate. The review favors quantitative data analysis (as standardization is antithetical to the goals of much qualitative data analysis) but considers the important contribution that qualitative data makes.
Measurement versus data
The first section of the paper is a short primer of often misunderstood concepts in measurement. It is intended for only readers who never had formalized measurement training or who want to check their understanding. The section explains common mistakes or questionable data analysis methods the authors have seen while reviewing, like using the split-mean method or difference/gain scores. For the purpose of a summary, I’ll focus on only the most fundamental point–measurement is not always the same as data. A researcher can use a qualitative measurement to create quantitative data, e.g., by asking a students to write programs (qual measurement) and giving them numeric grades (quant data). Similarly, a researcher can measure continuous data, such as a numeric grade from 0-100, and record ordinal-level data, such as a letter grade from A to F. This difference is important because researchers need to consider the data transformations that occur after measurement to use the correct analysis tools/tests and draw valid conclusions.
We examined all 287 papers published from 2013-2017 in Computer Science Education (CSE), Transactions on Computing Education (TOCE), and the International Computing Education Research (ICER) conference. We excluded papers that were not empirical, human-subjects research (e.g., review papers, evaluation papers, validation papers). The remaining 197 papers were coded for which variables they measured, number and type of measurements used, use of standardized instruments, number of participants in the study, and the type of data analysis used (i.e., quantitative, qualitative, or mixed).
Here are some of the results by the numbers.
- Papers using only quantitative data analysis: 92
- Average number of measurements: 2.2
- Median number of participants: 100
- Papers using only qualitative data analysis: 41
- Average number of measurements: 1.2
- Median number of participants: 17
- Papers using mixed data analysis: 64
- Average number of measurements: 2.3
- Median number of participants: 58
- Number of standardized instruments used: 37 (see Table 3)
- Number of direct measurements used (e.g., number of submissions or character count): 9
The majority of the analysis examines the variables that were measured to identify similarities across the field. Based on content analysis, the variables were categorized as product data, process data, or other data. Product data included variables that focused on student performance or understanding, and they were well-represented in papers that used quantitative or mixed data analysis. Process data included variables like progress/formative measurements, student experience, time on task, and collaboration. Progress/formative measurements were proportionally used across the different types of analyses; experience was disproportionately more common in qualitative and mixed data analysis; and time and collaboration were not common in any type of analysis. The last category, other data, included perceptions of computing and expert knowledge (e.g., PCK). Perceptions were twice as likely to be measured for a mixed analysis as for a quant or qual analysis, and expert knowledge was primarily used in qualitative analysis.
Based on the review of variables that were measured and existing standardized instruments, we make recommendations for new standardized instruments that the community would benefit from. Perhaps more usefully, we make recommendations for semi-standardized reporting of measurements that cannot be standardized (e.g., exams used in classes or student interviews).
Why this is important
In the conclusion, we compare the research and reporting practices that we found in the papers to best practices in human-subjects research. We highlight the practices that we already consistently adhere to and the practices that we should more consistently implement and enforce through reviewing to increase the rigor of the field. As with many literature reviews, this review of measurement helps take stock of practices in the field, providing evidence to support or refute anecdotal perceptions. All of the authors and reviewers were surprised by at least one thing in this paper. With a more complete perspective on the state of the field, we can make better informed decisions about how to improve it.
Margulieux, L. E., Ketenci, T. A., Decker, A. (2019). Review of measurements used in computing education research and suggestions for increasing standardization. Computer Science Education. doi: 10.1080/08993408.2018.1562145
For more information about the article summary series or more article summary posts, visit the article summary series introduction.