The Link Between Self-Assessment and Examination Performance

self assessment

Self-assessment is important for effective learning. Students who are skilled at examining their own thought processes can use the resulting information to learn and perform well in testing situations. In addition, those who effectively use feedback from exams can raise their level of learning. I report on two activities from an introductory psychology class, one that measures the ability to predict exam performance and a second that gives students feedback on their test performance before taking a retest. The feedback was specifically designed to show students what concepts were not completely mastered.

Before each of three in-class exams, I presented the class with a list of the critical concepts for each unit to be covered in the exam. Students were asked to rate on a scale of 1–5 how well they thought they would do on each unit, with 5 signaling high performance and 1 low performance. The exams included both short-answer and essay questions. After grading the exams, I returned them to students with scores for each short-answer question and for the essays. I then gave each student the option of taking a follow-up exam consisting of new questions on only those concepts that were not answered completely accurately on the short-answer portion of the first exam. Note that this means each student has a personalized retake exam and, in theory, that exam might be only one question long. I gave students one-half of any improvement they showed on the retake questions as an add-on to their original score. In effect, this is a mastery system of grading and feedback in which students are encouraged to focus their learning efforts on areas of weakness.

To do the analysis of self-assessment, I classified the student results for each exam into four groups, based on original exam performance. The groups ranged from the top 25 percent of exam scores (first quartile), to the bottom 25 percent of exam scores (fourth quartile). I then computed the average self-rating and average exam score separately for each of the four quartile groups. The critical question is the relationship between self-ratings and performance. Table 1 shows the average self-rating and average exam score for each of the four groups on each of the three exams.

Table 1
Mean Self Ratings and Mean Exam Scores

Quartile  Exam 1 Exam 2 Exam 3 Residual
Rating Score Rating Score Rating Score
 First 4 93.3 3.9 94.2 4.1 87.6 3.9
 Second 4.1 87.8 3.8 86.7 3.9 80.3 0.01
 Third 3.6 79.8 3.5 76.6 3.6 78.6 5.6
 Fourth 3.4 59.2 3.8 62.5 3.4 61.3 -10.5

I computed the correlation between average self-ratings and the average exam scores for each group using separate points for each of the exams, as shown in table 1. What is immediately obvious is a positive relationship between self-ratings and performance. The correlation between the two was r= .73, p< 01. This suggests that students generally do have a good idea of how well they know the material for purposes of exam performance; generally the higher the self-assessment, the higher the performance. This is potentially useful for helping students strengthen their learning; if we can get students to use their self-knowledge to engage in further study of new and unfamiliar material, they should have positive results.

There is, however, an important twist to the relationship. The ratings for students in the bottom quartile in terms of performance substantially overpredicted exam performance. This can be seen in the last column of the table, which is labeled “residual”. This shows the average departure of the obtained exam score from the score predicted on the basis of the self-rating. A positive residual number means that the obtained score was higher than the predicted score. A negative residual number means that the obtained exam score was lower than the predicted score. On average, the students in the lowest performing category overpredicted their exam scores by 10.5 points. The ratings for students in the other three groups either slightly underpredicted performance or were very close to the regression line that relates prediction to performance. The conclusion is twofold: Students do have an ability to predict how well they will do on an exam, but that ability is not nearly as strong for poorer performing students as it is for higher performers. The weaker students have less awareness than the stronger students as to how well they will perform on an exam.

After I evaluated the original exams and created a retake exam for each student who wanted one, I was able to calculate the improvement for each student on the critical concepts for that student. I did tell students that I would not subtract points if their retake performance was lower than their original performance. For each of the four classifications of performance on the original exam, I computed the percent opting for a retake and the average improvement. The results are shown in Table 2. The change scores reflect average improvement for students in each of the quartiles.

Table 2
Percent Attempting Retake and Mean Change in Score

Quartile Exam 1 Exam 2 Exam 3 All
%Taken Change %Taken Change %Taken Change %Taken Change
 First 33 3 33 2.9 42 2.9 36 2.9
 Second 67 4.1 50 4.5 50 4.7 56 4.4
 Third 46 9.1 92 7.1 58 6.9 65 7.5
 Fourth 54 10.9 75 9 42 14.5 57 11

Almost all students who took a retake exam showed improvement. The amount of improvement was highest for those in the lowest quartile and diminished as we go up the performance categories. On average, 57 percent of the students in the lowest quartile took a retake and achieved an average improvement of 11 points. For students in the highest group, on average 36 percent took a retake and achieved an average improvement of 2.9 points. This is not surprising; the lower a person’s original score, the more room there is for improvement. These results are clear and encouraging; when given feedback about areas of weakness, students can apply it successfully and achieve much higher scores in those areas.

Taken together, these two sets of results suggest ripe areas for helping weaker students improve. The students with lower exams scores appear also to have weaker self-awareness skills—the skills that are labeled as metacognitive. However, when given explicit information about their level of knowledge, they can use that information effectively. The most obvious conclusion is that a mastery system of retakes on specified areas can be effective. When students are shown where their performance needs improvement, they can make those improvements.

A second implication of these data is that we should explore ways of improving metacognitive skills, particularly for students who are struggling. I anticipate continuing to use self-ratings before exams but also to provide incentives for extra study of target materials before the exam is given. Given that students can make use of feedback about their knowledge, this should have positive effects. It may help sharpen their self-awareness. Stopping and thinking about what you do not know may help with general abilities for self-assessment. If we can show that improved self-knowledge is a key to learning, that would be an important advance.

Reference

Callender, A. A., Franco-Watkins, A. M., & Roberts, A. S. (2016). Improving metacognition in the classroom through instruction, training, and feedback.
Metacognition and Learning, 11, 215–235. https://doi.org/10.1007/s11409-015-9142-6

David Burrows is a professor of psychology and director of inclusive pedagogy at Lawrence University.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles

Love ’em or hate ’em, student evaluations of teaching (SETs) are here to stay. Parts <a href="https://www.teachingprofessor.com/free-article/its-time-to-discuss-student-evaluations-bias-with-our-students-seriously/" target="_blank"...

Since January, I have led multiple faculty development sessions on generative AI for faculty at my university. Attitudes...
Does your class end with a bang or a whimper? Many of us spend a lot of time crafting...

Faculty have recently been bombarded with a dizzying array of apps, platforms, and other widgets that...

The rapid rise of livestream content development and consumption has been nothing short of remarkable. According to Ceci...

Feedback on performance has proven to be one of the most important influences on learning, but students consistently...

wpChatIcon
[dropcap]S[/dropcap]elf-assessment is important for effective learning. Students who are skilled at examining their own thought processes can use the resulting information to learn and perform well in testing situations. In addition, those who effectively use feedback from exams can raise their level of learning. I report on two activities from an introductory psychology class, one that measures the ability to predict exam performance and a second that gives students feedback on their test performance before taking a retest. The feedback was specifically designed to show students what concepts were not completely mastered. Before each of three in-class exams, I presented the class with a list of the critical concepts for each unit to be covered in the exam. Students were asked to rate on a scale of 1–5 how well they thought they would do on each unit, with 5 signaling high performance and 1 low performance. The exams included both short-answer and essay questions. After grading the exams, I returned them to students with scores for each short-answer question and for the essays. I then gave each student the option of taking a follow-up exam consisting of new questions on only those concepts that were not answered completely accurately on the short-answer portion of the first exam. Note that this means each student has a personalized retake exam and, in theory, that exam might be only one question long. I gave students one-half of any improvement they showed on the retake questions as an add-on to their original score. In effect, this is a mastery system of grading and feedback in which students are encouraged to focus their learning efforts on areas of weakness. To do the analysis of self-assessment, I classified the student results for each exam into four groups, based on original exam performance. The groups ranged from the top 25 percent of exam scores (first quartile), to the bottom 25 percent of exam scores (fourth quartile). I then computed the average self-rating and average exam score separately for each of the four quartile groups. The critical question is the relationship between self-ratings and performance. Table 1 shows the average self-rating and average exam score for each of the four groups on each of the three exams.

Table 1 Mean Self Ratings and Mean Exam Scores

Quartile  Exam 1 Exam 2 Exam 3 Residual
Rating Score Rating Score Rating Score
 First 4 93.3 3.9 94.2 4.1 87.6 3.9
 Second 4.1 87.8 3.8 86.7 3.9 80.3 0.01
 Third 3.6 79.8 3.5 76.6 3.6 78.6 5.6
 Fourth 3.4 59.2 3.8 62.5 3.4 61.3 -10.5
I computed the correlation between average self-ratings and the average exam scores for each group using separate points for each of the exams, as shown in table 1. What is immediately obvious is a positive relationship between self-ratings and performance. The correlation between the two was r= .73, p< 01. This suggests that students generally do have a good idea of how well they know the material for purposes of exam performance; generally the higher the self-assessment, the higher the performance. This is potentially useful for helping students strengthen their learning; if we can get students to use their self-knowledge to engage in further study of new and unfamiliar material, they should have positive results. There is, however, an important twist to the relationship. The ratings for students in the bottom quartile in terms of performance substantially overpredicted exam performance. This can be seen in the last column of the table, which is labeled “residual”. This shows the average departure of the obtained exam score from the score predicted on the basis of the self-rating. A positive residual number means that the obtained score was higher than the predicted score. A negative residual number means that the obtained exam score was lower than the predicted score. On average, the students in the lowest performing category overpredicted their exam scores by 10.5 points. The ratings for students in the other three groups either slightly underpredicted performance or were very close to the regression line that relates prediction to performance. The conclusion is twofold: Students do have an ability to predict how well they will do on an exam, but that ability is not nearly as strong for poorer performing students as it is for higher performers. The weaker students have less awareness than the stronger students as to how well they will perform on an exam. After I evaluated the original exams and created a retake exam for each student who wanted one, I was able to calculate the improvement for each student on the critical concepts for that student. I did tell students that I would not subtract points if their retake performance was lower than their original performance. For each of the four classifications of performance on the original exam, I computed the percent opting for a retake and the average improvement. The results are shown in Table 2. The change scores reflect average improvement for students in each of the quartiles.

Table 2 Percent Attempting Retake and Mean Change in Score

Quartile Exam 1 Exam 2 Exam 3 All
%Taken Change %Taken Change %Taken Change %Taken Change
 First 33 3 33 2.9 42 2.9 36 2.9
 Second 67 4.1 50 4.5 50 4.7 56 4.4
 Third 46 9.1 92 7.1 58 6.9 65 7.5
 Fourth 54 10.9 75 9 42 14.5 57 11
Almost all students who took a retake exam showed improvement. The amount of improvement was highest for those in the lowest quartile and diminished as we go up the performance categories. On average, 57 percent of the students in the lowest quartile took a retake and achieved an average improvement of 11 points. For students in the highest group, on average 36 percent took a retake and achieved an average improvement of 2.9 points. This is not surprising; the lower a person’s original score, the more room there is for improvement. These results are clear and encouraging; when given feedback about areas of weakness, students can apply it successfully and achieve much higher scores in those areas. Taken together, these two sets of results suggest ripe areas for helping weaker students improve. The students with lower exams scores appear also to have weaker self-awareness skills—the skills that are labeled as metacognitive. However, when given explicit information about their level of knowledge, they can use that information effectively. The most obvious conclusion is that a mastery system of retakes on specified areas can be effective. When students are shown where their performance needs improvement, they can make those improvements. A second implication of these data is that we should explore ways of improving metacognitive skills, particularly for students who are struggling. I anticipate continuing to use self-ratings before exams but also to provide incentives for extra study of target materials before the exam is given. Given that students can make use of feedback about their knowledge, this should have positive effects. It may help sharpen their self-awareness. Stopping and thinking about what you do not know may help with general abilities for self-assessment. If we can show that improved self-knowledge is a key to learning, that would be an important advance.

Reference

Callender, A. A., Franco-Watkins, A. M., & Roberts, A. S. (2016). Improving metacognition in the classroom through instruction, training, and feedback. Metacognition and Learning, 11, 215–235. https://doi.org/10.1007/s11409-015-9142-6 David Burrows is a professor of psychology and director of inclusive pedagogy at Lawrence University.