Evaluation and Feedback

What Counts as Evidence of Effective Teaching?

By Stephen L. Chew
September 19, 2022

Psychology programs at large, research-focused universities often ask me to provide an external evaluation for a teaching-track faculty member who is being considered for tenure or promotion. I try to accept these invitations because I think it is vital that faculty who focus on teaching be evaluated by people who understand the complexities of teaching. Faculty and administrators at universities that prioritize research for tenure and promotion may not value or appreciate teaching. Some programs state specifically that they are asking me because they realize they don’t have the requisite expertise to evaluate teaching. Lecture is still the dominant method of teaching (Cerbin, 2018), and while lecture can be done well, there is no evidence of a widespread effort by college faculty to use what is known about learning science to improve their lecturing.

I see my task as both evaluating the candidate and informing the evaluators about how effective or challenging certain pedagogical practices are. Group work, for example, can range from simply dividing a class into groups and giving them an assignment to complete to dividing the class into groups, structuring assignments so that each group member has a role, instructing the group to develop a group contract, and creating an assessment scheme that holds each student accountable for the group outcome. Teaching-focused faculty may see the latter type of group work as critical to student learning. Faculty who do not focus on student learning in their teaching may not appreciate the effort.

When I agree to do an evaluation, the program provides me with a portfolio to review and assess. Its contents reflect what the institution believes is necessary to make an evaluation of teaching effectiveness. Typically, there is a faculty CV, some sort of teaching statement, and a copy of the university guidelines for tenure and promotion. There may be course syllabi and sample publications, especially if they relate to teaching. I might get access to a recorded lecture or two. I rarely receive student evaluations these days, likely out of concern for their validity (Carpenter et al., 2020) and biases (Adams et al., 2022).

Even though every college and university says they value teaching, the criteria for what constitutes teaching effectiveness for promotion vary considerably. Some schools provide no criteria at all and simply instruct me to evaluate the quality of their teaching using the materials provided. I guess we know good teaching when we see it. Some institutions state that the candidate must demonstrate a high level of teaching expertise without actually defining it. They say that the course syllabi, presentations, exams, and grade distributions should reflect high standards. Other institutions define teaching effectiveness in terms of things they can count, such as number of different course preparations taught, number of honors theses supervised, and number of independent studies taught. I guess the assumption is that the more of these you are asked to do, the better you must be. Some schools, though, have extensive criteria. Usually these school have embraced the scholarship of teaching and learning. Such criteria include

evidence of sustained excellence in teaching;
evidence of professional growth within the rank;
scholarly contributions to the teaching mission of the university; and
evidence of effective student advising and mentoring.

These schools then outline some of the different forms that evidence can take for fulfilling these criteria.

With only a CV, a teaching statement, and course syllabi, it is easy to determine the candidate’s intention to teach well, but it can be difficult to ascertain their actual success. The CV can tell me whether they engaged in professional development, conducted and presented any teaching-related research, or won any teaching awards. I can glean a lot about their approach to teaching from the teaching statement. A good statement reflects an arc of teaching improvement through reflection, and innovation, and improvement. The statement is where teachers can discuss issues they’ve identified in their teaching, how they’ve addressed them, and how they’ve assessed whether the modifications worked. A strong statement focuses on student learning, not just using the latest technology or what is considered a best practice. A good statement reflects an understanding of pedagogical research and theory. The same holds true with course syllabi. The syllabus should reflect an emphasis on student learning and student-centered course design (Richmond et al., 2016).

If you are a junior faculty member or are evaluating junior faculty members, here are some ways to demonstrate teaching effectiveness. This evidence involves identifying a teaching goal, designing a means of addressing that issue based on pedagogical research, implementing the innovation, and assessing the impact. Ideally, you can document and publicly share what you have done so that others can replicate and build on the results. For a set of teaching behaviors you might target, see Chew and Cerbin (2020) or Keeley et al. (2006).

Student evaluations of teaching have been the target of a lot of criticism as evidence of teaching effectiveness. They should be used with caution and should never be the sole means of evaluating teaching. I believe they can still be informative if used properly. I recommend using one with behaviorally focused questions that have established reliability and validity, such as the Students’ Evaluation of Educational Quality (SEEQ; Marsh, 1982). The SEEQ is comprehensive, and a teacher can use a subset of questions to focus on. I also recommend using the SEEQ to document an arc of improvement over semesters rather than using it as a one-time snapshot. For example, say you are interested in increasing group discussion in your classes. You can use the Group Interaction items on the SEEQ to establish a baseline. Then introduce an activity to promote class discussion, and then use the same questions to assess any changes.
Instead of using student evaluations, there are other standardized assessments that faculty can use to assess behaviors related to student learning, such as rapport (Lammers & Galaspy, 2013), belongingness (Slaten et al., 2018), growth mindset (Stanford SQARQ, n.d.), and use of effective learning strategies (Pintrich et al., 1991).
You do not have to use a standardized set of questions to assess the impact of a class activity or pedagogical innovation. You can simply ask students whether the activity helped them understand a concept or whether it was enjoyable or easy to do. For assessment, you can use a Likert scale for numerical data or an open-ended question to gather qualitative data.
You can assess student understanding using low-stakes formative assessments. In class, have students solve a problem that represents what they will have to do on an exam. That will give both you and the students feedback about their level of understanding. If you wonder whether students have overcome a common misconception in your field, then assess their understanding with a clicker question in which the misconception is one of the distractor answers. Clickers make it easy to gather data.
You can select a small set of questions to use on summative exams to assess student understanding of a particular concept. Using the same subset allows you to compare understanding across semesters. You can teach using one method for one semester and another for the next. If you are teaching two sections of the same course, then you can implement the innovation in one section and assess both classes using the same subset of questions.
Instead of using end-of-semester student evaluations of teaching, use continuous group feedback about how the class is going. You can set up Small Group Instructional Diagnosis (SGID), which is described on the University of Northern Iowa’s Center for Excellence in Teaching and Learning website. Handelsman (2012) proposes student management teams to give the teacher continuous feedback during the semester.

Ideally, all institutions should specify what they mean by effective teaching and academic success. They should provide teachers with the tools and training they need. There should be established assessments that teachers can use to gauge their progress and resources to help them improve. When faculty are considered for promotion or tenure, it should be clear what counts as evidence of effective teaching.

References

Adams, S., Bekker, S., Fan, Y., Gordon, T., Shepherd, L. J., Slavich, E., & Waters, D. (2022). Gender bias in student evaluations of teaching: “Punish[ing] those who fail to do their gender right.” Higher Education, 83, 787–807. https://doi.org/10.1007/s10734-021-00704-9

Carpenter, S. K., Witherby, A. E., & Tauber, S. K. (2020). On students’ (mis)judgments of learning and teaching effectiveness: Where we stand and how to move forward. Journal of Applied Research in Memory and Cognition, 9(2), 181–185. https://doi.org/10.1016/j.jarmac.2020.04.003

Cerbin, W. (2018). Improving student learning from lectures. Scholarship of Teaching and Learning in Psychology, 4(3), 151–163. https://doi.org/10.1037/stl0000113

Chew, S. L. & Cerbin, W. J. (2021). The cognitive challenges of effective teaching. The Journal of Economic Education, 52(1), 17–40. https://doi.org/10.1080/00220485.2020.1845266

Handelsman, M. M. (2012). Course evaluation for fun and profit: Student management teams. In J. Holmes, S. C. Baker, & J. R. Stowell (Eds.), Essays from e-xcellence in teaching (Vol. 11, pp. 8–11). Society for the Teaching of Psychology. http://teachpsych.org/ebooks/eit2011/index.php

Keeley, J., Smith, D., & Buskist, W. (2006) The Teacher Behaviors Checklist: Factor analysis of its utility for evaluating teaching. Teaching of Psychology, 33(2), 84–91. https://doi.org/10.1207/s15328023top3302_1

Lammers, W., & Gillaspy, A. (2013). Brief measure of student-instructor rapport predicts student success in online courses. International Journal for the Scholarship of Teaching and Learning, 7. https://doi.org/10.20429/ijsotl.2013.070216

Marsh, H. (1982). SEEQ: A reliable, valid, and useful instrument for collecting students’ evaluations of university teaching. British Journal of Educational Psychology, 52(1), 77–95. https://doi.org/10.1111/j.2044-8279.1982.tb02505.x

Pintrich, P., Smith, D. A. F., Garcia, T., & McKeachie, W. J. (1991). A manual for the use of the Motivated Strategies for Learning Questionnaire (MSLQ). The National Center for Research to Improve Postsecondary Teaching and Learning Project on Instructional Processes and Educational Outcomes. https://files.eric.ed.gov/fulltext/ED338122.pdf

Richmond, A. S., Slattery, J. M., Mitchell, N., Morgan, R. K., & Becknell, J. (2016). Can a learner-centered syllabus change students’ perceptions of student–professor rapport and master teacher behaviors? Scholarship of Teaching and Learning in Psychology, 2(3), 159–168. https://doi.org/10.1037/stl0000066

Slaten, C. D., Elison, Z. M., Deemer, E. D., Hughes, H. A., & Shemwell, D. A. (2018). The development and validation of the university belonging questionnaire. The Journal of Experimental Education, 86(4), 633–651. https://doi.org/10.1080/00220973.2017.1339009

Stanford SPARQ. (n.d.). Growth mindset scale. http://sparqtools.org/mobility-measure/growth-mindset-scale

Stephen L. Chew, PhD, is a professor of psychology at Samford University. Trained as a cognitive psychologist, he endeavors to translate cognitive research into forms that are useful for teachers and students. He is the recipient of multiple awards for his teaching and research. Author contact: slchew@samford.edu.

Beyond Bias: How to Get More Mileage out of Your Student Evaluations

Love ’em or hate ’em, student evaluations of teaching (SETs) are here to stay. Parts <a href="https://www.teachingprofessor.com/free-article/its-time-to-discuss-student-evaluations-bias-with-our-students-seriously/" target="_blank"...