Student evaluations of teaching (SETs) are one of the most widely used tools for assessing teaching effectiveness and play a significant role in decisions related to promotion, tenure, contract renewal, and merit-based pay. Despite their widespread use, a growing body of research suggests that SETs often measure students’ perceptions and emotional responses rather than the quality of course design or instructional effectiveness. These findings question the appropriateness of treating SETs as objective measures of teaching.
This article argues that peer evaluation, particularly a more immersive semester-long model of peer review, offers a more meaningful and constructive way to assess university teaching. Rather than relying on isolated classroom observations or end-of-the-semester student opinions, a peer reviewer who engages in the course over time can provide richer insight into course design, instructional decisions, and student learning. The goal of this article is to propose a practical framework for implementing peer evaluation in a way that provides actionable feedback and supports faculty development.
The Current Evaluation Methodology
Students occupy an important position in educational dynamics, and their feedback can provide useful insights into classroom climate, instructor accessibility, and course organization. However, students are rarely equipped to evaluate key aspects of effective teaching such as alignment of learning outcomes with assessments, rigor of discipline-specific content, or pedagogical reasoning behind instructional choices. Furthermore, SETs are also influenced by factors unrelated to teaching quality, including course difficulty, expected grades, and implicit biases related to gender, age, or race (Boring et al., 2016; Spooren et al., 2013; Stark & Freishtat, 2014; Uttl et al., 2017).
What Student Evaluations Measure
SETs can be informative when students comment on instructional clarity, instructor responsiveness, course organization, and overall class climate. Here, student perspectives may help instructors identify areas for improvement. However, many SET questions ask students to evaluate teaching expertise or effectiveness beyond their frame of reference. As a result, their evaluations often reflect how they feel about the course rather than how effectively the course supports learning.
Research suggests that SETs are susceptible to emotional influences, which includes student performance. SETs can also be affected by a variety of biases unrelated to teaching effectiveness. These limitations raise significant concerns regarding fairness, particularly when SETs are used as high-stakes measures of teaching quality.
Limitations of Traditional Peer Observations
In response to recognized shortcomings of SETs, many institutions incorporate some form of peer observation into their evaluation system. However, traditional peer observation models often consist of a single scheduled classroom visit, providing only a snapshot of teaching. Classroom dynamics vary throughout the semester, and instructors may adjust their teaching when they know an observation is scheduled, thus reducing authenticity. The broader course context must be included in peer evaluation to provide meaningful insight.
What We Propose – A Semester-Long Model of Peer Evaluation
A more effective approach to peer evaluation would involve a modest but sustained level of engagement by a peer reviewer throughout the semester. The goal is not to create an intensive or burdensome process but rather to allow the reviewer to gain a broader understanding of how the course is designed and implemented. A semester-long peer evaluation could include several key components including access to the learning management system (LMS), multiple classroom visits, and structured conversations.
Access to the Learning Management System
Having direct access to course materials on the institution’s LMS, the peer reviewer can examine how learning outcomes, readings, and assignments are organized. This perspective allows the reviewer to evaluate whether the course materials align with stated learning goals and whether the structure of the course supports student learning. Access to the LMS allows reviewers to see how instructors communicate with students, how assignments are organized across the semester, and how feedback is provided. These components are central to teaching effectiveness and are considered in some true course evaluations through entities like Quality Matters but can rarely be visible during a brief classroom observation.
Multiple Classroom Visits
Rather than observing a single class session, the peer reviewer should attend multiple classes throughout the semester on a regular or semi-regular basis. These visits would provide opportunity to observe different modalities of instruction since ideal pedagogies rarely consist of a single mode of content delivery. For example, one observation might take place during a lecture-focused class while another might occur during a discussion or applied learning activity. Multiple visits will provide a more balanced view of the instructor’s teaching approach and reduce pressures associated with a single high-stakes observation.
Structured Conversations
Including opportunities for discussion throughout the semester between the instructor and peer reviewer can foster a collaborative effort in establishing an effective course. An early conversation can help establish the context of the course, address the instructor’s goals for the course, and include any specific feedback the instructor would like to receive. A similar, follow-up conversation at the end of the semester can allow the reviewer to share observations witnessed from the multiple classroom visits, including what went well and places for improvement. Framing this review process as a professional exchange among colleagues, these peer evaluations can support a culture of shared teaching improvement rather than one of surveillance or evaluation. Moving peer evaluations from a purely evaluative process into a collaborative professional development opportunity will benefit both faculty members in the exchange.
Perspective From Early Career Faculty
Early career faculty often feel pressure to prioritize student satisfaction due to the weight SETs carry in tenure, promotion, and contract decisions. This can discourage the use of rigorous or innovative instructional practices. An experienced colleague as an evaluator can recognize the complexity of teaching and can provide developmental feedback that SETs alone cannot. For example, peer reviewers may provide guidance on course design, assignment structure, or discipline-specific expectations that students may not fully understand. Peer evaluations can create opportunities for mentorship and professional dialogue regarding teaching. Instead of receiving feedback only through Likert-scale surveys or short student comments, early career faculty can engage in conversations with colleagues who understand the challenges of teaching in the discipline.
Considerations for Implementation
Institutions interested in adopting an immersive model of peer evaluation need to consider several practical concerns. First, expectations for peer review should be clearly defined. Reviewing a course should not require extensive time commitments. Often, meaningful feedback can be provided through limited but strategic engagement such as reviewing course materials, attending a few class sessions, and participating in a short end-of-semester discussion.
Second, institutions may wish to provide brief training or guidelines for peer reviewers to ensure consistency. A simple rubric or checklist may help reviewers focus on key aspects of course design, instructional clarity, and assessment alignment.
Third, peer evaluation should emphasize developmental feedback rather than punitive judgment. Faculty are more likely to engage openly with the process when the goal is improvement rather than evaluation for administrative purposes.
Finally, institutions should continue to collect student feedback but interpreted alongside meaningful and informative peer evaluation and other evidence of teaching effectiveness as a multi-source evaluation.
Conclusion
Student opinions, perceptions, and ideologies are not completely without value. However, they shouldn’t be treated as factual and in our opinion, they shouldn’t be viewed only by the course instructor who is left trying to decipher between emotional and factual answers. A better approach allows peer evaluators to work alongside faculty to review the surveys in concert with the peer evaluation to isolate trends to discuss potential meaning behind comments. Additionally, peer-peer course evaluations should be completed regularly to help address issues while simultaneously providing learning opportunities for colleagues. Using holistic course evaluations in this manner provides a more meaningful evaluation while protecting the mental health of the faculty. Imagine, if faculty could continue to improve, but never had to read a comment that said “this class sucks” without greater consideration.
Hannah B. Lovins, PhD, is an assistant professor of molecular biology at the University of Findlay. Her research focuses on how nutrition impacts the immune system, specifically studying n-3 polyunsaturated fatty acids and antioxidants following air pollution inhalation; and is interested in pedagogical approaches to foster student intrinsic motivation.
Justin L. Rheubert is an assistant professor of teaching in biology and co-director of the honors program at the University of Findlay. His research interests include anatomy and physiology, comparative anatomy, histology, herpetology and pedagogical approaches to enhance student learning.
Abby L. Kalkstein, PhD, is an associate professor of biology and co-director of the Honors Program at the University of Findlay. Her work focuses on evolution of pathogens infecting both invertebrates and vertebrates, and research that enhances understanding of the learning process and student outcomes in higher education.
References
Boring, Anne, Ottoboni, Kellie, and Stark, Philip B. “Student evaluations of teaching (mostly) do not measure teaching effectiveness”. ScienceOpen Research. Vol. 0(0) (2016) 1-11. https://doi.org/10.14293/S2199-1006.1.SOR-EDU.AETBZC.v1
Spooren, Pieter., Brockx, Bert., and Mortelmans, Dimitri. “On the validity of student evaluation of teaching: The state of the art”. Review of Educational Research. Vol. 83(4), (2013) 598–642. https://doi.org/10.3102/0034654313496870
Stark, Philip B. and Freishtat, Richard. “An evaluation of course evaluations”. ScienceOpen Research. Vol. 0(0) (2014) 1-7. https://doi.org/10.14293/S2199-1006.1.SOR-EDU.AOFRQA.v1
Uttl, Bob, White, Carmela A., and Wong Gonzalez, Daniela. “Meta-analysis of faculty’s teaching effectiveness: Student evaluation of teaching ratings and student learning are not related”. Studies in Educational Evaluation. Vol. 54 (2017) 22-42. https://doi.org/10.1016/j.stueduc.2016.08.007.
