Characteristics of effective feedback in teacher evaluation

Kim, Jihyun; Li, Xintong; Bergin, Christi

doi:10.1007/s11092-024-09434-9

Characteristics of effective feedback in teacher evaluation

Published: 24 April 2024

Volume 36, pages 201–223, (2024)
Cite this article

Download PDF

Educational Assessment, Evaluation and Accountability Aims and scope Submit manuscript

Characteristics of effective feedback in teacher evaluation

Download PDF

276 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

One key mechanism through which evaluation facilitates teachers’ growth is principals’ feedback during teacher evaluation. In this study, we examined the relationship between teachers’ perceptions of feedback quality and improvement in their instructional practices. We used student report—a meaningful but rarely used measure—to measure the quality of instruction of two important instructional practices; promotion of cognitive engagement and critical thinking. We used teacher report to measure five characteristics of feedback. We found that feedback that focused on strengths was associated with a higher quality of both instructional practices, whereas focus on ways to improve teaching was associated only with the promotion of critical thinking. Whether feedback was immediate and face-to-face did not predict higher quality instructional practices.

Theories of Motivation in Education: an Integrative Framework

Article Open access 30 March 2023

Ethical Considerations of Conducting Systematic Reviews in Educational Research

Online learning in higher education: exploring advantages and disadvantages for engagement

Article 03 April 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The Every Student Succeeds Act (ESSA) was signed into law in 2015, replacing No Child Left Behind (NCLB) in the USA. A major change in the new act was that states were given more autonomy and responsibility in building their own accountability systems and identifying evidence-based interventions to improve student learning (Darling-Hammond et al., 2016; Duff & Wohlstetter, 2019; Egalite et al., 2017). To address this responsibility, many states sought ways to improve principals’ capacity as instructional leaders (Reid et al., 2020; Cherasaro et al., 2016). One area of focus is principals’ feedback to teachers during teacher evaluation because this is a key mechanism that facilitates teachers’ growth (Donaldson & Papay, 2015).

In a variety of settings, performance feedback is an effective form of intervention with large effect sizes, yet the quality of feedback is uneven (Hattie & Timperley, 2007; Kluger & DeNisi, 1996; Sleiman et al., 2020). In teacher evaluation settings, such variable quality of feedback may be one explanation for why the effects of teacher evaluation on various educational outcomes are still inconclusive (Feeney, 2007; Stecher et al., 2018). This study seeks to advance understanding of the characteristics of principal feedback associated with high-quality instruction, which has been rarely studied (Lavigne & Good, 2015). The field needs such understanding in order to inform principals’ instructional leadership. We first briefly review literature on how performance feedback in general shapes the action of feedback recipients. We then review literature specific to teacher evaluation and principal feedback with a focus on psychological aspects of teacher evaluation.

1 Performance feedback in general settings

Feedback is considered an important tool for improving individual’s performance in many contexts. Indeed, feedback is an integral element of several theories of learning and motivation, such as goal-setting theory (Locke & Latham, 1990), social cognitive theory (Bandura, 1991), and expectancy-value theory (Eccles et al., 1983). Yet, there are few theories that specifically explain how feedback during a performance evaluation improves performance, and therefore how to optimize feedback. This is important because feedback is not always effective. The success of feedback depends on the characteristics of feedback, tasks, feedback providers and receivers, and other factors (Alvero et al., 2001; Kluger & DeNisi, 1996; Lechermeier & Fassnacht, 2018; Thurlings et al., 2012, 2013).

In one common view, feedback is regarded as a motivated learning process that aims to reduce discrepancies between current performance and a desired performance goal (Hattie & Timperley, 2007; Kluger & DeNisi, 1996). From this perspective, feedback quality is influenced by how well three questions (i.e., “where am I going,” “how am I going,” and “where to next”) are answered (Hattie & Timperley, 2007). The first question “where am I going” is related to goal setting, which provides information regarding the criteria for success (Locke & Latham, 1990). The second question “how am I doing” is related to strategies to achieve goals and progress toward goals. The third question “where to next” provides information regarding next steps after the current task (Hattie & Timperley, 2007).

Feedback is processed in four steps: perceived feedback, acceptance of feedback, desire to respond to feedback, and the intended response. Characteristics of feedback can influence effectiveness at each step. Perceived feedback is influenced by timing in that feedback delay reduces effectiveness in behavior-feedback chains due to the limited memory of one’s behavior (Ammons, 1956; Ilgen et al., 1979). Feedback is more readily perceived and accepted if it is positive and matches one’s self-concept (Ilgen et al., 1979; Lechermeier & Fassnacht, 2018). This positive reinforcement increases recipients’ desire to respond to feedback (Thorndike, 1913) and leads to recipients’ setting specific and challenging goals that would lead to improved performance (Locke, 1968; Locke & Latham, 1990). Recipients’ response to feedback is also influenced by their expectation for success (Eccles et al., 1983). While feedback from a trusted other may shape recipients’ expectancies for success, so does their own self-efficacy. This, in turn, affects the likelihood of taking action based on feedback.

We combine these perspectives in a framework (see Fig. 1) to suggest how feedback from principals may improve teachers’ practices, which we discuss next. In the present study, we do not test the framework per se but rather present it to plausibly explain the effect of feedback on teaching quality.

1.1 Performance feedback in teacher evaluation

Many teacher evaluation systems in the U.S. and other countries use classroom observation by principals as a primary source of data and require post-observation conferences wherein principals provide feedback to teachers (e.g., Doherty & Jacobs, 2013; Shaked, 2018). Principal observations are a potentially powerful tool for improving teaching effectiveness when used to inform teacher professional development through feedback (Marzano & Toth, 2013; Phipps & Wiseman, 2021; Toch & Rothman, 2008). Empirically, rich, face-to-face feedback given to teachers after detailed classroom observations is regarded as an important component of effective teacher evaluation (Taylor & Tyler, 2012). It can increase teachers’ knowledge of effective teaching (Lavigne & Good, 2015). Indeed, teachers report that feedback following classroom observations is the most valuable process for their professional development (Reinhorn et al., 2017).

In most cases, the format of feedback sessions, such as the number of meetings between evaluator and teachers and when the sessions must happen, is determined by district- or state-level policies (Reid et al., 2020). Observers generally discuss what they observed and provide suggestions for improvement based on a given observation rubric. Although the content of the feedback sessions may be shaped by the observation rubric (Halverson et al., 2004), the substance of sessions (i.e., what to focus on during the feedback session) largely depends on the evaluator, which leaves room for substantial variation in effectiveness.

Accordingly, there is a growing interest in understanding what makes feedback effective in teacher evaluation. Previous studies suggest that focused and frequent feedback promotes improved instruction quality and/or student achievement (Donaldson, 2021; Garet et al., 2017; Steinberg & Sartain, 2015). Previous studies also suggest that feedback should be based on descriptive and observable data (Danielson & McGreal, 2000; Hunter & Springer, 2022), promote reflective inquiry supported by evidence (Glickman, 2002), clarify attributes of high-quality instruction (Danielson, 1996; Marzano et al., 2001), and be immediate, specific, actionable, and non-penalizing (Curtis & Wiener, 2012; Delvaux et al., 2013; Hunter & Springer, 2022; Kraft & Christian, 2019; Tuma et al., 2018).

However, the extant literature generally relies on teacher perception data to measure both characteristics of feedback and teaching improvement, which could involve source bias (e.g., Cherasaro et al., 2016; Delvaux et al., 2013). That is, teachers who feel positive about their evaluation may perceive both the quality of feedback and the impact of the evaluation positively, regardless of objective feedback quality or improvement in instruction quality. An exception is Hunter and Springer (2022) who analyzed the association between written feedback that teachers received, their classroom observation scores, and their value-added measures. Unlike prior studies, they found little association between characteristics of feedback and teachers’ improvement in instruction quality. Furthermore, the only feedback characteristic that had significant associations with instruction quality—goal-setting—was negatively associated with observation scores and positively associated with value-added measures. This suggests that the relationship between feedback characteristics and improvement in instruction quality found in previous studies may not be straightforward. A plausible explanation is that previous studies used limited indicators of both feedback quality and teaching practices. Both are multi-faceted constructs which necessitate various measures (Hill & Grossman, 2013; Kraft et al., 2018). The field needs more empirical studies that focus on multiple fine-grained characteristics of feedback quality and use different measures of instruction quality.

In the present study, we measured the quality of feedback through teacher self-report, as most previous studies have done. We choose self-report because feedback quality may not be a fixed attribute of the feedback itself, but may depend instead on characteristics of feedback providers and recipients (Shute, 2008). That is, teachers actively assess feedback from their evaluators and decide whether to accept the feedback and desire to respond (Donaldson, 2021; see Fig. 1). Empirically, for example, Kinicki et al. (2004)found that, if a feedback provider is viewed as trustworthy and competent, the recipients are likely to perceive the feedback as more accurate. This is important because perceiving feedback as accurate is the first step in a cognitive process that determines recipients’ acceptance of feedback and, in turn, their future performance (Kinicki et al., 2004). Cherasaro et al. (2016) reported that accuracy and evaluator credibility predicted teachers’ perceived usefulness of the feedback, and evaluator credibility most powerfully shaped teachers’ responses to feedback. Credible evaluators have rich experiences in teaching, subject-matter knowledge, and enough opportunity to observe teachers (Delvaux et al., 2013). Similarly, Kraft and Christian (2019) found that teachers’ perceptions of mutual respect and trust between themselves and their evaluators, as well as how much they enjoyed working with their evaluators, were predictive of teachers’ perceptions of feedback quality. In addition, source credibility predicts recipients’ motivation to or desire and intent to respond to the feedback and improve performance (Roberson & Stewart, 2006). Furthermore, feedback quality can be fluid, as it is based on the unfolding discussion between provider and receiver during feedback sessions. For example, Thurlings et al. (2012) found that teachers were capable of transforming ineffective feedback patterns to effective ones as they gave feedback to one another in face-to-face peer groups. Taken together, teachers’ perceived quality of feedback is important to understand what constitutes effective feedback.

Based on the extant literature and the theoretical framework in Fig. 1, our study focused on five characteristics of feedback: (1) face-to-face and immediate, (2) specific regarding ways to improve teaching, (3) useful and relevant regarding ways to improve teaching, (4) specific regarding strengths, and (5) useful and relevant regarding strengths. We hypothesized that face-to-face and immediate feedback would lead to greater improvement of instruction because it occurs while memory is fresh and because in-person communication facilitates clarifying questions and explanations. We hypothesized that characteristics 2 and 3, which reflect feedback on ways to improve teaching, would also lead to greater improvement of instruction because they address the questions “where am I going, how am I doing, and where do I go next?” However, they may be less effective than characteristics 4 and 5 which reflect feedback on a teacher’s existing strengths. A focus on strengths may increase acceptance of feedback and boost individuals’ expectancies for success. Exploring these hypotheses is important because principals need guidance on whether it is more effective to point out what teachers need to do to improve versus pointing out what they are already doing well. Characteristics 2 and 4 compared to 3 and 5 reflect teachers’ evaluation of feedback’s specificity versus relevance and usefulness. Such nuances in teachers’ perceptions of feedback have not been previously studied as they co-occur, yet are relevant to teacher’s acceptance of feedback and expectations for success, which should lead to improving teacher evaluation practice.

1.2 Student surveys as a measure of instruction quality

We measured the quality of instructional practices through student report. In contrast, previous studies typically used teacher self-report (potentially resulting in source bias) but sometimes used student achievement or classroom observation data (Garet et al., 2017; Steinberg & Sartain, 2015). It is important to note that various measures of teaching quality, such as classroom observation, value-added scores, and student survey measures often assess different aspects of teaching. However, the use of achievement data such as student test scores or teachers’ value-added scores to measure teaching effectiveness has well-documented problems. Among these are technical issues of how estimates are derived, the non-random assignment of students to teachers, the limited availability of tests in all subjects and grades, and the inability to provide useful feedback to teachers to guide instructional improvement (Briggs & Domingue, 2011; Hallinger et al., 2014; Hill et al., 2011; Rothstein, 2010). As a result, organizations such as the National Association of Secondary School Principals and the American Statistical Association have taken a stand against value-added measures.

Student ratings have several advantages. They are based on more data points over time and aggregated across many students, which helps attenuate measurement error. In contrast, classroom observations are necessarily limited to brief snapshots of teaching, and the presence of an observer can alter classroom dynamics. Students have first-hand experience with teachers’ instructional practices and can provide information not available to other observers. As a result of these advantages, student ratings are generally reliable and predictive of student learning (Tsai et al., 2022; Downer et al., 2015; Feldlaufer et al., 1988; Fraser & O'Brien, 1985; Kane & Staiger, 2012; Little et al., 2009; Marsh et al., 2019; Polikoff, 2015). For example, the Measures of Effective Teaching study found that student ratings of teachers predicted students’ growth in test scores and that they are often more reliable than achievement data or classroom observation data (Kane & Staiger, 2012). Recognizing these advantages, many states and districts have included student surveys as a component of their teacher evaluation systems (e.g., Missouri, Memphis, Chicago, Pittsburgh, and the New Teacher Project). In the coming years, we expect to see an increase in educational research that uses student survey data to measure instruction quality.

2 Study context: network for educator effectiveness

Our data came from a teacher evaluation system called the Network for Educator Effectiveness (NEE; www.neeadvantage.com), which is used in 295 districts in Missouri enrolling a total of 35,683 teachers and 360,056 students. NEE is a comprehensive, teacher evaluation system with the mission of promoting growth in teaching effectiveness. NEE meets the recommendations for an optimal evaluation system, according to Darling-Hammond et al. (2012), because it collects multiple sources of data with multiple observations by trained evaluators who provide meaningful and timely feedback (Jones & Bergin, 2019).

2.1 Classroom observations and feedback sessions

Principals score teachers based on the NEE Classroom Observation (NEE-CO) rubric, which was designed for use in preK-12 classrooms across all subjects. The NEE-CO rubric is based on the nationally recognized InTASC Standards for Teaching (Council of Chief State School Officers, 2011) as adapted by the Missouri Department of Elementary and Secondary Education. It includes 27 observable teaching practices, such as promoting students’ cognitive engagement (CE) and problem-solving and critical thinking (PCT). Districts select only four to six teaching practices to focus on, based on their priorities. The NEE-CO rubric is analytic, rather than holistic, which means principals award separate scores for each teaching practice, so a teacher may score a “2” on Problem-Solving and Critical Thinking but a “6” on Cognitive Engagement. Each teaching practice is scored on a scale ranging from “0” (not present) to “7” (perfect exemplar) based on evidence the principal observes (Li & Baker, 2018). Higher scores on the rubric reflect a greater frequency of high-quality instructional practices involving more students (i.e., “almost all” versus “only a few” students) when observed during a teaching session. NEE also provides principals with language to use when providing feedback to teachers. For example, for Cognitive Engagement, examples of feedback include the teacher “reviews frequently and spirals content” and “consistently encourages extension of discovery/play.”

NEE recommends that principals make several brief (approximately 10 min) unannounced visits every year for every teacher, scoring the focal four to six teaching practices per visit. The principal is expected to have a feedback conversation with the teacher within a few days of each visit. The scores are used formatively by principals and teachers to discuss each targeted teaching practice. A summative report is generated for every teacher at the end of each year, where a summative classroom observation score, a student survey score (if available), and written comments from the principal are included.

In this study, we focus on two teaching practices: promotion of cognitive engagement (CE) and promotion of problem solving and critical thinking (PCT). We chose CE and PCT because (1) they are high-impact teaching practices that predict student learning, and (2) they were the most frequently assessed teaching practices by NEE school districts, meaning that many districts prioritized improving these teaching practices.

2.1.1 Cognitive engagement (CE)

In the NEE system, Cognitive Engagement refers to active mental involvement by students in learning activities, such as meaningful processing, strategy use, concentration, and metacognition (Wang et al., 2014; Fredricks et al., 2004; Wang & Degol, 2014). It predicts students’ well-being and academic achievement (Li & Baker, 2018; Fredricks et al., 2004; Metallidou & Vlachou, 2007; Pietarinen et al., 2014; Reyes et al., 2012). Teachers can promote CE by using advance organizers, KWL (Know, Want, Learned) charts, share-outs, shoulder-partner talk, and authentic examples, as well as by connecting instruction/activities with students’ lives, showing relevance, presenting a puzzling problem, and inviting responses from all students.

2.1.2 Problem solving and critical thinking (PCT)

In the NEE system, Problem Solving and Critical Thinking refer to students skillfully applying, analyzing, synthesizing, and evaluating information to reach a conclusion or solve a problem (McCormick et al., 2015). PCT predicts students’ achievement across age groups and subject areas (Fong et al., 2017; Giancarlo et al., 2004; Von Secker & Lissitz, 1999; Yu et al., 2010). Teachers can promote PCT by having students explain or justify their thinking, evaluate others’ thinking, formulate challenging questions, make predictions, develop creative solutions, determine what makes an argument valid, assess possible solutions, categorize problems, and map concepts (Wirkala & Kuhn, 2011).

2.2 Principal training

Principals are required to attend a 3-day face-to-face certification training before they are given access to the NEE system. Principals’ training focuses on how to use the NEE-CO rubric for accurate and consistent scoring as well as how to provide constructive feedback based on classroom observations. Principals return annually for a one-day, face-to-face re-certification training. Principals are given an end-of-training video-based test of scoring accuracy. Those who receive low scores can still conduct in-field observations, but they are required to use additional training materials and are provided with in-field support.

The training follows three approaches to increase accuracy of observation ratings (Chafouleas, 2011; Woehr & Huffcutt, 1994). First, in the “rater error” training approach, raters learn to recognize and avoid leniency and halo errors and to use the full scale. Raters are trained to begin with a rating of “3” and then only move up or down the scale if the evidence clearly justifies doing so. Second, in the “performance dimension” training approach, raters learn to understand specific teaching practices through didactic research review and discussion. Third, in the “practice-with-feedback” training approach, raters watch and rate selected videos of authentic classes. They then share their ratings and evidence in small-group and whole-group discussions. Discrepancies are then discussed with trainer guidance. A previous study found that trained principals using the NEE-CO rubric successfully differentiate among teachers (Jones & Bergin, 2019).

3 Method

3.1 Participants

We focused on districts where students rated teachers on CE and/or PCT in the 2017–2018 and 2018–2019 school years. We used two separate samples based on student survey outcomes (CE or PCT) with 617 overlapping cases. The CE sample included 81 schools, 1010 teachers, and 43,318 student surveys. The PC sample included 103 schools, 1214 teachers, and 50,132 student surveys. Teachers’ scores are the mean aggregation of their students’ ratings. In smaller buildings, teachers were typically rated by all of their students, but due to logistical issues, not all students in larger buildings rated their teachers. In those cases, student raters were randomly selected from a set of students that was determined by building leaders.

Participating districts in Missouri were diverse, serving both high- and low-income students in urban, suburban, and rural areas. Across the state, 73.0% of students were non-Hispanic White; 50.0% were eligible for free/reduced-priced lunch; and the mean proficiency rate on the English language arts (ELA) and math state tests was 45.3%. Our sample schools were representative of Missouri schools with 78.6% of students being non-Hispanic white and 37.4% of students having FRPL status, and the average proficiency rates in ELA and mathematics were 50.6%, which is slightly better than the state average.

3.2 Procedures and measures

3.2.1 Feedback characteristics—teacher survey

NEE administers the teacher surveys (NEE-TS) online at the end of each school year. Teachers were able to access the survey when they signed into the online NEE portal. They were notified at the beginning of the survey that their responses were “completely anonymous and no one in the district can view individual responses from any teacher.”

NEE-TS asks teachers about their perceptions of school leadership. Items are based on the ten Professional Standards for Educational Leaders (PSEL; Reston, 2015) and the research literature on the characteristics of effective feedback (e.g., Cherasaro et al., 2015; Cherasaro et al., 2016; Feeney, 2007; Kraft et al., 2018; Ovando, 2005; Reinhorn et al., 2017; Wayne et al., 2016).

Feedback characteristics were measured by five items on NEE-TS (Cronbach’s alpha = 0.92) and rated on a four-point scale from “0” (strongly disagree) to “3” (strongly agree). Teachers’ responses to these five items were included in the models separately. The items include “This principal typically provides me with face-to-face feedback within two working days of observing my classroom;” “This principal provides specific feedback to me regarding ways my teaching can improve (i.e., focused, detailed, concrete);” “This principal provides useful and relevant feedback to me regarding ways my teaching can improve;” “This principal provides specific feedback to me regarding areas of strength in my teaching (i.e., focused, detailed, concrete);” and “This principal provides useful and relevant feedback to me regarding areas of strength in my teaching.”

3.2.2 Quality of instructional practices—student survey

NEE administers the student surveys (TESS) online at the end of each school year to students in 4th through 12th grades. The online survey interface was designed to be intuitive and easy for students to use. Students complete the survey on internet-enabled devices during a specified time window. An adult other than the evaluated teacher administers the survey using standard scripts provided by NEE. Students are assured that their responses are anonymous and voluntary. Although NEE is not able to collect information on incomplete surveys, very few students refuse to evaluate their teachers according to principals. Students were encouraged to ask questions or request definitions of difficult words, but to avoid possibly influencing students’ responses the proctors were instructed not to interpret any survey items for the students.

The TESS is based on the same InTASC standards as the NEE-CO rubric and includes 25 teaching practices observable by students. Teaching effectiveness for cognitive engagement (CE) was measured by four items (Cronbach’s alpha = 0.91) on a 4-point scale from 0 (not at all true) to 3 (very true). Higher scores indicate the students report higher quality teaching. The items include “this teacher expects us to think a lot and concentrate in this class,” “this teacher’s lessons make us think deeply,” “this teacher’s lessons make us think the whole class time,” and “this teacher wants us to ask questions during lessons.” Teaching effectiveness for PCT was also measured by four items (Cronbach’s alpha = 0.90). The items include “this teacher waits a while before letting us answer questions, so we have time to think,” “this teacher makes us compare different ideas or things,” “this teacher makes us use what we learn to come up with ways to solve problems,” and “this teacher asks ‘how?’ and ‘why?’ questions to make us think more.” Previous research shows that CE and PCT can be distinguished with TESS by 4th graders and older (Tsai et al., 2022) and both predict students’ performance on state proficiency tests (Li, 2022). Moreover, neither has shown a biased measure of illusory halo or teacher-student relationships (Li et al., 2022).

3.2.3 Covariates

Teachers’ general teaching effectiveness and personal attributes could be correlated with both feedback quality and their CE and PCT ratings, which could introduce bias into our estimation (Frank, 2000). Hunter and Springer (2022) noted that it is plausible that principals adjust their feedback quality for teachers with different backgrounds. It is also plausible that teachers perceive the feedback differently based on their teaching experiences. To mitigate this concern, we included TESS scores in the 2017–2018 school year (i.e., pre-measure) as a control for such potential confounders to reduce potential bias. Pre-measures are regarded as one of the most powerful ways to control for unobservables in observational studies (Cook et al., 2008). Additional covariates include teachers’ years of experience, grade level taught, and a binary indicator of whether a teacher taught a subject tested by state-standardized tests (i.e., mathematics, English language arts, science, or social studies).

An additional covariate is principals’ support for professional development (PD). Support for PD may play a role in improving teachers’ instructional practices and shaping their perceptions of feedback quality (Kim et al., 2019). We used a latent factor score calculated based on four items from the teacher survey that measure principals’ support for comprehensive professional development for staff using a four-point scale from “0” (strongly disagree) to “3” (strongly agree). The survey items include “This principal provides or locates resources I need for my teaching;” “This principal provides me with valid and meaningful professional development opportunities;” “This principal monitors the application of professional development in my instruction;” and “This principal knows what professional development I need” (Cronbach’s alpha = 0.89).

Finally, we included school characteristics because teacher evaluation implementation can be affected by school context (Cohen et al., 2020; Marsh et al., 2017). Specifically, we included the percentage of students eligible for free or reduced-price lunch, total student enrollment, the percentage of White students, and the percentage of students with Individualized Education Programs (IEP) at each school. We approximated students’ overall achievement levels using their proficiency rate on the Missouri Assessment Program (i.e., the average percentage of students at or above the proficiency level at the school). This data is school level because the student survey is anonymous; thus, we did not have data on student- or classroom-level demographics.

3.3 Analysis

We used two-level hierarchical linear modeling (HLM) where teachers are nested in school buildings. Student level was not included, as CE and PCT measures were aggregated at the teacher level. We ran the analysis for the CE and PCT samples separately. We analyzed two models beginning with an unconditional model, which only included the outcome variables and the residuals. The second set of analyses included feedback quality items one by one at the individual teacher level. In all models, teachers’ responses to five survey items were included separately.

$$\begin{array}{*{20}l} {{\text{TeachingPractice}}\_19_{ij} = \beta_{0j} + \beta_{1j} *\left( {{\text{FeedbackQuality}}_{{{\text{ij}}}} } \right) + \beta_{2j} *\left( {{\text{PD}}_{ij} } \right) + \beta_{3j} *\left( {{\text{TeachingPractice}}\_{18}_{ij} } \right) + \beta_{4j} *\left( {{\text{TeachingGrade}}_{ij} } \right) + \beta_{5j} *\left( {{\text{TestedSub}}_{ij} } \right) + \beta_{6j} *\left( {{\text{Exp}}_{ij} } \right) + r_{ij} } \\ {\beta_{0j} = \gamma_{00} + \gamma_{01} *\left( {{\text{FRL}}_{j} } \right) + \gamma_{02} *\left( {{\text{Enrollment}}_{j} } \right) + \gamma_{03} *\left( {{\text{WhitePCT}}_{j} } \right) + \gamma_{04} *\left( {{\text{ProfRate}}_{j} } \right) + \gamma_{05} *\left( {{\text{IEPRate}}_{j} } \right) + \gamma_{06} *\left( {{\text{FeedbackQuality}}_{j} } \right) + u_{0j} } \\ \end{array}$$

where

TeachingPractice_19_ij:: TESS Teaching practice scores in CE or PCT of teacher i in school j received in 2018–2019 school year, which are factor scores based on the means of students’ responses
TeachingPractice_18_ij:: TESS Teaching practice scores in CE or PCT of teacher i in school j received in 2017–2018, which are factor scores based on the means of students’ responses (i.e., pre-measure)
FeedbackQuality_ij:: Teachers’ responses to the five survey items regarding feedback quality in 2017–2018 (included in the models separately)
PD_ij:: NEE-TS Factor scores of four professional development support items
TeachingGrade_ij:: The mean grade level reported by students minus four so that the 4th grade, the lowest grade in the samples, was recoded as zero, and consequently the intercepts represent the ones for teachers rated by 4th graders
TestedSub_ij:: A binary variable indicating whether the teacher taught one of the four subjects tested by the state-standardized tests, i.e., ELA, mathematics, social studies, or science
Exp_ij:: Teachers’ reported years of teaching experience
FRL_j:: The school-level percentage of students eligible for free-or-reduced-priced lunch in 2019, provided by the Missouri Department of Elementary and Secondary Education (DESE). Missing school data in the 2019 DESE report were imputed using the most recent available data from previous years
Enrollment_j:: The number of K-12 enrollment of schools obtained from DESE
WhitePCT_j:: The school-level percentage of White students obtained from DESE. Missing school data in 2019 DESE report were imputed using the most recent available data from previous years
ProfRate_j:: The average of school-level ELA and mathematics proficiency rates of the state test (Missouri Assessment Program) obtained from DESE
IEPRate_j:: School-level incident rate of IEPs obtained from DESE
FeedbackQuality_ij:: School means of teachers’ responses to five principal feedback quality items
r_ij:: Level 1 residuals
u_0j:: Level 2 residuals

All Level 2 variables are grand mean centered. The parameter tests were based on cluster-robust standard errors with HLM 8, which can adjust biases due to observation dependence within clusters (Esarey & Menger, 2019). Table 1 summarizes the descriptive statistics of the variables that we included in the analysis for CE and PCT samples, respectively. The two samples of teachers are largely comparable. Teachers in the PCT sample tend to work at slightly smaller schools with slightly more white students and students with IEPs.

Table 1 Descriptive statistics

Full size table

4 Results

Before we move on to our main HLM model, we used unconditional models for the outcome variable—teaching effectiveness measured at the end of the 2018–2019 school year for the CE and PCT samples (Table 2). For both samples, level 2 (school level) random effects were significant, and ICCs (interclass correlation coefficients) were 0.120 and 0.141, respectively, which justifies including level 2 to analyze the outcomes.

Table 2 Unconditional models

Full size table

Tables 3 and 4 present the findings from the CE and PCT samples, respectively. In each table, models 1 through 5 only included individual survey items one by one.

Table 3 The association between feedback quality and instructional practices (CE sample)

Full size table

Table 4 The association between feedback quality and instructional practices (PCT sample)

Full size table

For both the CE and PCT samples, some aspects of teachers’ perceived feedback quality were related to teachers’ instructional practices after controlling for their pre-measures and other individual teacher- and school-level characteristics. Specifically, for both outcomes, feedback focused on strengths had positive and significant associations with students’ ratings of teachers’ instructional practices. For CE sample, when a teacher perceived that their principals’ feedback is specific regarding their strength in teaching, the teacher tended to involve more students in learning activities. This association is relatively stronger ($\beta =.12$) than one between teachers’ perceptions of feedback being useful and relevant regarding strengths and their CE practice $\left(\beta =.09\right).$ Similarly, there were significant associations between teachers’ perceptions of feedback being specific, useful, and relevant regarding strength in teaching and how much students actively involved in solving problems using various strategies and critical thinking during the class. Again, teachers’ perceptions of feedback being specific have a stronger relationship with the practice $\left(\beta =.15\right)$ compared to their perceptions of feedback being useful and relevant regarding strength in teaching $\left(\beta =.08\right)$. Another important finding is that for the PCT sample, feedback focused on improving teaching also was significantly associated with instructional practice $\left(\beta =.10\right)$. Contrary to our expectations based on previous studies, face-to-face and immediate feedback and useful and relevant feedback regarding improving teaching were not associated with teachers’ instructional practices.

As expected, the pre-measure of teaching effectiveness had a strong association with the outcome measures. Other covariates at the individual teacher level also had significant association with the outcome variables. Teachers who taught higher grades or a subject tested on the state exam, and had more experience, tended to receive higher student ratings of instruction quality. In contrast, none of the school-level covariates were significant in any of the models such as percentage of students who are White, have an IEP, or are eligible for free and reduced lunch price.

5 Discussion

Teacher evaluation is a potentially powerful tool for improving teaching effectiveness. Therefore, it is logical that evaluation policies have been at the center of educational reform efforts. However, research suggests that the effects of implementation of teacher evaluation systems have been mixed (Goldhaber, 2015; Stecher et al., 2018; Taylor & Tyler, 2012). There are likely multiple explanations for the uneven effects of teacher evaluation on teaching effectiveness, but our findings suggest that variation in the quality of principals’ feedback following classroom observations may be one explanation. We found that some characteristics of feedback are associated with instruction quality. Based on our theoretical framework, we focused on the five characteristics of feedback that are likely to shape the recipients’ response to feedback: (1) specific regarding strengths, (2) ways to improve teaching, (3) useful and relevant regarding strengths, (4) ways to improve teaching, and (5) face-to-face and immediate.

We found that specific (i.e., focused, detailed, concrete) feedback that focused on existing strengths of teachers was more strongly associated with instruction quality than specific feedback that focused on ways to improve teaching. In addition, teachers’ perceptions of whether feedback was relevant and useful when focused on strengths were more strongly associated with instruction quality than when focused on ways to improve teaching. These findings are consistent with previous studies that support the effects of positive feedback (e.g., Kinicki et al., 2004; Scheeler et al., 2004; Sleiman et al., 2020; Thurlings et al., 2013). That is, teachers were more likely to engage in high-quality instruction after receiving confirmation that they were doing a good job, as it raised their expectancies for success.

Another key finding is that specific feedback regarding ways to improve teaching was significantly associated with higher ratings of the use of problem-solving and critical-thinking instructional practices, although there was no association with cognitive engagement instructional practices. That is, the effect of one characteristic of effective feedback varied by teaching practice. While it is logical that specific suggestions from principals on how to improve in any teaching practice might be effective, it is most likely that it would have an effect on difficult instructional practices. Promotion of critical thinking is a complex, advanced instructional practice that is challenging to implement consistently in typical classrooms (Fox, 1962; Van der Lans et al., 2018). Of course, the promotion of critical thinking is not always appropriate because there are times when students should be practicing and over-learning skills that are foundational to critical thinking, yet this challenging teaching practice is too rare in classrooms (Willingham, 2008).

Teachers’ perceptions of whether feedback was useful and relevant regarding ways to improve were not related to the quality of instruction in any aspect. A plausible explanation is that even if a teacher believes feedback is useful and relevant, feedback related to their weakness is more challenging to translate into practice. This obstacle to benefiting from negative feedback, compared to positive, has been documented in feedback research in other non-educational settings (Audia & Locke, 2003). Another plausible explanation, aligned with our hypotheses and theoretical framework (Fig. 1), is that useful and relevant feedback regarding ways to improve may not have been as effective in terms of answering the questions “where am I going, how am I doing, where do I go next,” compared to specific feedback regarding ways to improve. Certainly, we need more research on these nuanced aspects of feedback.

Our findings provide important practical implications for training principals as evaluators. First, teachers’ perceptions of feedback quality do predict how well they teach based on aggregated student perceptions. Thus, providing high-quality feedback should be a priority among principals. Second, principals’ focus on teachers’ strengths is more likely to improve their instruction quality than principals’ focus on ways to improve teaching. Thus, principals may want to prioritize feedback on teachers’ strengths, but also add specific feedback on ways to improve teaching when evaluating especially complex, challenging-to-implement instructional practices.

An unexpected finding was that face-to-face and immediate feedback was not associated with higher ratings of either teaching practice. In contrast, previous research suggested these are useful characteristics of feedback (Hunter & Springer, 2022; Kraft & Christian, 2019). One explanation for these conflicting results is that our study used student ratings to measure instruction quality whereas most previous studies used teacher report which may be influenced by source bias. This is important because (as discussed above) teachers who have positive reports of their evaluation may perceive both the quality of feedback and the impact of the evaluation positively, despite objective feedback quality or improvement in instruction quality.

Another explanation is that in the Network for Educator Effectiveness, every teacher receives feedback multiple times per year. When the evaluation context is so normative and familiar, having feedback be face-to-face and immediate may be less important. This finding has practical implications because it may be more cost-effective to communicate feedback via written notes and it gives busy principals more flexibility for scheduling feedback sessions. Further research is needed, but our study expands the current discourse around feedback quality by adding a critical source of data—student ratings—about instruction quality.

5.1 Limitations

The findings from this study have some limitations. First, we focused on teachers’ perceptions of feedback but did not have access to the content of the feedback, which presumably also influences teaching improvement. However, given that the perceptions of the recipients of feedback affect whether they take actions on that feedback, we think this analysis was an important first step.

Second, we cannot fully rule out the possibility of a potential selection bias because the districts in the samples are self-selected based on their simultaneous use of both the TESS and the NEE-TS. These districts may have more resources and/or place a higher priority on teacher growth and student achievement compared to other districts.

A third limitation of the study is that we focused on two instructional practices: CE and PCT. This decision was made based on previous literature that found these teaching practices to affect student learning and based on the fact that many districts focused on these instructional practices. However, it is possible that feedback quality affects other dimensions of instruction differently, such as teacher-student relationships, classroom management, and the use of motivational strategies. Examining the effects of feedback on different aspects of instruction quality will deepen our understanding of the teacher evaluation feedback process.

6 Conclusion

In the current study, we found three characteristics of principals’ evaluative feedback are associated with teachers’ increased instruction quality, as measured by student ratings, after controlling for previous instruction quality. Focusing on strengths, but not ways to improve teaching, predicted higher quality of one instructional practice—promoting cognitive engagement. However, focusing on both strengths and ways to improve teaching predicted higher quality of another instructional practice—promoting critical thinking. Promotion of critical thinking is an infrequent, complex teaching practice that may benefit more from information on how to improve. Finally, we found that feedback did not need to be immediate and face-to-face to predict improvement for either instructional practice. Both evaluating teachers and training principals to be effective evaluators are resource-intensive activities. Thus, it is important for educational researchers and policymakers to better understand what constitutes effective evaluative feedback and refine that feedback process so as to maximize benefits for teachers and their students.

Data availability

The datasets analyzed during the current study are not publicly available because the data were collected from students and teachers under the agreement between the Network for Educator Effectiveness (neeadvantage.com) and its member districts. The authors of the study can answer questions about the data or provide further analysis of the data on reasonable request.

References

Ammons, R. B. (1956). Effects of knowledge of performance: A survey and tentative theoretical formulation. The Journal of General Psychology, 54(2), 279–299.
Article Google Scholar
Alvero, A. M., Bucklin, B. R., & Austin, J. (2001). An objective review of the effectiveness and essential characteristics of performance feedback in organizational settings (1985–1998). Journal of Organizational Behavior Management, 21(1), 3–29.
Article Google Scholar
Audia, P. G., & Locke, E. A. (2003). Benefiting from negative feedback. Human Resource Management Review, 13(4), 631–646.
Article Google Scholar
Bandura, A. (1991). Social theory of self-regulation. Organizational Behavior and Human Decision Processes, 50, 248–287.
Article Google Scholar
Briggs, D., & Domingue, B. (2011). A review of the value-added analysis underlying the effectiveness rankings of Los Angeles Unified School District teachers by the Los Angeles Times. National Education Policy Center, University of Colorado.
Google Scholar
Chafouleas, S. M. (2011). Direct behavior rating: A review of the issues and research in its development. Education & Treatment of Children, 34(4), 575–591.
Article Google Scholar
Cherasaro, T. L., Brodersen, R. M., Yanoski, D. C., Welp, L. C., & Reale, M. L. (2015). The examining evaluator feedback survey. REL 2016–100. Regional Educational Laboratory Central.
Cherasaro, T. L., Brodersen, R. M., Reale, M. L., & Yanoski, D. C. (2016). Teachers’responses to feedback from evaluators: What feedback characteristics matter? (REL 2017–190). U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory Central. Retrieved from http://ies.ed.gov/ncee/edlabs
Cohen, J., Loeb, S., Miller, L. C., & Wyckoff, J. H. (2020). Policy implementation, principal agency, and strategic action: Improving teaching effectiveness in New York City middle schools. Educational Evaluation and Policy Analysis, 42(1), 134–160.
Article Google Scholar
Cook, T. D., Shadish, W. R., & Wong, V. C. (2008). Three conditions under which experiments and observational studies produce comparable causal estimates: New findings from within-study comparisons. Journal of Policy Analysis and Management: The Journal of the Association for Public Policy Analysis and Management, 27(4), 724–750.
Article Google Scholar
Council of Chief State School Officers. (2011, April). Interstate teacher assessment and support consortium (InTASC) model core teaching standards: A resource for state dialogue. Author.
Curtis, R., & Wiener, R. (2012). Means to an end: A guide to developing teacher evaluation systems that support growth and development. Aspen Institute.
Google Scholar
Danielson, C. (1996). Enhancing professional practice: A framework for teaching. Association for Supervision and Curriculum Development.
Google Scholar
Danielson, C., & McGreal, T. L. (2000). Teacher evaluation to enhance professional practice. Association for Supervision and Curriculum Development.
Google Scholar
Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher evaluation. Phi Delta Kappan, 93(6), 8–15.
Article Google Scholar
Darling-Hammond, L., Bae, S., Cook-Harvey, C. M., Lam, L., Mercer, C., Podolsky, A., & Stosich, E. L. (2016). Pathways to new accountability through the Every Student Succeeds Act. Learning Policy Institute.
Book Google Scholar
Delvaux, E., Vanhoof, J., Tuytens, M., Vekeman, E., Devos, G., & Van Petegem, P. (2013). How may teacher evaluations have an impact on professional development? A multilevel analysis. Teaching and Teacher Education, 36, 1–11.
Article Google Scholar
Doherty, K. M., & Jacobs, S. (2013). State of the states 2013: Connect the dots--using evaluations of teacher effectiveness to inform policy and practice. National Council on Teacher Quality.
Donaldson, M. (2021). Multidisciplinary perspectives on teacher evaluation: Understanding the research and theory. Routledge.
Google Scholar
Donaldson, M. L., & Papay, J. P. (2015). Teacher evaluation for accountability and development. In H. F. Ladd, & M. E. Goertz (Eds.), Handbook of research in education, finance and, policy (2nd ed., pp. 174–193). Routledge.
Downer, J. T., Stuhlman, M., Schweig, J., Martínez, J. F., & Ruzek, E. (2015). Measuring effective teacher-student interactions from a student perspective: A multi-level analysis. The Journal of Early Adolescence, 35(5–6), 722–758.
Article Google Scholar
Duff, M., & Wohlstetter, P. (2019). Negotiating intergovernmental relations under ESSA. Educational Researcher, 48(5), 296–308.
Article Google Scholar
Eccles (Parsons), J. S., Adler, T. F., Futterman, R., Goff, S. B., Kaczala, C. M., Meece, J. L., et al. (1983). Expectancies, values, and academic behaviors. In J. T. Spence (Ed.). Achievement and achievement motivation (pp. 75–146). W. H. Freeman.
Egalite, A. J., Fusarelli, L. D., & Fusarelli, B. C. (2017). Will decentralization affect educational inequity? The Every Student Succeeds Act. Educational Administration Quarterly, 53(5), 757–781.
Article Google Scholar
Esarey, J., & Menger, A. (2019). Practical and effective approaches to dealing with clustered data. Political Science Research and Methods, 7(3), 541–559.
Article Google Scholar
Feeney, E. J. (2007). Quality feedback: The essential ingredient for teacher success. The Clearing House: A Journal of Educational Strategies, Issues and Ideas, 80(4), 191–198.
Article Google Scholar
Feldlaufer, H., Midgley, C., & Eccles, J. S. (1988). Student, teacher, and observer perceptions of the classroom environment before and after the transition to junior high school. The Journal of Early Adolescence, 8(2), 133–156.
Article Google Scholar
Fong, C. J., Kim, Y., Davis, C. W., Hoang, T., & Kim, Y. W. (2017). A meta-analysis on critical thinking and community college student achievement. Thinking Skills and Creativity, 26, 71–83.
Article Google Scholar
Fox, R. B. (1962). Difficulties in developing skill in critical thinking. The Journal of Educational Research, 55(7), 335–337.
Article Google Scholar
Frank, K. A. (2000). Impact of a confounding variable on a regression coefficient. Sociological Methods & Research, 29(2), 147–194.
Article Google Scholar
Fraser, B. J., & O’Brien, P. (1985). Student and teacher perceptions of the environment of elementary school classrooms. The Elementary School Journal, 85(5), 567–580.
Article Google Scholar
Fredricks, J. A., Blumenfeld, P. C., & Alison, H. P. (2004). School engagement: Potential of the concept, state of the evidence. Review of Educational Research, 74(1), 59–109.
Article Google Scholar
Garet, M. S., Wayne, A. J., Brown, S., Rickles, J., Song, M., & Manzeske, D. (2017). The impact of providing performance feedback to teachers and principals. NCEE 2018–4001. National Center for Education Evaluation and Regional Assistance.
Giancarlo, C. A., Blohm, S. W., & Urdan, T. (2004). Assessing secondary students’ disposition toward critical thinking: Development of the California measure of mental motivation. Educational and Psychological Measurement, 64(2), 347–364.
Article Google Scholar
Glickman, C. D. (2002). Leadership for learning: How to help teachers succeed. ASCD.
Google Scholar
Goldhaber, D. (2015). Exploring the potential of value-added performance measures to affect the quality of the teacher workforce. Educational Researcher, 44(2), 87–95.
Article Google Scholar
Hallinger, P., Heck, R. H., & Murphy, J. (2014). Teacher evaluation and school improvement: An analysis of the evidence. Educational Assessment, Evaluation and Accountability, 26(1), 1–24.
Article Google Scholar
Halverson, R., Kelley, C., & Kimball, S. (2004). Implementing teacher evaluation systems: How principals make sense of complex artifacts to shape local instructional practice. Educational administration, policy, and reform: Research and measurement (pp. 153–188).
Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112.
Article Google Scholar
Hill, H., & Grossman, P. (2013). Learning from teacher observations: Challenges and opportunities posed by new teacher evaluation systems. Harvard Educational Review, 83(2), 371–384.
Article Google Scholar
Hill, H. C., Kapitula, L., & Umland, K. (2011). A validity argument approach to evaluating teacher value-added scores. American Educational Research Journal, 48(3), 794–831.
Article Google Scholar
Hunter, S. B., & Springer, M. G. (2022). Critical feedback characteristics, teacher human capital, and early-career teacher performance: A mixed-methods analysis. Educational Evaluation and Policy Analysis, 44(3), 380–403.
Ilgen, D. R., Fisher, C. D., & Taylor, M. S. (1979). Consequences of individual feedback on behavior in organizations. Journal of Applied Psychology, 64(4), 349.
Article Google Scholar
Jones, E., & Bergin, C. (2019). Evaluating teacher effectiveness using classroom observations: A Rasch analysis of the rater effects of principals. Educational Assessment, 24(2), 91–118. https://doi.org/10.1080/10627197.2018.1564272
Kane, T. J., & Staiger, D. O. (2012). Gathering feedback for teaching: Combining high-quality observations with student surveys and achievement gains. Research Paper. Measures of Effective Teaching Project. Bill & Melinda Gates Foundation.
Kim, J., Sun, M., & Youngs, P. (2019). Developing the “will”: The relationship between teachers’ perceived policy legitimacy and instructional improvement. Teachers College Record, 121(3), 1–44.
Kinicki, A. J., Prussia, G. E., Wu, B. J., & McKee-Ryan, F. M. (2004). A covariance structure analysis of employees’ response to performance feedback. Journal of Applied Psychology, 89(6), 1057–1069.
Article Google Scholar
Kluger, A. N., & DeNisi, A. (1996). The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin, 119(2), 254.
Article Google Scholar
Kraft, M. A., Blazar, D., & Hogan, D. (2018). The effect of teaching coaching on instruction and achievement: A meta-analysis of the causal evidence. Review of Educational Research, 88(4), 547–588.
Article Google Scholar
Kraft, M. A. & Christian, A. (2019). In search of high-quality evaluation feedback: An administrator training field experiment (EdWorkingPaper No.19–62). Retrieved from Annenberg Institute at Brown University: http://edworkingpapers.com/ai19-62
Lavigne, A. L., & Good, T. L. (2015). Improving teaching through observation and feedback: Beyond state and federal mandates. Routledge.
Book Google Scholar
Lechermeier, J., & Fassnacht, M. (2018). How do performance feedback characteristics influence recipients’ reactions? A state-of-the-art review on feedback source, timing, and valence effects. Management Review Quarterly, 68(2), 145–193.
Article Google Scholar
Li, X. (2022). Authentic teacher evaluation scores and student achievement: Comparing principals’ and students’ ratings AEFP 47th Annual Conference, Denver, Colorado.
Li, Q., & Baker, R. (2018). The different relationships between engagement and outcomes across participant subgroups in Massive Open Online Courses. Computers & Education, 127, 41–65.
Li, X., et al. (2022). Positive teacher-student relationships may lead to better teaching. Learning and Instruction, 80, 101581. https://doi.org/10.1016/j.learninstruc.2022.101581
Little, O., Goe, L., & Bell, C. (2009). A practical guide to evaluating teacher effectiveness. National Comprehensive Center for Teacher Quality.
Google Scholar
Locke, E. A. (1968). Toward a theory of task motivation and incentives. Organizational Behavior and Human Performance, 3(2), 157–189.
Article Google Scholar
Locke, E. A., & Latham, G. P. (1990). A theory of goal setting & task performance. Prentice Hall.
Google Scholar
Marsh, H. W., Dicke, T., & Pfeiffer, M. (2019). A tale of two quests: The (almost) non-overlapping research literatures on students’ evaluations of secondary-school and university teachers. Contemporary Educational Psychology, 58, 1–18.
Article Google Scholar
Marsh, J. A., Bush-Mecenas, S., Strunk, K. O., Lincove, J. A., & Huguet, A. (2017). Evaluating teachers in the Big Easy: How organizational context shapes policy responses in New Orleans. Educational Evaluation and Policy Analysis, 39(4), 539–570.
Article Google Scholar
Marzano, R. J., Pickering, D., & Pollock, J. E. (2001). Classroom instruction that works: Research-based strategies for increasing student achievement. Association for Supervision and Curriculum Development.
Google Scholar
Marzano, R. J., & Toth, M. D. (2013). Teacher evaluation that makes a difference: A new model for teacher growth and student achievement. ASCD.
Google Scholar
McCormick, N. J., Clark, L. M., & Raines, J. M. (2015). Engaging students in critical thinking and problem solving: A brief review of the literature. Journal of Studies is Education, 5(4), 100–113.
Article Google Scholar
Metallidou, P., & Vlachou, A. (2007). Motivational beliefs, cognitive engagement, and achievement in language and mathematics in elementary school children. International Journal of Psychology, 42(1), 2–15.
Article Google Scholar
Ovando, M. N. (2005). Building instructional leaders’ capacity to deliver constructive feedback to teachers. Journal of Personnel Evaluation in Education, 18, 171–183.
Phipps, A. R., & Wiseman, E. A. (2021). Enacting the rubric: Teacher improvements in windows of high-stakes observation. Education Finance and Policy, 16(2), 283–312.
Article Google Scholar
Pietarinen, J., Soini, T., & Pyhältö, K. (2014). Students’ emotional and cognitive engagement as the determinants of well-being and achievement in school. International Journal of Educational Research, 67, 40–51.
Article Google Scholar
Polikoff, M. S. (2015). The stability of observational and student survey measures of teaching effectiveness. American Journal of Education, 121(2), 183–212.
Article Google Scholar
Reid, D., Galey-Horn, S., & Kim, J. (2020). How states use EESA to support principal preparation, development, and quality. In Youngs, P., Kim, J., & Mavrogorato, M. (Eds.), Exploring principal development and teacher uutcomes: How principals can strengthen instruction, teacher retention, and student achievement (pp. 207–220). Routledge.
Reinhorn, S. K., Johnson, S. M., & Simon, N. S. (2017). Investing in development: Six high-performing, high-poverty schools implement the Massachusetts teacher evaluation policy. Educational Evaluation and Policy Analysis, 39(3), 383–406.
Article Google Scholar
Reston, V. A. (2015). National Policy Board for Educational Administration. Professional Standards for Educational Leaders, American Association of Colleges of Teacher Education.
Reyes, M. R., Brackett, M. A., Rivers, S. E., White, M., & Salovey, P. (2012). Classroom emotional climate, student engagement, and academic achievement. Journal of Educational Psychology, 104(3), 700–712.
Article Google Scholar
Roberson, Q. M., & Stewart, M. M. (2006). Understanding the motivational effects of procedural and informational justice in feedback processes. British Journal of Psychology, 97(3), 281–298.
Article Google Scholar
Rothstein, J. (2010). Teacher quality in educational production: Tracking, decay, and student achievement. Quarterly Journal of Economics, 125(1), 175–214.
Article Google Scholar
Scheeler, M. C., Ruhl, K. L., & McAfee, J. K. (2004). Providing performance feedback to teachers: A review. Teacher Education and Special Education, 27(4), 396–407.
Article Google Scholar
Shaked, H. (2018). Why principals often give overly high ratings on teacher evaluations. Studies in Educational Evaluation, 59, 150–157.
Article Google Scholar
Shute, V. J. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153–189.
Article Google Scholar
Sleiman, A. A., Sigurjonsdottir, S., Elnes, A., Gage, N. A., & Gravina, N. E. (2020). A quantitative review of performance feedback in organizational settings (1998–2018). Journal of Organizational Behavior Management, 40(3–4), 303–332.
Article Google Scholar
Stecher, B. M., Holtzman, D. J., Garet, M. S., Hamilton, L. S., Engberg, J., Steiner, E. D., Robyn, A., Baird, M. D., Gutierrez, I. A., Peet, E. D., Brodziak de los Reyes, I., Fronberg, K., Weinberger, G., Hunter, G. P., & Chambers, J. (2018). Improving teaching effectiveness: Final report: The intensive partnerships for effective teaching through 2015–2016. RAND Corporation.
Book Google Scholar
Steinberg, M. P., & Sartain, L. (2015). Does teacher evaluation improve school performance? Experimental evidence from Chicago’s excellence in teaching project. Education Finance and Policy, 27(4), 793–818.
Google Scholar
Taylor, E. S., & Tyler, J. H. (2012). The effect of evaluation on teacher performance. American Economic Review, 102(7), 3628–3651.
Article Google Scholar
Thorndike, E.L. (1913). Educational psychology. Vol. 1. The original nature of man. Columbia Univ.: New York.
Thurlings, M. C. G., Kreijns, K., Vermeulen, M., Bastiaens, T. J., & Stijnen, P. J. J. (2012). Development of the teacher feedback observation scheme: Evaluating the quality of feedback in peer groups. Journal of Education for Teaching, 38(2), 193–208.
Article Google Scholar
Thurlings, M., Vermeulen, M., Bastiaens, T., & Stijnen, S. (2013). Understanding feedback: A learning theory perspective. Educational Research Review, 9, 1–15.
Article Google Scholar
Toch, T., & Rothman, R. (2008). Rush to judgment: Teacher evaluation in public education. Education Sector Reports. Education Sector.
Tsai, C. L., Bergin, C., & Jones, E. (2022). Students in 4th to 12th grade can distinguish dimensions of teaching when evaluating their teachers: A multilevel analysisof the TESS survey. Educational Studies, 1–16.
Tuma, A. P., Hamilton, L. S., & Tsai, T. (2018). A nationwide look at teacher perceptions of feedback and evaluation systems. Rand Cooperation. Retrieved from https://www.rand.org/content/dam/rand/pubs/research_reports/RR2500/RR2558/RAND_RR2558.pdf
Van der Lans, R. M., Van de Grift, W. J., & Van Veen, K. (2018). Developing an instrument for teacher feedback: Using the Rasch model to explore teachers’ development of effective teaching strategies and behaviors. The Journal of Experimental Education, 86(2), 247–264.
Article Google Scholar
Von Secker, C. E., & Lissitz, R. W. (1999). Estimating the impact of instructional practices on student achievement in science. Journal of Research in Science Teaching, 36(10), 1110–1126.
Article Google Scholar
Wang, M.-T., & Degol, J. (2014). Staying engaged: Knowledge and research needs in student engagement. Child Development Perspectives, 8(3), 137–143.
Article Google Scholar
Wang, Z., Bergin, C., & Bergin, D. A. (2014). Measuring engagement in fourth to twelfth grade classrooms: The Classroom Engagement Inventory. School Psychology Quarterly, 29(4), 517–535.
Wayne, A. J., Garet, M. S., Brown, S., Rickles, J., Song, M., Manzeske, D., & Ali, M. (2016). Early implementation findings from a study of teacher and principal performance measurement and feedback year 1 report. Technical report, American Institutes of Research.
Willingham, D. T. (2008). Critical thinking: Why is it so hard to teach? Arts Education Policy Review, 109(4), 21–32.
Article Google Scholar
Wirkala, C., & Kuhn, D. (2011). Problem-based learning in K–12 education: Is it effective and how does it achieve its effects? American Educational Research Journal, 48(5), 1157–1186.
Article Google Scholar
Woehr, D. J., & Huffcutt, A. I. (1994). Rater training for performance appraisal: A quantitative review. Journal of Occupational and Organizational Psychology, 67(3), 189–205.
Article Google Scholar
Yu, W. F., She, H. C., & Lee, Y. M. (2010). The effects of web-based/non-web-based problem-solving instruction and high/low achievement on students’ problem-solving ability and biology achievement. Innovations in Education and Teaching International, 47(2), 187–199.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Sungshin Women’s University, C836, Soojung Building, 2. Bomun-Ro 34da-Gil, Seongbuk-Gu, Seoul, 02844, Korea
Jihyun Kim
University of Missouri, 121E Townsend Hall, Columbia, MO, 65211, USA
Xintong Li
University of Missouri, 109 Hill Hall, Columbia, MO, 65211, USA
Christi Bergin

Authors

Jihyun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Xintong Li
View author publications
You can also search for this author in PubMed Google Scholar
Christi Bergin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jihyun Kim.

Ethics declarations

Conflicts of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

NEE-CO and TESS are proprietary, which precludes the disclosure of all the items in the observation rubric or surveys.

Jihyun Kim and Xintong Li are co-equal first authors.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kim, J., Li, X. & Bergin, C. Characteristics of effective feedback in teacher evaluation. Educ Asse Eval Acc 36, 201–223 (2024). https://doi.org/10.1007/s11092-024-09434-9

Download citation

Received: 06 January 2023
Accepted: 27 March 2024
Published: 24 April 2024
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11092-024-09434-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Characteristics of effective feedback in teacher evaluation

Abstract

Similar content being viewed by others

Theories of Motivation in Education: an Integrative Framework

Ethical Considerations of Conducting Systematic Reviews in Educational Research

Online learning in higher education: exploring advantages and disadvantages for engagement

1 Performance feedback in general settings

1.1 Performance feedback in teacher evaluation

1.2 Student surveys as a measure of instruction quality

2 Study context: network for educator effectiveness

2.1 Classroom observations and feedback sessions

2.1.1 Cognitive engagement (CE)

2.1.2 Problem solving and critical thinking (PCT)

2.2 Principal training

3 Method

3.1 Participants

3.2 Procedures and measures

3.2.1 Feedback characteristics—teacher survey

3.2.2 Quality of instructional practices—student survey

3.2.3 Covariates

3.3 Analysis

4 Results

5 Discussion

5.1 Limitations

6 Conclusion

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Characteristics of effective feedback in teacher evaluation

Abstract

Similar content being viewed by others

Theories of Motivation in Education: an Integrative Framework

Ethical Considerations of Conducting Systematic Reviews in Educational Research

Online learning in higher education: exploring advantages and disadvantages for engagement

1 Performance feedback in general settings

1.1 Performance feedback in teacher evaluation

1.2 Student surveys as a measure of instruction quality

2 Study context: network for educator effectiveness

2.1 Classroom observations and feedback sessions

2.1.1 Cognitive engagement (CE)

2.1.2 Problem solving and critical thinking (PCT)

2.2 Principal training

3 Method

3.1 Participants

3.2 Procedures and measures

3.2.1 Feedback characteristics—teacher survey

3.2.2 Quality of instructional practices—student survey

3.2.3 Covariates

3.3 Analysis

4 Results

5 Discussion

5.1 Limitations

6 Conclusion

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation