In this article in Educational Researcher, Angela Duckworth (University of Pennsylvania) and David Yeager (University of Texas/Austin) affirm the importance of noncognitive attributes, including:
- Goal-oriented effort through grit, self-control, and growth mindset;
- Healthy social relationships via gratitude, emotional intelligence, and social belonging;
- Sound judgment and decision-making marked by curiosity and open-mindedness.
“Longitudinal research has confirmed such qualities powerfully predict academic, economic, social, psychological, and physical well-being,” say Duckworth and Yeager.
However, they believe the ways of measuring these important attributes are not ready for prime time, and should not be used for consequential decisions about students or schools.
What’s the problem? Duckworth and Yeager have found that each of three methods currently used to measure noncognitive qualities – student self-reports, teacher questionnaires, and performance tasks – has strengths but also significant disadvantages. For self-reporting and questionnaires:
- Teachers and students may read or interpret an item in a way that differs from researcher intent.
- Students or teachers may not be astute or accurate reporters of behaviors, emotions, or motivation.
- Questionnaire scores may not reflect subtle changes over short periods of time.
- The frame of reference (i.e., implicit standards) used when making judgments may differ across students or teachers.
- Faking – students or teachers may provide answers that are desirable but not accurate.
With performance tasks, there’s a different set of problems:
- Researchers may make inaccurate assumptions about underlying reasons for student behavior.
- Tasks that optimize motivation to perform well may not reflect behavior in everyday situations.
- Task performance may be influenced by unrelated competencies (e.g., hand-eye coordination).
- Performance tasks may put students into situations (e.g., doing academic work with distracting video games in view) that they might avoid in real life.
- Scores on sequential administrations may be less accurate (e.g., because of increased familiarity with the task or boredom).
- Task performance may be influenced by aspects of the environment in which it is performed or by physiological state (e.g., time of day, noise in classrooms, hunger, fatigue).
- Scores may be influenced by purely random errors (e.g., a respondent marking the wrong answer).
Duckworth and Yeager give a vivid example of how a teacher’s and a student’s answer to a question might differ. The question: In the last month, how often does this student come to class prepared? The teacher’s thought process:
- Let’s see… I think he didn’t have his homework most days last week. He keeps making excuses. And he almost never brings a pencil.
- Okay, so overall, I would say that he comes to class prepared much less than most fifth graders I’ve taught.
- “Rarely” or “Sometimes” makes sense.
- Hmm… Nobody will see this but researchers. I’ll put down “Rarely.”
The student’s thought process:
- Let’s see… I think I didn’t have my homework a few times last week. But there were reasons why I couldn’t get it done.
- Okay, so overall, I guess I’m pretty good at coming to class prepared. Compared to my friends, I’m pretty good.
- “Sometimes” or “Often” makes sense.
- Hmmm… I guess it’s not too embarrassing to say “Sometimes.”
So the same student’s level of classroom preparation is rated “Rarely” on this question by the teacher and “Sometimes” by the student. Imprecise!
Duckworth and Yeager conclude that current tools for measuring students’ noncognitive attributes are useful for in-school reflection and improving educator practices, but are not precise and reliable enough be used for individual student diagnosis, program evaluation, school accountability, or between-school or within-school over-time comparisons. They conclude with a call for further research and refinement of measurement tools.