Monday, August 17, 2015

Educational Assessment: A Huge Waste of Time and Money?


An educational road trip
Imagine it’s 1980 - no World Wide Web, no cell phones, no GPS. Your child is learning to drive a car. They have to drive from Los Angeles to New York in time to attend an important event that could well influence the course of their future life. How would you help them do it? They’d need a long-range plan, of course – a map with a route marked out on it. But this plan alone wouldn’t get them there – they’d need to actively interpret the directions in the real world – identifying which of the many small streets is the right one to turn on, looking for signs and landmarks to know when to change lanes and prepare to exit the highway, constantly checking to make sure they didn’t take a wrong turn, and figuring out how to get back on track when they inevitably do. They must, in other words, constantly be assessing the situation – determining where they are on the map, where that puts them in relation to the route, and what to do at each moment to stay on track and on schedule.

This driving scenario is analogous to formal education. In this case, the subject matter (arithmetic, world history, etc.) is the map. The curriculum is the route marked out on the map. The student is the driver.  The assessment is the process of tracking location and progress in relation to the route, destination, and schedule.

What’s missing from this picture?

If you are a parent, this scenario might make you feel uneasy. Would you really be ok having your child learn to drive while also following a complex and unfamiliar route across thousands of miles over a number of days with important consequences riding on their timely arrival? (Analogously, would you expect that your child would buckle down and successfully learn to read books or master algebra on their own by June, given that they want to be a writer, carpenter, engineer, doctor, or architect when they grow up?) Probably not. If they had to make the trip by car and they had to do the driving, you’d probably want to send someone along with them – a navigator and guide who knows the route well, can coach them on how to drive safely and skillfully, and looks after their well-being during the trip - making sure they leave on time each morning, get plenty of sleep, and don’t get lost or sidetracked visiting roadside attractions along the way.

In the educational analogy, the navigator is the educational guide.  But not a classroom teacher – this navigator is a personal tutor working with one student.

Imagine that we cannot afford to provide a navigator (personal tutor) for each driver, but that we can allocate one navigator for each fleet of twenty-five cars. These cars are all leaving from different starting cities, at different times, moving at different speeds, with drivers who have different levels of driving experience and skill, and different levels of familiarity with their route.  Nonetheless, the fleet navigator is responsible for seeing that all drivers arrive in New York within the same hour.

In the educational analogy, the fleet navigator is the classroom teacher.  The cities the students start in are their prior knowledge of the subject matter (arithmetic, history, and so on), New York represents the destination – the set of learning objectives that the teacher is expected to help all students achieve by a specific calendar date (such as the end of the school year), and the diverse speeds and routes represent the fact that students come to any class with diverse levels of prior knowledge about the subject matter, different capabilities and limitations with respect to learning, different levels of interest in the topic, and so on. And yet the teacher is still expected to get them all to New York within the same hour.

What does any of this have to do with assessment?

I frequently hear people make statements like this:
“I feel that all this effort on assessment stuff is mostly a huge waste of time and money.”

To borrow a line from the film The Princess Bride:
You keep using that word ["assessment"].  I do not think it means what you think it means. 

When people talk about assessment, they typically seem to be thinking of written tests, and may even have in mind one specific “high-stakes” test. And that is indeed one form of assessment. But assessment, in an educational context, simply means gathering data to figure out where a student is on the map, evaluating where that puts them in relation to the route and schedule, and answering specific questions such as what adjustments to make to keep them on track and on time. 

Assessment can be done with the eyes and ears as well as with a paper test or an electronic GPS-like dashboard. The personal navigator sitting in the car with the student-driver, for example, is constantly assessing the situation using her five senses – looking for road signs, watching what the driver is doing, feeling the acceleration and deceleration of the car, comparing the car’s location against the marked route, and so on. Believe it or not, that’s assessment.  (More specifically, that’s formative assessment.) Another form of assessment is the determination of whether the trip was a success or failure overall – if the child arrives in New York in time for the event, the trip was a success and otherwise it was a failure. (This is an example of summative assessment – in this case, we might call this a “high risk” assessment because the outcome of the assessment correlates with big consequences, for better or worse.)

The fleet navigator (classroom teacher) obviously can’t be in the car with any of the drivers – she has to manage all twenty-five cars for the duration of the trip. But this is 1980, remember – before GPS and cell phones.  So the fleet navigator not only can’t see what every driver is doing inside their cars at any given moment, but she also has no way of tracking precisely where any student’s car is at any given time.  She can’t do anything to help the drivers reach their destination without information about their location and progress – she would effectively be flying blind. Classroom teachers face a very similar challenge - they can't directly observe what's going on in students' heads, and they simply can't teach effectively without good information about where each student is and how they are progressing.

What might we do?

One reasonable strategy would be to set up a series of checkpoints along the main routes.  Drivers check in when they arrive at these checkpoints and that way the fleet navigator can update the map with their approximate locations. If someone fails to check in at the expected time, or if they check in from an alternate location because they cannot find the checkpoint, then the fleet navigator can investigate the problem and decide how to take corrective action to get them back on track.

These checkpoints are analogous to formal educational assessments – including (but certainly not limited to) written tests. The location of a student’s car is analogous to their state of understanding of the subject matter – their progress in the class relative to the curriculum (route) and learning objectives (destination). The checkpoints (formal assessments or tests) help the fleet navigator (classroom teacher) to know much more precisely where each driver (student) is. Importantly, these checkpoints provide early warning – if we have to wait for the child to miss the event in New York (or fail to achieve the learning objectives by the end of the year) to find out if they were on track all along, by then it’s way too late to do anything about it.

The effectiveness of a classroom teacher – like the effectiveness of our fleet navigator – depends critically on the availability of data about individual students.  In addition to the informal assessments teachers are doing constantly using their eyes and ears, formal assessments (including tests) are the checkpoints that provide much of the detailed data about how students are progressing, whether they are on track, and what corrective actions the teacher needs to take.

But why can't teachers just give Friday quizzes and find out all they need to know?

An assessment (quiz, exam, standardized test, etc.) is a measurement instrument - like a ruler, weight scale, or thermometer.  Unlike a ruler, however, which measures things that one can actually see, an assessment is a psychometric ruler - it measures knowledge and skills and other intangible entities of the mind that we can't actually see and that are, in fact, much harder to define than an attribute like length or width. 


Let's ask roughly the same question but in a different domain: "Why do we need to provide engineers and medical doctors with rulers, weight scales, and thermometers to do their work?  Why can't they just create their own to find out all they need to know to do their jobs?"  There are a number of reasons.  Consider calibration, for example. Back in the day people did make and use their own rulers and weights, and they came up with very different measures for the same thing - a major problem if you are paying by the ounce for something, or if you are building a bridge from two ends that should meet in the middle, or if a medical diagnosis depends on the value being measured (body temperature, for instance).


That's not quite the same as the educational scenario, though. Since we can't see the invisible knowledge constructs we are trying to measure in education, we'd have to actually ask "Why can't engineers and medical doctors just create their own measurement instruments while blindfolded and wearing heavy gloves so they can neither see nor feel the thing they are trying to measure?"


Imagine two math teachers in adjacent classrooms each make up their own 10-question math quiz for the same instructional unit.  I've drawn a couple of homemade rulers below to illustrate what that might look like. Obviously, there are major problems with these measurement instruments. Let's consider just a few of the more glaring ones.



Problems with consistency of measurements
Looking at the first ruler, for example, the difference between a score of 1 and 2 is small compared to the difference between a score of 2 vs. 3.  The evenness of the numbers masks underlying unevenness in student understanding, which can lead to invalid educational conclusions and actions.

Problems with interpreting scores
The second ruler is measuring two different dimensions and adding them together. That would be like adding someone's height in feet to their hair length in inches and reporting the resulting number as a score.  How are we to interpret such a score? As a common educational example: when we include printed word problems in our math quiz, a child who struggles with reading may be unable to complete any of them - not because they don't understand the math but because they can't fluently read the problems.  Their score doesn't reflect their math competency - it's a combined math plus reading score. 

Problems with comparing performance across students
Now compare the two rulers.  How are we to compare the performance of students across the two math classes? For example, imagine a student in each class scores a 4 on their version of the quiz.  What can we say about the performance of the two students? They earned the same score - do they have the same math competency? Certainly not. If you look at the length marked by the 4's, then evidently the second student scored about twice as much as the first student. The numbers are not comparable, but they invite interpretation, evaluation, and decision-making as if they mean something specific and comparable.  This is a very real problem that colleges face, for example, when looking at student transcripts.  Looking at two applicants from different states, both having a high school GPA of 3.3, how are the admissions officers to compare them? They really can't.  Love it or hate it, that's one reason the SAT is so widely used - unlike GPA, standardized tests like the SAT provide a common ruler for measuring student competency in specific domains like math and language so the scores can be compared in meaningful ways across students, classes, and schools.

So, is investment in educational assessments a huge waste of time and money?

There is certainly room for healthy debate about whether any particular assessment is valid and fair, how assessments should be administered to students, and how the assessment data should be used. But is it really reasonable to ask whether we can do entirely without educational assessment in schools? Or whether we should really care about the quality and validity of assessment data? Only if it doesn’t really matter what students are learning or when they are actually learning it. But if that’s the case then we have to ask ourselves this: why do we bother sending our children to formal schools with highly trained teachers in the first place? If we really don’t care what they are learning or when, wouldn’t it be better to send them to day care or adventure camp five days each week instead?

In fact, assessment is not a huge waste of time and money.  But without high quality assessment in place to inform effective instruction, large parts of the rest of the educational system might well be.

Postscript: A peek at the future of educational assessment


Now fast-forward from 1980. Imagine a world where teachers have the equivalent of GPS in the classroom - that is, continuous, detailed data on student learning plotted in relation to the curriculum goals, delivered in real-time, and actionable at a glance. Yet students never have to take tests. 


It may sound far-fetched, but it already exists. It's called "embedded assessment" and we've built such a system over at Native Brain to demonstrate conclusively that it's not only technically possible but that it can be made to work at scale in typical public school classrooms - today. (See the screenshot below.)


As I've said before in this blog, we have the know-how right now to make mainstream public school education much, much better than it currently is.  The same way that GPS suddenly transformed the way we drive, technology in the classroom can transform the way teachers teach and the way students learn. There is definitely a way. The question is, do we have the will to make it happen?

(Note: As of the date of this posting the Native Numbers iPad math curriculum and accompanying GPS-like instructional dashboard are currently available at no cost to parents and teachers.)

Check it out. Send us your thoughts. Share.

2 comments:

  1. “Assessment a huge waste of time and money” – This philosophy is inconceivable! (SOMEONE had to use that word, right?!) Joking aside, this statement is quite disturbing. Knowing what a student understands is extremely complex. If I give a student a test/quiz on Friday, and they get a decent percent correct, let’s say 85%, am I willing to assume they have a decent enough level of understanding to move forward? What if within that 15% (incorrect) they are missing critical scaffolds; and/or perhaps a certain percentage of their answers were simply correct guesses?

    Formative assessments, are crucial! But I am afraid we use formative assessment as summative. We teach, we test, we move on. But, don’t place the full blame on teachers. Often we are given a scope and sequence, even daily pacing guides and/or scripted lessons. We all use high stakes tests, but those are given too late to benefit students immediately. A colleague of mine calls these scores “coroner’s reports.” I’ve heard them called “autopsies.” I don’t think our current high stakes tests are autopsies because an autopsy gives a detailed, explicit reasonable cause of death. We don’t have that knowledge, just scores. So…how do we know what a student does/doesn’t know?

    To get a peek at the difficulty, check out one or both of these videos:
    http://tinyurl.com/q8v2n3k
    http://tinyurl.com/q6kbw8a

    OR check out any of these student “mistakes” on this website (Warning, you might just spend hours contemplating these!) http://tinyurl.com/nglw2wj

    If you checked out those links, were you able to determine exactly what these students “knew” and did not “know”? Do teachers really spend this much time on every single student, for every single “Friday Quiz”? Should they? Can they? (Hint, the answer is NO...not efficiently!) So, yeah, I am hopeful, indeed excited, about the prospect of adaptive instructional software with embedded assessments. We do have some adaptive assessments, but these do not include the instructional piece; or if they do, these haven’t been validated. See this article explaining the need for evidence:
    http://tinyurl.com/nmmj7rp

    I’m not claiming any of this will be easy, nor that one size fits all! I don’t think we will ever replace teachers; but we ARE there with technology. I agree the better question to ask is, “Are we willing to invest in curriculum/assessments that lead a student to mastery?” - NOT assessing is a non-player. We have to assess. How we assess and what we do with the results are key. Shouldn’t we try, immediately, to get to the bottom of what it is a student does/doesn’t understand, revise teaching, get feedback, test, revise teaching, repeat.... Isn’t that the “scientific method” we teach our students? Isn’t that what an educational scientist does?

    ReplyDelete
    Replies
    1. Thanks for your thoughtful response, Rene.

      The video links you included are excellent examples of some of the hidden complexity here. The kids seem to understand place value when they are working with the teacher at the chalkboard, but when that knowledge is probed a bit below the surface it can completely fall apart - it turns out some of these students don't really understand place value at all.

      I encourage readers to view those videos.
      http://tinyurl.com/q8v2n3k
      http://tinyurl.com/q6kbw8a
      http://tinyurl.com/nmmj7rp

      Getting good information about what students know is absolutely critical for supporting good learning, but it's also much harder than it might seem.

      Delete