Last winter, Lucy Calkins (via the Teachers College Reading and Writing Project, or TCRWP) issued a rebuttal to the Achieve, Inc.-sponsored report asserting that her Units of Study fails to provide the sort of instruction that would result in all children learning to read—particularly those who do not already arrive in school with strong language skills. Entitled “A Defense of Balanced Literacy” it is, in keeping with Calkins’s previous major statement regarding phonics, a veritable masterpiece of half-truths, evasions, contradictions, misunderstandings, and distortions.
I don’t want to get pulled down into the weeds trying to offer a point-by-point analysis, but there is one major idea that I think is worth tackling head-on because it involves such a widespread classroom tool: running records.
For anyone reading this who might be unfamiliar with the term, running records are an assessment/monitoring tool generally associated with Reader’s Workshop and the leveled reading system developed by Irene Fountas and Gay Su Pinnell. They are based on children’s short read alouds and are designed to check their accuracy and types of “miscues” (word substitutions, omissions, additions, etc.) as well as their application of reading strategies covered in class. They come in many varieties, but for obvious reasons, I’m going to focus here on the ones used by TCRWP. They revolve around three components: Meaning (M), Syntax (V), and Vision (V). In other words, the three-cueing system.

Before I get going, I’d like to point out that in the myriad social media posts I’ve read about the science of reading, one of the most frequently recurring questions is whether various components of different reading programs can be combined, in a sort of DIY approach—most of these questions are of course asked in good faith, but one senses that beneath them, there is often a nagging desire to keep things basically the same while just adding a sprinkling of something new.
The problem is that this approach is what already characterizes much of Balanced Literacy: the issue in many schools’ programs is not so much that any one element (ahem, phonics) is entirely absent, but rather that it is combined with methods that are either ineffective or directly harmful, and that those methods undermine the phonics that does get taught. This point is starting to become more widely appreciated, but I think it bears emphasizing: an effective literacy program cannot include discredited approaches simply for the sake of being “fair” to all parties. The point is to teach children to read, not to mediate between competing academic theories.
As for the problems with the three-cueing system, I’ve already covered them extensively in this piece.
So that said, I’d like to take a hard look at the implications of using of running records for diagnostic and assessment purposes only. According to the TCRWP website, New York City teachers are now offered just two sets of running records: one for Beginning-of-Year (BOY) assessment and one for the End-of-Year (EOY) assessment. Obviously their official relegation to a mere biannual diagnostic/assessment, along with the addition of phonics materials, is intended as an attempt to pacify Calkins’s critics (although there is certainly nothing stopping teachers from using their own running records more often if they so wish).
I think, however, that this is also a cautionary tale of the dangers involved in trying to patch together approaches that not only rely on contradictory approaches but are also based on fundamentally different conceptions of what reading is—one of which is backed by decades of scientific research and the other of which has been thoroughly debunked in the mainstream academic world. When it comes to reading, you just don’t get to have it both ways.
The first very serious problem involves this statement from Calkins:
For us, as the report acknowledges, the three-cueing system is part and parcel of running records, and the purpose is to learn more about readers. We can’t discern the evidence behind the report’s objection to this. We don’t agree that there is settled research that says that when teachers use running records to assess readers this is damaging to learners (see, for example, Clay 1985; Ross 2004, Goetze and Burkett 2010, Shea 2012).
As both Mark Seidenberg of the University of Wisconsin (author of Language of the Speed of Sight) and Claude Goldenberg of the Stanford Graduate School of Education have discussed, Calkins largely relies on playing semantic games in order to wriggle out of addressing the extent of her program’s shortcomings. Presumably, she is fully aware of the substantial body of research detailing the failings of the three-cueing method as an instructional tool. However, by insisting that her program only advocates it as a means of assessment, she can play innocent and claim to be unaware of any research assailing it.
Technically, yes, there may be a dearth of publications on this particular application of the three-cueing system, but that is presumably because the topic does not merit independent treatment. Why on earth would a teacher use diagnostics/assessments explicitly aligned with the three-cueing system if they did not intend to teach in a way consistent with that approach?
Are we seriously to believe that teachers would diagnose students based on one system (going through the trouble of learning its intricacies) and then teach according to an entirely different system, only to ultimately reassess students based on the original approach?
Even if we accept this outlandish possibility, an even more fundamental problem is that the entire premise of the three-cueing system/MSV is directly opposed to how beginning readers become skilled decoders. Consequently, there is no good reason to include it in a program at all.
To be clear: experienced readers do make use of meaning, structural, and visual (spelling) cues to support comprehension. This is what the three-cueing system was originally intended to describe. Novice readers, however, need to focus on matching sounds to letters and sequences of letters so that they can literally identify words and begin to store them in their brain for instantaneous retrieval, i.e., “map” them orthographically. When children “solve” words by using various contextual features, as Calkins is known for advocating, this process is severely impeded.
As the 2014 version of the TCRWP Running Records Teacher Resources and Guidebook makes abundantly clear, the TC program is very much reliant on Ken Goodman’s long-discredited “reading is a psycholinguistic guessing game” theory (1967). And that theory is taken quite literally: everything is directed toward helping children make better “predictions” about unknown words by using meaning, structure, and visual cues. There is no acknowledgment of any mainstream scientific research done within the last 50 years—research clearly demonstrating that skilled readers 1) do not guess based on context, and 2) process text automatically and subconsciously by focusing on each letter of each word, in sequence, for a fraction of a second.
Even if we give Calkins the benefit of the doubt and assume that three-cueing strategies are not explicitly taught in classrooms using her program, running records are still an irrelevant distraction. Yes, they might sometimes provide interesting insights into a child’s thought processes (at a particular moment, at least), but truly effective applications for that knowledge are severely limited. Children do not become better readers by analyzing the features of their guesses, or by thinking harder about “what would make sense,” or by integrating multiple cues more effectively, because reading isn’t guessing, and decoding isn’t based on integrating context cues. The entire premise of the exercise is misaligned with how skilled reading develops.
When beginning readers struggle to decode phonetically regular words, it is virtually always because they have not yet mastered the necessary sound-letter patterns. The fact that they might plug in a word that does not make sense, or even the wrong part of speech, provides little real insight into their understanding of language—native speakers of standard English do not, as a rule, make these kinds of egregious errors in speech. When they appear in reading, they are almost invariably a sign that a child is expending so much mental energy trying to figure out what words literally say that they have nothing left over to think about meaning in a logical manner.
The only permanent way to resolve this problem is to ensure that letter-sound correspondences are learned and practiced to the point of automatic recognition so that sufficient mental room is freed up for considering meaning; investigating the precise nature of a child’s workarounds for lack of this skill does nothing to facilitate its development. Running records, as least as they are conceived of by TCWRP, can provide at most a trivial amount of information about which sound/letter patterns have and have not been fully learned. It makes no sense whatsoever to use them as diagnostics or assessments in the context of a program that actually emphasizes phonics.
Another major issue with running records is that one of their main purposes is to assess whether children are using the cues in the proper proportions. This is a point I really haven’t seen discussed, but I think that it’s actually quite important. Since the National Reading Panel report unequivocally came down on the side of phonics instruction in 2000, “Balanced Literacy” has—at least in theory—been viewed as a compromise designed to cover all the major components of reading: phonemic awareness, phonics, fluency, vocabulary, and comprehension. For Calkins, however, this term appears to have a somewhat different meaning: “balanced” refers more to the belief that children must balance their use of meaning, structural, and visual cues and not let any one overshadow the others. Because the three-cueing graphic includes three equal-sized circles, decoding is (falsely) understood to be the result of an equilibrium between the three.

Thus, if one component dominates excessively, e.g., the visual system (phonetic decoding), the student must be brought back into line and taught to base their predictions more on meaning (e.g., pictures or the general scenario of the story) and syntax (e.g., “what would sound right in a book?”). Running records are explicitly intended to reveal these imbalances. So when Calkins defends “Balanced Literacy,” this is presumably the concept of reading she is defending. This approach is, of course, the exact opposite of what children require to become strong decoders, but why even bother to include running records in a program if not to address reading from that perspective?
What makes Calkins’s assertions about the importance of phonics most suspect however, is the fact that her program is directly integrated with the Fountas & Pinnell leveled reading system. It is certainly an improvement that some phonological awareness exercises and phonics assessments are now included along with the other materials, but given the program’s admitted use of the three-cueing system and Calkins’s continued game-playing with her critics, one must seriously wonder whether this is not just window dressing. (Both the current Guidebook and the phonics materials are located in a password-protected section of the TCWRP website, so unforuntately there is no way to tell firsthand; however, it is hard to imagine that the central premises of the program have been radically overhauled since 2014.)
To reiterate: the real question isn’t if phonics is included in the general sense, but whether it’s 1) presented in an explicit, systematic way; and 2) reinforced rather than undermined by the other elements of a program.
That is why legitimate structured literacy programs rely on decodable texts—books that include only sound-letter patterns that have already been taught, so that children can practice what they’ve learned without getting distracted by things they haven’t. But as Karen Vaites of Eduvaites points out:
Neither Reading Workshop Units of Study nor the new [Calkins] phonics program come with a set of decodable readers. Recent calls to Heinemann suggest that this remains unchanged; Heinmann representatives are neither aware of decodables coming soon, nor do they recommend purchase of Ready Readers. So schools buying Caklins’s program for 2019-20 will not get a component that everyone agrees is important, including Calkins.
Regardless of what Calkins purports to agree with, the reality is that her program is still explicitly aligned with leveled readers, which are by definition not decodable. And children who attempt to read words with sound-letter patterns they have not yet learned will almost invariably fall back on other textual features, like pictures, or make semi-random associations based on their personal experiences. Essentially, then, they are coerced into using something that looks an awful lot like the three-cueing system, regardless of whether their teachers explicitly endorse it or whether they are being given phonics instruction as well.
And that is the real problem here: when ineffective methods are tacked onto effective ones (or vice versa) for the sake of “balance,” the effective ones tend to get canceled out. It’s obviously a positive development that the recent surge of interest in the science of reading has put serious pressure on Calkins and TCRWP for what is presumably the first time, but real change isn’t just a matter of adding a few worksheets or sticking on a new label. Decodable readers would be a good start—but given that a truly phonetic approach runs counter to Calkins’s belief that “excessive” reliance on any one aspect of the three-cueing system should be discouraged, that is unlikely to happen anytime soon. It will take much, much more pressure to bring about this degree of change. It is up to the people who have done the research to try to explain to school administrators, and school boards, and districts exactly what is involved. But if nothing else, scrapping running records entirely and replacing them with more productive forms of diagnosis and assessment would be a good first step.