S7 E5 - What Does the Research Actually Say About Grouping Students for Phonics?

Hello, hello, welcome to this episode of the Structured Literacy Podcast recorded here in Tasmania on the lands of the Palawa people. I'm Jocelyn and I want to start today by being upfront about exactly why this episode exists. We are told and are being told fairly consistently that grouping students for phonics instruction doesn't get you better outcomes than teaching the whole class and then giving struggling students some additional repetitions or extra sessions to fill their gaps. I have another podcast episode called Is Whole Class Instruction Really Equitable? that you may be interested in listening to.
Because the thing is that whole class instruction with a bit extra getting better outcomes has not been my experience. Not when I was teaching in my own classroom, not when I was leading a school, and not in the many schools I now support as a coach. Consistently, the schools who achieve the greatest gains in their data are those who provide matched and targeted instruction for students at the Tier 1 core level.
What Does the Research Say?
So I want to take a really careful look at what the research actually says, not what people say the research says, but what the studies themselves really show. And in diving into this topic, what I found was interesting. When you go looking for research that compares grouped instruction with whole class teaching, what you find is that the studies making the comparisons, were done with one of two things. Either they were keeping everyone on the same content, just delivering the lessons in small groups instead of to everyone, or the instruction was described as targeted, but the model of instruction being used was balanced literacy. So of course the research doesn't show strong benefits from grouping, we wouldn't expect it to in either of those cases.
So here's the tension that I'm trying to reconcile and support schools to reconcile. On one side, we have the current advice: teach every Year 1 student the same content and give the ones who are struggling a bit of extra time on top to have extra repetitions and fill gaps. On the other side, we have what we're actually seeing in many schools who are following that advice. They are seeing 20, 30, or even 40% of students arriving in Year 3, still needing foundational phonics instruction.
Now, teachers have always known that different children have different instructional needs, particularly in the areas of nuts and bolts for literacy and numeracy. That's why group rotations existed in the first place. It wasn't because teachers like to do craft, it was because we knew that that student who is streaks ahead of everybody else has different needs from the one who can barely read four graphemes. I've asked myself the uncomfortable questions here, including:
Is it that teachers don't know how to teach?
Is it that they're not that committed?
That they're not following the programs as written?
Well, those things could be true, but it just doesn't make sense to me that so many schools and so many students are struggling because teachers across the country have decided collectively to go rogue or they can't master their craft. We absolutely need more coaching and proper professional development support for teachers, and I will never stop saying that. But are large numbers of educators really so underprepared that they can't teach an early years student some phonics and decoding? I just don't buy that.
I've Seen This in Practice
Because here's what I've seen in practice. If content is pitched at the right level for the student, as in, they can access it and it hits them where they need it to, and lessons are clear and well structured, then delivery that sits at about 70 or 80% effectiveness will get the job done for all students who don't have significant additional challenges. When that isn't happening for a significant proportion of students, it's worth asking whether the structure is the issue, not the people. And I think that what we're looking at is a structural issue. So where does that leave us in our quest to be genuinely evidence-informed? That's what the rest of this episode is going to unpack.
Let's firstly clarify the question on the table
When people talk about grouping for phonics instruction, they often frame it as a debate between should we be grouping students or teaching whole class? And look, I get why the conversation lands there, but I'm not talking about the difference between 6 students and 26 students. I think a better question is, what does the research say about what happens when phonics instruction is matched and targeted to students or not matched, not targeted to a student's current point of need? And that shift in framing matters. Let's see why.
What the Research is Very Clear About
We can begin with what we can say with real confidence in terms of the research about reading instruction, because there's actually quite a lot. Across a significant body of intervention research, one thing is consistent: when students who are behind in reading receive explicit, systematic, code-focused instruction that is targeted to their needs, they make much better progress than if they stay in the regular classroom program alone. Let's look at a couple of examples.
A large national implementation study in New Zealand evaluated an approach that was a Tier 2 structured literacy intervention across 936 schools. Five-year-old children who received the Tier 2 intervention, that was explicit phonics, spelling, and word-level work in small groups, scored significantly higher in decoding and spelling than matched peers who had similar pre-intervention reading scores but did not have the Tier 2 support. Importantly, following that work, the Tier 2 group scores didn't differ from everybody else who didn't test as needing it (McNeill & Gillon, 2025).
A meta-analysis of Tier 2 response to intervention studies found that Foundation to 2 students at risk for reading difficulties consistently improved their decoding when they received structured Tier 2 reading and decoding interventions compared with typical classroom instruction alone (Wanzek et al., 2016).
So the first thing we can say with confidence is this: when we have targeted, explicit, code-focused instruction that is pitched to where students actually are, struggling students make stronger progress than if we don't. Full stop.
The research is also pretty clear about when and how much matters. Students who receive intensive early support make better progress than students who wait. A small but interesting experimental study by Bouton and colleagues in 2018 explored what they called an upside-down RTI model, where the most at-risk first graders were immediately placed in intensive one-on-one Tier 3 instruction rather than working through Tier 1 and Tier 2 instruction first. Students who were fast tracked showed stronger gains in word identification and sight word reading than match students who followed that traditional RTI pathway. The authors of this study are careful to note the study's small sample size and call for larger replication work. So we should treat this as promising early evidence rather than a definitive finding. But the direction it points in is consistent with everything else we know. Don't wait, give substantial, well-matched word-level support early and intensively.
And there's one more finding worth highlighting here. Research into the Targeted Reading Instruction model showed that when teachers used student data to tailor their lessons and implemented those targeted strategies with high fidelity, particularly in their second year of using the model, their at-risk students made greater gains in word reading and vocabulary (Vernon-Feagans et al., 2018). So using assessment to guide what you teach and actually sticking to that plan is linked to better outcomes. These teachers didn't just take the Year 1 program and start at lesson one.
Now, here's what the research does not tell us
Most of the intervention studies I've just described are built around a fairly standard model, whole class or core instruction, plus the intervention for students who are identified at risk. Usually identification happens through a broad reading screen, and maybe the bottom X percentage of students are included. No research has been done looking at the impact of high-quality, explicit instruction in phonics and decoding, comparing a whole-class Tier 1 phonics block business as usual, just everyone in the grade, with some add-on support, with what happens when we fully differentiate everyday with content and focus matched to what each group of students is actually ready to learn. That specific comparison has not been run yet. So I want to be really transparent, we cannot honestly say from the existing research that whole class and a bit extra is proven to be more effective than a differentiated Tier 1 model with high-quality systematic synthetic phonics. But we also can't say that the reverse has been directly proven. The research just hasn't tested that specific structural question.
Now I know what some of you are thinking. There's no evidence for grouping, they told us so. And yes, some people have used exactly this gap in the literature to argue that the status quo is fine. But that's not quite right either. The absence of a direct comparative trial is not the same thing as evidence that one approach is equal to another. It means we need to think carefully. We have to turn to adjacent research wisely and be really honest about what we do and we don't know.
There's another dimension worth naming
The intervention research that I've been discussing shows that additional targeted instruction really can work for struggling students. But let's think carefully about what that model actually means in practice. That additional instruction happens on top of the regular classroom day.
So a student will sit in a lesson, being asked to participate and do things that are way out of their reach. So that's at best a waste of instructional time, and at worst, a huge load on that student. And then after that, those students are being pulled out, they're being pulled out during maths or writing or science or whatever else is happening at the time that the learning support person happens to be available. They are absent from that instruction all the time. And in many schools, the reality is that the wait list for intervention support is long, much longer than the school has resources to cater for. So students sit in classrooms for weeks or months, falling further behind before a spot opens up. Nobody should have to wait like that.
And here's what I want us to consider: the features that make intervention effective, which are high-intensity, explicit teaching, content matched to where the student actually is, with frequent repetition, with immediate feedback, careful progress monitoring, none of those features are exclusive to a pull-out model. They're features of good instructional design. Instructional grouping at Tier 1 during core phonics time means that students are getting exactly that kind of precise, high-intensity, monitored, targeted work right when it should be happening, which is when they're learning that content. Nobody has to miss maths, no one has to miss writing, no one has to sit on a wait list. I'm not talking about taking away additional support on top of that for students who have significant difficulties and need that. But so many students would not be on the intervention list for the extra if their Tier 1 instruction, their core instruction, gave them what they need. So what I'm talking about is actually increasing the dosage of well-matched instruction by building it into the heart of what we already do.
So what can we reasonably conclude?
Cognitive load theory becomes really useful to us here as a lens through which to examine this issue. We know working memory is limited, and we've known this since the 60s, and the more recent modeling of Sweller and colleagues supports that (Sweller et al., 2011). When a task involves too many unfamiliar, interacting elements, too many unknown graphemes, irregular patterns and unpredictable words, working memory gets overloaded and learning suffers. This is particularly significant for students who are already behind because every element of the tasks they are being asked to perform cannot be processed automatically, and that adds to cognitive load.
In contrast, what do effective interventions do? They manage that load carefully. They introduce a small number of new patterns at a time. They use decodable text where students are actually practicing and applying what they know. They give lots of opportunity for repetition and retrieval with the pacing and the dosage adjusted to meet their needs. Let's think about what happens when a student who's still consolidating short vowel patterns and basic digraphs sits through a whole class lesson pitched at alternative spellings of long vowels. They can't process most of the content because there's too many elements that are unfamiliar. Their working memory is at capacity from the start. So they're either guessing, switching off, compliantly following direction, or spending all of that cognitive energy on things that bear no relationship to what they actually need to learn. That is not effective instruction.
My Goal
I want to close this episode by coming back to the reason that I made it. My goal, as always, is to help you, teachers and leaders, embrace what I think of as common sense: evidence-informed practice. That model where we look honestly at what the research says, bring in what we know from our own proven experience where we know how to get outcomes, and line all of that up with what is actually happening for students. The research tells us very clearly that targeted instruction works, particularly for the students who struggle. That's the entire premise of intervention programs. They look where students are up to and they give them what they need. Every good intervention is built on that. So my question is this: why can't we make that the experience of every student every day? Why can't we optimise working memory for every single one of our children? Why are we insisting on attempting to teach a whole class of students the same phonics lesson and then expect that we can magically adjust for a group whose starting point might range anywhere from the basic code to reading chapter books, when we know that's not how learning works. If the answer is because the research says so, then I think we need a new conversation about what research actually says.
As I've discussed today, the studies most often cited to show that this sort of differentiation doesn't work were studies that used balanced literacy and whole language practices. The studies that prove that whole class instruction plus a bit extra gets you better outcomes than high-quality targeted instruction at the Tier 1 level doesn't actually exist.
Schools I've taught in, schools I've led, and now schools that I support as a coach, the ones that have made the shift to grouping for phonics instruction consistently report better growth for students than they saw before they grouped. Teachers say they no longer feel guilty because they aren't meeting the needs of their lowest or their highest readers. I haven't even gone anywhere near those capable students who don't need what's being offered in that whole class lesson in this episode.
Teachers who have done the hard work of building consensus with colleagues, aligning timetables and resources, and genuinely treating every student in the school as their students, every piece of data is our data describe what happens next in one word: relief. They talk about fewer behavioural issues almost immediately. They talk about finally feeling effective for all students in their lessons. They're often surprised about how quickly gaps can be filled when the instruction is actually pitched where the students need it. And most importantly, they talk about how proud their students are of their own progress and how much that means to families.
And before anyone says, that's nice, Jocelyn, but anecdotes are not research, I agree with you. Anecdotes do not trump research. But I'm going to say this again, in this case, the research to refute what I'm suggesting does not exist.
I also want to say this clearly. If you work in a school where whole class with a little extra support is genuinely working, where your screening data shows very small numbers of students sitting in the at-risk category, and those students are making pleasing appropriate growth. If your teachers are working in flow, they're not stressed, kids are happy and they're progressing, then please don't change a thing. Keep doing what you're doing. Because there is no "everyone shall always" answer to this question. Schools are different, demographics are different, resources and staffing are different. What works brilliantly in one context needs adaptation in another.
But if your school is following the explicit phonics program, whatever it happens to be, implementing the advice, doing the PL, taking action, and you are still seeing 20, 30, 40% of students flagged as at risk in your screening data, it might be time to have a conversation about what comes next. Because that situation does not have to be inevitable, regardless of the student's socioeconomic background. Those results are not sustainable. They're not sustainable for you as the adult. They're not sustainable for the children.
Conclusion
I know that landing on "the data is inconclusive and absent" is not exactly a comforting place to finish, but I'd like to offer a different way of thinking about it. When there isn't a definitive research-based answer, that's not a dead end. That's an invitation. It's an invitation for us to have high expectations for our students, to think about our professional judgement and what we have proven we can do to meet the needs of students using available evidence where it is robust. And sometimes we don't need more smarts, we don't need more PL. We need permission to embrace what is largely common sense. Always remember that when we're in this murky land of not really having research to guide us fully, as always, the evidence that we've made wise choices is in the well-being of our students. It's in the well-being of our teachers, it's in our data. If we make adjustments and things get worse or they don't move, then it needs attending to. We don't keep on an unrealistic path for years because we're still waiting for it to work.
Next week, I'm bringing you something really practical as opposed to this kind of murky research conversation. We're going to get into the real world lessons that I've learned about grouping from my own classrooms, from my own time leading schools, from working alongside so many of you in yours. The things that work, the things that trip schools up, and how to navigate the messy middle of making a structural change like this in amongst the reality of busy schools. I can't wait to share all of my top tips with you so that you can take shortcuts if this is a direction that you want to go in. Until then, happy teaching, everyone. Bye.
Show Notes:
S5 E16 - Is Whole Class Phonics Instruction Really Equitable?
References:
Gillon, G., McNeill, B., Scott, A., Denston, A., Wilson, L., Carson, K., & Macfarlane, A. (2019). A better start to literacy learning: Findings from a teacher-implemented intervention in children’s first year at school. Reading and Writing, 32(8), 1989–2013.
Sweller, J., Ayres, P., & Kalyuga, S. (2011). Cognitive load theory. Springer.
Vernon-Feagans, L., Bratsch-Hines, M., & Targeted Reading Instruction Collaborative. (2018). Improving struggling readers’ early literacy skills through a tier 2 professional development program for rural classroom teachers. The Elementary School Journal, 118(4), 525–548.
Wanzek, J., Vaughn, S., Scammacca, N., Gatlin, B., Walker, M. A., & Capin, P. (2016). Meta-analyses of the effects of Tier 2 type reading interventions in grades K–3. Educational Psychology Review, 28(3), 551–576.
Looking for evidence-informed lessons and resources? Join us inside The Resource Room!


Jocelyn Seamer Education
0 comments
Leave a comment