S7 E10 - How to Evaluate Spelling Instruction

Direct, No-Nonsense, Challenging

Hello, hello. Welcome to this episode of the Structured Literacy Podcast recorded right here in Burnie, Tasmania, the lands of the Palawa people. I'm Jocelyn, I want to start this episode with a small warning. What I'm going to talk about today might feel a little bit confronting. And I do not mean the messaging in this episode to be harsh, I promise you that. But I do mean to be direct. I do mean to be no-nonsense. And I do mean to challenge some of what is happening in our classrooms, because the students I keep thinking about, the ones in Year 3, 4, 5, 6, 7, 8, who are still struggling with spelling despite their previous years of what is claimed to be robust phonics instruction, do not have time for us to skirt around the edges of what is actually getting in the way of their learning. So with that in mind, let's get into it.

Is Your Program Working?

Over the last couple of episodes of this podcast, I have been talking about spelling instruction in Years 3-8, what the research tells us it should look like, what can get in the way, and the instructional design decisions that either serve or shortchange students. Today I want to take that a step further and give teachers and leaders a tool that you can use to audit what is happening in your school in this space. The reality is that most schools have a spelling program and most schools would claim that it's working. But most schools have not sat down and systematically asked whether it actually is. I think part of the reason for that is the way we tend to evaluate impact in education. So much of what we count as evidence is about what teachers, leaders, and systems do. We measure the inputs, how many professional learning hours were delivered, whether a program is being used, whether the curriculum scope and sequence has been mapped. All of these things matter, but they are not the same as asking what students are actually learning. We want to be really careful that we're not using the inputs as a proxy for the outcomes we're looking for. I'd like us to, whenever we make a claim that something is "working" in inverted commas, I really want us to get into the habit of immediately asking two questions. How do we know that? And what is our evidence?

F-2? Attend Elsewhere

Before I go any further, I want to be clear about something for our Foundation to Year 2 teachers and leaders. In the early years, your spelling program is your phonics program. As long as you are giving encoding the same time and attention as decoding, and as long as every student is fully engaged in both, not copying off the board, not copying from someone next to them or watching someone do something at the front, as long as they're fully engaged, then reading and spelling genuinely enhance each other. So if you're in the early years, the audit that I'm about to walk through is not fully where your attention needs to go. You can use the steps of the process that I'm about to describe to consider and evaluate your phonics instruction for sure. But the specifics that I'm talking about today are for Years 3-8. So this episode is for the leaders, curriculum coordinators, and teachers who are working with older students and asking, is our current approach actually achieving what we think it is?

Victoria Bernhardt: Measuring Programs and Processes

The tool that I'm going to introduce you to today comes from the work of Victoria Bernhardt, specifically from her book Data Analysis for Continuous School Improvement. And I have the third edition of that book. Bernhardt has spent decades developing frameworks and tools that help schools move from vague intentions about improvement to structured evidence-informed action. One of the tools she provides that we're going to talk through today is called Measuring Programs and Processes. It involves a simple but powerful matrix, and Bernhardt suggests giving your team about two hours to work through all of this. And that might sound like a lot, but when we consider that most of us are spending years implementing programs without ever systematically evaluating them, two hours is a small investment to help us get on and recognise that we are on the right track or the wrong track. So the matrix asks seven questions organised across three areas: purpose, participants, and implementation. And then it asks about results. Now what I love about this framework is that it forces us to be specific. We're not asking, do we think it's going well? We're asking, what's the evidence that we are making the impact we think we are? I've used this exact matrix in schools that I have led myself, and it's the sort of thinking that I help school leaders do in working with them on their school improvement processes and journey.

What this sort of thinking reveals is that there's often a significant gap between what we believe is happening and what is actually happening. And in today's episode, we're going to apply this specifically to spelling, but as I said earlier, you can use it for any particular area of curriculum you like. So I'm going to walk you through each of the questions in Bernhardt's Matrix with a spelling focus. I also want to let you know that we've developed a free Spelling Instruction Evaluation tool drawing on the research that I've covered in the last few episodes and including this kind of thinking that we're going to see in the process that we're exploring today. You can download this tool from the show notes of this episode or from the Free Resources tab at jocelynseamereducation.com.

Bernhardt's First Question

So the first question that is asked is: what is the purpose of the program or the process we're evaluating? And it's important to know that this framework for thinking and reflection isn't only about curriculum, it's also about our attendance strategy or how we organise playground duty. So it works for everything and having a tool that we can use over and over, get used to that thinking, is really powerful. So this question of what is the purpose of the program or process is the starting point. So what is our spelling approach or our spelling program for? If the answer is something like, to teach students to spell the words on the list, then that is a program purpose, but it's not really a learning purpose. The learning purpose of spelling instruction is to build students' deep understanding of how words work so that they can apply that knowledge to words that they have never seen before in writing, they care about. Spelling instruction is about building automaticity and understanding so that cognitive energy can be devoted to those macro text-level elements that we know are so critical. If your program is producing students who can spell the words on the Friday test, but cannot transfer that knowledge to their own writing on Monday, then the learning purpose is not being met. And this is a common gap that we see.

Bernhardt's Second Question

The second question is: how will you know the purpose is being met? And for spelling, the honest answer for most schools is either, well, we don't know, or we look at our NAPLAN, which NAPLAN is very limited in its scope of what it looks for in spelling, or we have a weekly test that we check for. Now, full disclosure, in our spelling program, Spelling Success in Action, there is a weekly test. We need to figure out whether students can recall specific words and use patterns immediately after studying them. But this on its own does not tell us whether they can apply the pattern to unfamiliar words after the formal instruction has ceased. It absolutely does not tell us whether their spelling is independent in the writing that they do and whether their writing overall is improving. So when you sit down with the team and ask what our data actually tell us, this is one of the critical questions. And spelling data is a little messy because there isn't really a spelling version of DIBELS. So the questions we're asking really are about, have the students retained what we've taught and can they use it for their own independent purposes? So the first thing we have to acknowledge is we might not have the evidence to tell us what is happening, and we need to get on top of that.

So this is probably the right time to mention that we have a set of free diagnostic spelling assessments that cover Phonics, Orthographic Conventions, and Suffixing Conventions (the ones related to morphology) inside our website. And every school who has Spelling Success in Action 1 has access to the training specifically designed to help you unpack that data and to use it to make active instructional decisions. But even if you're not using our programs, you can download the tools and there are spreadsheets and explanations there for you to refer to to help you make some decisions. Assessment, though, is only useful if we use it to determine whether we are changing things for the better in our classrooms. And that's where the training bit comes in.

Bernhardt's Third Question

The next question is: who is the program intended to serve? So in theory, the answer should be: all students. In practice, the answer for many programs is basically the students who are already mostly fine. Because most programs just begin all students at the same point of instruction in the year, regardless of the diagnostic data. Instruction is then provided at a single pace based on what's on the page in front of us, with very little or no adjustment for students with gaps. So if that's the case, the program then tends to serve the middle because it doesn't stretch the students who really will benefit from applying knowledge in more complex, robust vocabulary, and it doesn't support those who are coming into the learning process with significant gaps. So programs at best then extend students who already have solid foundations, and then they leave the other students treading water.

Bernhardt's Fourth Question

The next question, and it's one that is hard to really look at, is: who is being served and who is not? Answering this question requires us to look at the data and be honest. Who is making progress? Who is not making progress? Is there a pattern that we're seeing? Do the students who are not being served tend to share something in common? Have they come from a similar year level? Do they have the same gaps? Do they have English as an additional language? Is there a diagnosis of some sort? And none of these things are excuses that make a failure to learn acceptable, but they do help us understand the context in which we're working.

And I have to be direct here, if we do not have the data, we cannot answer the question about who is being served and who is not being served, full stop. Because saying that students are doing really well or that a program is working really well without any data to back that up really means we feel like we are delivering instruction well. And those are two different things. Feeling like instruction is going well is not the same as knowing what students are learning. We have to be honest about the distinction because students' time is not something we can get back. This discussion is not about blame. Every teacher I have ever worked with cares about their students and every leader cares about their school. But caring is not a substitute for ensuring that learning is happening. If we want to know whether our students are being served, we have to move beyond a I feel it in my heart version of measuring outcomes.

Bernhardt's Fifth Question

Next question: what will it look like when the approach or program is fully implemented? And this question asks us to describe concretely what strong implementation looks like, not what teachers' self-report is happening, what it actually looks like in a classroom when our approach is working. Teachers don't claim that they are doing the program well because they're trying to deceive leadership or they're lazy. When someone who is a novice at using a particular approach or a program, regardless of how experienced they are as a teacher, if someone who is new to something says, no, no, I'm doing it, I'm doing it, that claim is because they are evaluating their performance against their version of instruction. And so when people say, no, no, we're doing it, they're not lying to you. They are doing their version of instruction. But whether that instruction is robust enough to meet the needs of students may be a different thing.

And this is where an evaluation tool becomes useful because this whole process is not about judging teachers. It's not performance management. It's asking, have we done what is needed to create these systems and structures within our school that set everybody up for success? When you have a set of specific evidence markers, when you know what you're looking for, then everything becomes easier.

So we're looking for things like:

Is there one clear concept that's being explored per unit?
Are students working with enough different words to genuinely build schema, not just one base word and its family?
Are students reading and spelling the target pattern in every lesson?
Is instruction reaching sentence and text level in a way that is optimised for the cognitive load of the students?
Is the teacher explaining, modelling, and checking for understanding?
Do we really have a shared vision for what strong, robust, effective instruction looks like in our school?
How much of what we do in classrooms is based on opinion and a best guess?
And how much is aligned to evidence and directly linked to using student data to evaluate impact?

If we cannot describe what full implementation, effective implementation looks like in really concrete terms, we cannot evaluate what we're doing. So there's a difference here between evaluating teacher performance, if we want to call it that, and program components and the instructional design components of a program. So don't get those two things mixed up.

Bernhardt's Sixth Question

Next question is: how is implementation being measured and should it be measured differently? So for most schools, implementation is not really measured and evaluated at all. There's often an assumption that because the program was purchased and teachers are using the materials, that the program is being implemented and that self-reporting about how well we think it's going is taken at face value by leadership. But implementation is not the same thing as effectiveness. What would it look like to actually measure implementation? Would it involve classroom walkthroughs with a specific focus? Would it involve collaborative review of lesson materials and re-watching snippets of training to compare what we're doing with what was actually in the training? And if you're a Spelling Success school, you have access to the teacher resources, the evaluation templates, and the meeting agendas that help you to do this. And yes, this process will include data showing that learning is occurring and that students are consolidating what is covered in class and what they're learning to use independently.

Bernhardt's Seventh Question

So the next question is: to what degree is this process or program being implemented? And we have to have an honest reckoning. We have to go really from a scale of we have the resources and they sit on the shelf to every classroom is delivering this consistently and with genuine understanding. We have to acknowledge we may not be where we think we're up to. And again, this is not about blaming teachers. It is a question about how effective and embedded our systems of instruction really are. If implementation is patchy, the answer is rarely that teachers are not trying hard enough. You may be surprised at how often we hear from schools, "Oh we didn't have time to do the training." If we want implementation to be strong, we have to devote the time, and that's about creating those systems that set people up for success.

Bernhardt's Results

Then what are the results? And we don't only ask that question once, the data is a part of everything that we're doing. We're not just looking at those basic test scores, though, we're looking at the full picture all the way through to application. So getting this full picture is about evaluating instruction against evidence and asking ourselves: is our approach fully aligned with evidence? Have we successfully connected the science of reading and writing with the science of learning? Or have we ticked boxes on the literacy elements, but we're not delivering in a way that reflects cognition and what we know about how human brains learn? Because there's two different things: are we implementing the program well, and is the program setting us up for success?

Bernhardt's Next Steps

Bernhardt's framework asks one final thing, and that is next steps. A specific plan, not to be perfect by next week, but to know where we're up to and how we're going to take steps to move us along to that next point in our development. So if we identify that we're at about a 6 out of 10, perhaps, when we look at evidence-based instruction, the goal isn't to get to 10 by next week, that's unrealistic, but it is to say, ok, this term, our goal is to get to 7, what is the practice that's going to get us there? And then do that well. Remember, evidence-based practice that is universally successful is not about doing one big thing, it's about 25 small things that are done consistently well and that are embedded in strong structures and processes so that everybody in the school community is set up for success.

Spelling Instruction Evaluation

If you would like a practical tool to support exactly this kind of conversation around spelling, the Spelling Instruction Evaluation tool is available for free download in the show notes of this episode and in the Free Resources tab at jocelynseamereducation.com. It covers the key elements of strong spelling instruction drawn from research, both the science of reading and writing, if we want to call it that, and from the science of learning. And it's designed to be used by teams, not just by leaders sitting alone in a room with a spreadsheet. You can use this tool to evaluate what's happening between classrooms, you can use it to evaluate different programs you might be considering purchasing or adopting, or you can use it for what's happening in general. Make it work for you.

I said at the start of this episode that this discussion may feel a little confronting. I really hope it wasn't. I hope it was useful. The students in Year 3-8 who are still carrying spelling gaps, who are choosing simpler words in their writing, who are exhausted by the cognitive load of basic encoding, those students deserve instruction that meets them where they are. And teachers deserve the tools and the clarity to make that happen. We will only achieve universal success when we work collectively to make instruction reflect how humans learn and what the research that we have available tells us about effective literacy instruction.

Until I see you in the next episode, happy teaching, everyone. Bye.

References

Bernhardt, V. L. (2013). Data analysis for continuous school improvement (3rd ed.). Routledge.

Looking for lessons and resources on spelling? Join us inside the Resource Room.

Website Banners (3)

Apr 17, 2026 04:34pm

By Jocelyn Seamer

Under education, Evidence based practice, reading instruction, evidence based instruction, teaching, Evidence based teaching practice, literacy, spelling, teaching spelling, literacy podcast & structured literacy

17 min read

933 views

S7 E10 - How to Evaluate Spelling Instruction

0 comments

Leave a comment