Every classroom has a hidden bottleneck. It is not behaviour, motivation, or even ability. It is the finite capacity of working memory, and most lessons push against it constantly without the teacher realising. John Sweller's cognitive load theory (1988) established that working memory can hold only four to seven items at once, and that poorly designed instruction wastes this capacity on things that have nothing to do with learning. The question for teachers in 2026 is whether AI tools can take on that wasteful work so pupils' working memory is freed for the thinking that matters.
Key Takeaways
- Extraneous load is the target. AI cannot reduce the intrinsic complexity of your subject, but it can strip away the unnecessary cognitive cost of badly formatted resources, cluttered slides, and one-size-fits-all instructions.
- Worked examples are the single highest-value use of AI for cognitive load. Sweller's worked example effect is one of the most replicated findings in educational psychology, and AI makes generating a faded sequence fast enough to do every lesson.
- The split-attention effect kills comprehension. When text and diagrams are separated, pupils waste working memory integrating them. AI can produce annotated, integrated materials in seconds.
- AI increases cognitive load when used carelessly. Outputs that are too long, too complex, or redundant with what the teacher has already said double the load rather than halving it. The redundancy effect is real.
- The Thinking Framework bounds the task. Giving pupils a specific cognitive operation (Compare, Sequence, Cause and Effect) reduces the open-ended cognitive demand that overwhelms working memory.
The Cognitive Load Problem in Every Classroom
Sweller distinguished three types of cognitive load that operate simultaneously in any learning episode. Intrinsic load is the inherent complexity of the content: photosynthesis is more complex than counting. Extraneous load is the cognitive cost imposed by poor instructional design: a cluttered worksheet, a six-part verbal instruction delivered once, text separated from its accompanying diagram. Germane load is the productive cognitive effort involved in forming new schema and connecting new knowledge to existing knowledge (Sweller, 1988; Paas, Renkl and Sweller, 2003).
While extraneous load is the most directly reducible through AI, intrinsic and germane load also respond to intelligent tool use.
Managing intrinsic load with AI. Intrinsic load comes from the inherent complexity of the material and cannot be eliminated, but it can be sequenced. AI can break a complex topic into a learning progression: "Arrange these 8 concepts about photosynthesis from simplest to most complex, then generate a lesson that introduces one concept per session." This is Sweller's element interactivity principle in action: by isolating elements before combining them, the teacher reduces the number of interacting elements pupils must process simultaneously.
Building germane load with AI. Germane load is the productive effort of constructing and automating schemas. Teachers should not try to reduce this. It IS the learning. AI can increase germane load by generating tasks that require pupils to connect new information to existing knowledge: "Create 5 questions that ask pupils to link today's lesson on fractions to what they already know about sharing equally." The Thinking Framework's Analogy operation is particularly effective here: "What is this new concept LIKE something we already understand?"
Means-ends analysis describes the cognitive process where a learner compares their current state to the goal state and works backward to close the gap. This is one of the most cognitively demanding processes in problem-solving because it requires holding both states in working memory simultaneously. Sweller (1988) showed that means-ends analysis consumes so much working memory that little capacity remains for schema construction. AI can reduce this burden by providing the goal state explicitly: instead of "solve this problem," provide "the answer is 42, and show how you would get there." This worked-example approach eliminates means-ends processing and redirects cognitive resources toward understanding the method.
The goal of instructional design is to minimise extraneous load so that available working memory capacity goes toward germane load. Most teachers do this instinctively by simplifying language, chunking tasks, and using visuals alongside explanations. The problem is that doing this well, for every pupil, every lesson, takes enormous preparation time. This is exactly where AI can help.
Research consistently shows that extraneous load is the most tractable type. You cannot make photosynthesis simpler than it is, but you can present it in a way that stops pupils wasting cognitive resources on decoding instructions, integrating separated information sources, or searching a cluttered slide for the relevant content. Working memory is the bottleneck; AI can widen the pipe by eliminating the friction that should never have been there.
How AI Reduces Extraneous Load
The mechanism is straightforward. AI handles the reformatting so the teacher's cognitive effort stays on pedagogy. A teacher who would have spent forty minutes simplifying a dense Year 9 science text into three reading levels can now do it in three minutes. The time saved is not the main benefit: the main benefit is that the simplified version actually gets made, gets used, and reduces extraneous load for the pupils who needed it.
Specific AI operations that directly reduce extraneous load include: rewriting complex instructions into numbered steps with shorter sentences; chunking multi-step tasks so pupils see only one step at a time; generating audio-ready text that reduces transient information load for pupils who struggle with reading; and producing visual alternatives to text-heavy explanations. Each of these maps directly onto a documented cognitive load effect.
The key principle for teachers is to be explicit in the prompt about what you want the AI to do and why. "Simplify this" is too vague. "Rewrite this for a reading age of nine, keeping the key vocabulary but using sentences of no more than twelve words and numbered steps of no more than one action each" produces something useful. The prompt is the instructional design decision; the AI is the execution.
Linking this to dual coding is straightforward. Mayer's cognitive theory of multimedia learning (2009) established that learners process verbal and visual information through separate channels. AI can generate dual-channel materials quickly: a brief text explanation alongside a described visual, or a verbal analogy for an abstract diagram. This reduces extraneous load by distributing information across both channels rather than overwhelming either one.
Faded Worked Examples with AI
The worked example effect is one of the most robustly replicated findings in educational psychology. Sweller and Cooper (1985) showed that studying worked examples was more effective than solving equivalent problems for novice learners, because problem-solving imposes high extraneous load through means-ends analysis while worked examples direct attention to the solution structure itself. Renkl (2014) later demonstrated that fading the scaffolding progressively, as learners develop expertise, is the optimal approach.
AI makes faded worked example sequences fast enough to be practical for classroom use. A teacher preparing a maths lesson can generate a complete four-step sequence in under two minutes. The prompt template below works across subjects and age groups:
Prompt: "Create 4 versions of this problem at decreasing levels of scaffolding: (1) fully worked with every step shown and annotated, (2) steps 1 and 2 shown but steps 3 and 4 blank with prompts, (3) only the first step shown with a brief prompt for each subsequent step, (4) the problem statement only with no scaffolding. Keep the mathematical notation consistent across all four versions."
This sequence exploits the expertise reversal effect identified by Kalyuga et al. (2003): scaffolding that helps novices becomes redundant noise for more experienced learners, and can actually increase cognitive load. By having all four versions available, teachers can match the level of scaffolding to where each pupil is in their learning, rather than guessing at a single version that will overload some and bore others.
The same principle applies in English, history, science, and languages. A worked essay paragraph with annotations explaining each rhetorical move, followed by a partially completed paragraph, followed by a planning frame only, is the literacy equivalent of the maths sequence above. AI generates all three versions from a single master example in minutes. This is one of the clearest cases where AI saves teacher preparation time and improves scaffolding quality simultaneously.
AI for Split-Attention Reduction
Chandler and Sweller (1991) identified the split-attention effect: when learners must mentally integrate two separate sources of information (a diagram and its accompanying text, for instance) the integration process itself consumes working memory. The cognitive cost is not in understanding either source; it is in the act of holding one in mind while cross-referencing the other.
The instructional fix is physical integration: labels on the diagram rather than in a separate key, annotations directly on the visual rather than in a text box beside it. The problem is that producing integrated materials takes significantly longer than producing text and a diagram separately. AI eliminates this constraint.
Prompt template for split-attention reduction: "Combine this explanation with the description of this diagram. All labels and explanatory text should appear directly on or immediately adjacent to the relevant part of the visual. Describe the integrated result in enough detail for me to create it, noting exactly where each label should sit."
For teachers working with maps, scientific diagrams, flowcharts, or annotated source documents, this is immediately practical. The AI output describes an integrated layout that the teacher can produce in a slide tool or simply have students draw while the teacher narrates. Either way, the split-attention burden is removed. Where differentiation is required, AI can produce integrated versions at different complexity levels from the same source material.
Differentiation Without Cognitive Overload
Traditional differentiation creates the worst possible conditions for the teacher's own cognitive load. Preparing three or four versions of every resource, across every lesson, across every class, is unsustainable. The result is that differentiation either does not happen, or happens inconsistently, or relies on in-the-moment adaptation that demands so much teacher attention that it reduces the quality of whole-class instruction.
AI changes the economics of differentiation. A single source text can be transformed into three reading levels in three minutes. The teacher makes the pedagogical decisions (which vocabulary to preserve at each level, what scaffolding to add, which challenge questions to append) and the AI does the textual work. This is a clean division of labour.
Prompt template: "Take this Year 9 science passage and produce three versions. Version 1: age-appropriate as written. Version 2: simplified to a reading age of nine, keeping all key scientific terms but using shorter sentences and replacing complex connectives with simple ones. Version 3: an extended version for pupils who have already encountered this content, with two additional challenge questions requiring them to apply the concept to an unfamiliar example. Format all three clearly."
The cognitive load benefit for pupils is direct. A pupil reading at age nine who is given the age-appropriate Year 9 text is spending most of their working memory on decoding, leaving almost none for understanding the science. The same pupil given the simplified version can actually engage with the concept. Cognitive load theory predicts this; AI makes the solution feasible.
It is worth noting that differentiation by text level is only one application. AI can also generate task versions with different levels of working memory demand: a version with sentence starters and vocabulary banks for pupils who need them, and a version with no supports for pupils who do not. The key is that the teacher specifies the cognitive design, not just the difficulty.
AI Scaffolding for SEND Learners
Pupils with identified SEND often face compounding cognitive load challenges that standard classroom materials do not address. Gathercole and Alloway (2008) demonstrated that working memory difficulties are more prevalent in SEND populations than previously recognised, and that they have a stronger relationship with academic attainment than IQ. The implication is that for many SEND learners, the barrier is not intellectual capacity but the cognitive architecture of the materials they are given.
AI can address specific SEND-related cognitive load problems with targeted prompts. For pupils with working memory difficulties, sentence starters reduce the retrieval demand of writing; AI generates these from any task description in seconds. For pupils with slow processing speed, chunked instructions presented one at a time reduce the transient information load from multi-step verbal instructions. For pupils with dyslexia, AI can simplify text and suggest audio-format alternatives. For autistic pupils, visual schedules generated from lesson plans reduce the uncertainty that consumes working memory at lesson transitions.
The PDA (pathological demand avoidance) application deserves specific mention. PDA-aware teaching reframes demands as choices. AI can rewrite a directive task into an autonomous one: "Complete questions 1-5" becomes "Choose any four of these five questions and complete them in any order." This is a single textual transformation that meaningfully changes the cognitive and emotional experience for a PDA pupil. AI makes it fast enough to do routinely rather than occasionally.
For SENCOs planning provision, the connection to AI for SEND administration is clear. The same tools that generate differentiated resources can also assist with documentation and planning, reducing the administrative cognitive load that detracts from the SENCO's capacity to support learners directly. See also our guide on reducing cognitive overload for SEND learners with AI.
The Thinking Framework as a Cognitive Load Manager
Open-ended thinking tasks carry the highest intrinsic cognitive load because the pupil must generate both the content and the structure simultaneously. "Write about the causes of the First World War" requires a pupil to decide what counts as a cause, how to order them, what detail to include, and how to structure a written response, all while managing their existing knowledge. The cognitive demand is unbounded.
The Thinking Framework solves this by providing the cognitive structure in advance. When a teacher says "Compare the causes of the First World War using the Compare operation," the pupil knows they need to identify attributes, find similarities and differences, and draw a conclusion. The structure is given; only the content is open. This is a principled reduction in cognitive demand.
AI combined with the Thinking Framework produces bounded cognitive tasks at scale. A teacher can prompt: "Create five Compare tasks on the causes of the First World War at different complexity levels. Each task should specify what is being compared and what attributes to consider. Use the Compare thinking operation structure: identify attributes, find similarities and differences, draw a conclusion." The result is a set of tasks that are cognitively well-designed, not just superficially differentiated.
The same principle applies across all eight operations: Compare, Classify, Sequence, Cause and Effect, Part-Whole, Analogy, Perspective, and Systems Thinking. Each operation gives the pupil a cognitive procedure to follow, which reduces the extraneous load of working out how to think about the content. AI can generate multiple instances of each operation on any topic, meaning teachers can provide variety without cognitive design work for every individual task.
This is also the argument for metacognitive framing in AI-generated tasks. When pupils understand what cognitive operation they are performing, they can monitor their own thinking more effectively. The Thinking Framework provides that language; AI can embed it into task instructions consistently.
Common Mistakes: When AI Increases Cognitive Load
The redundancy effect, identified by Chandler and Sweller (1991), is the most frequently violated principle when teachers use AI. It states that presenting the same information in multiple formats simultaneously can actually increase cognitive load rather than reduce it, because learners process both formats and then must reconcile them. If a teacher delivers a verbal explanation and then projects an AI-generated text explanation covering the same ground, the two sources compete for working memory rather than supplementing each other.
The practical rule is: use AI to replace a less effective format, not to add to it. If the teacher is explaining verbally, the slide should show a diagram, not the same explanation in text. If the AI has generated a simplified text version, the teacher does not also need to read it aloud word for word. This sounds obvious, but the temptation to use AI-generated content as a supplement to existing resources, rather than as a replacement, is strong, and it produces redundant presentations.
Mayer's coherence principle (2009) provides a related warning. AI tends to generate comprehensive outputs, and comprehensive is often the enemy of learnable. A 600-word AI-generated explanation of osmosis is not more useful than a 150-word explanation; it is likely less useful, because the additional detail increases extraneous load without adding to understanding for most pupils. Teachers should explicitly prompt for brevity: "Explain osmosis in no more than 100 words for a Year 8 pupil with no prior knowledge of the topic."
A third failure mode is complexity mismatch. AI generates at the level you specify, but if you do not specify, it defaults to a generic middle level that may be too complex for struggling learners and too simple for strong ones, while feeling appropriate to neither. The expertise reversal effect means that the same AI-generated scaffold that helps a novice can actively impede a more experienced learner. Prompt specifically, or generate multiple versions and select the appropriate one. Barriers to learning are often design problems rather than pupil problems, and AI can introduce new design problems if the teacher is not deliberate about the prompt.
Connecting to Rosenshine's Principles
Rosenshine's Principles of Instruction (2012) align closely with cognitive load theory and provide a practical framework for thinking about where AI can add value. Several principles map directly onto cognitive load reduction: presenting new material in small steps (reduces intrinsic load presented at once), providing models (worked examples that reduce extraneous load), and checking for understanding before moving on (prevents extraneous load from accumulated misconceptions).
AI can support each of these. Small steps can be generated from a larger learning objective by asking AI to sequence the component knowledge elements. Models can be generated as worked examples with the fading approach described above. Rosenshine's emphasis on high success rate also has a cognitive load interpretation: tasks that are pitched correctly do not impose unnecessary extraneous load from confusion, and AI can generate formative checkpoints that help teachers pitch tasks accurately.
The connection to retrieval practice is also worth making explicit. Retrieval practice works partly by strengthening schema in long-term memory, which in turn reduces future cognitive load when related content is encountered. AI can generate retrieval practice questions quickly: low-stakes, frequent, and varied. A teacher who uses AI to produce five retrieval questions at the start of each lesson is investing in reducing the cognitive load of future lessons, not just the current one.
What to Try Next Lesson
Take the most complex written instruction you give pupils this week. It might be a task brief, a set of exam questions, or a multi-step investigation protocol. Paste it into an AI tool with this prompt:
"Simplify this instruction for a pupil with a reading age of nine. Keep all the key subject vocabulary but use sentences of no more than twelve words. Present each action as a separate numbered step. Where there are more than five steps, break them into two labelled sections."
Compare the original and the AI output. The simplified version is almost certainly clearer for your whole class, not just your struggling learners. Use it for everyone. The pupils who would have found the original easy will not be disadvantaged by clearer instructions; the pupils who would have been confused by the original will now be able to engage with the actual content. That is the extraneous load reduction in practice.
The second thing to try is a faded worked example. Take one problem or task you plan to set this week. Ask AI to generate a fully worked version with annotations, then a partially worked version with the final two steps removed, then a planning frame only. Use all three in the same lesson, matching each to the appropriate group of pupils. Notice whether the pupils who receive the fully worked example engage more successfully with the independent practice than usual, and whether the planning-frame group are stretched without being overwhelmed.
If you want to go further, look at your most complex resource from this term, whether a reading, a diagram, or a data table, and ask AI to integrate the text with the visual so that labels appear directly on the relevant part of the image rather than in a separate key. The cognitive cost of reducing split-attention in that one resource is minutes. The benefit to your pupils' comprehension is measurable. That is the cognitive load case for AI in teaching: not a transformation, but a removal of friction that should never have been there. See our guide to AI for teachers and AI in lesson planning for the broader toolkit.
Further Reading: Key Research Papers
The following papers provide the academic foundation for using cognitive load principles in instructional design with AI.
Cognitive Load Theory: Instructional Implications of the Interaction between Information Structures and Cognitive Architecture View study ↗
2,100+ citations
Sweller, J., Ayres, P., & Kalyuga, S. (2011)
This comprehensive treatment of cognitive load theory covers intrinsic, extraneous, and germane load in depth, with instructional design implications directly applicable to AI-generated materials. Essential reading for teachers who want to move beyond surface-level AI prompting.
The Worked Example Effect and Human Cognition View study ↗
850+ citations
Sweller, J. (2006)
Sweller reviews the evidence base for worked examples and explains why studying examples is more effective than problem-solving for novice learners. The paper establishes the theoretical case for using AI to generate faded example sequences at scale.
Principles for Reducing Extraneous Processing in Multimedia Learning View study ↗
Mayer, R. E. (2009)
Mayer, R. E. (2009)
This chapter from the Cambridge Handbook of Multimedia Learning covers the coherence, signalling, redundancy, spatial contiguity, and temporal contiguity principles. Each maps onto a specific AI misuse risk, making this essential reading for teachers designing AI-assisted materials.
Working Memory Capacity and Implications for Learning View study ↗
1,400+ citations
Gathercole, S. E., & Alloway, T. P. (2008)
Gathercole and Alloway's work on working memory and learning demonstrates the prevalence of working memory difficulties in school-age children and their relationship to academic outcomes. This provides the evidence base for why reducing extraneous load has outsized benefit for SEND learners.
Cognitive Architecture and Instructional Design: 20 Years Later View study ↗
900+ citations
Sweller, J., van Merriënboer, J., & Paas, F. (2019)
A landmark review updating cognitive load theory with twenty years of subsequent research. The paper addresses how the theory applies to technology-enhanced learning environments, making it directly relevant to AI-assisted instruction. Covers element interactivity, the expertise reversal effect, and implications for adaptive learning systems.
Written by the Structural Learning Research Team
This article draws on peer-reviewed research in cognitive load theory, educational psychology, and AI-assisted instruction. References: Sweller (1988); Paas, Renkl and Sweller (2003); Chandler and Sweller (1991); Renkl (2014); Mayer (2009); Gathercole and Alloway (2008).