Why methodology matters more than speed. Cognitive load theory, retrieval practice, Bloom's Taxonomy, and Rosenshine's principles applied to AI lesson planning tools.
Most AI lesson planners generate content. The best ones generate learning. The difference is learning science. Right now, thousands of teachers are downloading AI-generated worksheets that look professional but teach nothing new. They contain questions, yes, but no retrieval practice. They present information, yes, but no scaffolding. They aim at learning, but without a framework for how learning actually happens in the brain. This article is about the difference between a faster photocopier and a thinking partner. It is about why the methodology built into your AI tool matters infinitely more than the speed at which it generates.
Key Takeaways
AI tools without a learning science foundation are sophisticated photocopiers, not teaching partners.
Cognitive load theory, retrieval practice, scaffolding, and Rosenshine's principles should be engineering specifications for any AI lesson planner.
The best AI tools adapt to the zone of proximal development, build in spacing and retrieval, and progress through cognitive levels within a single lesson.
Before adopting an AI tool, ask: what learning science is this built on? If the answer is unclear, you are buying speed, not learning.
What Learning Science Means for AI
Learning science is the intersection of cognitive psychology and educational research applied to how people actually learn. It answers questions like: how does the brain encode information? What strengthens memory? How do pupils move from novice to expert? When your AI tool is built on learning science, every feature has a reason. When it is not, you get a feature machine.
Consider two Year 4 maths lessons, both generated by AI on the topic of fractions. The first tool generates a worksheet with 15 questions: "What is 1/2 of 8? What is 1/4 of 12?" The pupils complete them, get feedback, and move on. The second tool generates a lesson sequence built on cognitive load theory. It starts with a worked example: a diagram showing a chocolate bar divided into quarters, with the teacher narrating: "Each piece is one quarter. If we have four pieces, we have four quarters, which is the whole." Then it provides a partly-worked problem with a sentence stem: "If one quarter is __, then two quarters is __." Only then, once extraneous load is reduced and germane load is activated, does it ask an independent question. Which lesson teaches fractions better? The science is clear: scaffolding with worked examples builds understanding faster than direct practice (Rosenshine, 2012).
An AI without learning science is a risk in schools. It will generate plausible-looking activities that feel productive but build shallow learning. Over time, pupils fall further behind because they were taught efficiently but not effectively.
Cognitive Load Theory and AI Design
John Sweller's cognitive load theory (1988) divides the mental effort required to learn into three types. Intrinsic load is the inherent difficulty of the task (learning fractions is harder than learning whole numbers). Extraneous load is the unnecessary cognitive burden imposed by poor design (confusing instructions, cluttered layouts, irrelevant information). Germane load is the productive mental effort directed at building understanding and creating schema.
A learning-science-aware AI tool manages these loads ruthlessly. It reduces extraneous load by presenting instructions clearly, using clean layouts, and removing distractions. It maximizes germane load by asking questions that force schema-building, providing models of expert thinking, and gradually increasing complexity. A tool built without this understanding generates the opposite: it piles extraneous load onto pupils by asking 20 unscaffolded questions on a new topic, leaving no mental space for actually learning.
Here is a concrete example. A Year 5 teacher is teaching long multiplication for the first time. A poor AI tool generates a worksheet with 10 problems: "14 × 23 = ? 16 × 31 = ?" with no worked example, no sentence stems, no visual support. The pupils spend cognitive load decoding the format and remembering the algorithm, leaving nothing for understanding why it works. A learning-science-aware AI tool generates: a visual representation of 14 × 23 as an area model (base ten grids), a fully worked example with narration ("We multiply 14 by the 20 first, then by the 3, then add them"), a partly-worked problem with a sentence stem ("Following this example, 16 × 31 = 16 × __ plus 16 × __ = __ plus __ = __"), and only then an independent problem. Same topic, wildly different cognitive load management.
One more insight: Kalyuga's expertise reversal effect (2003) shows that worked examples that are helpful to novices are actually distracting to experts. A truly intelligent AI tool adapts. It recognizes when a pupil has mastered the first problem type and removes scaffolding. It does not bore experts with examples; it does not overwhelm novices with independence.
Bloom's Taxonomy in AI-Generated Lessons
Most AI tools default to lower-order thinking. Ask an AI to generate a history lesson on the Norman Conquest, and it will generate recall questions: "When was the Battle of Hastings?" "Who was William of Normandy?" These are useful, but they are not learning. They are memorization.
Bloom's revised taxonomy (Anderson and Krathwohl, 2001) describes six levels of cognitive complexity: Remember (recall facts), Understand (explain ideas), Apply (use in new situations), Analyse (break into parts), Evaluate (make judgments), Create (produce new work). A learning-science-informed AI tool deliberately engineers progression through these levels within a single lesson. It does not ask all questions at the Remember level. It builds a staircase.
A Year 8 history teacher generates a lesson on the Norman Conquest using an AI built on learning science. The tool produces: a Remember question ("In what year did the Battle of Hastings occur?"), an Understand question ("Why did William believe he had a claim to the English throne?"), an Apply question ("If you were an English noble in 1066, would you have supported Harold or William, and why?"), an Analyse question ("What were the short-term and long-term consequences of the Norman Conquest for England?"), and an Evaluate question ("Was the Norman Conquest beneficial or harmful to England?"). Notice the progression. A pupil who can only recall facts moves toward understanding. A pupil who understands moves toward application and analysis. The lesson does not leave anyone behind; it pulls everyone forward.
Most AI tools generate Remember and Understand questions because they are easiest to produce automatically. A tool built on learning science requires the extra logic to climb the taxonomy within a single session.
Zone of Proximal Development and Adaptive AI
Lev Vygotsky's concept of the zone of proximal development (ZPD) (1978) describes the space between what a pupil can do alone and what they can do with support. Learning happens in this zone. A task too easy is boring; a task too hard is demoralizing. The learning sweet spot is just beyond current competence, with scaffolding.
Most AI tools generate one-size-fits-all content. All pupils get the same lesson, the same questions, the same difficulty. A learning-science-aware tool recognizes that the ZPD is different for every pupil and adapts accordingly. It provides more scaffolding to struggling pupils, less to confident ones, and gradually reduces scaffolding as competence grows.
A Year 3 reading comprehension lesson illustrates this. The teacher has just introduced a story about a lost dog. A struggling reader needs more support to engage with the text. The AI provides sentence stems: "The dog was lost because __. The owner felt __ because __. I think the dog will be found because __." A confident reader needs less scaffolding and more challenge. The AI provides an open prompt: "Why do you think the author described the owner's feelings at the end of the story?" A pupil with high comprehension gets an analysis question: "How would the story change if it were told from the dog's perspective?" All three pupils are in their ZPD. All three are learning. A one-size-fits-all tool leaves two of them behind.
Adaptive AI is still emerging. But the learning science principle is clear: if your AI tool does not adapt to individual pupils, it is not serving individual learning. It is broadcasting.
Retrieval Practice and Spaced Repetition
The single most powerful learning technique in cognitive psychology is retrieval practice: the act of pulling information from memory (Roediger and Butler, 2011). It is more effective than re-reading, summarizing, or even focused study. Every time a pupil retrieves information from memory, the memory strengthens. Every time they fail to retrieve it, the retrieval attempt itself provides a learning opportunity.
Retrieval practice is not the same as answering questions. A question is only retrieval practice if the pupil must access long-term memory to answer it. A question they can answer by reading the previous sentence is not retrieval practice. A question about information presented three weeks ago, when it has partially faded, is powerful retrieval practice.
Spaced repetition amplifies this. Information is stronger if retrieval attempts are spaced over time. A pupil who answers a question about photosynthesis today, then again in three days, then again in two weeks, will retain photosynthesis far longer than a pupil who answers five questions in a row and never sees it again. This is not intuitive; pupils feel more confident after massed practice. But spaced practice produces durable learning (Karpicke, 2012).
How many AI lesson planners build in retrieval practice? Very few. How many automatically space retrieval attempts over days and weeks? Almost none. An AI tool built on learning science would work like this: a Year 7 science teacher generates a lesson on photosynthesis. The AI produces learning activities for that lesson. But it also tracks what the teacher covered (the tool knows photosynthesis was taught three weeks ago). When the teacher generates a lesson on a new topic next week, the AI embeds a retrieval question from photosynthesis in the starter activity, spaced according to cognitive science principles. By term's end, pupils have retrieved the concept multiple times, spaced across weeks, and retained it durably.
This is not magic. It is elementary learning science. Yet almost no current tools do it because it requires data infrastructure and pedagogical intent. The tools optimized purely for speed skip this step.
Dual Coding in AI-Generated Resources
Allan Paivio's dual coding theory (1986) shows that combining verbal and visual information produces stronger encoding than either alone. When you encode information through two channels, you build two memory routes to the same concept. Retrieve one route, and the other comes with it. The brain is built for both words and images.
Most current AI tools generate text. A prompt like "Explain photosynthesis" produces a paragraph. A paragraph is a single channel. A learning-science-aware tool generates both words and images simultaneously. The same prompt produces: a written explanation ("Photosynthesis is the process by which plants convert light energy into chemical energy stored in glucose") and a fully labelled diagram (sunlight arrow → leaves, water arrow → roots, glucose arrow → plant). The diagram is not decorative; it is essential to encoding.
A Year 6 science lesson on the water cycle illustrates this. One AI tool generates a paragraph: "Water evaporates from the ocean, forms clouds, falls as rain, and flows back to the ocean." A learning-science-informed tool generates: that same paragraph plus an annotated diagram showing the ocean, arrows for evaporation rising upward, clouds forming, rain falling, and water running downhill back to the sea. Add labels for each process. Now pupils encode through words and spatial relationships. Now dual coding is active.
This seems obvious once stated. Yet hundreds of AI tools prioritize speed and brevity over encoding strength and generate text-only output.
Rosenshine's Principles as an AI Framework
Barak Rosenshine spent decades analyzing the most effective teachers and extracted ten principles of instruction grounded in cognitive science (2012). These principles should be the engineering specification for any AI lesson planner. If your AI tool is not built to implement Rosenshine's principles, it is not built on learning science.
Begin lessons with a review of prior learning. This is retrieval practice. Every lesson should start with questions about what was taught yesterday, last week, or last term. An AI tool should automatically generate review questions spaced according to spacing principles.
Present new material in small steps. This is cognitive load management. Do not introduce ten new concepts at once. Break them into chunks. Present one chunk, ask questions about it, then move to the next. An AI tool should segment lessons into small steps and check understanding between each.
Ask many questions and check for understanding. Do not present information and assume pupils understood. Stop and question. Use questioning to reveal gaps in understanding and address them immediately. An AI tool should embed frequent checking questions, not just end-of-lesson assessments.
Provide models and worked examples. When teaching a new procedure or strategy, show your thinking. Work through an example aloud while pupils watch. Then work through another with pupil participation. Only then ask pupils to work independently. An AI tool should generate worked examples as standard, with fading scaffolding.
Guide practice with scaffolding. When pupils practise, they should not practise mistakes. Provide support, hints, and feedback. Gradually reduce support as competence grows. An AI tool should scaffold practice using sentence stems, models, and hints, not throw pupils into deep water.
These five principles (of ten) are non-negotiable for effective learning. Any AI tool claiming to support teaching should implement them. Most do not. Read the full Rosenshine principles and check your current AI tool against them. How many does it actually fulfil?
Comparing AI Tools Through a Learning Science Lens
Not all AI lesson planning tools are built equally. Some prioritize speed; others prioritize learning science. Here is how six tools compare when evaluated through a learning science lens.
AI Tool
Scaffolding
Retrieval & Spacing
Cognitive Progression
Pedagogy Framework
Structural Learning
Yes. Built on Thinking Framework colour coding; adjusts support based on pupil response.
Yes. Interactive live lessons; response-based support.
Emerging. Live feedback loop; spaced retrieval not automated.
Yes. Interactive design allows cognitive progression; live adjustment possible.
8 published impact studies; pedagogical framework pragmatic
What does this comparison reveal? First, tools designed by or in partnership with educational researchers (Structural Learning, Oak, Curipod) embed learning science more explicitly. Second, tools optimized purely for speed and convenience (MagicSchool) leave pedagogical depth to the user. Third, even the best current tools have gaps; none fully implement spaced retrieval practice at scale.
This is not judgment. It is observation. If you need speed, a general-purpose tool may serve you. If you need durability, growth, and learning science, you need a tool designed around pedagogy.
Questions to Ask Before Choosing an AI Tool
When your school is evaluating an AI lesson planning tool, use this checklist to assess whether it is built on learning science or just built to be fast.
1. Does it scaffold learning? Ask the vendor: how does the tool reduce extraneous cognitive load? What scaffolding features are built-in? If the answer is "it depends on how you prompt it," that is not enough. The tool should have scaffolding as a default, not an option.
2. Does it build in retrieval practice? Ask: does the tool track what pupils learned and embed review questions in future lessons? If the answer is no, the tool is not managing long-term retention.
3. Does it progress through cognitive levels? Generate a sample lesson and map the questions to Bloom's taxonomy. Do they climb from Remember toward Evaluate? Or do they stay at Remember?
4. Does it adapt to different learner needs? Ask: if one pupil struggles on a practice question, does the tool provide additional scaffolding? If one pupil excels, does it increase challenge? If the answer is "the teacher decides," that is passing responsibility rather than sharing it.
5. Does it cite its pedagogical framework? A good tool can explain which learning science principles it implements. It references Rosenshine, Sweller, Hattie, or other researchers. If the tool cannot explain its pedagogy, it does not have one.
6. Can you see the learning science behind the output? Generate a lesson and ask: why are the questions in this order? Why is this example included? If the answer is "the AI thought it would be helpful," that is not learning science. It is guessing.
7. Does it generate activities or just worksheets? A worksheet is a document. An activity is a learning sequence. A good tool generates the latter. Does the output include worked examples, scaffolding, checking questions, and fading support? Or does it just ask questions?
Use these questions in a staff meeting. Compare two or three tools your school is considering. The tool that can answer all seven clearly is the one built on learning science.
Before your next staff meeting on AI tools, ask one question: what learning science is this built on? If the answer is "none" or "we just use a large language model," you are buying a faster photocopier. Your pupils deserve a thinking partner. That thinking partner should be grounded in decades of cognitive science and educational research, not just in the engineering of language models. Choose deliberately. Your pupils' learning depends on it.
References
Anderson, L.W., and Krathwohl, D.R. (Eds.). (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom's taxonomy of educational objectives. Longman.
Hattie, J. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge.
Kalyuga, S. (2003). The expertise reversal effect and its implications for learner-tailored instruction. Educational Psychology Review, 17(4), 374-377.
Karpicke, J.D. (2012). Retrieval-based learning: Active retrieval promotes meaningful learning. Current Directions in Psychological Science, 21(3), 157-163.
Paivio, A. (1986). Mental representations: A dual coding approach. Oxford University Press.
Roediger, H.L., and Butler, A.C. (2011). The critical role of retrieval practice in long-term retention. Trends in Cognitive Sciences, 15(8), 374-380.
Rosenshine, B. (2012). Principles of instruction: research-based strategies that all teachers should know. American Educator, 36(1), 12-19.
Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257-285.
Vygotsky, L.S. (1978). Mind in society: The development of higher psychological processes. Harvard University Press.
Most AI lesson planners generate content. The best ones generate learning. The difference is learning science. Right now, thousands of teachers are downloading AI-generated worksheets that look professional but teach nothing new. They contain questions, yes, but no retrieval practice. They present information, yes, but no scaffolding. They aim at learning, but without a framework for how learning actually happens in the brain. This article is about the difference between a faster photocopier and a thinking partner. It is about why the methodology built into your AI tool matters infinitely more than the speed at which it generates.
Key Takeaways
AI tools without a learning science foundation are sophisticated photocopiers, not teaching partners.
Cognitive load theory, retrieval practice, scaffolding, and Rosenshine's principles should be engineering specifications for any AI lesson planner.
The best AI tools adapt to the zone of proximal development, build in spacing and retrieval, and progress through cognitive levels within a single lesson.
Before adopting an AI tool, ask: what learning science is this built on? If the answer is unclear, you are buying speed, not learning.
What Learning Science Means for AI
Learning science is the intersection of cognitive psychology and educational research applied to how people actually learn. It answers questions like: how does the brain encode information? What strengthens memory? How do pupils move from novice to expert? When your AI tool is built on learning science, every feature has a reason. When it is not, you get a feature machine.
Consider two Year 4 maths lessons, both generated by AI on the topic of fractions. The first tool generates a worksheet with 15 questions: "What is 1/2 of 8? What is 1/4 of 12?" The pupils complete them, get feedback, and move on. The second tool generates a lesson sequence built on cognitive load theory. It starts with a worked example: a diagram showing a chocolate bar divided into quarters, with the teacher narrating: "Each piece is one quarter. If we have four pieces, we have four quarters, which is the whole." Then it provides a partly-worked problem with a sentence stem: "If one quarter is __, then two quarters is __." Only then, once extraneous load is reduced and germane load is activated, does it ask an independent question. Which lesson teaches fractions better? The science is clear: scaffolding with worked examples builds understanding faster than direct practice (Rosenshine, 2012).
An AI without learning science is a risk in schools. It will generate plausible-looking activities that feel productive but build shallow learning. Over time, pupils fall further behind because they were taught efficiently but not effectively.
Cognitive Load Theory and AI Design
John Sweller's cognitive load theory (1988) divides the mental effort required to learn into three types. Intrinsic load is the inherent difficulty of the task (learning fractions is harder than learning whole numbers). Extraneous load is the unnecessary cognitive burden imposed by poor design (confusing instructions, cluttered layouts, irrelevant information). Germane load is the productive mental effort directed at building understanding and creating schema.
A learning-science-aware AI tool manages these loads ruthlessly. It reduces extraneous load by presenting instructions clearly, using clean layouts, and removing distractions. It maximizes germane load by asking questions that force schema-building, providing models of expert thinking, and gradually increasing complexity. A tool built without this understanding generates the opposite: it piles extraneous load onto pupils by asking 20 unscaffolded questions on a new topic, leaving no mental space for actually learning.
Here is a concrete example. A Year 5 teacher is teaching long multiplication for the first time. A poor AI tool generates a worksheet with 10 problems: "14 × 23 = ? 16 × 31 = ?" with no worked example, no sentence stems, no visual support. The pupils spend cognitive load decoding the format and remembering the algorithm, leaving nothing for understanding why it works. A learning-science-aware AI tool generates: a visual representation of 14 × 23 as an area model (base ten grids), a fully worked example with narration ("We multiply 14 by the 20 first, then by the 3, then add them"), a partly-worked problem with a sentence stem ("Following this example, 16 × 31 = 16 × __ plus 16 × __ = __ plus __ = __"), and only then an independent problem. Same topic, wildly different cognitive load management.
One more insight: Kalyuga's expertise reversal effect (2003) shows that worked examples that are helpful to novices are actually distracting to experts. A truly intelligent AI tool adapts. It recognizes when a pupil has mastered the first problem type and removes scaffolding. It does not bore experts with examples; it does not overwhelm novices with independence.
Bloom's Taxonomy in AI-Generated Lessons
Most AI tools default to lower-order thinking. Ask an AI to generate a history lesson on the Norman Conquest, and it will generate recall questions: "When was the Battle of Hastings?" "Who was William of Normandy?" These are useful, but they are not learning. They are memorization.
Bloom's revised taxonomy (Anderson and Krathwohl, 2001) describes six levels of cognitive complexity: Remember (recall facts), Understand (explain ideas), Apply (use in new situations), Analyse (break into parts), Evaluate (make judgments), Create (produce new work). A learning-science-informed AI tool deliberately engineers progression through these levels within a single lesson. It does not ask all questions at the Remember level. It builds a staircase.
A Year 8 history teacher generates a lesson on the Norman Conquest using an AI built on learning science. The tool produces: a Remember question ("In what year did the Battle of Hastings occur?"), an Understand question ("Why did William believe he had a claim to the English throne?"), an Apply question ("If you were an English noble in 1066, would you have supported Harold or William, and why?"), an Analyse question ("What were the short-term and long-term consequences of the Norman Conquest for England?"), and an Evaluate question ("Was the Norman Conquest beneficial or harmful to England?"). Notice the progression. A pupil who can only recall facts moves toward understanding. A pupil who understands moves toward application and analysis. The lesson does not leave anyone behind; it pulls everyone forward.
Most AI tools generate Remember and Understand questions because they are easiest to produce automatically. A tool built on learning science requires the extra logic to climb the taxonomy within a single session.
Zone of Proximal Development and Adaptive AI
Lev Vygotsky's concept of the zone of proximal development (ZPD) (1978) describes the space between what a pupil can do alone and what they can do with support. Learning happens in this zone. A task too easy is boring; a task too hard is demoralizing. The learning sweet spot is just beyond current competence, with scaffolding.
Most AI tools generate one-size-fits-all content. All pupils get the same lesson, the same questions, the same difficulty. A learning-science-aware tool recognizes that the ZPD is different for every pupil and adapts accordingly. It provides more scaffolding to struggling pupils, less to confident ones, and gradually reduces scaffolding as competence grows.
A Year 3 reading comprehension lesson illustrates this. The teacher has just introduced a story about a lost dog. A struggling reader needs more support to engage with the text. The AI provides sentence stems: "The dog was lost because __. The owner felt __ because __. I think the dog will be found because __." A confident reader needs less scaffolding and more challenge. The AI provides an open prompt: "Why do you think the author described the owner's feelings at the end of the story?" A pupil with high comprehension gets an analysis question: "How would the story change if it were told from the dog's perspective?" All three pupils are in their ZPD. All three are learning. A one-size-fits-all tool leaves two of them behind.
Adaptive AI is still emerging. But the learning science principle is clear: if your AI tool does not adapt to individual pupils, it is not serving individual learning. It is broadcasting.
Retrieval Practice and Spaced Repetition
The single most powerful learning technique in cognitive psychology is retrieval practice: the act of pulling information from memory (Roediger and Butler, 2011). It is more effective than re-reading, summarizing, or even focused study. Every time a pupil retrieves information from memory, the memory strengthens. Every time they fail to retrieve it, the retrieval attempt itself provides a learning opportunity.
Retrieval practice is not the same as answering questions. A question is only retrieval practice if the pupil must access long-term memory to answer it. A question they can answer by reading the previous sentence is not retrieval practice. A question about information presented three weeks ago, when it has partially faded, is powerful retrieval practice.
Spaced repetition amplifies this. Information is stronger if retrieval attempts are spaced over time. A pupil who answers a question about photosynthesis today, then again in three days, then again in two weeks, will retain photosynthesis far longer than a pupil who answers five questions in a row and never sees it again. This is not intuitive; pupils feel more confident after massed practice. But spaced practice produces durable learning (Karpicke, 2012).
How many AI lesson planners build in retrieval practice? Very few. How many automatically space retrieval attempts over days and weeks? Almost none. An AI tool built on learning science would work like this: a Year 7 science teacher generates a lesson on photosynthesis. The AI produces learning activities for that lesson. But it also tracks what the teacher covered (the tool knows photosynthesis was taught three weeks ago). When the teacher generates a lesson on a new topic next week, the AI embeds a retrieval question from photosynthesis in the starter activity, spaced according to cognitive science principles. By term's end, pupils have retrieved the concept multiple times, spaced across weeks, and retained it durably.
This is not magic. It is elementary learning science. Yet almost no current tools do it because it requires data infrastructure and pedagogical intent. The tools optimized purely for speed skip this step.
Dual Coding in AI-Generated Resources
Allan Paivio's dual coding theory (1986) shows that combining verbal and visual information produces stronger encoding than either alone. When you encode information through two channels, you build two memory routes to the same concept. Retrieve one route, and the other comes with it. The brain is built for both words and images.
Most current AI tools generate text. A prompt like "Explain photosynthesis" produces a paragraph. A paragraph is a single channel. A learning-science-aware tool generates both words and images simultaneously. The same prompt produces: a written explanation ("Photosynthesis is the process by which plants convert light energy into chemical energy stored in glucose") and a fully labelled diagram (sunlight arrow → leaves, water arrow → roots, glucose arrow → plant). The diagram is not decorative; it is essential to encoding.
A Year 6 science lesson on the water cycle illustrates this. One AI tool generates a paragraph: "Water evaporates from the ocean, forms clouds, falls as rain, and flows back to the ocean." A learning-science-informed tool generates: that same paragraph plus an annotated diagram showing the ocean, arrows for evaporation rising upward, clouds forming, rain falling, and water running downhill back to the sea. Add labels for each process. Now pupils encode through words and spatial relationships. Now dual coding is active.
This seems obvious once stated. Yet hundreds of AI tools prioritize speed and brevity over encoding strength and generate text-only output.
Rosenshine's Principles as an AI Framework
Barak Rosenshine spent decades analyzing the most effective teachers and extracted ten principles of instruction grounded in cognitive science (2012). These principles should be the engineering specification for any AI lesson planner. If your AI tool is not built to implement Rosenshine's principles, it is not built on learning science.
Begin lessons with a review of prior learning. This is retrieval practice. Every lesson should start with questions about what was taught yesterday, last week, or last term. An AI tool should automatically generate review questions spaced according to spacing principles.
Present new material in small steps. This is cognitive load management. Do not introduce ten new concepts at once. Break them into chunks. Present one chunk, ask questions about it, then move to the next. An AI tool should segment lessons into small steps and check understanding between each.
Ask many questions and check for understanding. Do not present information and assume pupils understood. Stop and question. Use questioning to reveal gaps in understanding and address them immediately. An AI tool should embed frequent checking questions, not just end-of-lesson assessments.
Provide models and worked examples. When teaching a new procedure or strategy, show your thinking. Work through an example aloud while pupils watch. Then work through another with pupil participation. Only then ask pupils to work independently. An AI tool should generate worked examples as standard, with fading scaffolding.
Guide practice with scaffolding. When pupils practise, they should not practise mistakes. Provide support, hints, and feedback. Gradually reduce support as competence grows. An AI tool should scaffold practice using sentence stems, models, and hints, not throw pupils into deep water.
These five principles (of ten) are non-negotiable for effective learning. Any AI tool claiming to support teaching should implement them. Most do not. Read the full Rosenshine principles and check your current AI tool against them. How many does it actually fulfil?
Comparing AI Tools Through a Learning Science Lens
Not all AI lesson planning tools are built equally. Some prioritize speed; others prioritize learning science. Here is how six tools compare when evaluated through a learning science lens.
AI Tool
Scaffolding
Retrieval & Spacing
Cognitive Progression
Pedagogy Framework
Structural Learning
Yes. Built on Thinking Framework colour coding; adjusts support based on pupil response.
Yes. Interactive live lessons; response-based support.
Emerging. Live feedback loop; spaced retrieval not automated.
Yes. Interactive design allows cognitive progression; live adjustment possible.
8 published impact studies; pedagogical framework pragmatic
What does this comparison reveal? First, tools designed by or in partnership with educational researchers (Structural Learning, Oak, Curipod) embed learning science more explicitly. Second, tools optimized purely for speed and convenience (MagicSchool) leave pedagogical depth to the user. Third, even the best current tools have gaps; none fully implement spaced retrieval practice at scale.
This is not judgment. It is observation. If you need speed, a general-purpose tool may serve you. If you need durability, growth, and learning science, you need a tool designed around pedagogy.
Questions to Ask Before Choosing an AI Tool
When your school is evaluating an AI lesson planning tool, use this checklist to assess whether it is built on learning science or just built to be fast.
1. Does it scaffold learning? Ask the vendor: how does the tool reduce extraneous cognitive load? What scaffolding features are built-in? If the answer is "it depends on how you prompt it," that is not enough. The tool should have scaffolding as a default, not an option.
2. Does it build in retrieval practice? Ask: does the tool track what pupils learned and embed review questions in future lessons? If the answer is no, the tool is not managing long-term retention.
3. Does it progress through cognitive levels? Generate a sample lesson and map the questions to Bloom's taxonomy. Do they climb from Remember toward Evaluate? Or do they stay at Remember?
4. Does it adapt to different learner needs? Ask: if one pupil struggles on a practice question, does the tool provide additional scaffolding? If one pupil excels, does it increase challenge? If the answer is "the teacher decides," that is passing responsibility rather than sharing it.
5. Does it cite its pedagogical framework? A good tool can explain which learning science principles it implements. It references Rosenshine, Sweller, Hattie, or other researchers. If the tool cannot explain its pedagogy, it does not have one.
6. Can you see the learning science behind the output? Generate a lesson and ask: why are the questions in this order? Why is this example included? If the answer is "the AI thought it would be helpful," that is not learning science. It is guessing.
7. Does it generate activities or just worksheets? A worksheet is a document. An activity is a learning sequence. A good tool generates the latter. Does the output include worked examples, scaffolding, checking questions, and fading support? Or does it just ask questions?
Use these questions in a staff meeting. Compare two or three tools your school is considering. The tool that can answer all seven clearly is the one built on learning science.
Before your next staff meeting on AI tools, ask one question: what learning science is this built on? If the answer is "none" or "we just use a large language model," you are buying a faster photocopier. Your pupils deserve a thinking partner. That thinking partner should be grounded in decades of cognitive science and educational research, not just in the engineering of language models. Choose deliberately. Your pupils' learning depends on it.
References
Anderson, L.W., and Krathwohl, D.R. (Eds.). (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom's taxonomy of educational objectives. Longman.
Hattie, J. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge.
Kalyuga, S. (2003). The expertise reversal effect and its implications for learner-tailored instruction. Educational Psychology Review, 17(4), 374-377.
Karpicke, J.D. (2012). Retrieval-based learning: Active retrieval promotes meaningful learning. Current Directions in Psychological Science, 21(3), 157-163.
Paivio, A. (1986). Mental representations: A dual coding approach. Oxford University Press.
Roediger, H.L., and Butler, A.C. (2011). The critical role of retrieval practice in long-term retention. Trends in Cognitive Sciences, 15(8), 374-380.
Rosenshine, B. (2012). Principles of instruction: research-based strategies that all teachers should know. American Educator, 36(1), 12-19.
Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257-285.
Vygotsky, L.S. (1978). Mind in society: The development of higher psychological processes. Harvard University Press.