Long-Term Memory: How Knowledge Sticks and Why It Matters for Teaching

Updated on  

March 10, 2026

Long-Term Memory: How Knowledge Sticks and Why It Matters for Teaching

|

March 7, 2026

When a pupil walks into your classroom in September, they bring everything they have ever learned. That prior knowledge, stored in long-term memory, is the raw material your teaching builds on. Understanding how long-term memory works, and how to strengthen it, is one of the most useful things a teacher can know.

Key Takeaways

  1. Long-term memory has unlimited capacity: Unlike working memory (which holds 4-7 items), long-term memory can store vast amounts of knowledge indefinitely when information is encoded effectively.
  2. Schemas organise knowledge: Information stored in interconnected schemas is easier to retrieve and apply. Teaching that builds on existing schemas produces stronger learning than isolated facts.
  3. Forgetting is natural but manageable: Ebbinghaus showed that most forgetting occurs within 24 hours. Spacing, retrieval practice, and interleaving counteract this decay curve.
  4. Automaticity frees working memory: When foundational knowledge becomes automatic (like times tables or phonics), pupils can devote working memory to higher-order thinking.

What Is Long-Term Memory?

Long-term memory is the brain's permanent storage system. Unlike working memory, which can hold only a small number of items for a short time, long-term memory has no known upper limit on capacity or duration. A Year 10 pupil who learned the water cycle in Year 5 can still retrieve that knowledge years later, provided it was encoded well and revisited periodically.

Psychologists distinguish between two broad types of long-term memory: declarative memory (conscious knowledge of facts and events) and non-declarative memory (skills, habits, and conditioned responses). Declarative memory breaks further into semantic memory (general knowledge: "Paris is the capital of France") and episodic memory (personal events: "I remember my first day of teaching"). Both types are relevant to classroom learning, though semantic memory carries the most weight in academic contexts.

For teachers, the practical significance is this: what pupils know now shapes what they can learn next. A pupil with a rich store of prior knowledge in science can connect new material to existing concepts, encode it faster, and retain it longer. A pupil with thin prior knowledge faces a much steeper climb. Building long-term memory is not a luxury; it is the central task of teaching.

Types of Long-Term Memory: A Classroom Overview

Memory Type Subtype What It Stores Classroom Example
Declarative (Explicit) Semantic Facts, concepts, and general knowledge A pupil knows that photosynthesis converts light into glucose
Declarative (Explicit) Episodic Personal memories and experiences A pupil remembers doing a leaf chromatography experiment in Year 7
Non-Declarative (Implicit) Procedural Motor skills and practised routines A pupil writes cursive script without consciously thinking about letter formation
Non-Declarative (Implicit) Priming Prior exposure that influences later responses Seeing the word "river" earlier makes a pupil faster to identify "delta" on a test
Non-Declarative (Implicit) Conditioned Responses Learned associations formed through repetition A pupil automatically pauses at a capital letter when reading aloud

How Memories Move from Short-Term to Long-Term Storage

Information does not simply transfer from short-term to long-term memory by itself. The process of encoding requires effort, repetition, and meaningful connection. Baddeley (2000) described how the episodic buffer in working memory acts as a temporary workspace, linking new information to existing long-term memories before consolidation occurs.

Consolidation happens through two processes: synaptic consolidation occurs within hours of learning. Systems consolidation unfolds over days and weeks as the hippocampus gradually transfers memories to the cortex. This is why a single lesson is rarely enough. A pupil who hears about osmosis once, writes it down, and never revisits it will have very little of that knowledge available the following term.

The practical implication for teachers is significant. If you want something to move into long-term memory, you need to create the conditions for consolidation: spaced repetition, effortful retrieval, and connection to prior knowledge. A good example is the "Do Now" starter activity, where pupils spend the first five minutes of class retrieving facts from a previous lesson. This is not warm-up fluff; it is active consolidation at work.

Elaborative encoding strengthens the process further. When pupils explain why something is true, create examples, or link new ideas to existing knowledge, they form stronger memories that are easier to recall later. Compare two approaches to teaching natural selection. You could tell pupils "favourable traits are passed on" or ask them to explain why a faster cheetah is more likely to survive and reproduce. The second approach demands elaboration and produces a more durable memory.

Schema Theory and Knowledge Organisation

Cognitive psychologists use the term 'schema' to describe the mental frameworks through which we organise knowledge. Bartlett (1932) demonstrated in landmark studies that people do not store memories like photographs; instead, they reconstruct them using existing schemas, filling gaps with prior expectations. A pupil with rich knowledge about World War II will absorb a new lesson about the Blitz more easily than someone with no background knowledge.

Schemas are interconnected networks of related knowledge. When a new fact connects to an existing schema, it becomes part of that network and benefits from the connections already established there. When a fact has no schema to attach to, it sits in isolation and is far more vulnerable to forgetting. This is why subject-specific vocabulary instruction matters so much: giving pupils the words for concepts builds the scaffolding on which further knowledge can hang.

The relevance for cognitive load theory is direct. Expert teachers do not experience a lesson as a collection of separate facts. Their well-developed schemas allow them to process large chunks of information as single units, freeing up mental bandwidth for the unfamiliar. For pupils still building those schemas, each element requires separate processing and places a greater demand on working memory. Teaching that builds schemas incrementally, connecting each new piece to what pupils already know, is teaching that works with the architecture of memory rather than against it. Reading more about schemas in education gives a fuller picture of how this works in practice.

Encoding Strategies That Build Durable Memory

Not all study methods are equal. Research on memory consistently shows that the strategies most pupils default to, such as re-reading and highlighting, produce weak encoding. The strategies that produce the strongest, most durable memories are those that require active mental effort.

Retrieval practice is the most well-evidenced strategy for long-term retention. Roediger and Butler (2011) showed that testing produces greater long-term retention than restudying the same material, even when the tests are low-stakes. In practical terms, this means asking pupils to write down everything they remember about a topic before you revisit it. You can use quizzes at the start of lessons, or have pupils answer questions without looking at their notes. The act of retrieval strengthens the memory trace. Full guidance on implementing this is available in the guide to retrieval practice for teachers.

Spaced practice distributes learning across time rather than concentrating it in one session. A topic introduced in September, revisited in October, and tested in November will be remembered much better than one studied intensively for three days then dropped. This is the spacing effect, one of the most replicated findings in cognitive psychology. Teachers can build spacing into their planning by including brief retrieval activities on older material at the start of every lesson. See the article on spaced practice for ready-to-use classroom structures.

Elaborative interrogation asks pupils to explain why facts are true, not just what they are. A geography pupil might memorise that coastal erosion forms headlands and bays. But when they explain why harder rock resists erosion while softer rock recedes, they build a richer, more retrievable memory. This connects to the broader principle that understanding supports retention: pupils who understand tend to remember because understanding creates a dense network of connections.

Dual coding pairs verbal and visual representations of the same information. A timeline that shows the sequence of events in the French Revolution with written dates gives memory two routes for retrieval. The same applies to a diagram that maps the water cycle with key vocabulary. Research by Paivio showed that combining verbal and visual channels, without overloading either, improves encoding significantly. Practical classroom applications are covered in the guide to dual coding.

Why Pupils Forget (and What You Can Do About It)

Ebbinghaus (1885) mapped what he called the 'forgetting curve': a steep decline in retention that begins within hours of learning and flattens out over time. His data, collected through self-experimentation with nonsense syllables, showed that roughly half of new material is forgotten within a day without review. By the end of a week, much of what remains has degraded further.

There are several mechanisms behind forgetting. Interference occurs when similar material competes in memory, making both harder to retrieve. Decay describes the natural weakening of memory traces that are not reactivated. Retrieval failure happens when information is stored but cannot be accessed. This often occurs because retrieval cues at test time do not match those present during learning. A pupil who revised in a quiet room using colour-coded notes may struggle to retrieve the same knowledge during a noisy exam without those cues.

What can teachers do? Three approaches have the strongest evidence base. First, interleaving: rather than blocking all practice on one topic before moving to the next, mix topics within a practice session. This feels harder for pupils and slower in the short term, but it produces better long-term retention. Full guidance on this is available in the article on interleaving. Second, spaced retrieval: returning to material at longer intervals strengthens memory. This works because some forgetting has happened. Retrieving something you nearly forgot is more powerful than retrieving fresh information. Third, varied practice: presenting the same concept in different contexts, through different examples and question types, builds flexible memory that transfers to novel situations.

Metacognition also matters here. Pupils who understand why they forget, and who can identify when they do not actually know something (as opposed to merely feeling familiar with it), make better study decisions. Teaching pupils about the forgetting curve helps build self-awareness for independent study. Show them that re-reading feels productive but creates poor retention. Demonstrate that effortful retrieval works much better. Resources on developing metacognition cover this in more depth.

Automaticity: When Knowledge Becomes Effortless

Automaticity describes the point at which a skill or piece of knowledge can be retrieved and applied without conscious effort. A fluent reader does not decode individual letters; they recognise words and phrases as whole units, freeing all available cognitive resources for comprehension. A pupil who has automatic recall of times tables does not have to calculate while solving a word problem. They retrieve the answer instantly, leaving working memory free to handle the mathematical reasoning.

This matters enormously because working memory is the bottleneck of learning. Sweller (1988) showed that when basic parts of a task use too much working memory, nothing remains for the higher-order thinking the task should develop. A pupil who has not automated basic number facts will struggle with algebra, not because they cannot reason mathematically, but because the arithmetic itself consumes all available capacity. The path to higher-order thinking runs through, not around, the memorisation of foundational knowledge.

Building automaticity requires large amounts of distributed practice, which is why drilling times tables, high-frequency words, and key scientific vocabulary is pedagogically justified even when it looks simple. The goal is not to limit pupils to low-level recall; the goal is to free up the cognitive resources that make high-level thinking possible. Teachers who feel uncomfortable with drilling should reframe it as infrastructure work: you are building the cognitive pipes through which complex thought will later flow.

A secondary benefit of automaticity is reduced cognitive load during transfer. When pupils apply a known concept to a new context, the automatic retrieval of that concept leaves them free to focus on what is unfamiliar in the new situation. This is why Rosenshine's Principles emphasise guided practice to the point of fluency before moving on. Moving too quickly to new material before existing material is secure undermines both the new learning and the old.

Connecting New Learning to Existing Schemas

The most efficient way to build durable memory is to attach new knowledge to what pupils already know. Ausubel (1968) put this as a principle that still holds: "If I had to reduce all of educational psychology to just one principle, I would say this: the most important single factor influencing learning is what the learner already knows." Prior knowledge is not just background. It is the scaffolding on which new learning is built.

In practice, this means starting every new topic with an activation of relevant prior knowledge. Before introducing plate tectonics, ask pupils what they already know about earthquakes and volcanoes. Before a poetry unit, ask what pupils remember about metaphor and rhythm from previous study. These are not just engagement activities; they are memory operations. Activating prior knowledge brings relevant schemas into working memory, making it far easier to connect the new material that follows.

The technique of using analogies is particularly powerful for this reason. When a chemistry teacher compares electron shells to theatre seats (front row fills first), they link existing knowledge to new ideas. The pupil who has no analogy to hold the abstract concept against must store it in isolation; the pupil with the theatre analogy has somewhere to file it. Scaffolding in teaching more broadly works on this principle: provide the structure that connects the familiar to the unfamiliar, then gradually withdraw support as the new schema grows.

It is also worth noting that incorrect prior knowledge can interfere with new learning. A pupil who enters a physics lesson believing that heavier objects fall faster will not easily update that belief unless the misconception is directly addressed. Simply presenting correct information is often insufficient; teachers need to create cognitive conflict, make the misconception visible, and guide pupils to reconstruct their understanding. Direct instruction that incorporates deliberate misconception-busting is a well-evidenced approach for this situation.

Classroom Strategies for Long-Term Retention

The strategies below are drawn from cognitive science research and have direct classroom applications. None require elaborate resources; most can be built into existing lesson structures.

Spaced starters. Begin each lesson with a short retrieval activity covering material from one week ago, one month ago, and one term ago. A five-minute 'Do Now' asks pupils to answer three to five questions from memory without notes. This activates prior knowledge, identifies gaps, and strengthens memories of older material. Keep a record of which topics have been revisited so you can ensure spacing is genuinely distributed rather than clustered around the topics you find easiest to re-test.

Interleaved practice sets. When setting practice tasks, mix topics rather than grouping all practice on one topic in a single block. A maths teacher might set ten questions covering algebra, fractions, and geometry rather than ten algebra questions followed by ten fractions questions. The short-term effect is that pupils find it harder and may feel less confident. The long-term effect is significantly stronger retention and better transfer. Explain this to pupils so they understand why the practice feels difficult.

The brain dump. At any point in a lesson, ask pupils to close their notes and write down everything they can remember about the topic. This is freeform retrieval: no structure, no prompts, just recall. After two minutes, pupils compare their lists with a partner and add anything they missed. The combination of individual retrieval and peer comparison reinforces memory, surfaces misconceptions, and gives you rapid formative information about what has and has not been retained. A deeper look at memorisation techniques shows how this fits into a wider toolkit.

Concept mapping. Asking pupils to draw a diagram showing how key concepts in a unit relate to each other produces a visible representation of their schema. Gaps in the map reveal where connections have not formed. Mistakes in the connections reveal misconceptions. The act of creating the map, deciding what connects to what and how, is itself a powerful elaboration activity that strengthens encoding. This pairs naturally with formative assessment because the map gives you detailed diagnostic information about what pupils actually understand rather than what they can copy.

Cumulative review. Rather than treating each unit as self-contained, build cumulative assessments that require pupils to draw on knowledge from across the year. End-of-term tests should include material from every previous unit, not just the most recent topic. This forces pupils to keep all their knowledge accessible rather than letting it decay after the unit test. This feels demanding, but it models the kind of connected, flexible knowledge that high-stakes examinations and real-world application both require.

Further Reading: Key Papers on Memory and Learning

Further Reading: Key Papers on Memory and Learning

These five papers form the evidence base for the strategies described in this article. Each is cited by thousands of researchers and has direct implications for classroom practice.

The Episodic Buffer: A New Component of Working Memory? View study ↗
Baddeley, A. (2000). Trends in Cognitive Sciences, 4(11).

Baddeley extended his influential model of working memory to include an episodic buffer, a temporary store that integrates information across working memory subsystems and links to long-term memory. This paper clarifies how new information is held in mind long enough to be connected to existing knowledge, and why that linking process is essential for durable encoding.

Remembering: A Study in Experimental and Social Psychology View study ↗
Bartlett, F.C. (1932). Cambridge University Press.

Bartlett's foundational work showed that memory is reconstructive, not reproductive. Participants who recalled stories across time consistently altered them to fit their existing cultural schemas. For teachers, this means that what pupils already know actively shapes what they will remember from new instruction, making prior knowledge activation a critical step in every lesson.

Memory: A Contribution to Experimental Psychology View study ↗
Ebbinghaus, H. (1885/1913). Teachers College, Columbia University.

Ebbinghaus documented the forgetting curve through meticulous self-experimentation, showing that memory declines steeply within 24 hours of learning and that spaced repetition substantially reduces this decay. His work established the empirical foundation for spacing and retrieval practice, strategies that remain among the most powerful tools available to teachers over a century later.

The Critical Role of Retrieval Practice in Long-Term Retention View study ↗
Roediger, H.L. & Butler, A.C. (2011). Trends in Cognitive Sciences, 15(1).

Roediger and Butler synthesised decades of research showing that the act of retrieval, not just the number of exposures to material, is the key driver of long-term retention. Their review demonstrated that low-stakes testing outperforms restudying, that feedback after retrieval enhances learning, and that retrieval practice transfers to real-world academic performance. This paper is essential reading for anyone designing revision and formative assessment.

Cognitive Load During Problem Solving: Effects on Learning View study ↗
Sweller, J. (1988). Cognitive Science, 12(2).

Sweller's original paper on cognitive load theory demonstrated that when problem-solving demands exceed working memory capacity, learning suffers even when performance looks adequate. For long-term memory, this matters because material that overloads working memory is poorly encoded into long-term storage. The paper makes a strong case for worked examples and guided practice as tools that manage cognitive load while supporting the transfer of knowledge into long-term memory.

Loading audit...

When a pupil walks into your classroom in September, they bring everything they have ever learned. That prior knowledge, stored in long-term memory, is the raw material your teaching builds on. Understanding how long-term memory works, and how to strengthen it, is one of the most useful things a teacher can know.

Key Takeaways

  1. Long-term memory has unlimited capacity: Unlike working memory (which holds 4-7 items), long-term memory can store vast amounts of knowledge indefinitely when information is encoded effectively.
  2. Schemas organise knowledge: Information stored in interconnected schemas is easier to retrieve and apply. Teaching that builds on existing schemas produces stronger learning than isolated facts.
  3. Forgetting is natural but manageable: Ebbinghaus showed that most forgetting occurs within 24 hours. Spacing, retrieval practice, and interleaving counteract this decay curve.
  4. Automaticity frees working memory: When foundational knowledge becomes automatic (like times tables or phonics), pupils can devote working memory to higher-order thinking.

What Is Long-Term Memory?

Long-term memory is the brain's permanent storage system. Unlike working memory, which can hold only a small number of items for a short time, long-term memory has no known upper limit on capacity or duration. A Year 10 pupil who learned the water cycle in Year 5 can still retrieve that knowledge years later, provided it was encoded well and revisited periodically.

Psychologists distinguish between two broad types of long-term memory: declarative memory (conscious knowledge of facts and events) and non-declarative memory (skills, habits, and conditioned responses). Declarative memory breaks further into semantic memory (general knowledge: "Paris is the capital of France") and episodic memory (personal events: "I remember my first day of teaching"). Both types are relevant to classroom learning, though semantic memory carries the most weight in academic contexts.

For teachers, the practical significance is this: what pupils know now shapes what they can learn next. A pupil with a rich store of prior knowledge in science can connect new material to existing concepts, encode it faster, and retain it longer. A pupil with thin prior knowledge faces a much steeper climb. Building long-term memory is not a luxury; it is the central task of teaching.

Types of Long-Term Memory: A Classroom Overview

Memory Type Subtype What It Stores Classroom Example
Declarative (Explicit) Semantic Facts, concepts, and general knowledge A pupil knows that photosynthesis converts light into glucose
Declarative (Explicit) Episodic Personal memories and experiences A pupil remembers doing a leaf chromatography experiment in Year 7
Non-Declarative (Implicit) Procedural Motor skills and practised routines A pupil writes cursive script without consciously thinking about letter formation
Non-Declarative (Implicit) Priming Prior exposure that influences later responses Seeing the word "river" earlier makes a pupil faster to identify "delta" on a test
Non-Declarative (Implicit) Conditioned Responses Learned associations formed through repetition A pupil automatically pauses at a capital letter when reading aloud

How Memories Move from Short-Term to Long-Term Storage

Information does not simply transfer from short-term to long-term memory by itself. The process of encoding requires effort, repetition, and meaningful connection. Baddeley (2000) described how the episodic buffer in working memory acts as a temporary workspace, linking new information to existing long-term memories before consolidation occurs.

Consolidation happens through two processes: synaptic consolidation occurs within hours of learning. Systems consolidation unfolds over days and weeks as the hippocampus gradually transfers memories to the cortex. This is why a single lesson is rarely enough. A pupil who hears about osmosis once, writes it down, and never revisits it will have very little of that knowledge available the following term.

The practical implication for teachers is significant. If you want something to move into long-term memory, you need to create the conditions for consolidation: spaced repetition, effortful retrieval, and connection to prior knowledge. A good example is the "Do Now" starter activity, where pupils spend the first five minutes of class retrieving facts from a previous lesson. This is not warm-up fluff; it is active consolidation at work.

Elaborative encoding strengthens the process further. When pupils explain why something is true, create examples, or link new ideas to existing knowledge, they form stronger memories that are easier to recall later. Compare two approaches to teaching natural selection. You could tell pupils "favourable traits are passed on" or ask them to explain why a faster cheetah is more likely to survive and reproduce. The second approach demands elaboration and produces a more durable memory.

Schema Theory and Knowledge Organisation

Cognitive psychologists use the term 'schema' to describe the mental frameworks through which we organise knowledge. Bartlett (1932) demonstrated in landmark studies that people do not store memories like photographs; instead, they reconstruct them using existing schemas, filling gaps with prior expectations. A pupil with rich knowledge about World War II will absorb a new lesson about the Blitz more easily than someone with no background knowledge.

Schemas are interconnected networks of related knowledge. When a new fact connects to an existing schema, it becomes part of that network and benefits from the connections already established there. When a fact has no schema to attach to, it sits in isolation and is far more vulnerable to forgetting. This is why subject-specific vocabulary instruction matters so much: giving pupils the words for concepts builds the scaffolding on which further knowledge can hang.

The relevance for cognitive load theory is direct. Expert teachers do not experience a lesson as a collection of separate facts. Their well-developed schemas allow them to process large chunks of information as single units, freeing up mental bandwidth for the unfamiliar. For pupils still building those schemas, each element requires separate processing and places a greater demand on working memory. Teaching that builds schemas incrementally, connecting each new piece to what pupils already know, is teaching that works with the architecture of memory rather than against it. Reading more about schemas in education gives a fuller picture of how this works in practice.

Encoding Strategies That Build Durable Memory

Not all study methods are equal. Research on memory consistently shows that the strategies most pupils default to, such as re-reading and highlighting, produce weak encoding. The strategies that produce the strongest, most durable memories are those that require active mental effort.

Retrieval practice is the most well-evidenced strategy for long-term retention. Roediger and Butler (2011) showed that testing produces greater long-term retention than restudying the same material, even when the tests are low-stakes. In practical terms, this means asking pupils to write down everything they remember about a topic before you revisit it. You can use quizzes at the start of lessons, or have pupils answer questions without looking at their notes. The act of retrieval strengthens the memory trace. Full guidance on implementing this is available in the guide to retrieval practice for teachers.

Spaced practice distributes learning across time rather than concentrating it in one session. A topic introduced in September, revisited in October, and tested in November will be remembered much better than one studied intensively for three days then dropped. This is the spacing effect, one of the most replicated findings in cognitive psychology. Teachers can build spacing into their planning by including brief retrieval activities on older material at the start of every lesson. See the article on spaced practice for ready-to-use classroom structures.

Elaborative interrogation asks pupils to explain why facts are true, not just what they are. A geography pupil might memorise that coastal erosion forms headlands and bays. But when they explain why harder rock resists erosion while softer rock recedes, they build a richer, more retrievable memory. This connects to the broader principle that understanding supports retention: pupils who understand tend to remember because understanding creates a dense network of connections.

Dual coding pairs verbal and visual representations of the same information. A timeline that shows the sequence of events in the French Revolution with written dates gives memory two routes for retrieval. The same applies to a diagram that maps the water cycle with key vocabulary. Research by Paivio showed that combining verbal and visual channels, without overloading either, improves encoding significantly. Practical classroom applications are covered in the guide to dual coding.

Why Pupils Forget (and What You Can Do About It)

Ebbinghaus (1885) mapped what he called the 'forgetting curve': a steep decline in retention that begins within hours of learning and flattens out over time. His data, collected through self-experimentation with nonsense syllables, showed that roughly half of new material is forgotten within a day without review. By the end of a week, much of what remains has degraded further.

There are several mechanisms behind forgetting. Interference occurs when similar material competes in memory, making both harder to retrieve. Decay describes the natural weakening of memory traces that are not reactivated. Retrieval failure happens when information is stored but cannot be accessed. This often occurs because retrieval cues at test time do not match those present during learning. A pupil who revised in a quiet room using colour-coded notes may struggle to retrieve the same knowledge during a noisy exam without those cues.

What can teachers do? Three approaches have the strongest evidence base. First, interleaving: rather than blocking all practice on one topic before moving to the next, mix topics within a practice session. This feels harder for pupils and slower in the short term, but it produces better long-term retention. Full guidance on this is available in the article on interleaving. Second, spaced retrieval: returning to material at longer intervals strengthens memory. This works because some forgetting has happened. Retrieving something you nearly forgot is more powerful than retrieving fresh information. Third, varied practice: presenting the same concept in different contexts, through different examples and question types, builds flexible memory that transfers to novel situations.

Metacognition also matters here. Pupils who understand why they forget, and who can identify when they do not actually know something (as opposed to merely feeling familiar with it), make better study decisions. Teaching pupils about the forgetting curve helps build self-awareness for independent study. Show them that re-reading feels productive but creates poor retention. Demonstrate that effortful retrieval works much better. Resources on developing metacognition cover this in more depth.

Automaticity: When Knowledge Becomes Effortless

Automaticity describes the point at which a skill or piece of knowledge can be retrieved and applied without conscious effort. A fluent reader does not decode individual letters; they recognise words and phrases as whole units, freeing all available cognitive resources for comprehension. A pupil who has automatic recall of times tables does not have to calculate while solving a word problem. They retrieve the answer instantly, leaving working memory free to handle the mathematical reasoning.

This matters enormously because working memory is the bottleneck of learning. Sweller (1988) showed that when basic parts of a task use too much working memory, nothing remains for the higher-order thinking the task should develop. A pupil who has not automated basic number facts will struggle with algebra, not because they cannot reason mathematically, but because the arithmetic itself consumes all available capacity. The path to higher-order thinking runs through, not around, the memorisation of foundational knowledge.

Building automaticity requires large amounts of distributed practice, which is why drilling times tables, high-frequency words, and key scientific vocabulary is pedagogically justified even when it looks simple. The goal is not to limit pupils to low-level recall; the goal is to free up the cognitive resources that make high-level thinking possible. Teachers who feel uncomfortable with drilling should reframe it as infrastructure work: you are building the cognitive pipes through which complex thought will later flow.

A secondary benefit of automaticity is reduced cognitive load during transfer. When pupils apply a known concept to a new context, the automatic retrieval of that concept leaves them free to focus on what is unfamiliar in the new situation. This is why Rosenshine's Principles emphasise guided practice to the point of fluency before moving on. Moving too quickly to new material before existing material is secure undermines both the new learning and the old.

Connecting New Learning to Existing Schemas

The most efficient way to build durable memory is to attach new knowledge to what pupils already know. Ausubel (1968) put this as a principle that still holds: "If I had to reduce all of educational psychology to just one principle, I would say this: the most important single factor influencing learning is what the learner already knows." Prior knowledge is not just background. It is the scaffolding on which new learning is built.

In practice, this means starting every new topic with an activation of relevant prior knowledge. Before introducing plate tectonics, ask pupils what they already know about earthquakes and volcanoes. Before a poetry unit, ask what pupils remember about metaphor and rhythm from previous study. These are not just engagement activities; they are memory operations. Activating prior knowledge brings relevant schemas into working memory, making it far easier to connect the new material that follows.

The technique of using analogies is particularly powerful for this reason. When a chemistry teacher compares electron shells to theatre seats (front row fills first), they link existing knowledge to new ideas. The pupil who has no analogy to hold the abstract concept against must store it in isolation; the pupil with the theatre analogy has somewhere to file it. Scaffolding in teaching more broadly works on this principle: provide the structure that connects the familiar to the unfamiliar, then gradually withdraw support as the new schema grows.

It is also worth noting that incorrect prior knowledge can interfere with new learning. A pupil who enters a physics lesson believing that heavier objects fall faster will not easily update that belief unless the misconception is directly addressed. Simply presenting correct information is often insufficient; teachers need to create cognitive conflict, make the misconception visible, and guide pupils to reconstruct their understanding. Direct instruction that incorporates deliberate misconception-busting is a well-evidenced approach for this situation.

Classroom Strategies for Long-Term Retention

The strategies below are drawn from cognitive science research and have direct classroom applications. None require elaborate resources; most can be built into existing lesson structures.

Spaced starters. Begin each lesson with a short retrieval activity covering material from one week ago, one month ago, and one term ago. A five-minute 'Do Now' asks pupils to answer three to five questions from memory without notes. This activates prior knowledge, identifies gaps, and strengthens memories of older material. Keep a record of which topics have been revisited so you can ensure spacing is genuinely distributed rather than clustered around the topics you find easiest to re-test.

Interleaved practice sets. When setting practice tasks, mix topics rather than grouping all practice on one topic in a single block. A maths teacher might set ten questions covering algebra, fractions, and geometry rather than ten algebra questions followed by ten fractions questions. The short-term effect is that pupils find it harder and may feel less confident. The long-term effect is significantly stronger retention and better transfer. Explain this to pupils so they understand why the practice feels difficult.

The brain dump. At any point in a lesson, ask pupils to close their notes and write down everything they can remember about the topic. This is freeform retrieval: no structure, no prompts, just recall. After two minutes, pupils compare their lists with a partner and add anything they missed. The combination of individual retrieval and peer comparison reinforces memory, surfaces misconceptions, and gives you rapid formative information about what has and has not been retained. A deeper look at memorisation techniques shows how this fits into a wider toolkit.

Concept mapping. Asking pupils to draw a diagram showing how key concepts in a unit relate to each other produces a visible representation of their schema. Gaps in the map reveal where connections have not formed. Mistakes in the connections reveal misconceptions. The act of creating the map, deciding what connects to what and how, is itself a powerful elaboration activity that strengthens encoding. This pairs naturally with formative assessment because the map gives you detailed diagnostic information about what pupils actually understand rather than what they can copy.

Cumulative review. Rather than treating each unit as self-contained, build cumulative assessments that require pupils to draw on knowledge from across the year. End-of-term tests should include material from every previous unit, not just the most recent topic. This forces pupils to keep all their knowledge accessible rather than letting it decay after the unit test. This feels demanding, but it models the kind of connected, flexible knowledge that high-stakes examinations and real-world application both require.

Further Reading: Key Papers on Memory and Learning

Further Reading: Key Papers on Memory and Learning

These five papers form the evidence base for the strategies described in this article. Each is cited by thousands of researchers and has direct implications for classroom practice.

The Episodic Buffer: A New Component of Working Memory? View study ↗
Baddeley, A. (2000). Trends in Cognitive Sciences, 4(11).

Baddeley extended his influential model of working memory to include an episodic buffer, a temporary store that integrates information across working memory subsystems and links to long-term memory. This paper clarifies how new information is held in mind long enough to be connected to existing knowledge, and why that linking process is essential for durable encoding.

Remembering: A Study in Experimental and Social Psychology View study ↗
Bartlett, F.C. (1932). Cambridge University Press.

Bartlett's foundational work showed that memory is reconstructive, not reproductive. Participants who recalled stories across time consistently altered them to fit their existing cultural schemas. For teachers, this means that what pupils already know actively shapes what they will remember from new instruction, making prior knowledge activation a critical step in every lesson.

Memory: A Contribution to Experimental Psychology View study ↗
Ebbinghaus, H. (1885/1913). Teachers College, Columbia University.

Ebbinghaus documented the forgetting curve through meticulous self-experimentation, showing that memory declines steeply within 24 hours of learning and that spaced repetition substantially reduces this decay. His work established the empirical foundation for spacing and retrieval practice, strategies that remain among the most powerful tools available to teachers over a century later.

The Critical Role of Retrieval Practice in Long-Term Retention View study ↗
Roediger, H.L. & Butler, A.C. (2011). Trends in Cognitive Sciences, 15(1).

Roediger and Butler synthesised decades of research showing that the act of retrieval, not just the number of exposures to material, is the key driver of long-term retention. Their review demonstrated that low-stakes testing outperforms restudying, that feedback after retrieval enhances learning, and that retrieval practice transfers to real-world academic performance. This paper is essential reading for anyone designing revision and formative assessment.

Cognitive Load During Problem Solving: Effects on Learning View study ↗
Sweller, J. (1988). Cognitive Science, 12(2).

Sweller's original paper on cognitive load theory demonstrated that when problem-solving demands exceed working memory capacity, learning suffers even when performance looks adequate. For long-term memory, this matters because material that overloads working memory is poorly encoded into long-term storage. The paper makes a strong case for worked examples and guided practice as tools that manage cognitive load while supporting the transfer of knowledge into long-term memory.

No Posts found.
Back to Blog