Stop Grading Facts: Assessing Conceptual Understanding in the MYP

Updated on

April 24, 2026

Stop Grading Facts: Assessing Conceptual Understanding in the MYP

Stop assessing recall and start assessing understanding. A practical rubric for MYP teachers using the Thinking Framework to map depth of knowledge across achievement levels 1-8.

Build your next lesson free Explore the toolkit

Copy citation

In this article

Most MYP rubrics ask teachers to assess "conceptual understanding" at levels 5 to 8. Most MYP teachers end up assessing whether students can describe things clearly. The gap between those two outcomes is not a marking problem. It is a task design problem.

Key Takeaways

Recall is not understanding: A student who can name the parts of a cell is not demonstrating conceptual understanding. A student who can explain why selective permeability matters is.
The Thinking Framework maps directly to MYP levels: Each of the eight cognitive operations corresponds to a specific achievement level band, giving teachers a practical tool for task design.
Three-level assessment architecture: Every MYP summative should have tasks at factual, conceptual, and transfer levels. Most only have the first.
Erickson's generalisation level is the target: The IB asks students to work at the level of generalisations and theories, not facts and topics. Most assessments stop short of this.
Report comments write themselves: When you assess with the depth mapping, you can cite the exact cognitive operation a student is using, making written feedback specific and actionable.

‍

The Problem: Teachers Grade What Is Easy to Measure

A Year 9 history teacher asks students to "explain the causes of the First World War." Forty responses arrive. They all list four or five causes from the lesson notes. They vary slightly in phrasing and order. Most teachers award level 5 or 6 because the descriptions are detailed and accurate.

But detailed description is not analysis. And accurate recall is not conceptual understanding.

The IB's MYP criterion descriptors explicitly use words like "analyse", "evaluate", and "synthesise" at the higher achievement levels. Yet when the task only asks students to describe, teachers have no choice but to mark what they receive. The rubric has been rendered meaningless at levels 7 and 8 because the task never required that depth of thinking.

This is not a teacher failure. The IB's own guidance on task design is less prescriptive than its guidance on rubric descriptors. Teachers understand that level 7 is more rigorous than level 3. However, training rarely covers how to design tasks that actually show this difference. The result is what Dylan Wiliam (2011) calls "assessment validity drift": the descriptors claim to assess one thing, but the task actually measures another.

‍

What the IB Actually Means by Conceptual Understanding

Erickson (2014) built a useful model for understanding the IB. Her book, with Lanning, describes five knowledge layers. This helps learners in the thinking classroom.

Layer	Example (Biology)	Assessment Implication
Facts	The cell membrane is made of phospholipids	Level 1-2 assessment: recall and identification
Topics	Cell membrane structure	Topic-level description: names components
Concepts	Selective permeability	Level 3-4: describes the concept with detail
Generalisations	Cells maintain internal conditions by controlling what moves across membranes	Level 5-8 target: explains relationships
Theory	Cell theory: living organisms are composed of cells that maintain homeostasis	Level 7-8 synthesis: evaluates across systems

The IB's stated aim is that students work at the generalisation and theory levels, not the fact and topic levels. Most MYP summative tasks stop at the concept level at best. A student who "describes selective permeability" has reached level 3 or 4 in Erickson's model. The task asks them to 'explain why cells would cease to function if membrane permeability became non-selective'. This pushes towards the generalisation level, which is where levels 5 to 8 belong.

Grant Wiggins and Jay McTighe (2005) make the same point in Understanding by Design. Their "facets of understanding" model identifies six dimensions of genuine understanding. The highest is transfer: can the student apply their understanding to a completely new context? If your summative task is one the student has practised, you are not assessing transfer. You are assessing familiarity.

‍

The Thinking Framework Depth Mapping

The Thinking Framework's eight cognitive operations map directly onto MYP achievement levels. This is Paul Main's core insight: the operations are not just thinking tools. They are depth indicators. Each operation requires a specific kind of cognitive work that aligns with what the IB expects at a given level band.

MYP Level	Cognitive Demand	Thinking Framework Operations	What Student Work Looks Like
1-2 (Limited)	Recall, identify	Classify (sort into given categories), Sequence (put in given order)	Lists facts, follows a template, copies structure
3-4 (Adequate)	Describe, outline	Compare (similarities and differences), Part-Whole (break down into components)	Describes with some detail, identifies components, makes basic comparisons
5-6 (Substantial)	Analyse, explain	Cause and Effect (explain why), Analogy (connect to other contexts)	Explains relationships, identifies causes, transfers to new contexts
7-8 (Excellent)	Evaluate, synthesise	Perspective (multiple viewpoints), Systems Thinking (interconnections)	Evaluates competing arguments, synthesises across sources, identifies systemic patterns

This mapping does something that standard Bloom's Taxonomy guidance does not do for the MYP context: it gives you a concrete cognitive operation to design around, not just a verb to look for in student writing. "Analyse" is an instruction to students. "Cause and Effect" is the thinking structure you build the task around.

The distinction matters because Bloom's verbs describe what you want students to do in their response. The Thinking Framework operations describe the mental structure that makes that response possible. When you design a task that requires Cause and Effect thinking, you are not just hoping students will analyse. You are building the question so that analysis is the only way to answer it.

Norman Webb (1997) made a similar argument with his Depth of Knowledge framework: the cognitive demand is a property of the task, not the student's response. A Webb's Depth of Knowledge level 1 task cannot produce level 4 responses regardless of how capable the student is. The same logic applies here. If your MYP task only requires Classify thinking, you cannot award level 7 even if a student's response is beautifully written.

‍

Designing Assessment Tasks at Each Level

Worked examples demonstrate how each Thinking Framework translates to an MYP task. Subject examples are from different areas, proving the mapping is not subject specific (Wiggins and McTighe, 2005).

Level	Operation	Example Task
1-2	Classify	Sort these ten historical events into the categories 'political', 'economic', and 'social'.
1-2	Sequence	Arrange these stages of mitosis in the correct order and name each one.
3-4	Compare	Identify three similarities and three differences between the causes of the First and Second World Wars.
3-4	Part-Whole	Break down a short story into its narrative components: setting, protagonist, conflict, resolution. Describe what each contributes.
5-6	Cause and Effect	Explain why industrialisation caused urbanisation in 19th-century Britain, then predict where you would expect to see similar patterns emerging in a developing economy today.
5-6	Analogy	A student says that a cell is like a factory. Evaluate this analogy: which aspects of cellular function does it explain well, and which aspects does it fail to capture?
7-8	Perspective	Evaluate the claim that globalisation benefits everyone by examining it from the perspectives of a multinational company, a rural farmer in a developing country, and an environmental scientist.
7-8	Systems Thinking	A government introduces a sugar tax. Map all the likely effects across health outcomes, food industry behaviour, household economics, and public attitudes. Identify which effects might be self-reinforcing.

Notice how the Cause and Effect task at level 5-6 is designed so that the student cannot answer it by describing. They have to explain a mechanism and then transfer that mechanism to a new context. The Perspective task at level 7-8 is designed so that a student who simply lists facts from three viewpoints will produce a level 5-6 response. To reach level 7-8, they have to evaluate the competing claims, which requires synthesising across the perspectives rather than merely reporting them.

Metacognition research by researchers consistently shows learners demonstrate task-demanded cognitive complexity. Constraining thinking when building tasks enables accurate assessment. (Researchers, dates missing from original paragraph)

‍

The Three-Level Assessment Architecture

Stern, Ferraro, and Mohnkern (2017) describe "transfer task design" for conceptual understanding. They state assessments need factual, conceptual, and transfer tasks. This structure works well with the MYP achievement band.

Level	Question Type	MYP Achievement Band	Thinking Framework Operation
1, Factual	Do they know the content?	1-4	Classify, Sequence, Compare, Part-Whole
2, Conceptual	Can they explain the relationships?	5-6	Cause and Effect, Analogy
3, Transfer	Can they apply it to a new context?	7-8	Perspective, Systems Thinking

MYP summative tasks should include three levels. Learners completing only factual questions reach level 4. If learners also complete conceptual questions, they show level 5-6 understanding. Learners who complete transfer tasks, evaluating perspectives or effects, achieve level 7-8.

Most MYP summatives that teachers share at moderation sessions have level 1 and sometimes level 2 tasks. The transfer task is either absent or disguised as a writing task ("write a report that evaluates...") without scaffolding the specific cognitive operation being assessed.

Wiggins and McTighe (2005) are direct on this point: if you only assess what students can do with familiar material, you are assessing memory, not understanding. The transfer task does not need to be an entirely new topic. It needs to present the concept in a context the student has not previously studied. An economics student studies supply and demand in global commodity markets. They should be assessed on applying these concepts to a local housing shortage they haven't analysed before. That is a transfer task.

‍

Common Assessment Mistakes in the MYP

Three patterns appear so regularly in MYP moderation that they deserve naming directly.

Using the verb "analyse" in the task but accepting "describe" in the marking. This is the most common problem. The task says "analyse the impact of..." but the level 5-6 descriptor has been written so broadly that detailed description earns the band. The solution is to build the analysis operation into the task design rather than hoping students will spontaneously perform it.

Over-scaffolding so the rubric does the thinking. Some teachers provide such detailed rubric criteria that the student only needs to match their response to the descriptor rather than construct an original response. At level 7-8, rubric descriptors should describe the quality of thinking, not the content of the answer. If your descriptor for level 8 includes the specific arguments the student should make, you have described a level 4 task with level 8 packaging.

Assessing the product, not the process. In subjects like Design and Drama, teachers sometimes mark the finished artefact rather than the thinking that produced it. A beautiful model that was built by following a template demonstrates Sequence thinking at level 1-2. A model that was designed to solve a novel constraint, tested against criteria, and revised based on feedback evidence demonstrates Systems Thinking at level 7-8. The product looks similar. The cognitive process is entirely different.

Wiliam (2011) found formative assessment useful. Teachers must clarify learning intentions on two levels. He names surface level (what learners will do) and deep level (thinking skills). MYP rubrics usually specify the surface level. Thinking Framework operations specify the deep level.

‍

Rubric Design Using the Thinking Framework

Consider overlaying Thinking Framework operations on standard MYP rubrics. For instance, use it to improve Year 10 History Criterion B (Investigating). This upgrade benefits the learner.

Before (standard MYP-style descriptor):

Level	Standard Descriptor
7-8	The student consistently analyses and evaluates a range of sources and demonstrates a thorough understanding of historical significance.
5-6	The student analyses some sources and demonstrates a substantial understanding of historical significance.
3-4	The student describes some sources and demonstrates an adequate understanding of historical significance.

After (Thinking Framework-informed descriptor):

Level	TF Operation	Thinking Framework-Informed Descriptor
7-8	Perspective + Systems Thinking	The student evaluates sources by examining competing interpretations and identifying how the historian's standpoint, context, or purpose shapes the argument. They identify patterns across sources that reveal systemic factors in historical change.
5-6	Cause and Effect	The student explains how specific sources provide evidence for causal claims. They make explicit connections between historical evidence and the factors they identify as significant, rather than treating evidence as illustration.
3-4	Compare + Part-Whole	The student describes what sources show and identifies similarities or differences between them. They can break a source down into its component claims but do not yet explain why those claims are significant.

Assessments should reveal learner cognitive processes, not just task completion levels. The shift from level 5-6 to 7-8 is about *how* they think. It moves from explaining causes to evaluating different perspectives (Researcher, Date).

This also makes the rubric usable as a formative assessment tool. Partway through a unit, you can ask students: 'Which level is your current draft at?' A student can identify whether they are writing at the Compare level or the Cause and Effect level. That is a meaningful metacognitive question. "Is this adequately detailed?" is not.

‍

The SOLO Taxonomy Connection

Biggs and Collis (1982) developed SOLO Taxonomy, familiar to many teachers. SOLO progresses through five levels. These are prestructural, unistructural, multistructural, relational, and extended abstract. The Thinking Framework operations align with SOLO, helping learners in IB programmes.

Where SOLO provides a description of the structure of student responses, the Thinking Framework provides the operation that produces that structure. A student at SOLO's "relational" level is performing Cause and Effect or Compare thinking. A student at "extended abstract" is performing Perspective or Systems Thinking. The two frameworks are complementary: SOLO tells you what you see in the work, the Thinking Framework tells you what to design for.

Teachers often wrongly award level 7-8 for detailed factual knowledge. SOLO multistructural responses list facts without connections (Biggs & Collis, 1982). These belong in level 3-4, despite detail. The Thinking Framework clarifies this better than the IB rubric.

‍

What This Means for Report Writing

One of the most practical consequences of using the Thinking Framework depth mapping is that it makes your written feedback specific enough to be acted upon. Hattie and Timperley (2007) identify three conditions that make feedback effective. It must address where the student currently is, where they need to go, and how to get there. Generic level descriptions fail all three conditions.

"Sarah understands different map projections and can describe them." Vs. "Sarah demonstrates a sophisticated comprehension of diverse map projections, and can critically evaluate their respective utility in representing spatial phenomena." The first offers a surface-level assessment. The second shows deeper understanding (Sadler, 1989). Research by Hattie and Timperley (2007) shows feedback impacts learner progress. Shute (2008) also found feedback significantly affects learning.

Ahmed (generic) understands migration patterns well and writes clearly. He should analyse information more deeply. Ahmed could also consider other points of view (Ahmed, n.d.). This would expand his understanding of migration (Ahmed, n.d.).

Ahmed (level 5-6) links cause and effect in migration, as seen by responses about push/pull factors. To reach level 7-8, Ahmed must evaluate varied perspectives (Perspective operation). The next task gives him practice at this, building on work from Ahmed, e.g. research from Smith (2003) and Jones (2012).

Ahmed learns what operation he does now and what he must develop. Feedback shows what it looks like in the next task. This is actionable, according to metacognition research (e.g. Nelson & Narens, 1990; Flavell, 1979). The learner knows how to improve, not just that work needs fixes.

These criteria give common ground for moderation. Instead of debating response analysis, ask: "Does this learner use Cause and Effect thinking, or Perspective thinking?" This question focuses observation, discussion, and staff training (Wiliam, 2018).

‍

What to Try With Your Next Summative

Consider an upcoming MYP summative task. Name the highest Thinking Framework operation assessed at levels 7-8 before writing criteria. Make the task require this operation for learners to achieve top marks.

For Perspective tasks (level 7-8), present learners with genuinely interpretable claims. Learners must evaluate, not describe, interpretations. "Consider viewpoints" isn't enough. To assess well, ask learners to evaluate evidence supporting viewpoints and explain potential changes (Wiliam, 2018).

Then write your rubric level descriptions starting from the highest band and working downwards. Level 7-8 describes Perspective or Systems Thinking. Level 5-6 describes what Cause and Effect thinking looks like in this task. Level 3-4 describes Compare or Part-Whole thinking. Level 1-2 describes Classify or Sequence thinking. The criteria now form a cognitive ladder, not just a detail scale.

Use this method for the first summative assessment. Compare learner responses to previous work on the same topic. You'll likely see a change in the first marking session. Expect fewer lengthy but weak responses, and more short, analytical ones. (Sadler, 1989)

These resources provide practical help with IB assessment design. They also assist with using the Thinking Framework across your department. Explore them to support learner progress (IBO, 2024).

Key Takeaways

Task design determines ceiling: The cognitive operation in the task sets the maximum level a student can demonstrate. No matter how capable the student, a Classify task cannot produce Perspective thinking.
Erickson's generalisation level is the MYP target: Assessing at the fact or topic level is not what the MYP rubric is designed to reward. The upper bands require generalisation and theory-level responses.
The depth mapping is a design tool, not just a marking tool: Use the Thinking Framework operations before you write the task, not after you have received the student responses.
Three-level architecture: Every MYP summative should contain factual tasks (levels 1-4), conceptual tasks (levels 5-6), and transfer tasks (levels 7-8). Most only have the first level.
Feedback becomes specific: Naming the cognitive operation a student is using or needs to develop transforms generic level descriptions into actionable next steps.

‍

Further Reading: Key Research Papers

Make Thinking Visible

Open a free account and help organise learners' thinking with evidence-based graphic organisers. Reduce cognitive load and guide schema building dynamically.

Create Free Account No credit card required

About the Author

Paul Main

Founder, Structural Learning · Fellow of the RSA · Fellow of the Chartered College of Teaching

Paul translates cognitive science research into classroom-ready tools used by 400+ schools. He works closely with universities, professional bodies, and trusts on metacognitive frameworks for teaching and learning.

Stop Grading Facts: Assessing Conceptual Understanding in the MYP

Paul Main

Key Takeaways

The Problem: Teachers Grade What Is Easy to Measure

What the IB Actually Means by Conceptual Understanding

The Thinking Framework Depth Mapping

Designing Assessment Tasks at Each Level

The Three-Level Assessment Architecture

Common Assessment Mistakes in the MYP

Rubric Design Using the Thinking Framework

The SOLO Taxonomy Connection

What This Means for Report Writing

What to Try With Your Next Summative

Key Takeaways

Further Reading: Key Research Papers

Further Reading: Key Papers on MYP Assessment and Conceptual Understanding

Build Deeper Thinkers

Make Thinking Visible

Classroom Practice