MEGA: Mathematics Explanations through Games by AI LLMs
Background (The Problem): Many students struggle to understand Mathematics (Math) because existing methods for explaining some concepts fall short. This discourages some students from pursuing Math-related disciplines at higher education institutions (HEIs). Particularly, 13% of 15-year-old Swedish students scored above level 4 and boys across the OECD outperformed girls by five score points, in a relatively recent PISA survey.
Aim:
MEGA aims to improve the learning of students in Math through a novel combination of (1) the Socratic method and (2) simplified gamification by applying large language models (LLMs).
Methodology:
MEGA Flowchart (which describes the instruction used):

For evaluation, generally, 2 students compared explanation methods of A (CoT) and B (MEGA). We sampled 120 questions (from GSM8K and MATH datasets) and assigned 15 each to 2 students. After each question they were asked: Which explanation was better to understand, A or B?
Results
According to the bar graph, MEGA provides better explanations to understand for the MATH dataset (containing more difficult university questions) while it is slightly less for the GSM8K dataset (grade school math containing simple math word problems).
Graph of % of Unanimous Students' Agreement (shows MEGA is much better for hard MATH problems)

Students' feedback:
- B (MEGA) is more engaging and is extremely easy to follow through. Even though A (CoT) is broken down into smaller parts, it's hard to follow through. B lets me go through step by step in an engaging way and I appreciate that it makes sure that you get each step correct before moving on, this helps you know which step you may need help at.
- In some cases, I preferred option B because it was more interactive and helped keep me engaged with the problem-solving process. It encouraged me to think and work through the steps, especially in questions like 1 and 7, where I was asked to perform calculations myself.
- When A is better than B, I generally experience it for more simple math questions, such as 1, 9, 10, 12, 13, and 14. When the question is more challenging/complex, an interactive (option B) is generally better because it is broken down into steps and also asks us to give input. To give input, we first must read and understand the explanation it gives.
Key Takeaways:
- Overall, MEGA is better than CoT (step-by-step) based on student agreement.
- MEGA is slightly better than CoT for GSM8K dataset but much better than CoT for the MATH dataset, i.e. MEGA is much better when math problems are harder.
- 79% of students evaluated prefer interactive learning. Hence, MEGA’s superior performance of 47.5% in the MATH dataset leaves room for improvement.
- MEGA exposes the limitations of LLMs (sub-question inconsistencies) better than CoT.
Demo: www.megamath.se External link, opens in new window. ID: mgone@ltu.se Password: demoMEGA
(MEGA Instruction: MEGA_files - Google Drive External link. )
Special thanks to the students of the Text Mining course (D7058E) 2024/25 and the Blended Intensive Programme in Germany, 2024 for their evaluation.
Contact
Oluwatosin Adewumi
- Postdoctoral researcher
- 0920-49
- oluwatosin.adewumi@ltu.se
- Oluwatosin Adewumi
Updated: