Can an AI agent tutor students through math — teaching them to find the answer rather than handing it over — without hallucinating?



Fibo is our internal R&D project exploring how AI can support education. It's an AI math tutor for secondary and high-school students that helps them understand fundamental concepts and practice — through a teaching-first chat, learning videos, and quizzes — proving an AI agent can guide learning, not just spit out answers.
Fibo — AI Math Tutor
2024
We wanted to test whether an AI agent could stand in for a tutor: making one-to-one help more accessible and affordable. The central risk was obvious — AI hallucinations. If the tutor confidently teaches the wrong method, it does more harm than good.
A from-scratch proof of concept: a cross-platform Flutter app, built by a team of two, backed by a NestJS/PostgreSQL/Redis service, with LangChain and OpenAI driving a teaching-first chat.
The problem space
The promise of AI in education is one-to-one tutoring at near-zero marginal cost — something only wealthier families can buy today. But education is exactly where hallucinations are most dangerous: a tutor that's confidently wrong teaches bad methods. So the real R&D question wasn't "can AI answer math questions" (ChatGPT can, roughly) — it was "can AI *teach* math reliably enough to trust with a student," which means controlling accuracy and resisting hallucination on exam-level problems.
developers built the full cross-platform MVP (Flutter)
ways to learn in one app: AI chat tutor, learning videos, and quizzes
answers handed over without teaching — Fibo guides students to find them
Technology choices
What we evaluated, what we chose, and why.
One codebase let just two developers ship a high-quality MVP for Android and iOS — the lean-team economics that make internal R&D viable.
A progressive Node.js framework serving the API, PostgreSQL for reliable data, and Redis to push performance-heavy work to background threads — keeping response times low.
The NLP/generation core, enabling context-aware tutoring interactions.
A chain of prompts based on the Tree-of-Thought method, which (per our tests) dramatically improved accuracy and relevance — the key lever against hallucination.
A typesetting standard for rendering mathematical formulas precisely on screen — a genuine front-end challenge for a math app.
Rejected as insufficient alone. It performed adequately on basic exercises but hallucinated on advanced problems — which is exactly why the Tree-of-Thought chain was needed.
Not needed at PoC stage. Prompt engineering plus Tree-of-Thought met the accuracy bar far faster.
The POC in action
The working thing — capabilities, not a scope list.
The tutor is tuned to help students understand *how* to reach an answer rather than just providing it — the difference between a tutor and an answer key.
A Tree-of-Thought prompt chain improves answer accuracy, and TeX renders formulas cleanly — we iterated specifically on getting AI-generated formulas to display correctly.
Learning videos and quizzes let students study and practice alongside the chat.
A two-person team delivered the MVP on both Android and iOS via Flutter.
Results & takeaways
Honest feasibility findings.
We proved an AI agent can act as a teaching-first tutor — guiding students to find answers themselves, not just supplying them.
Chaining prompts via the Tree-of-Thought method dramatically improved accuracy on exam-level problems versus naive prompting — the crux of making AI trustworthy enough for education.
Quality splits into AI-independent testing (deterministic, like normal QA) and AI-dependent testing (non-deterministic — small prompt changes swing results). The latter demands continuous iteration; it's an ongoing loop, not a one-time sign-off.
The same Tree-of-Thought, guide-don't-answer approach extends beyond foundational math into other STEM subjects, exam and test prep, and language learning — and licenses well to schools, tutoring platforms, and corporate upskilling. Proven hallucination control is what makes it sellable into education, not just demoable.