How AI Transforms PDF Learning into Adaptive Quizzes
Discover how RinKuzu's ALSS system, SAINT-Bloom, Dueling DQN, and Gemini, transforms any PDF lecture into a personalized adaptive quiz journey.
AI Learning Platform
The Static Learning Problem
Every student has a folder of lecture notes, textbooks, and handouts — rich content that contains the knowledge they need to master a subject. Yet most of that material sits passive, read once and forgotten. The challenge isn't access to information; it's that static documents can't adapt to the reader. A 300-page textbook treats every student the same way, whether they're seeing the material for the first time or reviewing for an exam.
Research in cognitive science consistently shows that spaced repetition, active recall, and adaptive difficulty are among the most effective learning strategies. But applying these strategies manually requires significant expertise and time — time most students and teachers simply don't have. RinKuzu automates this entire pipeline: from any PDF document to a personalized, continuously adapting practice session.
From PDF to Knowledge Graph: The Content Processor Pipeline
Before a single practice question is generated, RinKuzu's Content Processor transforms the raw PDF into a structured, machine-readable knowledge representation. This pipeline runs once per document and consists of seven steps:
- PDF Loading: Text is extracted from the document using PyPDF2 with caching to avoid reprocessing previously seen files.
- Chunking: The text is split into overlapping chunks (default: 4,000 characters, 400-character overlap) using a sliding window approach, ensuring context isn't lost at boundaries.
- LLM Concept Extraction: Each chunk is processed by Google Gemini (gemini-2.0-flash) to extract discrete knowledge concepts. For each concept, the LLM identifies: a unique concept_id, a name, a detailed definition, related LaTeX formulas, examples, and prerequisite relationships with other concepts. Only newly introduced or defined concepts are extracted — not ones merely listed in headers or table of contents.
- Embedding Generation: Each extracted concept is encoded using SentenceTransformer (all-MiniLM-L6-v2), producing a 384-dimensional embedding vector for semantic similarity comparison.
- Concept Merging: Two rounds of deduplication run: (1) name-based merge groups concepts with identical or near-identical names; (2) embedding-based merge uses cosine similarity (threshold: 0.85) to detect semantic duplicates. In practice, this reduces raw concept counts by ~27%.
- Prerequisite Relation Verification (PRS): Raw prerequisite relations from step 3 are verified using the Prerequisite Ranking Score (PRS) algorithm — based on the Reference Distance hypothesis: if a concept's definition has high semantic similarity to another concept's name, a prerequisite relationship is likely. PRS > τ (default τ=0.65) passes a pair to LLM for final validation. Pairs below threshold are rejected. This two-stage approach achieves a 50% acceptance rate on LectureBank benchmark while maintaining reasonable precision.
- Knowledge Graph Construction: Verified prerequisite pairs form directed edges in a DAG. Transitive reduction removes redundant edges, ensuring the graph has minimal edges while preserving all prerequisite relationships. The result: a directed, acyclic prerequisite graph ready for adaptive curriculum planning.
SAINT-Bloom: Multi-Level Knowledge Tracing
Once the knowledge graph is built, RinKuzu needs to track each learner's mastery state across all concepts and cognitive levels. This is handled by SAINT-Bloom — a variant of the Separated Self-Attentional Neural Knowledge Tracing model (Choi et al., 2020) extended to model mastery at each Bloom's Taxonomy level independently.
Unlike standard knowledge tracing models that produce a single mastery probability per concept, SAINT-Bloom outputs: P(correct | concept, Bloom_level) — a separate probability for each of the six cognitive levels: Remember, Understand, Apply, Analyze, Evaluate, and Create.
This is critical because "knowing" a concept at the Remember level (memorizing a formula) is fundamentally different from "knowing" it at the Create level (designing a novel problem that uses it). Standard models collapse this distinction, leading to the illusion of mastery — students who score well on recall questions fail at application because the model never distinguished between the two levels.
SAINT-Bloom achieves AUC 0.8010 on the Junyi Academy benchmark. Its 128-dimensional hidden state captures the learner's full interaction history and serves as the primary input to the curriculum planning agent.
Dueling DQN: Curriculum Sequencing with Hard Action Masking
With a knowledge graph and a learner's current mastery state, the next challenge is: what should the learner practice next? This is the curriculum sequencing problem — and RinKuzu solves it with a Dueling Double Deep Q-Network (D3QN) agent trained using Gymnasium.
Action Space: Each action represents a (concept, Bloom_level) pair. With N concepts and 6 Bloom levels, the action space has N×6 dimensions. RinKuzu uses a concept-agnostic architecture: every concept is processed by the same shared backbone network, enabling the agent to generalize to any number of concepts without retraining.
State Representation: The 2,082-dimensional state vector encodes:
- Global features (130-dim): SAINT-Bloom hidden state + session progress scalars
- Per-concept features (N×8): bloom mastery per level, visited flag, prerequisite satisfaction flag
Four-Layer Action Masking
Before action selection, four filters block invalid actions:
- Prerequisite filter: Concept is blocked if any prerequisite has max Bloom mastery < 0.75
- Bloom validity filter: Actions not present in the dataset for that concept are blocked
- Sequential Bloom unlock: Bloom level b+1 cannot be selected unless Bloom b has been attempted
- Mastery filter: Actions where P(correct) ≥ 0.75 are blocked (already mastered)
The agent was trained over 500,000 timesteps on a SAINT-based student simulator, achieving Reward 58.45 and Normalized Learning Gain 0.774 — outperforming Random, Sequential, Greedy, and Q-matrix baselines.
Gemini 2.0 Flash: Adaptive Exercise Generation & Evaluation
When D3QN selects action a* = (concept, Bloom_level), RinKuzu retrieves the concept's metadata (name, definition, formulas, examples) from the Concept Registry and calls Gemini 2.0 Flash to generate a tailored practice question.
| Level | Focus | |---|---|---| | 1. Remember | Recall definitions, formulas, facts | | 2. Understand | Explain in own words, summarize, distinguish | | 3. Apply | Use formulas in new situations, calculate | | 4. Analyze | Compare methods, find patterns, decompose structure | | 5. Evaluate | Judge correctness, prove/disprove, justify choices | | 6. Create | Design new problems, compose multi-step solutions |
Gemini outputs structured exercises with five fields: QUESTION, TYPE (multiple_choice or free_form), OPTIONS (A/B/C/D), ANSWER, and EXPLANATION. Temperature is set to 0.7 — diverse enough to prevent question repetition, controlled enough to avoid factual errors.
Mastery-adjusted difficulty: Gemini also receives the learner's current P(correct) for this (concept, Bloom) pair. Low mastery (<50%) prompts the model to generate easier variants; high mastery (>70%) pushes toward harder applications.
Answer evaluation: After the learner submits a response, a separate Gemini call evaluates correctness and provides targeted Vietnamese-language feedback. If Gemini is unavailable, a deterministic string-match fallback ensures the session never stalls.
The Complete Adaptive Loop: How It All Connects
The seven-step adaptive loop runs continuously throughout each practice session:
- State Representation: Current learner's SAINT-Bloom hidden state + mastery vector + progress scalars
- D3QN Decision: ε-greedy action selection with four-layer masking → outputs (concept_id, Bloom_level)
- Metadata Retrieval: Concept definition, formulas, examples fetched from Concept Registry
- Exercise Generation: Gemini 2.0 Flash generates a Bloom-specific question tailored to current mastery
- Learner Response: Student submits an answer
- Evaluation: Gemini judges correctness and delivers feedback in Vietnamese
- State Update: env.step() triggers SAINT forward pass → new hidden state, updated mastery matrix → loop returns to step 1
This closed loop means the system gets smarter with every interaction. Correct answers increase P(correct) for that (concept, Bloom) pair; incorrect answers decrease it. D3QN observes the updated mastery and dynamically rebalances — either staying to deepen understanding or moving to address weaker areas.
Experience the Adaptive Loop Yourself
The best way to understand this architecture is to experience it. Upload any lecture PDF — calculus, biology, economics, or physics — and watch RinKuzu build the knowledge graph, initialize your mastery state, and generate the first personalized practice question.
Your session will be shaped by your own knowledge graph, your own SAINT-Bloom state, and your own pace. No two learners take the same path through the same material.
Start with the free plan — no credit card required. Your first personalized quiz is ready in under a minute.
References
-
Choi, Y., Lee, Y., Shin, D., & Kim, J. (2020). SAINT: Integrating Temporal Reasoning and Knowledge Tracing Models for Adaptive Learning on Codeforces Problems. arXiv preprint arXiv:2008.10056. https://arxiv.org/abs/2008.10056
-
Corbett, A. T., & Anderson, J. R. (1995). Knowledge Tracing: Modeling the Acquisition of Procedural Knowledge. User Modeling and User-Adapted Interaction, 4(4), 253–278. https://doi.org/10.1007/BF01099821
-
Liang, W., Zhang, Y., Wu, H., Nie, Y., & Yang, F. (2022). LectureBank: A Benchmark for Prerequisite Discovery from Educational Texts. Findings of ACL 2022. https://aclanthology.org/2022.findings-acl.1/
-
Wang, Z., Lan, A. S., & Baraniuk, R. (2016). Mathematical Language Processing: Automatic Understanding and Generation of Mathematical Reasoning. arXiv preprint arXiv:1605.08286. https://arxiv.org/abs/1605.08286
-
Clement, B., Roy, D., Oudeyer, P.-Y., & Lopes, M. (2015). Multi-Armed Bandits for Intelligent Tutoring Systems. arXiv preprint arXiv:1310.3174. https://arxiv.org/abs/1310.3174
-
Van de Broeck, G., & Gybels, J. (2019). Dueling Buffer: A Practical Dueling Bandits Algorithm for Online Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 330–336. https://doi.org/10.1609/aaai.v33i01.3301330
Start your adaptive journey today
Upload your first PDF and see the ALSS pipeline in action. Join students mastering complex topics with RinKuzu.
Get Started FreeNo credit card required · Free plan available