Knowledge Hub | Serevion Lab

The idea

Most learning apps wait to be opened, and most are built for exactly one subject. Knowledge Hub inverts both. It is a guided-learning format that comes to you in a chat you already use, and treats the subject as a plugin — so the same engine that teaches German vocabulary can, in principle, teach Python, SQL, or anything else expressible as items. The achievement is not the German tutor; it is the format underneath it.

This is an ongoing project. German is shipped as the first subject; the work now is proving the engine is genuinely subject-agnostic, and growing the surfaces it reaches.

What is being tested

Three bets, each covered in the notes below:

One format, any subject. An engine knows only about items and their mastery (seen → learned → mastered, promoted at one and three correct answers); the subject plugin supplies the content and how to quiz it. Quiz type is routed by status — recognition for new items, production (gap-fill) for known ones.
The chat does half the work. Delivery lives in Telegram: push notifications, distribution, and identity come for free, and a scheduling strategy decides per learner and local time what to send — a word batch sized by intensity (1–5 → 5/10/15/20/25 a day), a quiz "gate," a conversation invite, or an evening recap.
Affordability is an architecture, not an afterthought. Every word, quiz, and explanation is LLM-generated, so cost is controlled by a content cache keyed by word, type, level, version, model, and variant — generate once, reuse forever — and by defaulting to a cheap model (Gemini Flash) behind a multi-provider factory.

Technology

Built as a webhook service, not a polling script.

Aiogram 3.x for the Telegram surface, over FastAPI + Granian (a Rust-based ASGI runtime).
Pydantic-AI for typed LLM orchestration behind a multi-provider model factory (OpenAI, Anthropic, Google, DeepSeek, OpenRouter); Gemini Flash is the default.
SQLAlchemy 2.0 + asyncpg over PostgreSQL for durable state; Redis backs the conversation finite-state machine and caches; APScheduler drives timezone-aware pushes.
A generic learning engine with the subject behind a plugin interface; a shared registration / chat engine reused across the platform (the same plumbing also powers the Pulse experiment); cost made observable with token tracking, a credit ledger, and Langfuse traces.
Python 3.13.

What works today

The full German loop in chat: email-code login, a new word (wizz) with inline Know it / Got it / More / Explain, translation and gap-fill quizzes (including a multi-question push quiz with a daily-goal bar), conversation roleplay (talk), a daily recap, plus profile, stats, settings, timezone, and push schedule.
Scheduled pushes — word batches, the quiz gate, conversation invites, and an evening "Today's Vocabulary" report — across a CEFR-leveled (A1–C2) set.
A web app sharing the same engine and database via a runtime split (KNOWLEDGE_HUB_APP_ROLE=web): a no-build-step browser UI with a timeline-first shell, plus real-time voice roleplay through a LiveKit + Gemini native-audio agent. The Telegram bot stays stable while the web surface iterates.

What I learned

Routing quiz type by mastery status is simpler and more defensible than one generic quiz format — adaptivity falls out of a three-value enum.
Typed LLM output plus a content cache is what makes generation cheap and repeatable; caching is the part of the design that makes an always-on LLM product affordable for one person.
Putting the subject behind a plugin and the engine behind its own boundary is what let a whole second surface (web + voice) cost a shell instead of a rewrite.

Limitations and what's next

The engine is designed for many subjects but has exactly one shipped, so "any subject" is still a hypothesis until the second plugin — Python — proves it; that is where "mastered" shifts from recall to application. There is no first-class event log yet (progress is derived from item status), web quiz and voice sessions are not persisted, and review forecasting is planned but not built. Live voice is also the one place the caching strategy stops helping — real-time audio is paid for every minute, so it is treated as a deliberate, metered feature.

See the two notes behind this experiment for the full story, from the original idea and proof of concept to the voice-and-web advancement.