Everything EQBook can do, where its boundaries lie, and how to get the most from it.
EQBook combines a local LLM, a RAG pipeline, and a focused research UI — all running offline on your Apple Silicon device.
Organise research into workspaces. Each workspace holds unlimited PDFs, DOCX, TXT, MD, audio, and video files.
Chat against all your sources at once. Answers stream in real time with inline citations you can click to view the exact source passage.
Search across every chunk in the workspace using vector similarity. Finds conceptually related passages, not just keyword matches.
Drop in MP3 or MP4 files. EQBook transcribes them on-device using CoreML Whisper Small, then indexes the transcript for Q&A.
Every AI answer includes dotted-underline citations. Click any citation to see the exact source paragraph it was drawn from.
Generate a podcast-style two-host conversation from your workspace sources. Absorb research on the go — produced entirely on-device.
Hardware is auto-detected at launch. The right Gemma 4 variant is recommended based on your Mac's unified memory.
Sources are grouped by type (PDF, Audio, Video, Text) in a collapsible sidebar. Rename, delete, or view chunk counts per source.
All conversations are saved locally. Sessions persist across restarts, giving you a running record of every research question you've asked.
Native macOS app with a collapsible source sidebar, keyboard shortcuts, and Spotlight integration — built for focused, distraction-free research.
Audio Overview — a podcast-style summary generated entirely on-device from your sources
EQBook's Retrieval-Augmented Generation pipeline runs entirely on-device in five stages.
| Stage | What happens | Technology |
|---|---|---|
| 1 — Ingest | Text extracted from PDF, DOCX, TXT, MD, or Whisper transcript | PDFKit, XMLCoder |
| 2 — Chunk | Text split into 512-token chunks with 50-token overlap, respecting paragraph breaks | Custom Swift chunker |
| 3 — Embed | Each chunk converted to a 384-dimension vector | MLX all-MiniLM-L6-v2 |
| 4 — Retrieve | Top-8 semantically closest chunks fetched for the user's query | On-device vector search (cosine) |
| 5 — Generate | Retrieved chunks + question passed to Gemma 4; response streamed with citations extracted | MLX-LM Gemma 4 |
Quiz mode — test your comprehension of any workspace with AI-generated questions
EQBook is powerful but not without trade-offs. Here's what to expect.
All inference uses Apple's MLX framework, which requires Apple Silicon (M1 or later). Intel Macs are not supported.
The E4B model tops out at 128K tokens; the 26B MoE model at 256K. Very large document collections may exceed the context window — use focused workspaces to stay within limits.
Documents larger than 200 MB prompt you to choose between copying or referencing. Files referenced from disk may cause issues if moved or deleted.
Practical tips from the development team and early access users.
Don't dump all your documents into one workspace. Use separate workspaces per project or topic to keep the retrieval context clean and relevant.
Audio and video transcription happens asynchronously. Wait for a source's status badge to show "Ready" before running queries against it.
The RAG retrieval works best with specific, focused questions. Vague prompts retrieve scattered chunks and produce weaker answers.
Open the Insights tab after ingesting sources to get a quick summary. This helps you know what's in the corpus before writing chat prompts.
Always click inline citations to verify the source passage. LLMs can occasionally misrepresent or conflate information — check primary sources.
If your Mac has ≥16 GB RAM, download the 26B MoE variant. It has a 256K context window and significantly better reasoning on complex queries.
Scanned PDFs without embedded text will have poor extraction results. Run them through an OCR tool (e.g. Apple's Preview, Adobe) before importing.