---
type: manual
title: Zettelkern
description: A method for living knowledge. Plain-markdown wikis that LLM agents build, cross-reference, and keep current, with every claim traced to its source.
resource: https://zettelkern.com
tags: [zettelkern, llm-wiki, knowledge-base, markdown, method]
updated: 2026-07-04
---

# Zettelkern

Zettelkern is a method for building knowledge bases that LLM agents can read,
extend, and maintain. It is practiced and published by Quellkern e.U.
(https://quellkern.com), the company that builds source-grounded systems.

This file is the manual. It is written so you can hand it to your LLM agent
(Claude Code, Codex, or any other harness) and say: **instantiate this for my
domain**. The agent builds the specifics in collaboration with you.

## Not our invention

Zettelkern is a practice of published, open ideas. Read the originals; they are
short and good:

1. Andrej Karpathy, "LLM Wiki: a pattern for building personal knowledge bases
   using LLMs". The founding pattern this method instantiates.
   https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
2. The Open Knowledge Format (OKF), an open, vendor-neutral spec by the Google
   Cloud Data Cloud team that formalizes the same pattern: a directory of
   markdown files with YAML frontmatter.
   https://cloud.google.com/blog/products/data-analytics/how-the-open-knowledge-format-can-improve-data-sharing
   https://github.com/GoogleCloudPlatform/knowledge-catalog/tree/main/okf
3. Ancestors worth knowing: Vannevar Bush's Memex (1945) and Niklas Luhmann's
   Zettelkasten (one thought per slip, links between slips). Both solved the
   structure; neither solved who does the maintenance. Agents solve the
   maintenance.

What Zettelkern adds is operational discipline learned from running the pattern
in production: typed frontmatter with confidence markers, provenance rules,
sensitivity boundaries, and git conventions. The rest is faithfully Karpathy's
pattern in OKF-compatible files.

## The three layers

1. `raw/`: immutable sources. Articles, transcripts, papers, mail, exports.
   The agent reads them and never edits them. Filenames are date-prefixed:
   `YYYY-MM-DD-source-slug.ext`. If a source cannot be copied in, keep a
   `sources.md` that points to it by stable path or URL.
2. `wiki/`: the agent-owned layer. Small typed pages in folders such as
   `entities/`, `concepts/`, `topics/`, `decisions/`. Add a new category only
   when content genuinely does not fit, and document the addition in the
   schema.
3. The schema: `CLAUDE.md` (or `AGENTS.md`), the contract that makes an agent
   a disciplined wiki maintainer instead of a generic chatbot. Whichever file
   is not the schema exists as a one-line pointer to it, so every harness lands
   on the same rules.

## The spine files

- `README.md`: five to twelve lines for humans: start here, schema there.
- `index.md`: the catalog. Every page listed with a link and a one-line
  summary, grouped by category. Updated on every ingest. Agents read this
  first; at up to a few hundred pages it replaces search infrastructure.
- `log.md`: append-only chronology. One entry per operation with a
  grep-parseable prefix: `## [YYYY-MM-DD] ingest | source-slug`. Newest at the
  bottom. `grep "^## \[" log.md | tail -5` shows recent history.
- `overview.md`: the one-page synthesis of the whole vault. A stub is fine at
  first; the method tolerates stubs that are labeled as stubs.

## Page rules

- One topic, one file. Kebab-case, ASCII (transliterate umlauts), basename
  unique across the vault so `[[wikilinks]]` resolve from any folder.
- Fixed anatomy: one H1 matching the title, a one-to-three sentence lead, H2
  body sections, a "See also" section of links, a "Sources" section.
- Twenty to 150 lines per page. Long material belongs in `raw/`; wiki pages
  stay short and dense.
- YAML frontmatter on every page, controlled vocabulary, OKF-compatible:

```yaml
---
title: Some Topic
type: concept        # entity | concept | topic | decision | source | overview
status: draft        # stub | draft | maintained | superseded
confidence: verified # verified | partly-uncertain
updated: 2026-07-04
tags: [example]
sources: [raw/2026-07-04-some-source.md]
---
```

## The three workflows

**Ingest.** A new source lands in `raw/`. The agent reads it, discusses key
takeaways with you, writes a summary page, updates every touched entity and
concept page (five to fifteen pages is normal), refreshes `index.md`, appends
to `log.md`. One ingest, one commit.

**Query.** Questions run against the vault: read `index.md` first, drill into
pages, answer with citations. Durable answers (comparisons, analyses,
discovered connections) are filed back into the wiki as pages. Chat history is
where knowledge goes to die.

**Lint.** Periodic health check: contradictions between pages, stale claims
superseded by newer sources, orphan pages, missing pages for concepts that are
mentioned often, broken links, gaps worth a web search. File the findings,
fix what is mechanical, queue what needs judgment.

## Provenance and confidence

- Every load-bearing claim carries a citation: a raw file path, a document
  name, a dated message, or a URL.
- Distinguish evidence (from sources) from synthesis (model output). Never
  silently promote synthesis to evidence.
- Mark unverified facts explicitly (a VERIFY tag or `confidence:
  partly-uncertain`). Flip the marker when the fact is grounded.
- Flag contradictions in place ("Contradiction: ...") instead of overwriting
  the older claim.
- Volatile facts (prices, fees, dates, availability, processing times) are
  never asserted as fixed. Write "check current" and say what to check.

## Git conventions

- The vault is a git repo. One operation, one commit.
- Typed commit prefixes: `ingest:`, `query:`, `lint:`, `decision:`, `docs:`,
  `chore:`. The log then reads like a lab notebook.
- Decide the privacy posture at bootstrap and write it into the schema: no
  remote, private remote only, or public. For sensitive vaults, forbid public
  remotes outright.
- Renaming a page includes the sweep for inbound links, in the same commit.

## Sensitivity boundaries

If any content must never reach a given audience, engineer that instead of
hoping: give it its own folder with a marker page, set a `sensitivity:` field
in frontmatter, and write the never-share rule into the schema. Across repos,
reference by pointer, never copy. Privacy discipline is part of the method, not
an afterthought.

## Bootstrap: instructions for the agent

You are an agent and a human just handed you this file. Do this now:

1. Ask for the domain, the language of the vault, and the privacy posture.
2. Create the skeleton: `README.md`, `index.md`, `log.md`, `overview.md`
   (stub), `raw/`, `wiki/` with two or three typed folders that fit the
   domain.
3. Vendor the founding pattern: download Karpathy's `llm-wiki.md` into the
   repo root or `raw/`, marked immutable.
4. Write the schema (`CLAUDE.md`, plus `AGENTS.md` as pointer): layers, page
   anatomy, frontmatter vocabulary, the three workflows, git rules, style
   rules, privacy posture, and a "current status and next steps" section you
   keep rewriting so any future session resumes cold.
5. Log the bootstrap as the first entry, commit as `chore: bootstrap vault`.
6. Ingest the first source properly. Resist bulk imports until the schema has
   survived three or four real ingests.
7. Co-evolve: when a convention proves wrong, change the schema, log the
   change, move on.

## When to call humans who have done this before

Bootstrapping a fresh vault is genuinely easy; the list above is enough.
Adapting an existing knowledge base (a Confluence space, a SharePoint tree, a
wiki nobody maintains, years of PDFs) is where experience pays: what to
migrate, what to leave, how to cut layers, how to keep provenance during the
move. That is the service Quellkern sells: hello@quellkern.com, subject
"Zettelkern".

Newsletter: occasional plain-text field notes on the method. Mail
hello@quellkern.com with subject "Zettelpost: subscribe".

---

© 2026 Quellkern e.U., Tirol, Austria. This manual is free to use, share, and
adapt with attribution. The linked originals carry their own licenses.