Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Zettelkaten - Making a second brain

Overview

The Zettelkasten Method is a holistic method on how to deal with knowledge. It is defined as a personal tool for thinking and writing that possesses hypertextual features to enable the creation of a web of thoughts.

The method emphasizes connection, not collection. It is considered highly effective and acts as an amplifier of endeavors in knowledge work. With consistent effort and practice, it can produce “gems of knowledge”. The method helps streamline workflow, decrease friction, and makes writing easier and more coherent by holding thoughts alive over long periods.

The method is based on the work of the highly productive social scientist, Niklas Luhmann, who published 50 books and over 600 articles. Luhmann himself attributed his output to working in a partnership with his Zettelkasten.

Core Principles of the Zettelkasten Method

The Zettelkasten Method is founded on three primary traits:

1. Hyper-textual Structure

The Zettelkasten is a type of hypertext where notes refer to, explain, expand upon, and use each other’s information. This structure is organic and non-linear.

2. Principle of Atomicity

This principle dictates that knowledge is composed of discrete building blocks. The guiding compass for note-taking is to capture one knowledge building block, or precisely one thought, per note.

3. Personal Focus

The Zettelkasten is intended to be a personal thinking tool; therefore, the guideline is one Zettelkasten per person. Writing in the ZettelkastenZettelkasten should be done for oneself, unlike writing for the public.

The Anatomy of a Note (ZettelZettel)

An individual note, or ZettelZettel (German for “paper slip”), is the smallest building block of the system. Every ZettelZettel must have three components:

  1. A Unique Identifier (ID): This is the unambiguous address of the note, mandatory for creating the hypertext.

    • Luhmann-ID: For paper systems, Luhmann used a clever branching numbering system (e.g., 1, 2, 1a1a, 1b1b, 1a11a1) to allow organic growth by interspersing or continuing trains of thought.

    • Digital IDs: Digital systems often use a time-based ID (e.g., 202006110955) or an arbitrary unique string.

  2. The Body of the ZettelZettel: This contains the piece of knowledge to be captured. The most important aspect is that the content must be written in the note taker’s own words to increase understanding and recall.

  3. References: Located at the bottom of the ZettelZettel, this section references the sources of the knowledge.

    • References to external sources (like books or articles) typically use citekeys from reference management software.

    • If the note is based on material already processed, it references other ZettelZettels by linking to their ID.

    • If no reference is provided, the content is considered the note taker’s own thought by default.

Adding Structure to the Zettelkasten

While the method emphasizes bottom-up creation through individual ZettelZettels and connections, hierarchical organization is also useful via Structure Notes.

Creating a Second Brain

The process is sometimes referred to as creating a “second brain” because the core methods involved, particularly the Zettelkasten Method and modern AI-driven tools, function as an external amplifier for human thinking, knowledge storage, and connectivity.

Zettelkasten Method as a Personal Thinking Tool

The Zettelkasten Method is explicitly defined as a personal tool for thinking and writing. It is designed to act as an amplifier of endeavors in knowledge work.

The term “second brain” relates to how the Zettelkasten handles functions typically associated with the human mind:

Modern AI Tools as “Second Brains”

The concept extends to modern tools like NotebookLM, which similarly manage and synthesize information to aid deeper understanding:

Obsidian

Obsidian is fundamentally a note-making application (or “note app”) that users download and install on their computer. It allows users to create notes within a designated folder called a vault.

Key characteristics and functions of Obsidian include:

Getting Started

The Zettelkasten Method, implemented within the Markdown-based environment of Obsidian, provides a powerful structure for tracking the iterative progress, technical methodologies, and core findings of a data science project. By emphasizing atomicity and connection, this process transforms raw notes into a dynamic “idea verse”.

Here is a process for getting started with Obsidian to track the elements of a data science project:

Phase 1: Setup and Foundation (The Vault)

The initial setup focuses on ensuring that the knowledge base is durable, flexible, and ready for growth.

  1. Create an Obsidian Vault: Start by creating a new vault, which is simply a folder that Obsidian monitors.

  2. Adopt Plain Text Philosophy: Ensure all documentation is written in Markdown (.md) files. This plain text approach is considered the most versatile and durable file format, making the information “future-proof”,, and easily manageable by other literate programming tools like Git.

  3. Prioritize Linking over Folders: When first starting, focus on connecting ideas rather than rigidly organizing notes into hierarchical folders,. Complex organization tends to make the system fragile.

Phase 2: Atomic Capture of Project Components

The core principle applied here is the Principle of Atomicity, where each note (or ZettelZettel) captures one distinct building block of knowledge. For a data science project, this means capturing tools, methods, and findings separately.

Component TypeActionExample Content
Tools & LibrariesCreate a note for every tool, library, or platform used (e.g., Ollama, LangChain, PubMed Meta Analyzer). Capture its purpose in your own words.[[PubMed Meta Analyzer]],: A Python tool designed to automate literature reviews by extracting metadata from PubMed via the Entrez API,.
Methods & ConceptsCreate atomic notes for technical concepts or modeling approaches, such as Retrieval-Augmented Generation (RAG), Transfer Learning, or Vector Similarity.[[Hashing-Based Similarity Search]]: A technique used to enhance search capabilities in evidence retrieval, like topology-preserving hashing.
Findings & ClaimsCreate specific notes capturing key project results, crucial data insights, or notable limitations (e.g., observations of Hallucination Rate or Reference Accuracy).[[LLM Context Length Limitations]]: RAG models struggle with context length in extended queries and difficulties in maintaining context for precise vector similarity searches,.
Source TrackingAlways include References at the bottom of the ZettelZettel (note) identifying the source of external knowledge. For academic work, use citekeys from reference management software (e.g., BibDesk or JabRef),.References: [#Brown_et_al_2020]

Phase 3: Connecting and Contextualizing Knowledge

The goal of the Zettelkasten is to build an organic web of thoughts that improves recall and fosters new insights,,.

  1. Use Internal Hyperlinks for Strong Connections:

    • Whenever one idea informs another, create a link using the double bracket syntax: [[Note ID or Title]],.

    • Capture the Link Context: When creating a link, explicitly state why the connection was made, as this is how new knowledge is created,. Use the note-making prompt: “This reminds me of...”,.

    • Example: In a note on Data Annotation, use the prompt “This reminds me of...” and link it to [[Active Learning]] because active learning improves biomedical abstract screening efficiency.

  2. Use Tags for Categorization and Metadata:

    • Use hashtags (#) to create non-hierarchical groups that describe the state, component, or category of the information,.

    • Examples: #data_cleaning, #model_training, #evaluation_metrics, #python_code. You can easily check all notes containing a tag using the enabled Tag Pane feature in Obsidian.

  3. Manage Progress and Iterations with Git:

    • Although Obsidian is not Git, because the notes are Markdown files, store the entire vault in a Git repository. This ensures persistence and provides full version history for every iteration and insight captured.

Phase 4: Structuring the Project (Structure Notes)

To manage complexity and facilitate project writing, organize related atomic notes into “hub notes”.

  1. Create Project Structure Notes: Make new notes that act as tables of content for major project phases or domains. Use a Markdown list structure to link related ZettelZettels:

    • Example Structure Note: [[Evaluation Metrics for LLMs]]

      • Metrics for quality: [[Overall Quality Score (OQS)]]

      • Metrics for retrieval: [[BM25 and TF-IDF]]

      • Metrics for accuracy: [[Hallucination Rate]]

  2. Use Structure Notes for Reasoning Chains: For tracking methodology or logical arguments (like systematic review steps), create a sequential Structure Note to capture the argument flow (e.g., abca \rightarrow b \rightarrow c) and link to the ZettelZettel that supports each step,. This provides panorama vision over complex problems.

Phase 5: Technical Maintenance and Integration

Ensure the platform supports the fluid workflow of a data science project:

  1. Ensure Link Integrity: In Obsidian’s settings, enable the most important setting: “Automatically update internal links”. This ensures that if you rename a note (like changing Note Star to Note Star 2), all links referring to it automatically update, maintaining the functionality of the hypertext.

  2. Integrate with External Tools (Implicitly): Use the plain text notes as context for advanced LLM analysis. Obsidian notes can be used as a source for tools like LlamaIndex when building Agentic Workflows using Retrieval-Augmented Generation (RAG). These agents can then synthesize drafts or outlines using the connected thoughts recorded in the Zettelkasten,–.

  3. Focus on Creation: When working, remember the final goal is not organizational perfection but continuous thinking, writing, and connecting ideas.,

Benefits to AI Data Science

The Zettelkasten Method, applied alongside literate programming tools like Markdown and Git, creates a synergistic research workflow that enhances knowledge creation, ensures data portability and persistence, and allows for the seamless integration of modern AI-driven analysis.

Here is how the Zettelkasten Method can be applied with these tools to improve research products, drawing upon the principles of connection, atomicity, and persistent, plain-text formats:

1. Utilizing Markdown and Atomicity for Note Creation

The core philosophy of the Zettelkasten is based on the Principle of Atomicity, which guides the note taker to capture precisely one thought or one knowledge building block per note. Markdown is the ideal format for capturing these atomic notes and connecting them effectively within a literate programming environment:

2. Utilizing Git for Version Control and Persistence

Git, a version control system common in literate programming, is invaluable for managing the iterative nature of research and the organic growth of the Zettelkasten structure:

3. Improving Research Products through Structure and AI Augmentation

The combined methodology directly supports the systematic processes required for high-quality research products, such as systematic reviews:

References

https://zettelkasten.de/introduction/

https://help.obsidian.md/