The Three-Pass Pipeline
The compiler lives in crates/knot-core/src/compiler/. Understanding the three
passes is the prerequisite for almost any change to how Knot executes code.
Pass 1 — Planning (pipeline.rs)
Input: parsed Vec<Node> + cache
Output: Vec<PlannedNode> — every node annotated with ExecutionNeed
Planning does four things:
-
Resolve options for each chunk: merge global defaults, language defaults, and per-chunk
#|options into aResolvedChunkOptions. -
Compute a chained SHA-256 hash for each chunk in each language chain. The hash of chunk N covers:
- the chunk's source code
- its resolved options
- the hash of chunk N-1 in the same language
Because hashes chain, editing chunk 3 changes the hash of chunk 4, 5, 6, … even if their code is unchanged. This guarantees downstream re-execution.
-
Classify each chunk:
Skip—eval: falseoptionCacheHit(attempt)— hash matched a cache entryMustExecute— hash not in cache (new or changed)
-
Apply Phase0Mode when assembling the partial document (for live preview):
Phase0Mode::Pending(duringdo_compile): allMustExecutechunks getChunkExecutionState::Pending(orange border).Phase0Mode::Modified(duringdo_phase0_only): the firstMustExecuteper language chain getsModified(amber thick), subsequent ones getModifiedCascade(amber thin) — distinguishing direct edits from hash-cascade invalidations.
Pass 2 — Execution (execution.rs)
Input: Vec<PlannedNode> (only MustExecute nodes are processed)
Output: Vec<ExecutedNode> — results written to cache
Execution uses std::thread::scope to run R and Python chains in parallel:
group_by_language(planned_nodes)
├── R chain → thread A → run_language_chain(r_nodes)
└── Python chain → thread B → run_language_chain(python_nodes)
Within each run_language_chain:
- Iterate over
MustExecutenodes in document order. - For each node, call the executor (
RExecutororPythonExecutor). - Write the result to cache (success or error).
- If the result is an error, all subsequent nodes in the chain become
Inert(interpreter state is uncertain). - If
freezeobjects are declared,check_freeze_contractis called after each subsequentMustExecutenode — a hash mismatch also cascadesInert.
The ExecutorManager uses a take/put-back pattern so executors can be moved
into threads without lifetime issues.
SnapshotManager saves and restores interpreter state (the R/Python environment
just before each chunk). This allows re-executing chunk 5 in a 20-chunk document
without re-running chunks 1-4.
Pass 3 — Assembly (mod.rs)
Input: original Vec<Node> + execution results + cache
Output: a .typ string
Assembly interleaves prose and code-chunk outputs in document order. For each
node it calls format_node() in backend.rs, which:
- For prose: emits the text as-is (it is already valid Typst).
- For code chunks: calls
format_chunk()to wrap the output in the appropriate#code-chunk(...)call with the right state flags and options. - Embeds
#KNOT-SYNCmarkers for bidirectional source ↔ PDF navigation.
Streaming (two-phase API)
The compiler exposes a two-phase API on the Compiler struct for live preview:
#![allow(unused)] fn main() { // Phase 0: planning only (no code executed), returns partial .typ immediately fn plan_and_partial( &self, nodes: Vec<Node>, mode: Phase0Mode, ) -> Result<(Vec<PlannedNode>, Arc<Mutex<Cache>>, String)> // Phase 1: execution + streaming fn execute_and_assemble_streaming( &self, planned: Vec<PlannedNode>, cache: Arc<Mutex<Cache>>, progress: Option<Sender<ProgressEvent>>, ) -> Result<String> }
ProgressEvent carries the doc_idx of the completed node plus its
ExecutedNode. The LSP uses these to rebuild the .typ string and push
incremental updates to Tinymist after each chunk completes.
Entry points
For most purposes you will use the project-level API in project.rs:
#![allow(unused)] fn main() { // One-shot compilation (used by knot build / knot watch) pub fn compile_project_full( root: &Path, on_progress: Option<Box<dyn Fn(String) + Send>>, ) -> Result<ProjectOutput> // Phase 0 only (instant, used by LSP on didChange) pub fn compile_project_phase0(root: &Path, mode: Phase0Mode) -> Result<ProjectOutput> // Phase 0 with unsaved buffer (typing-time updates) pub fn compile_project_phase0_unsaved( root: &Path, unsaved_path: &Path, content: &str, mode: Phase0Mode, ) -> Result<ProjectOutput> }