mdwright spec deviations
The mdwright formatter targets the GFM 0.29-gfm spec (crates/mdwright/tests/gfm-spec/spec.txt, vendored from cmark-gfm). Every example
is exercised by crates/mdwright/tests/gfm_spec.rs as a parse → format → parse → format round-trip and compared against the source
HTML and the normalised event stream.
This document is the user-facing index of where mdwright currently does not byte-for-byte round-trip the spec. It is split into two parts because the underlying mechanism does:
- Editorial deviations: choices we have made and intend to keep. Curated in
crates/mdwright/tests/gfm-spec/allowlist.toml. Each entry has a one-line rationale and a pointer to where the decision is documented. - Tracked regressions: known divergences that we intend to fix. Recorded in
crates/mdwright/tests/gfm-spec/snapshot.txt. The snapshot is asserted byte-for-byte, so any drift, whether regression or improvement, fails CI and forces a deliberate update.
The gfm_spec_coverage test prints the live count for both groups; the numbers below are a snapshot of the current main
branch.
Coverage
| Bucket | Examples |
|---|---|
| Spec examples total | 672 |
| Matching | 637 |
| Editorial deviations | 35 |
| Tracked regressions | 0 |
A case may fail more than one comparison kind (semantic, idempotence); the snapshot file is keyed by
(case, kind) and currently lists no tracked regressions.
Parser Backend Drift
The formatter round-trip gate is not the same as cmark-gfm renderer equivalence. cargo xtask parser-audit compares
mdwright's current pulldown-cmark backend with cmark-gfm and renders mdwright through the opt-in cmark-gfm render
profile. The current GFM-spec parser audit has 15 classified HTML differences, 0 source-position differences, and 0
unclassified differences.
The remaining differences are accepted constraints of the current backend:
| Class | Count | Status |
|---|---|---|
| Emphasis delimiter-stack resolution | 9 | accepted parser-backend drift |
| Raw HTML block indentation/newline spelling | 4 | accepted render drift with stable source facts |
| Task-list examples marked disabled by the spec | 2 | accepted spec-fixture drift |
| Contained upstream parser panic | 1 | converted to ParseError |
[render] profile = "cmark-gfm" changes only HTML spelling for mdwright render: quote escaping, link-destination
escaping, ordinary GFM table layout, task-list checkbox spelling, and one raw-HTML newline case where the parser already
exposes enough structure. It does not change emphasis resolution or parser tree semantics. Full cmark-gfm parser
equivalence would require upstream pulldown-cmark changes, a maintained fork, or a parser backend switch.
Editorial deviations
Pulldown text-chunking deviations
35 spec examples currently fail the AST-event comparison only; HTML matches byte-for-byte and round-trip is idempotent.
The mismatch reflects pulldown-cmark's text-run chunking: pulldown splits long runs of text into events at points
cmark-gfm does not, so the normalised Event::Text(…) stream differs even though every other event lines up and every
rendered HTML byte agrees.
The triage rule, applied at the snapshot level, is:
For each (case, kinds) in snapshot.txt:
if kinds == {"ast"} and case has no other entry:
-> allowlist.toml (bucket = "pulldown-text-chunking")
else:
-> stays in snapshot.txt (tracked regression)
Affected cases: 5, 6, 7 (Tabs, CM §2.2); 16, 19 (Thematic breaks, CM §4.1); 61 (Setext headings, CM §4.3); 102, 103 (Fenced code blocks, CM §4.5); 214, 230 (Block quotes, CM §5.1); 232, 242, 248, 249, 251, 252, 256, 264, 265, 266, 268 (List items, CM §5.2); 320 (Backslash escapes, CM §2.4); 321, 324, 330, 333 (Entity refs, CM §2.5); 393, 411 (Emphasis, CM §6.2); 499, 500, 503, 520, 528, 536 (Links, CM §6.3); 640 (Raw HTML, CM §6.8).
The bucket name is load-bearing: if a future per-case investigation disproves the chunking explanation for one of the
cases above, remove its entry from allowlist.toml and let it re-enter the snapshot as a tracked regression.
Tracked regressions
There are currently no tracked GFM-spec formatter regressions. Any future non-allowlisted failure appears in
crates/mdwright/tests/gfm-spec/snapshot.txt and fails the snapshot test until it is fixed or deliberately classified.
mdformat-mkdocs parity deviations
mdwright matches mdformat-mkdocs byte-for-byte for the four Markdown extensions covered in
Markdown extensions. The parity test at crates/mdwright/tests/extension_parity.rs enforces this against five
committed reference fixtures. Known divergences below; each row exists because the upstream pulldown-cmark parser
doesn't surface enough information for mdwright to round-trip the source faithfully.
| Construct | Source pattern that diverges | Why |
|---|---|---|
| Heading attribute, quoted value | # H {title="hello world"} | pulldown-cmark 0.13's heading-attribute parser splits the trailer on whitespace and ignores "…" quoting. Pulldown surfaces two attrs (title="hello, world") instead of one. mdformat-mkdocs (python-markdown's attr_list) handles the quoted form correctly. Tracked upstream; will resolve when pulldown lands the fix. |
The parity test refuses to silently accept new divergences: any byte-for-byte mismatch fails the test and forces a deliberate add to this table (with a rationale and an upstream pointer) or a fix in mdwright's emit path.
MyST + Pandoc directive parity
mdwright preserves MyST directive containers, Pandoc fenced divs, inline roles, MyST substitutions, Pandoc inline
attribute spans, and MyST % line comments byte-verbatim. See MyST + Pandoc directives for
the full scope. The bar is idempotence-on-mode, not byte-equal round-trip with mdformat-mkdocs: mdformat-mkdocs does
not implement these constructs at all, so there is no upstream reference to diff against. The vendored jupyter-book demo
at crates/mdwright/tests/external/jupyter_book_minimal/ plus the per-construct regressions at
crates/mdwright/tests/regressions/{directive_*,inline_role_*,myst_*}.in are the safety net.
| Construct | Source pattern that diverges | Why |
|---|---|---|
Malformed :::{name} source | Bare :::{warning} Experimental with no closer | Pulldown parses the opener as part of a definition-list or paragraph; mdwright's directive overlay matches on byte-range overlap and emits the union of the tree-node range and the directive region, so the bytes survive, but the surrounding misclassified bytes flow through pulldown's normal path. Fix the source by closing the directive. |
How to read the live numbers
cargo test --release --test gfm_spec gfm_spec_coverage -- --nocapture
prints, at the top of its output:
gfm spec coverage:
total cases: <n>
fully matching: <n>
intentional dev: <n>
tracked regression: <n>
unexpected: <n>
These are the source of truth; the table above is a snapshot for the release notes.
Updating the snapshot
After a deliberate fix (or an accepted editorial deviation):
# A fix that removes (case, kind) entries from snapshot.txt:
MDWRIGHT_UPDATE_SNAPSHOT=1 cargo test --release --test gfm_spec gfm_spec_snapshot
# An editorial deviation: add a row to crates/mdwright/tests/gfm-spec/allowlist.toml
# *before* regenerating the snapshot, then run the same command.
The snapshot test fails on any drift; CI will not silently accept a regression that happens to look like an improvement, and an improvement that isn't reflected in the snapshot fails just as loudly.