Wiki Lint Daily — Project Review (2026-06-22)
What it is
Daily wiki hygiene tool. A Python scanner (lint_scan.py) walks the SilverBullet wiki tree checking for frontmatter validity, required fields, type compliance, and tag taxonomy violations. Runs as part of autopilot Mode D rotation. Goal: keep metadata consistent so the wiki stays navigable.
Current state
- Script exists at
scripts/lint_scan.py— functional, loads tags dynamically from SCHEMA.md - Last meaningful scan: Jun 22 (321 pages, 387 issues). Stable since Jun 17 (~369-387 issues across runs)
- 30+ session logs spanning Jun 5–Jun 22 showing repeated scans with similar results
- Key past work done: fixed broken wikilinks in concepts/, added frontmatter to 278 files, created missing root index.md, resolved smart-groceries dead session links
Gaps and risks
- No regression tracking: Jun 22 scan noted “baseline not found on disk” — no persistent baseline file means delta comparison is impossible
- Issues plateauing at ~370+: Same structural issues keep appearing (missing created/updated, bad types in archived dirs) but nothing systematic fixes them
- Schema drift risk: VALID_TYPES and REQUIRED_FIELDS are hardcoded in the script. SCHEMA.md changes won’t propagate unless someone manually updates lint_scan.py
- No remediation automation: Scanner only reports; doesn’t fix. The same missing-fields issues appear every scan because nothing auto-generates the missing
created/updatedfields - Duplicate project dirs: t-002 in todos.json flagged “wiki-lint-daily vs wikilint-daily duplicate dir” — still open, status=doing since Jun 18
- Tag loading fragile:
_load_allowed_tags()extracts backtick-wrapped words from SCHEMA.md using a broad regex. Any formatting change to SCHEMA could break it silently
Recommended approach
Build automated remediation on top of the existing scanner. Fix what’s fixable (missing dates → inject today’s date, bad types → map to closest valid type), then focus effort only on structural/schema gaps that require human decisions.
Keep the “daily scan” name but split into two phases:
- Auto-fix: inject missing fields, normalize types — no-op unless something changes
- Report-only: flag what requires schema updates or human judgment
Phased plan
Phase 1 — Stabilise (t-001 to t-004)
- Write baseline after each scan to
data/baseline.jsonfor regression tracking - Add auto-fix mode: inject
created/updatedfrom filesystem mtime or today’s date when missing - Clean up the duplicate dir issue (wikilint-daily vs wiki-lint-daily)
Phase 2 — Schema-aware linting (t-005 to t-007)
- Parse SCHEMA.md VALID_TYPES programmatically instead of hardcoding in script
- Add type-mapping table: known invalid types → their closest valid equivalent (e.g.,
goal-reflection→session) - Tag loading: tighten regex or parse structured sections
Phase 3 — Continuous enforcement (t-008+)
- Wire auto-fix into autopilot cron so scans both detect and repair on each tick
- Add a weekly summary that shows issue delta (up/down/stable) instead of raw counts
- Consider archiving old session logs (>30 days) to reduce scan noise