Session goal: Run bill scanner, fix PyMuPDF venv issue, process all attachments.
Progress log:
- 14:00 — Ran
bill-scanner.py --scan: found 2 potential bills (Unitywater May, Policy 77512971) - 14:01 — Ran
--process-attachments: all 11 PDFs failed with “No module named ‘fitz’” — system Python missing PyMuPDF - 14:03 — Installed PyMuPDF v1.27.2.3 via
uv pip installinto/opt/data/.venv/ - 14:04 — Fixed scanner shebang from
#!/usr/bin/env python3→#!/opt/data/.venv/bin/python3 - 14:04 — Removed stale
sys.path.insert(0, '/opt/hermes/...')hack inpdf_to_images()function - 14:05 — Re-ran
--process-attachments: SUCCESS — 11 PDFs → 15 pages converted → all OCR’d via GPU node (carnice-v2-27b, 192.168.100.106:8080)
Outputs:
- Fixed
/opt/data/bin/bill-scanner.pyshebang + removed broken sys.path hack - PyMuPDF v1.27.2.3 installed in
/opt/data/.venv/ - 15 OCR results saved to
/opt/data/bills/processed/:111_Unitywater Bill 27 May 2026_page1.txt+_page2.txt(Unitywater Qtr bill)514_PowerCo_Bill_MAR2026*.txt(3 PowerCo test files — NZ address, skip)55_PowerCo_Bill_MAR2026_page1.txt(PowerCo test file — NZ address, skip)825_Unitywater Bill 27 May 2026_page*.txt(duplicate of email 111, same files)98_JB-25702383-5008679994-146_page1.txt+99_*(JB Hi-Fi AirPods, $360.99 paid)test_421_Invoice_page*.txt(Superloop $119 broadband — past due, needs verification)
Issues / Questions:
- PowerCo files are clearly test data (NZ Auckland address). Leave as-is or clean up?
- Superloop “test_421” account ($119/month, past due since 5 May): likely dummy. Needs pvs confirmation to delete.
Status: done