Refactor smart-groceries importers to use camofox
Goal
Refactor smart-groceries scraping to route through camofox so we evade bot detection on Coles/Woolworths and never reveal our gateway IP.
Why this is broken
The smart-groceries-catalogue-scrape cronjob has been failing for days. Last manual run got 0 categories from both Coles and Woolworths because requests.Session() is detectable as a bot. Switching to camofox solves both: real Firefox fingerprint + dedicated NordVPN sidecar.
Files to change
/opt/data/smart-groceries/app/importers/coles.py/opt/data/smart-groceries/app/importers/woolworths.py- New:
/opt/data/smart-groceries/app/importers/camofox_client.py /opt/data/smart-groceries/requirements.txtaddhttpx
camofox API (probed in production)
- Service:
http://camofox-browser-service.ai-agents.svc.cluster.local:9377 GET /returns engine statusPOST /startboots the Firefox engine if not runningPOST /tabs/openbody{"url":"...","wait":"networkidle"}returns{tabId, ...}POST /tabs/{tabId}/evaluatebody{"script":"document.documentElement.outerHTML"}returns{result: "..."}DELETE /tabs/{tabId}to close
After changes
cd /opt/data/smart-groceries && git add app/importers/ requirements.txtgit -c user.email=hermes@paralla.org -c user.name=hermes commit -m "refactor: route Coles/Woolworths importers through camofox"git push origin main— token is$GITLAB_TOKENenv from hermes-credentials.gitlab-token- Watch GitLab CI: image rebuilds on main branch.
After CI rebuilds
Tell Claude and Claude will update the cronjob to use the new image, drop nordvpn-sidecar, drop wait-for-VPN logic.
Validation
Once cronjob is updated, manual job should yield ~15-20 categories per store and hundreds of products. If anything is unclear, ask Claude.
Result (failed, completed by hermes at 2026-04-30T20:45:03Z)
poller failed: timed out