Report Format

The final report is plain text written to --output (default: stdout). It contains 11 sections in the order described below. Each major section is separated by two blank lines for readability.

Section 1: Configuration Summary

Shows the effective configuration used for the crawl, including all URL classification lists.

Section 2: Statistics Summary

Overall crawl statistics including elapsed time, total requests, bytes downloaded, and per-domain request breakdown.

Section 4: Broken Anchors

Fragment references (e.g. #section) where the target anchor ID does not exist in the target page. Grouped by the target URL and fragment.

Section 5: Non-200 Responses

All URLs that returned a non-200 final status after redirects. Includes broken links (4xx/5xx) and also other non-200 responses (e.g. 403 Forbidden). Grouped by HTTP status code.

Section 6: Redirects

URLs where the server issued a 3xx redirect to a different final URL. Shows the original URL, final URL, and the HTTP status code of the first redirect hop (e.g. 301, 302). Redirects that ultimately resolve to 200 do not cause a non-zero exit code — they are informational only.

Note

Redirects are recorded using the status code of the first redirect response (e.g. 301), not the final 200. Only genuine server-side redirects to different URLs are listed; same-URL redirects caused by URL normalization are suppressed.

When ignore_http_to_https_redirects is enabled, redirects where only the scheme changes from http to https (same host, path, and query) are also suppressed. The section header will note [http→https upgrades suppressed] when this option is active.

Section 7: Misplaced Assets

Only present when asset_urls is configured. Assets found outside their expected locations, grouped by asset type (Image, Document, Data, Infrastructure, Other).

Section 8: Ignore URL Matches

URLs that matched an ignore_urls prefix and were skipped entirely. Listed so site owners know which ignored URLs are still being referenced.

Section 10: SSL Warnings

Domains that had SSL certificate errors. Grouped by domain. Crawling continues after SSL errors.

Section 11: Unvalidated Anchors

Fragment references that could not be validated because the target page’s HTML was not parsed (due to no-crawl, depth limit, or external status).

Referencing Page Truncation

In all sections that list referencing pages, the count is limited to --max-referencing-pages (default 10). When exceeded, a note is appended:

... and N more referencing pages