★ Framework14 sub-systemsReference

FRAMEWORK · 14 SUB-SYSTEMS

The full audit framework we run on every engagement.

Every check, every target state, every red flag we’ve seen blow up in production. This is the working reference. Bookmark it, send it to your dev team, audit yourself against it before we even start.

Book an audit →← Back to overview

§ 5.101 / 14

Crawlability & Indexation

Can search engines and AI crawlers reach every page that should be indexed, and nothing that shouldn’t?

Checks

robots.txt present, valid, not blocking critical paths
sitemap.xml referenced in robots.txt and contains only indexable URLs
Single canonical domain: all four variants 301 → one canonical
Self-referencing canonical tags on indexable pages
noindex directives audited, present only on pages we want excluded
Indexation coverage (submitted vs indexed): any gap explained, not accidental
Soft 404s: zero tolerated on important pages
Orphan pages identified, then either linked or deleted
Crawl budget concentrated on valuable URLs (log-file analysis)
Rendered HTML === raw HTML for content (JavaScript rendering parity)

Red flags

Paginated archive URLs + faceted navigation eating crawl budget
Staging / preview URLs indexed by accident
noindex left over from a migration
Crawlers blocked at the WAF or CDN (common with Cloudflare defaults)

§ 5.202 / 14

Site Architecture & Internal Linking

Does the structure communicate topical authority and let humans + machines discover important content in ≤3 clicks?

Checks

Click depth ≤ 3 from homepage to any important page
Zero orphan pages
Zero broken internal links
Max 1-hop redirect chains
Descriptive anchor text, no ‘click here’ on important links
Breadcrumbs present + marked up with BreadcrumbList schema
Crawlable pagination, no pure infinite scroll
Evidence of coherent pillar-cluster topic structure
Mega-menu link dilution avoided, no pages with 300+ internal links
Pillar pages receive many contextual links from clusters

§ 5.303 / 14

Performance & Core Web Vitals

Do pages load fast enough that humans stay and crawlers allocate budget generously?

Checks

LCP ≤ 2.5s (field data, not lab)
INP ≤ 200ms
CLS ≤ 0.1
TTFB ≤ 400ms
TBT ≤ 200ms
Field data (CrUX / GSC) used as source of truth, with lab data only for diagnosis

Red flags

Unoptimized hero images (not WebP/AVIF, not responsive, no width/height)
Render-blocking third-party scripts (chat widgets, GTM, analytics)
Font swap causing CLS
Ads or embeds injected after load pushing content down
Bloated JS bundles from drag-and-drop page builders

§ 5.404 / 14

Security, HTTPS & Status Codes

Is the site served securely, consistently, and without silent errors?

Checks

SSL valid, covers all subdomains, not about to expire
All HTTP → HTTPS via 301
No mixed content warnings
HSTS enabled (Strict-Transport-Security header)
Content-Security-Policy header present and sensible
No 5xx errors in status-code crawl
No accidental X-Robots-Tag: noindex in production headers
Security headers graded A or A+ at securityheaders.com
WAF / bot rules don’t block legitimate AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended)

§ 5.505 / 14

On-Page Technical & Semantic HTML

Does every page’s markup clearly communicate what it is and what it’s about?

Checks

Title tag: unique, 50–60 chars, primary keyword near front
Meta description: unique, 140–160 chars, CTA intent included
Exactly one H1 per page, describes the page topic
H2–H6: logical hierarchy, no skipped levels
Semantic HTML5 used correctly: article, section, nav, main, aside, header, footer
Descriptive alt attributes on all non-decorative images
lang attribute set on
Open Graph + Twitter Card complete on shareable pages
hreflang (if multi-locale): reciprocal, self-referencing, valid codes
Canonical: self-referencing on indexable pages

Red flags

div-soup styled to look like headings, invisible to screen readers and LLM parsers (the most common failure on modern React/Vue sites)

§ 5.606 / 14

Structured Data & AI-Readiness

Can LLMs, answer engines, and rich-result pipelines parse our site into structured facts they can cite correctly?

Checks

Homepage: Organization + WebSite (with SearchAction)
Articles: Article / BlogPosting + Author (Person) + BreadcrumbList
Products: Product + Offer + AggregateRating + Review
Services: Service + Organization
FAQs (real ones only): FAQPage
HTML fully rendered server-side or statically generated
llms.txt published at root
robots.txt explicitly allows GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot (unless strategically blocked)
WAF rules don’t shadowban the above
Structured data validates cleanly in Google Rich Results Test + Schema.org validator
Entity consistency: org name, founding date, founders, product names match across schema + copy
Key facts in answer-ready formats: definitions near the top, short paragraphs, H2 questions
Monthly citation spot-check across Perplexity, ChatGPT, Claude, Gemini

§ 5.707 / 14

Mobile & Responsive

Does the mobile experience match or exceed desktop, since Google indexes mobile-first?

Checks

Viewport meta tag present and correct
Tap targets ≥ 48×48px with adequate spacing
Text legible without zoom (≥ 16px body)
No horizontal scroll on any viewport ≥ 320px
Mobile serves identical content to desktop: no truncated articles, no hidden sections
Forms work on mobile keyboards (inputmode, autocomplete attrs)
Mobile CWV within range of desktop
Tested on real devices, not just Chrome DevTools

§ 5.808 / 14

Accessibility (WCAG 2.2 AA)

Can users with disabilities accomplish everything sighted/hearing/mouse users can?

Checks

Perceivable: alt text, captions, contrast ≥ 4.5:1 body / 3:1 large, text resize to 200% without loss
Operable: keyboard navigation, visible focus, skip-to-content link, no keyboard traps
Understandable: consistent navigation, clear errors, form labels correctly associated
Robust: valid HTML, ARIA used correctly and sparingly, works across assistive tech
Automated scan (axe, WAVE, Lighthouse) + manual keyboard and screen-reader testing

Red flags

Relying only on automated tools. They catch ~40% of issues. Manual keyboard + screen-reader testing is non-negotiable.

§ 5.909 / 14

Content Audit by Type

Different content types fail differently. Auditing them with the same rubric is malpractice.

Checks

13 content-type rubrics applied per URL (homepage, pillar, cluster, case study, service, product, pricing, blog index, landing, docs, about, contact, legal)
Per-type structured data audited
Per-type quality signals scored (E-E-A-T, last-updated, author bylines)
Inventory of which type each URL is (many sites can’t answer this)

§ 5.1010 / 14

UX & Conversion Audit

At every decision point in the funnel, are we reducing friction and reinforcing intent?

Checks

Information scent: can users predict what’s behind each link?
CTA clarity: one dominant action per page?
Form friction: minimum fields, progressive disclosure?
Trust signals placed at the right decision points?
Error states clear and recoverable?
Empty states useful, not dead ends?
Loading states visible and honest?
Dark patterns absent (third-party reviewer with ethics brief)?
Session replay sample (20–50 sessions, mix desktop/mobile)
Heatmaps on top 5 conversion pages

§ 5.1111 / 14

Backlink & Off-Page Audit

Who links to us, and do those links help or hurt us?

Checks

Total referring domains + DR/DA distribution
Top 20 linking domains (manual quality review)
Anchor text distribution (over-optimized anchors are a risk signal)
Toxic / spammy backlinks: disavow only if necessary + manual action exists
Link velocity, with sudden spikes investigated
Brand mentions unlinked, treated as outreach opportunities
Competitor backlink gap analysis

§ 5.1212 / 14

Analytics, Event & Observability

Can we see what’s happening, attribute it to a source, and detect when something breaks?

Checks

GA4 (or alt) installed on 100% of pages, verified via crawl
Event schema documented (20–40 named events, typed properties, no PII)
Key conversion events tested via DebugView
Attribution model agreed and stored with every lead
UTM parameters captured and persisted into CRM + analytics
Uptime monitoring with 99.9% SLO minimum
Client-side JS error monitoring (Sentry or equivalent)
Form submission success rate monitored with daily threshold alert
Core Web Vitals tracked over time (RUM or PageSpeed API cron)
Privacy / consent compliance: banner, DPA, PII handling

Red flags

If a lead form silently fails for 48 hours, how do you find out? If the answer is ‘a salesperson notices the drought,’ fix that before anything else.

§ 5.1313 / 14

Competitive & Content Gap

Where are competitors winning that we’re not, and is it worth catching up?

Checks

Top 3–5 competitors identified (business-defined, not tool-defined)
Their top-100 organic keywords vs yours (gap report)
Their content velocity (posts / month)
Their backlink sources you don’t share
Their schema coverage vs yours
Their AI citation presence (same brand-query spot check)
Topic clusters they own that you don’t
Their site speed + UX benchmarks vs yours

§ 5.1414 / 14

Log File Analysis

What are crawlers actually doing, not what we hope they’re doing?

Checks

Googlebot hit volume per day (trend)
Top 20 most-crawled URLs should be the most valuable
URLs crawled repeatedly that return errors
URLs never crawled that should be
Crawl budget wasted on parameter URLs, filters, infinite paginations
Bot distribution: Googlebot vs Bingbot vs AI crawlers (GPTBot, ClaudeBot, PerplexityBot)
Unusual user agents flagged for investigation

Red flags

Most mid-market sites skip this. Every enterprise site must do it.

Want this scored against your site?

Send the URL. We’ll run the framework and come back with a ranked, shippable backlog.

Book your audit →

[DEEPER · RELATED REFERENCES]

Go deeper on the sections that matter most to you.

REFERENCE ↗

The full audit framework we run on every engagement.

Crawlability & Indexation

Site Architecture & Internal Linking

Performance & Core Web Vitals

Security, HTTPS & Status Codes

On-Page Technical & Semantic HTML

Structured Data & AI-Readiness

Mobile & Responsive

Accessibility (WCAG 2.2 AA)

Content Audit by Type

UX & Conversion Audit

Backlink & Off-Page Audit

Analytics, Event & Observability

Competitive & Content Gap

Log File Analysis

Go deeper on the sections that matter most to you.

13 content-type rubrics

AI-readiness deep dive

5-day process + prerequisites