Resource

The SaaS Website Audit Checklist: Fix These 37 Issues Before You Build a Single Link

Building links to a broken website is like pouring fuel into a car with no engine. You’ll spend the money, smell the petrol, and go absolutely nowhere.

I’ve seen this play out dozens of times. SaaS companies come to us after months of link building with another agency, frustrated that nothing’s moved. Rankings flat. Traffic stuck. Pipeline dry. And when we dig in, the answer is almost always the same: the site itself was the problem all along.

At EMGI, we believe link building sits at the end of a roadmap, not the beginning. The full buyer journey matters – from the moment someone lands on your site to the moment they sign up. If your technical foundation is cracked, even the best DR 70+ backlinks in the world won’t fix it.

This is the exact checklist we run through before we build a single link for any SaaS client.

Key Takeaways

  • Technical SEO issues silently block link building ROI — 96% of websites fail at least one Core Web Vitals assessment (Semrush, 2025)
  • A full SaaS website audit covers 37 items across 6 categories: crawlability, indexation, speed, on-page, content, and AI readiness
  • In our experience, clients who fix technical issues first see dramatically better results from the same link investment
  • AI crawler accessibility is the new audit frontier – most SaaS sites haven’t addressed it at all
  • EMGI runs a baseline audit for every client before campaign kickoff, but this checklist goes deeper for teams who want to do it themselves

Table of Contents

Why Should You Audit Your SaaS Site Before Building Links?

Sites that meet Google’s Core Web Vitals thresholds see 24% fewer page abandonment rates (Google, 2025). Technical health directly determines whether your backlinks translate into rankings, traffic, and revenue – or vanish into the void.

Here’s the thing most agencies won’t tell you: link building is only as effective as the site receiving the links. I’ve watched SaaS companies spend $5,000 a month on backlinks while their site had JavaScript rendering issues that made half their pages invisible to Google. That’s not link building. That’s charity.

The worst case I’ve ever seen? A SaaS company had been paying an agency for links for six months. Good links, actually. DR 60+, relevant placements, solid anchor text diversity. The problem? Their entire website was set to noindex. Every single page. Six months of link building, zero pages in Google’s index, and precisely zero organic traffic. The agency never checked. Nobody checked. They were essentially building links to a site that, as far as Google was concerned, didn’t exist.

That’s an extreme example, but milder versions happen constantly. Canonical tag loops. Accidental robots.txt blocks on key directories. Redirect chains that dilute link equity by 15-20% per hop. Your links are doing their job — your site just isn’t letting them work.

The EMGI Roadmap Approach

When a SaaS company comes to us for link building, we don’t just start firing off outreach emails. We follow a roadmap:

  1. Baseline audit — We check for the showstoppers (indexation, crawlability, major technical issues)
  2. Page-level assessment — We identify which pages are worth building links to and which need fixing first
  3. Link strategy — Only then do we design the campaign, targeting pages that are technically ready to absorb link equity
  4. Build and measure — Links go live, and because the foundation is solid, results actually show up

Most of our SaaS clients are well beyond the basics. They have competent dev teams, decent site architecture, and a working product. But even sophisticated SaaS sites have blind spots. This checklist is designed to catch them.


Category 1: Crawlability (Issues 1-7)

Google can’t rank what it can’t find. A Semrush study of 50,000+ domains found that 52% of websites have broken internal or external links, and 50% have duplicate content issues (Semrush, 2025). For SaaS sites with dynamic content, gated features, and JavaScript-heavy frontends, the risk is even higher.

1. Is Your robots.txt Configured Correctly?

Your robots.txt file is the first thing any crawler reads. If it’s misconfigured, you might be blocking Google from your most important pages without knowing it.

Check for: Accidental disallow rules on key directories, missing sitemap declarations, overly broad wildcard rules.

SaaS-specific risk: Dev or staging environments leaking into production robots.txt. We’ve seen /app//dashboard/, and even /pricing/ accidentally blocked.

2. Do You Have a Valid XML Sitemap?

Your sitemap tells Google which pages matter and when they were last updated. No sitemap means Google is guessing.

Check for: Sitemap exists at /sitemap.xml, all important pages are included, lastmod dates are accurate (not just the date the sitemap was generated), no 404 URLs listed.

3. Are All Key Pages Crawlable?

Use Screaming Frog or Sitebulb to crawl your site. Compare the pages found against the pages you expect to be found. The gap is your crawlability problem.

SaaS-specific risk: Feature pages rendered via JavaScript frameworks (React, Next.js, Vue) may appear blank to Googlebot if they’re not server-side rendered.

4. Are There Orphan Pages?

Orphan pages have no internal links pointing to them. Google might not discover them at all, and even if it does, the lack of internal links signals low importance.

Look for: Pages that exist in your CMS but aren’t linked from anywhere on the site. Case studies and integration pages are the usual suspects for SaaS.

5. Are Redirect Chains Under Control?

Each 301 redirect in a chain dilutes some link equity. A chain of three redirects can lose 10-15% of the original link value before it reaches the final URL.

Red flags: Chains of more than one redirect, 302 (temporary) redirects that should be 301 (permanent), redirect loops.

6. Is Your Crawl Budget Being Wasted?

SaaS sites often have thousands of parameter-driven URLs, filtered views, or paginated listing pages that consume crawl budget without adding SEO value.

Test: Excessive URL parameters, infinite scroll pagination without proper rel=next/prev or canonical handling, faceted navigation creating duplicate URLs.

7. Are Internal Links Passing Equity Effectively?

Internal links distribute authority across your site. If your most important pages are buried four clicks deep, they’re not getting the equity they deserve.

Check for: Key pages (pricing, features, comparison pages) within three clicks of the homepage. Use a flat, hub-and-spoke architecture. Link from high-authority blog content to conversion pages.


Category 2: Indexation (Issues 8-13)

Even if Google can crawl your pages, it doesn’t mean they’re indexed. Ahrefs data shows that 66.31% of published content gets zero search traffic — and a huge chunk of that is down to poor indexing, not poor content (Ahrefs, 2025). Meanwhile, 35% of sites have duplicate title tags and 25% have no meta descriptions at all (Semrush, 2025).

8. Are Your Pages Actually Indexed?

Run a site:yourdomain.com search in Google. Compare the count to the pages in your sitemap. A big gap means indexation problems.

Pro tip: Check Google Search Console’s “Pages” report. It will tell you exactly which pages are indexed and which aren’t, and why.

9. Is Anything Accidentally Set to Noindex?

This is the silent killer. A <meta name="robots" content="noindex"> tag on the wrong page will keep it out of Google entirely, no matter how many links point to it.

SaaS-specific risk: CMS platforms sometimes apply noindex to new pages by default. If your marketing team publishes a page and forgets to flip the switch, it’s invisible to search. I’ve also seen this happen after site migrations — staging environment noindex tags getting pushed to production.

Remember that story I mentioned earlier about the company paying for links to a fully noindexed site? Always check this first. It takes 30 seconds and could save you months of wasted budget.

10. Are Canonical Tags Correct?

Canonical tags tell Google which version of a page is the “real” one. Wrong canonicals can deindex pages you actually want ranked.

Verify: Self-referencing canonicals on all important pages, no pages canonicalising to a different URL by mistake, HTTP/HTTPS and www/non-www consistency.

11. Is Duplicate Content Under Control?

SaaS sites often generate duplicates through parameter variations, session IDs, and tracking URLs. Google picks one version and ignores the rest — hopefully the right one.

Test with: Siteliner or Screaming Frog’s duplicate content check. Look for URL parameter variations creating multiple versions of the same page, and HTTP/HTTPS versions both accessible.

12. Are Pagination and Filtered Views Handled?

If your SaaS has listing pages, resource libraries, or directories, pagination can create indexation bloat.

Check for: Proper canonical tags on paginated series, noindex on filtered/sorted variations, clean URL structure without excessive parameters.

13. Is Your Hreflang Implementation Correct (If Applicable)?

If your SaaS serves multiple markets or languages, hreflang tells Google which version to show to which audience. Incorrect implementation is one of the most common international SEO errors.

Check for: Valid language and region codes, reciprocal hreflang tags (page A points to page B and vice versa), self-referencing hreflang on every page.


Category 3: Site Speed and Core Web Vitals (Issues 14-19)

If you’ve made it through the first 13 items and found a few issues, don’t panic. That’s normal. Every SaaS site we’ve ever audited has had at least a handful of crawlability or indexation problems — the question is whether they’re dealbreakers or just inefficiencies. Now let’s talk about speed, because this is where things start affecting your bottom line directly.

Only 53% of websites pass all three Core Web Vitals metrics as of September 2025 (Chrome UX Report, 2025). Google’s own data shows that a 1-second delay in mobile load time can reduce conversions by up to 20% (Think with Google, 2025). For SaaS companies where every signup matters, this section is non-negotiable.

14. Do You Pass Core Web Vitals?

The three metrics that matter:

MetricGoodNeeds ImprovementPoor
LCP (Largest Contentful Paint)< 2.5s2.5-4s> 4s
INP (Interaction to Next Paint)< 200ms200-500ms> 500ms
CLS (Cumulative Layout Shift)< 0.10.1-0.25> 0.25

Check with: Google PageSpeed Insights (lab + field data), Chrome UX Report, or Google Search Console’s Core Web Vitals report.

15. Is Your Largest Contentful Paint Under 2.5 Seconds?

LCP measures how fast your main content loads. For SaaS homepages with hero images, product screenshots, or embedded videos, LCP is often the hardest metric to pass.

Google’s data shows the probability of bounce increases 32% as page load goes from 1 to 3 seconds, and a staggering 123% from 1 to 10 seconds (Think with Google, 2025).

Quick wins: Compress hero images, use WebP/AVIF format, preload critical assets, implement a CDN.

16. Is JavaScript Blocking Rendering?

SaaS sites love JavaScript. Login flows, interactive demos, pricing calculators — all great for users, potentially disastrous for SEO if they block rendering. And it gets worse: 33.9% of SEOs say that none of the LLM crawlers (ChatGPT, Perplexity, etc.) render JavaScript at all (Sitebulb, 2025). If your content depends on JS to load, AI search can’t see it.

Check for: Render-blocking scripts in the <head>, main content behind JavaScript execution, time to first meaningful paint over 3 seconds. Use Lighthouse to identify specific blocking resources.

17. Are Images Optimised?

Large, uncompressed images are the most common page speed killer. Every SaaS site has product screenshots and feature images that could be half the file size.

Red flags: Images over 200KB, missing width/height attributes (causes CLS), images served in PNG when WebP/AVIF would do, missing lazy loading on below-the-fold images.

18. Is Your Hosting Up to the Job?

Cheap shared hosting creates a speed ceiling you can’t optimise past. SaaS companies should be on dedicated, VPS, or cloud hosting with edge caching.

Check for: Time to First Byte (TTFB) under 200ms, consistent uptime (99.9%+), CDN enabled, HTTP/2 or HTTP/3 support.

19. Is Your CSS and JS Minified?

Unminified code adds unnecessary kilobytes to every page load. This is the lowest-hanging fruit in speed optimisation.

Check for: Minified CSS and JS in production, unused CSS removed (especially from theme or framework defaults), critical CSS inlined in the <head>.


Category 4: On-Page SEO (Issues 20-27)

On-page SEO is where your content meets Google’s understanding of what it’s about. With 35% of sites running duplicate title tags and 25% missing meta descriptions entirely (Semrush, 2025), the basics still trip up even experienced SaaS teams. These are free wins that compound the value of every link you build.

20. Does Every Page Have a Unique Title Tag?

Title tags should be 50-60 characters, include the primary keyword, and be unique across the entire site. Duplicate titles confuse Google about which page to rank.

SaaS-specific tip: Feature pages often get generic titles like “Features – [Brand].” Be specific: “CRM Pipeline Management | [Brand]” is far better.

21. Are Meta Descriptions Written for Humans?

Meta descriptions don’t directly affect rankings, but they affect click-through rate — which does. A compelling meta description is your ad copy in the SERPs.

Check for: 150-160 characters, includes primary keyword naturally, has a clear value proposition or call to action, not just a repeat of the title tag.

22. Is Your Heading Hierarchy Clean?

One H1 per page. H2s for main sections. H3s for subsections. Never skip levels (no H2 to H4 jumps). This structure helps both Google and screen readers understand your content.

Watch out for: Multiple H1s on the same page (common in template-driven SaaS sites), headings used for styling rather than structure, missing H1s entirely.

23. Are Images Using Descriptive Alt Text?

Alt text helps Google understand image content and improves accessibility. Every image should have descriptive alt text that naturally includes relevant keywords where appropriate.

Check for: Missing alt text on any image, alt text that’s just the filename (“IMG_4521.jpg”), keyword-stuffed alt text.

24. Is Structured Data Implemented?

Schema markup helps Google understand your content type and can trigger rich results. For SaaS, the most valuable types are Organization, Product, FAQ, Review, and BreadcrumbList.

Verify: JSON-LD schema on the homepage (Organization), FAQ schema on resource pages, BreadcrumbList on all pages, no validation errors in Google’s Rich Results Test.

25. Are Internal Links Using Descriptive Anchor Text?

“Click here” and “learn more” tell Google nothing. Internal link anchor text should describe the destination page’s topic.

Look for: Generic anchor text on key internal links, orphan pages with no internal links, over-optimised exact-match anchor text (natural variation is better).

26. Is Your URL Structure Clean and Logical?

URLs should be short, descriptive, and follow a logical hierarchy. /features/crm-pipeline is better than /page?id=4521&category=features.

SaaS-specific risk: Dynamic URLs from SaaS dashboards or app pages leaking into search results. Ensure non-public pages are blocked via robots.txt or noindex.

27. Are External Links Functional and Relevant?

Broken outbound links are a trust signal problem. They tell Google (and users) that your content isn’t maintained.

Check for: Broken external links (404s, timeouts), links to low-quality or spammy sites, nofollow on sponsored or affiliate links.


Category 5: Content Quality and Gaps (Issues 28-33)

Here’s where the audit shifts from “can Google see your site?” to “is what Google sees actually worth ranking?” You’d be surprised how many SaaS companies pass all the technical checks and still can’t rank because their content doesn’t match what their buyers are actually searching for.

Content quality is the engine that turns link equity into rankings. Google’s December 2025 Core Update doubled down on E-E-A-T signals, and 76.4% of top-cited pages in AI systems were updated within the last 30 days (Digitaloft, 2026). Stale content doesn’t just underperform — it actively holds back your other pages.

28. Is Your Content Matching Search Intent?

Every page should target a specific keyword with clear intent alignment. A page targeting “CRM pricing” should show pricing, not a blog post about pricing strategy.

Check for: Mismatch between keyword intent (informational vs. commercial vs. transactional) and page content, pages trying to rank for multiple unrelated keywords.

29. Are Bottom-of-Funnel Pages Strong Enough?

This is where EMGI’s revenue-first philosophy comes in. Most SaaS companies over-invest in top-of-funnel blog content and neglect the pages that actually convert: comparison pages, feature pages, pricing pages, and use-case pages.

The full buyer journey matters. A prospect who finds your comparison page via organic search is far more likely to convert than someone reading a generic “What is [category]?” blog post. Build links to these pages, not just your blog.

Check for: Missing comparison pages (“Your Product vs. Competitor”), thin feature pages, no use-case specific landing pages, pricing page not optimised for search.

30. Is There Duplicate or Cannibalising Content?

If two pages target the same keyword, they compete against each other. Google picks one, and it might not be the one you want.

Check for: Multiple pages targeting the same primary keyword, blog posts that overlap with service pages, old content that was never consolidated during a rebrand or migration.

31. Is Content Fresh and Up to Date?

Pages with dates from 2023 or 2024 in their titles signal to both Google and users that the information might be outdated. SaaS moves fast — your content should too.

Warning signs: Year references in titles or content that need updating, statistics citing pre-2025 data, product screenshots showing old UI.

32. Are Author Bios and E-E-A-T Signals Present?

Google wants to know who wrote your content and why they’re qualified. A named author with credentials beats “Admin” or “Team” every time.

Check for: Named author on every post, author bio with relevant experience, author schema markup, links to author’s LinkedIn or professional profiles.

33. Are Case Studies and Social Proof Optimised?

Case studies are conversion powerhouses AND link magnets. But they need to be findable in search, not buried behind a form gate.

Quick wins: Make sure case studies are indexed and optimised for search (not gated behind a form), include specific metrics in titles and meta descriptions, feature client logos and named results on service pages.


Category 6: AI Crawler Accessibility (Issues 34-37)

This is the section no one else is talking about yet. In 2026, 73.2% of SEO experts believe backlinks influence whether a brand appears in AI Search Overviews (Editorial.Link, 2025). But here’s the catch: if AI crawlers can’t access your content, you’re invisible to ChatGPT, Perplexity, Google AI Overviews, and every other AI-powered search tool.

Brand mentions now have a stronger correlation (0.664) with AI Overview appearances than backlinks alone (0.218) according to Position.digital (2026). But those mentions only matter if the AI can actually read your pages.

This is the new frontier of website auditing, and most SaaS companies haven’t even started thinking about it.

34. Does Your robots.txt Address AI Crawlers?

Beyond Googlebot, you now need rules for GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, Google-Extended (Gemini), and CCBot (Common Crawl). GPTBot alone is blocked on approximately 5.6 million websites — a 70% increase from July 2025 (The Register / Tollbit, Dec 2025). ClaudeBot is blocked on 5.8 million sites.

Check for: Explicit allow/disallow rules for AI crawlers in robots.txt. Most SaaS sites have no rules at all, which means they’re leaving AI visibility entirely to chance. Only 21% of the top 1,000 websites even have GPTBot rules in their robots.txt (Paul Calvano, 2025). Decide whether you want AI systems citing your content (hint: for SaaS companies, you almost certainly do) and set rules accordingly.

35. Do You Have an llms.txt File?

llms.txt is an emerging standard — think of it as sitemap.xml but specifically for large language models. It tells AI systems what your site is about, what services you offer, and which pages are most important to understand and cite.

Ask yourself: Does /llms.txt exist? Does it describe your business clearly? Does it list your key pages? This is first-mover territory — very few SaaS sites have implemented it, but it’s a quick win for AI visibility.

36. Is Your Content Server-Side Rendered for AI Bots?

AI crawlers are even less capable of executing JavaScript than Googlebot. If your content loads via client-side JavaScript, most AI systems will see a blank page.

Check for: View your page source (right-click > View Source, not Inspect Element). If the main content is missing from the raw HTML, AI crawlers can’t see it. Server-side rendering (SSR) or static site generation (SSG) is essential for AI discoverability.

37. Is Your Content Structured for AI Citation?

AI systems extract answers from structured, well-formatted content. If your answers are buried in the middle of long paragraphs, they won’t be cited.

Check for: Answer-first formatting (key information at the start of each section), clear H2/H3 hierarchy, comparison tables with proper <thead> markup, FAQ sections with question-and-answer format, statistics with inline source attribution.

According to research, “Best X” listicles make up 43.8% of all page types cited in ChatGPT responses (Position.digital, 2026). Structured, citable content isn’t optional anymore — it’s a ranking factor in a new kind of search.


How Does This Checklist Fit Into the Link Building Process?

At EMGI, every link building engagement starts with a version of this audit. We’re not going to take your money and start blasting outreach emails while your site has fundamental problems that would waste the investment.

That said, most of the SaaS companies we work with are well past the basics. They have good dev teams, solid products, and generally competent websites. They don’t need someone to tell them to install an SSL certificate.

What they do need is someone to catch the blind spots. The canonical tag that’s been pointing to the wrong URL for six months. The blog section that’s accidentally set to noindex after a CMS update. The JavaScript-rendered feature page that Googlebot can’t see. The complete absence of AI crawler rules at a time when AI search is reshaping how buyers discover SaaS products.

That’s where this checklist earns its keep. Run through it before your next link building campaign — whether with us or anyone else. The links will work harder, rank faster, and deliver actual revenue.

Because at the end of the day, link building isn’t about building links. It’s about building revenue. And revenue starts with a site that’s technically ready to receive it.


The Quick-Reference Checklist

Crawlability (1-7)

  • [ ] robots.txt configured correctly
  • [ ] Valid XML sitemap submitted
  • [ ] All key pages crawlable
  • [ ] No orphan pages
  • [ ] Redirect chains resolved
  • [ ] Crawl budget not wasted on low-value URLs
  • [ ] Internal links distributing equity to key pages

Indexation (8-13)

  • [ ] Key pages confirmed indexed in Google
  • [ ] No accidental noindex tags
  • [ ] Canonical tags pointing correctly
  • [ ] Duplicate content resolved
  • [ ] Pagination and filtered views handled
  • [ ] Hreflang correct (if multi-language)

Speed and Core Web Vitals (14-19)

  • [ ] Core Web Vitals passing (LCP < 2.5s, INP < 200ms, CLS < 0.1)
  • [ ] LCP specifically under threshold
  • [ ] JavaScript not blocking rendering
  • [ ] Images optimised (WebP/AVIF, lazy loading, width/height set)
  • [ ] Hosting adequate (TTFB < 200ms)
  • [ ] CSS and JS minified

On-Page SEO (20-27)

  • [ ] Unique title tags on every page (50-60 chars)
  • [ ] Meta descriptions written for CTR (150-160 chars)
  • [ ] Clean heading hierarchy (one H1, logical H2/H3)
  • [ ] Descriptive image alt text
  • [ ] Structured data implemented (Organization, FAQ, Breadcrumb)
  • [ ] Internal links using descriptive anchor text
  • [ ] Clean, logical URL structure
  • [ ] External links functional and relevant

Content Quality (28-33)

  • [ ] Content matches search intent per page
  • [ ] Bottom-of-funnel pages strong and link-worthy
  • [ ] No keyword cannibalisation
  • [ ] Content fresh and up to date
  • [ ] Author bios and E-E-A-T signals present
  • [ ] Case studies optimised and indexable

AI Crawler Accessibility (34-37)

  • [ ] robots.txt addresses AI crawlers (GPTBot, ClaudeBot, etc.)
  • [ ] llms.txt file created and deployed
  • [ ] Content server-side rendered for AI bots
  • [ ] Content structured for AI citation (answer-first, tables, FAQs)

FAQs

What Is a SaaS Website Audit?

A SaaS website audit is a systematic review of your site’s technical health, on-page SEO, content quality, and (increasingly) AI readiness. It identifies issues that prevent your pages from ranking, converting, or being cited by AI search tools. For SaaS companies that rely on organic acquisition for pipeline, an audit should happen before any link building investment.

How Long Does a Website Audit Take?

A thorough SaaS website audit typically takes 2-5 days depending on site size. A basic crawl-and-check can be done in a few hours, but the deeper analysis — content gap assessment, competitor benchmarking, AI readiness review — requires more time. At EMGI, our baseline audit is included as part of every link building engagement.

Should I Audit Before or After Building Links?

Before. Always before. Building links to a site with technical problems is like advertising a restaurant with a broken front door. The links might bring people to your doorstep, but they can’t get in. Fix the foundation first, then invest in amplification.

What Tools Do I Need for a SaaS Website Audit?

The core toolkit: Screaming Frog (crawling), Google Search Console (indexation), Google PageSpeed Insights(speed), Ahrefs or SEMrush (backlinks and keywords), and Chrome DevTools (JavaScript rendering). For AI readiness, you’ll need to manually check robots.txt rules and test server-side rendering by viewing page source.

How Often Should I Re-Audit?

Quarterly for the full checklist. Monthly spot-checks on indexation and Core Web Vitals using Search Console. After every major site update, migration, or CMS change. SaaS sites change constantly — your audit cadence should match.


Matt Shirley is the founder of EMGI Group, a SaaS link building agency based in the UK. He’s built links for 30+ SaaS companies across Europe, North America, Asia, and Australia, and believes that every link should drive revenue, not just rankings. Connect on LinkedIn.