Resource

AI Visibility Audit for SaaS: The 12-Point Framework We Use on Every Client

Editorial note. This is an opinion piece based on running AI visibility audits for SaaS clients at EMGI Group. The framework is the one I use live on strategy calls. Numbers are from EMGI proposals and third-party sources (cited inline). Nothing here is gatekept. If you want to run the audit yourself without engaging an agency, you can, and the instructions are in the post.

A SaaS marketing lead messaged me last month. Good Series A product, $7 million raised, decent SEO numbers on paper. 22,000 monthly organic visits, DR 52, the usual dashboard wins. She was pleased until a board member asked her, in the nicest way possible, why the company didn’t come up when he asked ChatGPT “what’s the best tool for X” at the weekend. She checked herself. She wasn’t in the answer. Neither was her top competitor. Three enterprise incumbents were, two of which had weaker products than hers and one of which was honestly just a bigger brand with a dusty blog.

That conversation is why this post exists. Your buyer has moved part of the evaluation stage into ChatGPT, Perplexity, Claude, and Google AI Overviews, and most SaaS brands haven’t audited whether they show up there. Ranking on Google isn’t the same as being cited by an LLM. Both matter. You need to know where you stand on both.

I run EMGI Group, a SaaS-only growth agency, and every new client starts with the same 12-point AI visibility audit. This post gives you the complete framework: four phases, twelve check-points, the comparison matrix, the honest take on the tools, and the bits most vendor content leaves out. I’m giving the whole thing away because the methodology is the product. The dashboard is not.

Key takeaways

  • A proper AI visibility audit runs across 4 phases (ground truth, buyer prompt mapping, baseline measurement, roadmap) and 12 specific check-points.
  • Reddit drives roughly 40% of AI citation share for SaaS queries (Semrush, June 2025), which reshapes your content distribution, not just your SEO.
  • The real deliverable isn’t a visibility score. It’s the citation source map (where LLMs pull answers from for your category) and the semantic framing audit (what LLMs say about you versus competitors).
  • The new crop of AI visibility SaaS tools are overpriced for what they do. The framework, not the dashboard, is where the value lives.

Table of Contents

What an AI Visibility Audit Actually Measures (and What It Doesn’t)

An AI visibility audit measures whether your brand is cited, mentioned, or completely absent when ideal-customer prompts are run through ChatGPT, Claude, Perplexity, and Google AI Overviews. That’s the whole job. Everything else is implementation detail.

Said more simply, it answers four questions: are we visible when our ideal buyer asks LLMs about our category, who shows up instead of us, how are we framed when we do show up, and what do we fix first.

An AI visibility audit is closer to a blood test than a vanity scorecard. You’re not asking the model to rate your looks. You’re asking whether the category has your brand in its bloodstream at all, and if so, at what concentration. That framing changes what you measure and what you do about the results.

What it isn’t

It isn’t a rank-tracking report with ChatGPT swapped in for Google. Rank tracking is positional. AI visibility is about whether you’re cited at all, and if so, in what context. It isn’t a long-tail prompt harvesting exercise either. You can rank for “best SaaS tool for left-handed Norwegian dentists”, but nobody is typing that into Perplexity, so it doesn’t convert. Long-tail prompt wins are the lonely mountaintop problem: you’re alone up there because nobody else bothered to climb.

It isn’t a one-off score either. If you run the audit once and file the PDF, you’ve wasted the effort. The point is the monitoring cadence. Buyer prompts evolve faster on LLMs than they do on Google, and your competitors are actively trying to reframe the category in their favour. Measuring once and filing the report is like weighing yourself naked on January 1st and then never stepping on the scale again. The number was real; the number is also useless now.

Why authority is doing most of the work right now

Here’s a pattern I see across nearly every EMGI proposal: the AI SERP for ideal SaaS buyer queries is dominated by enterprise brands, even when their topical relevancy is weak. Big brand, weak topical fit, still cited first. That tells you something important. Authority, not keyword targeting, is doing more work in AI answers than most LLM SEO content acknowledges. Which means the roadmap this audit produces can’t just say “write more content”. It has to address where authority actually comes from: editorial mentions, Reddit threads, YouTube coverage, directory listings, trusted third-party citations. For a deeper view on the citation gap, I published the SaaS AI citation gap report with the aggregate numbers.

Phase 1: Ground Truth (Points 1 to 3)

Before you measure anything, audit what LLMs have to work with. Language models ground their answers in whatever is indexed, citable, and coherent about your brand. If your own site is vague about who you serve, what you do, and what evidence exists, no visibility tool in the world will save you.

Think of it this way: you can’t ask an LLM to frame you well if you haven’t given it anything to frame with. Garbage in, garbage cited.

Point 1: Brand narrative audit

Open your homepage. Read the first screen without scrolling. Can you tell, in one sentence, who this is for, what it does, and why it’s different? The most common failure I see is something like “the all-in-one platform for modern teams”. That’s vague enough that no LLM can ground on it, and nor can a buyer. The fix is a specific, repeated claim: “X for [specific buyer type] who need [specific outcome] without [specific pain]”. Then that same claim, with minor rewording, shows up on landing pages, in the About section, on blog bylines, and in meta descriptions. Consistency is what LLMs pattern-match on. Repetition is a feature here, not a bug.

Point 2: Case study and proof-point inventory

Count them. How many published case studies do you have with verifiable outcomes? How many have a named customer, a quantified result, an attribution method, and ideally a third-party confirmation? If the answer is “two” or “none”, you’ve found a big part of your problem. LLMs cite what they can extract in paragraphs. Long PDFs, anonymised “Company X” stories, and quote-only testimonials are hard to extract. Structured case studies with clear numbers are easy. The web scraping SaaS GEO case study we published is a worked example of how to structure this for a developer-API product: named client, quantified traffic and citation gains, and the methodology in the prose.

Point 3: Positioning clarity check

Are you a category leader claim (“we are the X”), a best-for-niche claim (“the X for Y customers”), or an alternative-to claim (“the X for teams tired of Z”)? All three work. Mixing them confuses the model. Here’s a practitioner test that takes 30 seconds: ask ChatGPT “what is [your brand]?” and grade the first sentence of the answer. If it matches your positioning, you’ve grounded well. If it describes you as something adjacent or wrong, you’ve got a framing problem before you’ve even started measuring prompts.

Phase 2: Mapping Buyer Prompts Across the Funnel (Points 4 to 6)

Your buyer’s journey on LLMs runs the same three stages as every other channel: top of funnel, middle of funnel, bottom of funnel. The work is translating those stages into actual prompts a real human would type into ChatGPT while they’re evaluating you. If you skip this and just track “brand name mentions”, you’ll measure vanity and miss the money.

Point 4: Keyword-to-prompt translation

Start from keywords you already rank for, plus keywords you want to rank for. For each keyword, rewrite it as a natural prompt. “SaaS link building agency” becomes “what’s the best link building agency for a Series A SaaS on a $5K/month budget?” Notice the difference. The keyword is a noun phrase. The prompt is a question with a budget, a stage, and a specific buyer context. That’s how your actual customer talks to ChatGPT, because they’ve already figured out that the more specific their prompt, the better the answer.

Add buyer-stage modifiers as you go:

  • Awareness: “what is”, “how does X work”, “why do teams use X”
  • Consideration: “how do I”, “best practices for”, “what should I look for when”
  • Evaluation: “best X for Y”, “compare X and Y”, “top alternatives to X”
  • Decision: “is X worth it”, “X vs Y”, “should I use X for [specific situation]”

Point 5: Building the prompt library

Target 30 to 50 prompts. Weight them by buyer stage. BOFU prompts have lower volume but convert. TOFU prompts build category authority. A reasonable starting split is roughly 30% TOFU, 40% MOFU, 30% BOFU. Refresh the library monthly. This is the part most people skip. Buyer language drifts faster on LLMs than on Google because the feedback loop is tighter. A prompt that surfaced your competitor last quarter might frame them differently this quarter.

Recommended 40-prompt library distribution

TOFU 30%
MOFU 40%
BOFU 30%
12 prompts
16 prompts
12 prompts
awareness
consideration + evaluation
decision
Recommended 40-prompt library distribution. Source: EMGI Group AI Visibility Audit methodology, 2026.

Point 6: Platform selection

Default pair: ChatGPT and Google AI Overviews. Between them, they cover roughly 80% of commercial SaaS queries we see. Add Perplexity if your buyer skews technical or research-heavy, because Perplexity’s citation style favours editorial publishers and Reddit threads, which matters for evaluation prompts. Add Claude if your product is developer-facing; technical buyers lean on Claude for implementation questions, and the citation patterns are different enough to matter. AI Mode is worth tracking separately from AI Overviews even though they both live inside Google. Their citation logic diverges, and we’ve seen brands show up in AIO but not in AI Mode (and vice versa) for the same query. The allied health SaaS GEO case study shows the platform selection logic in a regulated clinical market, where the weightings flip compared to a developer-API product.

Practitioner note from the trenches: when I built the first version of EMGI’s prompt tracking for a web scraping client, we started with 12 prompts. By week 4 we had 47. The library compounds because every cited competitor response surfaces three more prompts you didn’t think of. Build small, let it grow.

Phase 3: Measuring Your Baseline Visibility (Points 7 to 10)

This is where most audits stop and most agencies charge you $500 to generate a dashboard. It’s also the least interesting part. The measurement is simple. The interpretation is where the value lives.

Point 7: Visibility scoring per prompt

Use a three-state score for each prompt, on each platform:

  • Cited: you’re named and linked in the answer
  • Mentioned: you’re named but not linked
  • Absent: you don’t appear

Run each prompt three to five times per platform. LLM outputs are stochastic, and a single run tells you almost nothing. The three-to-five-run average tells you something real. Crucially, record the citation source every time. If the answer cites you via a Reddit thread, note that. If it cites a G2 review, note that. This data feeds Phase 4 and it’s where the real insight lives.

Point 8: SERP competitor identification

These are the brands ranking for your target keywords on Google. Usually the obvious ones. Ahrefs or Semrush will show you the top 10 in about 30 seconds. Why it matters for an AI audit: SERP competitors define how LLMs understand your category. If the same five brands dominate the SERP for your top 20 keywords, those five brands are what LLMs implicitly grade you against, even if you don’t see them as real competitors.

Point 9: Positioning competitor identification

These are the brands you actually compete with on product, regardless of whether they rank for your keywords. The distinction matters more than most people give it credit for. Here’s why: LLMs group brands by product similarity, not keyword overlap. You might not share a single keyword with a competitor, but if your products solve the same problem for the same buyer, the LLM will put you in the same answer. That’s a competitor for this purpose.

Practitioner tip: ask ChatGPT “who competes with [your brand]?” The list is often wrong. That wrongness is diagnostic. It tells you how the LLM currently understands your positioning, which might be miles from how you understand it.

Point 10: Semantic framing extraction

This is the highest-value output of the whole audit. For each competitor (both SERP and positioning), pull the adjectives and feature claims LLMs attribute to them. Build a perception matrix: brand x feature x sentiment. For each row, you get a sentence or two about how the LLM talks about that brand. Do this for your top 5 competitors plus yourself. Look at the matrix side by side.

This is where you discover that LLMs describe your competitor as “enterprise-grade and trusted by Fortune 500s”, while describing you as “a newer tool popular with startups”, even when your customer base is comparable. That’s a framing gap. Framing gaps don’t close through rank tracking. They close through content, PR, case study distribution, and deliberate brand narrative work.

The semantic framing audit is the single highest-value output of this whole process. Measurement tells you where you rank. Framing tells you what to write next, which editorial outlet to pitch, which Reddit thread to show up in. That’s the whole roadmap right there.

Phase 4: Building the Roadmap (Points 11 to 12)

The roadmap isn’t a list of fixes; it’s a map of where you need to show up. Once you know which sources LLMs cite for your category, the action list writes itself. YouTube channels, Reddit threads, directories, editorial publishers, community wikis, conference talk archives. The roadmap is a distribution strategy dressed up as an audit output.

Point 11: Perception-versus-intent gap analysis

Put your desired narrative next to the LLM-measured narrative and identify the three largest gaps. Usually they fall into three categories:

  • Feature framing gap: LLMs miss or mischaracterise your core feature. Fix: more explicit feature content, better docs structure, repeated mention in high-authority third-party coverage.
  • Target-customer framing gap: LLMs think you’re for a different buyer segment than you actually serve. Fix: case studies naming the real buyer type, landing pages for that segment, editorial coverage in that segment’s publications.
  • Category framing gap: LLMs slot you in a category adjacent to, but not the same as, the one you want. Fix: explicit category definition content, repeated category framing on-site, links from category-defining publishers.

Point 12: Citation source map and prioritised roadmap

This is where the audit stops being diagnostic and becomes operational. For every prompt you tracked, you recorded the citation source. Aggregate those sources. What’s the distribution? For a typical SaaS evaluation prompt (“best X for Y”), we usually see something like the breakdown below. Reddit dominates, YouTube punches above its weight, G2 and editorial blogs share the middle ground, and company sites barely feature.

Citation source breakdown: typical SaaS "best X for Y" prompt

Reddit

35%

YouTube

18%

G2 / Capterra

12%

Editorial blogs

12%

Other sources

9%

Company site

8%

Product Hunt

6%

If 40% of citations come from Reddit and you have zero Reddit presence, your content calendar just wrote itself.

Typical citation source distribution for a SaaS evaluation prompt. Source: EMGI Group aggregated audits, 2026.

Those percentages are your content calendar. If Reddit drives 40% of citations in your category and you have zero Reddit presence, your entire roadmap pivots. If YouTube is 20% of citations and you haven’t recorded a single video, that’s what you do next. Semrush found Reddit drives roughly 40% of AI citation share across B2B categories (Semrush, June 2025), and for SaaS specifically it’s often higher. AIO-cited brands see 35% more organic clicks and 91% more paid clicks than uncited competitors (Seer Interactive, September 2025), which is why this stops being a branding exercise and starts being a pipeline one.

The deliverable at this stage is an action list, not a report. Each row: source type, current presence, priority, effort estimate, expected impact on citation frequency. That’s the document that justifies spend. For the authority-building side of the roadmap, my guide to high-authority backlinks for SaaS covers the link-building piece, and the off-page SEO checklist covers the broader actions that compound in LLM citations.

The Prompt x Platform x Priority Matrix

Not every prompt belongs on every platform. BOFU decision prompts convert on ChatGPT. Awareness-stage prompts live on Perplexity and AIO. If you monitor every platform equally, you waste half your budget. Here’s the matrix I run on every EMGI audit.

Prompt type
ChatGPT
AI Overviews
Perplexity
Claude
TOFU awareness
"what is X"
Medium
High
High
Low
MOFU consideration
"how do I"
High
Medium
Medium
Medium
MOFU evaluation
"best X for Y"
High
Medium
High
Medium
BOFU decision
"X vs Y" / "is X worth it"
Highest
Medium
High
Medium

Priority rating for each prompt type and platform pairing. Source: EMGI Group AI Visibility Audit methodology, 2026.

How to use it: your priority score for each cell drives where you spend monitoring and content effort. A “High” cell for a BOFU evaluation prompt justifies Reddit presence, a G2 listing refresh, and a dedicated comparison page. A “Low” cell for TOFU on Claude tells you not to lose sleep over it. Two things to watch. First, the priority ratings shift every quarter as platform behaviour evolves. AI Mode, for example, is climbing the MOFU evaluation priority faster than AIO. Second, these ratings are category-agnostic; a developer-tool SaaS will push Claude higher across the board, and a design-tool SaaS will push YouTube-adjacent prompts up. Calibrate to your own buyer, not to the matrix.

An Honest Take on the New Crop of AI Visibility Tools

The new crop of AI visibility tools are overpriced for what they do. Most of them are a dashboard sitting over an OpenAI API call. That’s fine if you want a pretty interface. It’s also completely replicable for the price of a monthly API budget and a spreadsheet.

Running a visibility tool without a framework is like buying a fitness tracker and never getting off the sofa. The numbers are lovely. The outcomes are identical.

What they actually do

Under the hood, the pattern is consistent. Let you import or suggest keywords. Translate those to prompts. Query LLMs on a schedule via API. Surface results in a dashboard with some filtering and charts. That’s it. A few of them layer on competitor benchmarking and a maturity score, which is useful but not hard.

Why the pricing doesn’t match the value

Starter tiers for the better-known tools hit $300 to $500 a month. The API cost to replicate the same queries yourself is roughly $30 to $60 a month. The difference is UI polish and vendor hosting. That’s a 5x-to-10x markup on tooling. The bigger issue is that the framework, not the tool, is the hard part. Tools get you monitoring. They don’t get you a roadmap. A dashboard without a framework is an IKEA instruction sheet with no Allen key. You can see where all the bits are meant to go. You just can’t actually assemble anything.

I’ve seen SaaS teams subscribe to a visibility tool for a year, generate a lovely dashboard every month, and still have no idea which two actions would actually move the needle. The dashboard is not the strategy.

When they make sense

  • Marketing team with no technical capacity at all, and no agency to run it
  • You need a clean quarterly board report with no engineering involvement
  • Enterprise context where you need SSO, audit logs, and shared dashboards across a 20-person marketing org

When they don’t

  • Series A to B SaaS with any technical capacity on the team. You can vibe-code this in a weekend.
  • Anyone who’d rather spend the monitoring budget on content, links, or Reddit presence
  • If you’re already working with an agency on LLM SEO. They should be running this for you. If they charge for “tool access” without a methodology behind it, ask what you’re actually paying for.

The tool market is currently doing for AI visibility what rank-tracking tools did for SEO back in 2014. Output is commoditising fast. The framework and the interpretation are what’s actually billable, and always were.

How We Run This at EMGI

Every SaaS client starts with this 12-point audit. We’ve built our own internal prompt-tracking stack so we can run it consistently at client scale without paying the vendor tax, but the framework is the same one you’ve just read. The tool is implementation detail.

Two quick case study callouts. The web scraping SaaS GEO case study shows the framework applied to a developer-API product, where AI visibility became the primary growth channel because the technical buyer lives in Claude and ChatGPT more than on Google. The allied health SaaS GEO case study shows how the framework adapts for a regulated clinical market, where trust signals and editorial citations carry more weight than Reddit threads.

Both follow the same 12 points. What changes is the priority weighting in Phase 4 and the kind of citation sources that dominate the roadmap output. That’s the beauty of a proper framework: the bones stay the same, the flesh adapts to the category. If you want the broader LLM SEO context this audit fits inside, our LLM SEO service page and the LLM SEO for SaaS pillar piece cover the strategy layer above the audit.

Frequently Asked Questions

How long does an AI visibility audit take?

Two to four weeks for a first pass. One week to build the prompt library, one to two weeks to collect baseline data across platforms (with repeat runs to handle stochastic output), and about a week for semantic framing analysis and roadmap generation. Ongoing monitoring becomes a 30-minute weekly review once the system is in place.

How often should I re-run the audit?

Quarterly full reruns, with weekly monitoring on your top 10 BOFU prompts. Buyer language drifts faster on LLMs than on Google because the feedback loop between buyer and model is tighter. A prompt that converted you last quarter may surface a competitor this quarter, and you need to know within weeks, not months.

What’s the difference between AI visibility and GEO?

GEO (Generative Engine Optimisation) is the broader practice of optimising for LLM-mediated discovery. The AI visibility audit is the measurement layer inside GEO. You run the audit to understand where you stand, then you do GEO work to improve it. My LLM SEO for SaaS pillar covers the full GEO strategy.

Which platforms matter most for B2B SaaS?

ChatGPT and Google AI Overviews cover about 80% of commercial queries in our experience. Add Perplexity for research-heavy buyer segments and Claude for developer-tool categories. AI Mode is worth separating from AI Overviews because its citation logic diverges, and we regularly see brands appearing in one but not the other.

Can I run this audit myself, or do I need an agency?

You can absolutely run it yourself. The framework is portable, the API costs sit under $100 a month, and the prompt library fits in a spreadsheet. What agencies add is the semantic framing interpretation and the roadmap translation. If you have a strategic marketing lead and about 10 hours a week to spare, DIY works. If you don’t, pay someone to interpret the output and turn it into action.

What’s a “good” AI visibility score?

There’s no absolute benchmark, and anyone who tells you there is has something to sell. The useful question is: are you cited in the prompts your ideal buyer actually types? For BOFU evaluation prompts on ChatGPT and AI Overviews, target 60% or higher citation rate. For TOFU awareness prompts, 20% to 30% is strong. Below that, the roadmap has work to do.

Do AI visibility tools replace this framework?

No. They automate the measurement layer but offer no methodology and no interpretation. Useful if you want a dashboard and no engineering work. Insufficient if you want strategy. The question to ask any vendor or agency is: what framework sits behind this dashboard? If the answer is vague, the dashboard is the product, which means you’re paying for UI.

How does AI visibility impact revenue?

Indirectly, by compounding your organic pipeline. B2B SaaS SEO returns around 702% ROI with a 7-month breakeven point (First Page Sage, 2026), and the AI visibility layer extends the same logic to LLM-mediated discovery. If you’re cited in the answers your buyer reads during evaluation, you’re on the shortlist. If you’re not, you’re not. My SaaS link building ROI piece breaks down the full ROI maths, and the same compounding logic applies here.

Should I use the new AI visibility SaaS tools or stick with Ahrefs and Semrush?

The established SEO suites (Ahrefs, Semrush) have bolted AI visibility modules onto existing products and they are, in my view, the more honest version of this market. You’re paying for tools you already use, and the AI module is an add-on. The standalone new-crop visibility SaaS tools are typically priced at a premium for the methodology you don’t get. If you already pay for Semrush or Ahrefs, their AI visibility modules are a fine starting point. If you don’t, vibe-code it yourself or have your agency run it.

What should I expect from an AI visibility audit in the first 90 days?

Month one: framework built, prompt library live, baseline measured across ChatGPT and AI Overviews. Month two: semantic framing matrix complete, citation source map finished, roadmap prioritised with effort and impact estimates. Month three: first content and link actions shipping against the roadmap, weekly monitoring in place, first signs of citation share movement on your top 10 BOFU prompts. If an agency promises “AI visibility” without explaining the framework, ask them to walk you through their version of these 12 points.

Book a Free AI Visibility Strategy Call

If you’d rather not run all twelve points yourself, I run these audits every week for SaaS clients. The strategy call is genuinely free. You walk away with actionable findings even if we don’t work together. On the call, we run 5 to 10 of your top buyer prompts live across ChatGPT and AI Overviews, pull your baseline, and give you 3 to 5 prioritised actions. No presentation, no slide deck, just the work. If the results make sense for a retainer, we build it into the strategy. If they don’t, you’ve got the framework and a baseline to run it in-house.

There’s an analogy I use with clients that I think captures why this work compounds: AI visibility is the pipes, not the water. You can pour all the content in the world into a brand that isn’t plumbed into the surfaces LLMs cite from, and the pressure never reaches the tap. Fix the pipes first, then the content starts arriving where the buyer is actually looking. Book a call here if you’d like me to take a look at your plumbing.