Most Marketing Metrics Are Misleading. Here’s What Leaders Measure Instead

Key Takeaways

  1. Traditional marketing metrics like traffic, search rankings, and ROAS were designed for a more trackable internet. They still have uses, but they no longer tell the full story.
  2. Marketing attribution assigns credit to touchpoints but cannot prove that marketing caused the outcome. It typically rewards demand capture over demand creation.
  3. ROAS averages compress marginal return curves into a single number, hiding where spend becomes inefficient.
  4. Executives want to know whether marketing caused growth, not just whether activity occurred. Those are different questions with different answers.
  5. Modern measurement tracks incremental signals, branded demand growth, and customer value metrics to give a more complete picture of what is actually working.

Your marketing reports probably look fine. Traffic is up. Engagement is solid. Return on ad spend (ROAS) hits the benchmarks your team set last quarter. But here is the problem with why your marketing reports are inaccurate: the numbers that look best are often the ones least connected to actual business growth.

Marketing dashboards were built for a version of the internet that no longer exists. When clicks were cheap and user journeys were predictable, tracking activity was a reasonable proxy for impact. That is no longer the case. Discovery now happens in AI summaries, social feeds, and private conversations that never show up in analytics. Attribution systems reward the last touchpoint, not the one that created demand. And ROAS averages can hide the fact that the last dollar spent barely broke even.

The shift underway is significant. Measurement is moving from tracking activity to proving impact. Marketing leaders who recognize this will make better budget decisions and communicate more credibly with leadership.

This is the first part of a three-part series examining how modern organizations measure marketing performance in a way that actually connects to growth.

The Old Marketing Scoreboard Was Built for a Different Internet

For most of the last decade, marketing teams built their reporting around a stable set of marketing metrics: organic traffic, search rankings, click-through rates, and ROAS. These became the dominant performance indicators not because they were perfect, but because they were easy to track and easy to report.

The logic made sense at the time. More organic traffic meant more potential customers. Higher rankings meant greater visibility. Click-through rate measured whether ads were relevant.

ROAS connected spend to revenue in a single ratio. These gave teams something concrete to optimize and executives something simple to evaluate.

The problem was that teams began equating activity with impact. A spike in sessions became evidence of a successful campaign. A high ROAS figure became justification for more spend. 

But these metrics measured what happened on a screen, not what drove a purchase decision. Many of them are what marketers now call vanity metrics: numbers that look meaningful but don’t connect reliably to revenue.

Analytics dashboards were built to track what they could see, and teams made decisions based on what was visible. That created a structural bias toward channels that were easy to measure, even when harder-to-measure channels were doing more of the actual work.

Three-panel infographic from NP Digital showing why the old marketing playbook is breaking: declining traffic relevance, attribution noise, and growing executive demand for proof of business impact.

Why Many Marketing Metrics Are Becoming Misleading

The way people discover brands has changed substantially, and many standard marketing KPIs were not built to account for that shift. Three changes in particular are making traditional metrics less reliable.

Zero-Click Discovery Is Increasing

AI-generated answers, featured snippets, and knowledge panels now resolve many queries without requiring a click. According to Pew Research, when users encounter an AI summary in search results, they click through to websites at roughly half the rate they do with standard results. Around 26 percent end their session after viewing an AI summary, compared to 16 percent for standard search results.

For marketing teams, this creates an invisible influence problem. A brand can shape a buyer’s thinking through AI-cited content without that interaction ever appearing in a traffic report. Organic search may be doing more work than the data suggests, and session counts alone cannot tell you which.

Discovery Happens Inside Platforms

Buyers increasingly research and evaluate brands inside closed ecosystems: social platforms, marketplaces, YouTube, and AI-driven interfaces. These platforms have their own algorithms, their own ad systems, and limited data sharing with external analytics tools.

According to NP Digital research, 82 percent of marketing engagement now happens through video, while SERP and AI answers account for 79 percent of engagement. Only 12 percent happens on-site. Website analytics captures a fraction of where influence actually occurs. 

Brands get evaluated across Google, YouTube, LinkedIn, review sites, and AI engines, often before a customer ever visits a website. NP Digital data also shows that the average customer journey has grown from 8.5 touchpoints in 2021 to 11.1 touchpoints in 2025. What looks like a direct visit or a branded search conversion often reflects influence that originated somewhere else entirely.

Traffic No Longer Reflects Influence

Even when traffic increases, the quality of that traffic has become harder to assess. NP Digital research tracking 602 websites found that 51 percent of traffic came from bots and 21 percent were short sessions, leaving only 16 percent that could be classified as genuinely engaged sessions.

An NP Digital infographic with a traffic quality breakdown.

More sessions do not equal more intent. Traffic can grow while real engagement shrinks, particularly as bots, low-intent visits, and passive content consumption inflate session counts. Optimizing for traffic volume in this environment can mean more spend for fewer qualified outcomes.

The Attribution Problem Most Teams Ignore

Marketing attribution became central to reporting because it appeared to solve a hard problem: connecting activity to conversions. For direct-response channels with short feedback loops, it worked reasonably well. But attribution has a structural limitation that deserves more attention. For a deeper look at where these systems break down, see this overview of marketing attribution blind spots.

Attribution models credit the touchpoints that preceded a conversion. They track what happened well. They are not built to determine whether marketing caused the outcome.

That distinction matters more than it might seem. Algorithmic platforms optimize toward users who are already likely to convert. 

Last-click models, and many of their more sophisticated variants, inherit this bias. They reward demand capture over demand creation, which means the channels that appear most efficient are often the ones intercepting customers who would have converted regardless.

The evidence from major advertisers is instructive. When Airbnb paused its performance marketing budget, there was no significant drop in bookings. When Uber reduced spend in certain channels, rider acquisition was largely unaffected. In both cases, attribution had been crediting spend for outcomes that would have occurred without it.

Privacy changes have made this harder to ignore. Third-party cookie deprecation, cross-device behavior, and private sharing channels all reduce the fidelity of attribution data. According to NP Digital research, nearly 47 percent of marketers lack confidence in their attribution model. Yet most teams still use attribution reports as the primary input for budget decisions. Data-driven attribution improves on last-click models in some respects, but it still cannot fully separate demand creation from demand capture.

Attribution remains useful for day-to-day campaign optimization. The problem is treating it as strategic truth, as proof that marketing caused growth.

Why ROAS Can Hide the Real Economics of Marketing

ROAS is the most widely used efficiency metric in paid marketing, and for good reason. It is simple, ties spend to revenue, and is easy to compare across campaigns and channels. The problem is that ROAS compresses a marginal return curve into a single number, and that compression hides where spending stops being productive.

Consider a channel running at an overall 4x ROAS. That number looks strong. But if the first $100,000 spent generated 8x returns and the last $200,000 generated 0.5x returns, the blended average conceals a significant amount of wasted spend. Optimizing toward the average means continuing to invest in the tail of a diminishing curve.

ROAS also ignores what created the demand being captured. Branded search conversions frequently get credited to paid search, but the intent behind that search often originated from a video campaign, a piece of organic content, or a recommendation that happened in a private channel. The channel capturing the intent gets the credit. The channel that generated it does not. This dynamic is especially relevant for ecommerce metrics, where brands often over-invest in bottom-funnel capture while underfunding the upper-funnel activity that makes conversion possible.

The question ROAS does not answer is: how much of this revenue was incremental?

Separating captured demand from created demand requires different tools, which is why leading organizations are increasingly pairing ROAS with incrementality testing and marketing mix modeling.

A chart comparing Organic Traffic Trends vs. Revenue Growth.

The Question Executives Actually Care About

The metrics most marketing teams optimize are not the ones most executives prioritize. According to NP Digital research, 92 percent of marketers say profit is a primary metric, and 87 percent prioritize pipeline. Search rankings rank near the bottom at 18 percent, and ROAS comes in at 16 percent.

That gap reflects a real tension. Marketing teams spend considerable time reporting on activity and efficiency. Leadership wants to know whether marketing is actually changing the economics of the business.

The core question executives ask is whether marketing caused growth, or whether it captured demand that already existed. These are different outcomes. A campaign can generate strong attribution numbers while producing no incremental growth. A brand investment can create lasting demand without generating a single directly trackable conversion.

The questions that matter most at the leadership level are:

  1. Did this campaign create new demand, or intercept demand that already existed?
  2. Would revenue have changed if this marketing activity had not occurred?
  3. Which investments change the underlying economics of the business?

These are questions about causality, not efficiency. They cannot be answered by ROAS or click-through rates. They require measurement methods designed to isolate actual marketing impact from demand that would have existed regardless. This is the gap that is pushing high-growth organizations toward a different approach.

What Modern Marketing Leaders Measure Instead

The most important marketing metrics for growth-focused organizations look different from the ones that dominate standard dashboards. The shift is away from activity-based signals and toward measures tied directly to business outcomes.

Rather than optimizing for total traffic, leading teams track branded demand growth, which captures whether the brand is generating more direct interest over time. Rather than reporting on attributed conversions, they measure incremental conversions: the outcomes that would not have happened without the marketing. Understanding the most important marketing metrics for your business means asking which numbers reflect whether marketing is creating demand, not just capturing it.

Customer value metrics have become more prominent as well. Lifetime value (LTV), customer acquisition cost (CAC) adjusted for margin, and payback periods give a more accurate picture of whether growth is sustainable. For teams managing ecommerce KPIs, this means looking past add-to-cart rates and conversion percentages toward cohort retention, repeat purchase rates, and revenue per customer over time.

Revenue per session, lead-to-close rates by channel, and downstream conversion quality provide a fuller picture of marketing performance than surface metrics can. A channel that generates high traffic but low-quality leads may look better on a standard dashboard than one generating fewer, higher-value conversions.

The shift does not mean abandoning familiar metrics entirely. Traffic, rankings, and ROAS still provide useful context. The change is in treating them as diagnostics rather than goals. The next piece in this series examines how high-growth organizations build the measurement systems that track these signals, combining marketing mix modeling, incrementality testing, and attribution into a layered approach that answers different questions at different levels of the business.

A chart comparing new and old KPIs for marketing organizations.

FAQs

What Are KPIs in Marketing?

Marketing key performance indicators (KPIs) are the metrics teams use to evaluate performance against business goals. Common marketing KPIs include traffic, leads, conversion rates, ROAS, and customer acquisition cost. The most useful KPIs are ones tied directly to business outcomes rather than activity alone.

What Are Marketing Metrics?

Marketing metrics are the data points used to evaluate marketing performance. These range from top-of-funnel measures like impressions and traffic to bottom-of-funnel measures like conversion rate and revenue. Not all marketing metrics examples reflect real business impact equally, which is why understanding which metrics to prioritize matters as much as tracking them.

How Do You Make a Marketing Report?

A strong marketing report connects activity data to business outcomes. Start by identifying the decisions the report needs to support, then select metrics that reflect progress toward those outcomes. Include both leading indicators, such as branded search volume and engaged session rates, and lagging indicators like revenue and customer acquisition cost.

Conclusion

Marketing measurement has not failed. The environment around it changed, and the metrics that once served as reliable proxies for growth have become less accurate as discovery, attribution, and buyer behavior grew more complex.

The organizations gaining ground are the ones questioning which metrics actually reflect growth, rather than which ones look best in a dashboard. That means looking past traffic and attribution toward signals tied to incremental outcomes, customer value, and causal impact.

This is the foundation the rest of this series builds on. The next installment covers how high-growth companies structure their measurement systems, combining multiple methods to get directional confidence across different levels of the business. If you want to start reviewing your current approach, this guide to website performance metrics is a useful starting point, as is this breakdown of which marketing KPIs are worth keeping and which may be leading your team in the wrong direction.

Read more at Read More

The SEO Update by Yoast – April 2026

The SEO Update by Yoast – April 2026

Don’t miss the next SEO Update by Yoast

Search is changing fast – make sure you’re not falling behind.

Sign up for the next SEO Update by Yoast and get expert-led clarity on what’s happening in SEO right now and what it means for your strategy.

Join Carolyn Shelby and Alex Moss as they unpack the most important SEO news, algorithm shifts, and industry developments – so you can focus on what actually moves the needle.

Who should sign up?

This update is ideal if you:

  • Want expert insight into recent SEO changes and trends
  • Need help refining or validating your SEO strategy
  • Have SEO questions you’d like answered live

Event details

  • Level: Intermediate
  • Duration: 1 hour
  • Live Q&A with our SEO experts
  • Free registration
  • Recording available after the session

First upcoming events

WordCamp Asia 2026
April 09 – 11, 2026

Team Yoast is Attending, Sponsoring, Yoast Booth at WordCamp Asia 2026! Click…


The post The SEO Update by Yoast – April 2026 appeared first on Yoast.

Read more at Read More

YouTube adds AI creator matching and ad formats to its partnerships platform

YouTube used its NewFront presentation to unveil a significant upgrade to its Creator Partnerships platform, adding Gemini-powered creator matching, stronger measurement tools, and new ways to run creator content as paid ads.

Why we care. Influencer marketing has become a core part of many brands’ strategies, but finding the right creators at scale and proving ROI is a pain point. tackles influencer marketing’s two biggest friction points — finding the right creator and proving ROI.

Gemini-powered matching cuts through the noise of three million creators, while the ability to run creator content as paid Shorts and in-stream ads makes performance measurable like any standard campaign, backed by a reported 30% conversion lift.

How it works. The updated platform uses Gemini to recommend creators from a pool of more than three million YouTube Partner Program members, filtered by campaign goals. Advertisers get more control over who they work with and better visibility into how those partnerships perform.

The big new feature. A revamped Creator Partnerships boost lets brands run creator-made content directly as Shorts and in-stream ads — formats YouTube says deliver an average 30% lift in conversions.

The big picture. The announcement builds on BrandConnect, YouTube’s existing creator monetization infrastructure, showing that the platform is doubling down on the creator economy as a growth lever for advertisers — not just a content strategy.

What’s next. Brands interested in the updated tools can watch the full NewFront presentation on YouTube for more details.

Read more at Read More

AI search engines cite Reddit, YouTube, and LinkedIn most: Study

AI citations

Reddit ranks as the most-cited domain in AI-generated answers, followed by YouTube and LinkedIn, based on a new analysis of 30 million sources by Peec AI, an AI search analytics tool.

The findings. Reddit was the most-cited source across ChatGPT, Google AI Mode, Gemini, Perplexity, and AI Overviews. YouTube, LinkedIn, Wikipedia, and Forbes also ranked in the top five. Review platforms like Yelp and G2 appeared often in recommendation queries.

The research showed which domains models rely on:

  • ChatGPT favored Wikipedia, Reddit, and editorial sites like Forbes.
  • Google leaned toward platforms like Facebook and Yelp.
  • Perplexity emphasized Reddit, LinkedIn, and G2 for B2B queries.

Why we care. To win in AI search, you need authority beyond your site. Brands that appear consistently across trusted third-party platforms are more likely to be cited.

Why these sources? AI systems prioritize perceived authority plus authentic user input:

  • Reddit leads because it captures real user discussions.
  • YouTube dominates video citations via transcripts and descriptions.
  • Wikipedia serves as both a live source and a training dataset.

About the data. The analysis covered 30 million sources across ChatGPT, Google AI Mode, Gemini, Perplexity, and AI Overviews, measuring domains directly cited in answers to isolate what shapes responses.

The study. Top domains cited by AI search: Analysis based on 30M sources

Dig deeper. More citation research:

Read more at Read More

Google Gemini may adapt AI answers to match user tone: Report

Google Gemini positive vs negative framing

A newly published, unverified report claims Google’s Gemini AI is instructed to mirror user tone and validate emotions while grounding its responses in fact and reality.

Why we care. If accurate, AI-generated search responses may vary based on how a query is phrased — not just the information available.

What’s new. The report centers on the inherent tension in the system-level instructions guiding how Gemini responds. The report, published by Elie Berreby, head of SEO and AI search at Adorama, suggested that Gemini is instructed to:

  • Match the user’s tone, energy, and intent.
  • Validate emotions before responding.
  • Deliver answers aligned with the user’s perspective.

What it means. The “overly supportive mandate frequently overrides the factual grounding,” Berreby wrote. So instead of acting as a neutral aggregator, AI answers may:

  • Reinforce negative framing (“Why is X bad?”).
  • Reinforce positive framing (“Why is X great?”).

If public perception is negative, AI may amplify it. As the report suggests:

  • AI reflects existing sentiment signals.
  • It doesn’t “balance” them the way blue links often do.

Query framing. The emotional framing of a query affects:

  • Which sources get cited.
  • How summaries are written.
  • The overall tone of the answer.

Google’s AI Overviews already show tone shifts, often aligning with query intent beyond keywords. This report offers a possible explanation.

Unverified. Google hasn’t confirmed the leak. As Berreby noted in his report: “I’ve decided to share only a fraction of the leaked internal system information with the general public. I’m not sharing any sensitive data. This isn’t a zero-day exploit. This is a tiny leak.”

The report. This Gemini Leak Means You Can’t Outrank a Feeling

Read more at Read More

Google expands Merchant Center loyalty features to 14 countries and AI surfaces

Google Shopping Ads - Google Ads

Google is giving retailers more firepower to promote loyalty program benefits directly within product listings — expanding the program internationally and into its newest AI-powered shopping experiences.

What’s new. Merchants can now highlight member pricing and exclusive shipping options directly on listings. Loyalty annotations have also expanded to local inventory ads and regional Shopping ads — making it easier to promote in-store or geography-specific perks.

Why we care. The more you can personalize an offer for a shopper, the better. Embedding member perks into the moment of purchase discovery — rather than requiring a separate loyalty app or webpage — makes programs more visible and more likely to drive sign-ups.

By the numbers. According to Google, some retailers have reported up to a 20% lift in click-through rates when showing tailored offers to existing loyalty members.

The big picture. Loyalty benefits will now appear on Google’s AI-first surfaces, including AI Mode and Gemini, putting member offers in front of shoppers at an entirely new layer of the search experience.

Where it’s available. The expansion covers 14 countries — Australia, Brazil, Canada, France, Germany, India, Italy, Japan, Mexico, Netherlands, South Korea, Spain, the UK, and the US.

How to get started. Merchants activate the loyalty add-on in Merchant Center, configure member tiers, and set up pricing and shipping attributes. Connecting Customer Match lists in Google Ads is required to display strikethrough pricing and shipping perks to known members.

Don’t miss. US merchants can apply to join a pilot that uses Customer Match as a relationship data source for free listings — potentially expanding loyalty reach without additional ad spend.

Read more at Read More

Google explains how crawling works in 2026

Gary Illyes from Google shared some more details on Googlebot, Google’s crawling ecosystem, fetching and how it processes bytes.

The article is named Inside Googlebot: demystifying crawling, fetching, and the bytes we process.

Googlebot. Google has many more than one singular crawler, it has many crawlers for many purposes. So referencing Googlebot as a singular crawler, might not be super accurate anymore. Google documented many of its crawlers and user agents over here.

Limits. Recently, Google spoke about its crawling limits. Now, Gary Illyes dug into it more. He said:

  • Googlebot currently fetches up to 2MB for any individual URL (excluding PDFs).
  • This means it crawls only the first 2MB of a resource, including the HTTP header.
  • For PDF files, the limit is 64MB.
  • Image and video crawlers typically have a wide range of threshold values, and it largely depends on the product that they’re fetching for.
  • For any other crawlers that don’t specify a limit, the default is 15MB regardless of content type.

Then what happens when Google crawls?

  1. Partial fetching: If your HTML file is larger than 2MB, Googlebot doesn’t reject the page. Instead, it stops the fetch exactly at the 2MB cutoff. Note that the limit includes HTTP request headers.
  2. Processing the cutoff: That downloaded portion (the first 2MB of bytes) is passed along to our indexing systems and the Web Rendering Service (WRS) as if it were the complete file.
  3. The unseen bytes: Any bytes that exist after that 2MB threshold are entirely ignored. They aren’t fetched, they aren’t rendered, and they aren’t indexed.
  4. Bringing in resources: Every referenced resource in the HTML (excluding media, fonts, and a few exotic files) will be fetched by WRS with Googlebot like the parent HTML. They have their own, separate, per-URL byte counter and don’t count towards the size of the parent page.

How Google renders these bytes. When the crawler accesses these bytes, it then passes it over to WRS, the web rendering service. “The WRS processes JavaScript and executes client-side code similar to a modern browser to understand the final visual and textual state of the page. Rendering pulls in and executes JavaScript and CSS files, and processes XHR requests to better understand the page’s textual content and structure (it doesn’t request images or videos). For each requested resource, the 2MB limit also applies,” Google explained.

Best practices. Google listed these best practices:

  • Keep your HTML lean: Move heavy CSS and JavaScript to external files. While the initial HTML document is capped at 2MB, external scripts, and stylesheets are fetched separately (subject to their own limits).
  • Order matters: Place your most critical elements — like meta tags, <title> elements, <link> elements, canonicals, and essential structured data — higher up in the HTML document. This ensures they are unlikely to be found below the cutoff.
  • Monitor your server logs: Keep an eye on your server response times. If your server is struggling to serve bytes, our fetchers will automatically back off to avoid overloading your infrastructure, which will drop your crawl frequency.

Podcast. Google also had a podcast on the topic, here it is:

Read more at Read More

59% of SEO jobs are now senior-level roles: Study

SEO command center

SEO hiring is shifting toward senior, strategy-led roles as AI reshapes search and expands the scope of the job. A new Semrush analysis of 3,900 listings shows companies now prioritize leadership, experimentation, and cross-channel visibility over pure technical execution.

Why we care. SEO hiring, career paths, and required skills are changing. Entry roles focus on execution, while most demand sits at the leadership level — owning strategy across search, AI assistants, and paid channels, with clear revenue impact.

What changed. Senior roles dominated, accounting for 59% of listings. Mid-level roles, such as specialists (15%) and managers (10%), trailed far behind.

  • Companies are shifting budget toward strategy as AI tools absorb more execution work.

The skills shift. In-demand capabilities extend beyond traditional SEO into coordination, testing, and decision-making:

  • Project management appeared in more than 30% of listings.
  • Communication led non-senior roles at 39.4%.
  • Experimentation appeared in 23.9% of senior roles compared with 14% of other roles.
  • Technical SEO appeared in about 6% of listings.

Tools and channels. The SEO tech stack now spans analytics, paid media, and data.

  • Google Analytics appeared in up to 47.7% of listings.
  • Google Ads appeared in 29% of listings.
  • SQL demand grew at the senior level.
  • AI tools like ChatGPT were increasingly listed.

AI expectations: AI literacy is moving from optional to expected:

  • 31% of senior roles mentioned AI.
  • Nearly 10% referenced LLM familiarity.
  • AI search concepts like AI search and AEO appeared more often.

Pay and positioning: SEO is increasingly treated as a business function.

  • The median salary for senior roles reached $130,000, compared to $71,630 for others. Some listings were much higher.
  • Degree preferences skewed toward business and marketing.

Remote work is now standard. More than 40% of listings offered remote options, with little difference by seniority.

About the data: Semrush analyzed 3,900 U.S.-based SEO job listings from Indeed as of Nov. 25. Roles were deduplicated, segmented by seniority, and analyzed using semantic keyword extraction.

The study. What 3,900 SEO Job Listings Reveal for 2026: Experiments, AI, and Six-Figure Salaries

Read more at Read More

Technical SEO for generative search: Optimizing for AI agents

Technical SEO for generative search: Optimizing for AI agents

Technical SEO extends beyond indexing to how content is discovered and used, especially as AI systems generate answers instead of listing pages.

For generative engine optimization (GEO), the underlying tools and frameworks remain largely the same, but how you implement them determines whether your content gets surfaced — or overlooked.

That means focusing on how AI agents access your site, how content is structured for extraction, and how reliably it can be interpreted and reused in generated responses.

Agentic access control: Managing the bot frontier

From a technical standpoint, robots.txt is a tool you already use in your SEO arsenal. You need to add the right crawlers within your files to allow specific bots their own rights. 

For example, you may want a training model like GPTBot to have access to your /public/ folder, but not your /private/ folder, and would need to do something like this:

User-agent: GPTBot
Allow: /public/
Disallow: /private/

You’ll also need to decide between model training and real-time search and citations. You might consider disallowing GPTBot and allowing OAI-SearchBot.

Within your robots.txt, you also need to consider Perplexity and Claude standards, which are tied to these bots:

Claude

  • ClaudeBot (Training)
  • Claude-User (Retrieval/Search)
  • Claude-SearchBot

Perplexity 

  • PerplexityBot (Crawler)
  • Perplexity-User (Searcher)

Adding to your agentic access is another new protocol — llms.txt, a markdown-based standard that provides a structured way for AI agents to access and understand your content.

While it’s not integrated into every agent’s algorithm or design, it’s a protocol worth paying attention to. For example, Perplexity offers an llms.txt that you can follow here. You’ll come across two flavors of llms.txt:

  • llms.txt: A concise map of links.
  • llms-full.txt: An aggregate of text content that makes it so that agents don’t have to crawl your entire site.

Even if Google and other AI tools aren’t reading llms.txt, it’s worth adapting for future use. You can read John Mueller’s reply about it below:

Extractability: Making content ‘fragment-ready’

GEO focuses more on chunks of information, or fragments, to provide precise answers. Bloat is a problem with extractability, which means AI retrieval has issues with:

  • JavaScript execution.
  • Keyword-optimized content rather than entity-optimized content.
  • Weak content structures that fail to provide clear, concise answers.

You want your core content visible to users, bots, and agents. Achieving this goal is easier when you use semantic HTML, such as:

  • <article>
  • <section>
  • <aside>

The goal? Separate core facts from boilerplate content so your site shows up in answer blocks. Keep your context window lean so AI agents can read your pages without truncation. Creating content fragments will feed both search engines and agentic bots.

Dig deeper: How to chunk content and when it’s worth it

Your customers search everywhere. Make sure your brand shows up.

The SEO toolkit you know, plus the AI visibility data you need.

Start Free Trial
Get started with

Semrush One Logo

Structured data: The knowledge graph connective tissue

Schema.org has been a go-to for rich snippets, but it’s also evolving into a way to connect your entities online. What do I mean by this? In 2026, you can (and should) consider making these schemas a priority:

  • Organization and sameAs: A way to link your site to verified entities about you, such as Wikipedia, LinkedIn, or Crunchbase.
  • FAQPage and HowTo: Sections of low-hanging fruit in your content, such as your FAQs or how-to content.
  • SignificantLink: A directive that tells agents, “Hey, this is an authoritative pillar of information.”

Connecting information and data for agents makes it easier for your site or business to be presented on these platforms. Once you have the basics down, you can then focus on performance and freshness.

Get the newsletter search marketers rely on.


Performance and freshness: The latency of truth

AI is constantly scouring the internet to maintain a fresh dataset. If the information goes stale, the platform becomes less valuable to users, which is why retrieval-augmented generation (RAG) must become a focal point for you.

RAG allows AI models, like ChatGPT, to inject external context into a response through a prompt at runtime. You want your site to be part of an AI’s live search, which means following the recommendations from the previous sections. Additionally, focus on factors such as page speed, server response time, and errors.

In addition to RAG, add “last updated” signals for your content. <time datetime=””> is one way to achieve this, along with schema headers, which are critical components for:

  • News queries.
  • Technical queries.

You can now start measuring your success through audits to see how your efforts are translating into real results for your clients.

Dig deeper: How to keep your content fresh in the age of AI

Measuring success: The GEO technical audit

You have everything in place and ready to go, but without audits, there’s no way to benchmark your success. A few audit areas to focus on are:

  • Citation share: Rankings still exist, but it’s time to focus on mentions as well. You can do this manually, but for larger sites you’ll want to use tools like Semrush.
  • Log file analysis: Are agents hitting your site? If so, which agents are where? You can do this through log analysis and even use AI to help parse all of the data for you.
  • The zero-click referral: Custom tracking parameters can help you identify traffic origins and “read more” links, but they only paint part of the picture. You also need to be aware that agents may append your parameters, which can impact your true referral figures.

Measuring success shows you the validity of your efforts and ensures you have KPIs you can share with clients or management.

Scaling GEO into 2027

Preparing your GEO strategy for 2027 requires changes in how you approach technical SEO, but it still builds on your current efforts. You’ll want to automate as much as you can, especially in a world with millions of custom GPTs.

Manual optimization? Ditch it for something that scales without requiring endless man-hours.

Technical SEO was long the core of ranking a site and ensuring you provided search bots and crawlers with an asset that was easy to crawl and index.

Now? It’s shifting.

Your site must become the de facto source of truth for the world’s models, and this is only possible by using the tools at your disposal.

Start with your robots.txt and work your way up to structure, fragmented data, and extractability. Audit your success over time and keep tweaking your efforts until you see positive results. Then, scale with automation.

Read more at Read More

The push layer returns: Why ‘publish and wait’ is half a strategy

The push layer returns- Why ‘publish and wait’ is half a strategy

In 1998, submitting a website to search engines was manual, methodical, and genuinely tedious. I remember 17 of them: AltaVista, Yahoo Directory, Excite, Infoseek, Lycos, WebCrawler, HotBot, Northern Light, Ask Jeeves, DMOZ, Snap, LookSmart, GoTo.com, AllTheWeb, Inktomi, iWon, and About.com.

Each had its own form, process, and wait time, and its own quiet judgment about whether your URL was worth including. We submitted manually, 18,000 pages in all. Yawn.

Google was barely a year old when we were doing this. But they were already building the thing that would make submission irrelevant.

PageRank meant Google followed links, and a site that other sites linked to would be found whether it submitted or not. The other 17 engines waited to be told about content. Google went looking, and within a few years, they got so good at finding content that manual submission became the exception rather than the norm.

You published, you waited, the bots arrived. For 20 years, that was the deal, and SEO optimized for a crawler that would show up sooner or later.

The irony is that we’re now shifting back. Not because Google got worse at finding things, but because the game has expanded in ways that pull alone can’t cover, and the revenue flowing through assistive and agentic channels doesn’t wait for a bot.

Your opportunities to skip gates

Pull isn’t the only entry mode

The pull model (bot discovers, selects, and fetches) remains the dominant entry mode for the web index. What’s changed is that pull is now one of five entry modes into the AI engine pipeline (the 10-gate sequence through which content passes before any AI system can recommend it), not the only one. 

The pipeline has expanded, and new modes have been added alongside the existing model rather than replacing it, and the single entry mode that has been the norm for 20 years has expanded to five.

What follows is my taxonomy of those five modes, with an explanation of the advantages each one gives you at the two gates that determine whether content can compete: indexing and annotation.

Your customers search everywhere. Make sure your brand shows up.

The SEO toolkit you know, plus the AI visibility data you need.

Start Free Trial
Get started with

Semrush One Logo

The five entry modes differ by gates skipped, signal preserved, and revenue reached

Mode 1: Pull model

Traditional crawl-based discovery where all 10 pipeline gates apply and the bot decides everything. You start at gate zero and have no structural advantage by the time your content gets to annotation (which is where that content starts to contribute to your AI assistive agent/engine strategy). You’re entirely dependent on the bot’s schedule and the quality of what it finds when it arrives.

Mode 2: Push discovery

The brand proactively notifies the system that content exists or has changed, through IndexNow or manual submission. 

Fabrice Canel built IndexNow at Bing for exactly this purpose: “IndexNow is all about knowing ‘now.’” It skips discovery, improves the chances of selection, and gets you straight to crawl. The content still needs to be crawled, rendered, and indexed, because IndexNow is a hint, not a guarantee. 

You win speed and priority queue position, which means your content is eligible for recommendation days or weeks earlier than a competitor who waited for the bot. In fast-moving categories, that window is the difference between being in the answer and being absent from it.

Note: WebMCP helps with Modes 1 and 2 by making crawling, rendering, and indexing more reliable, retaining signal and confidence that would otherwise be lost through those three gates. 

Because confidence is multiplicative across the pipeline, a higher passage rate at crawling, rendering, and indexing means your content arrives at annotation with significantly more surviving signal than a standard crawl delivers. The structural advantage compounds from there.

Mode 3: Push data 

Structured data goes directly into the system’s index, bypassing the entire bot phase. Google Merchant Center pushes product data with GTINs, prices, availability, and structured attributes. OpenAI’s Product Feed Specification powers ChatGPT Shopping that supports 15-minute refresh cycles. 

Discovery, selection, crawling, and rendering don’t exist for this content, and the “translation” at the indexing phase is seamless: it arrives at indexing already in machine-readable format, four gates skipped and one improved. That means the annotation advantage is significant.

This is where the money is for product-led businesses: where crawled content arrives as unstructured prose the system has to interpret and feed content arrives pre-labeled with explicit machine-readable entity type, category, and attributes. By structuring the data and injecting directly into indexing, you’re solving a huge chunk of the classification problem at annotation, which, as you’ll see in the next article, is the single most important step in the 10-gate sequence.

As the confidence pipeline shows, each gate that passes at higher confidence compounds multiplicatively, so this is where you can get the “3x surviving-signal advantage” I outline in “The five infrastructure gates behind crawl, render, and index.”

Mode 4: Push via MCP 

Model Context Protocol (MCP) — a standard that lets AI agents query a brand’s live data during response generation — allows agents to retrieve data from brand systems on demand. 

In February 2026, four infrastructure companies shipped agent commerce systems simultaneously. Stripe, Coinbase, Cloudflare, and OpenAI collectively wired a real-time transactional layer into the agent pipeline, live with Etsy and 1 million Shopify merchants. 

Agentic commerce is key. MCP skips the entire DSCRI pipeline and then operates at three levels, each entering the pipeline at a different gate: 

  • As a data source at recruitment.
  • As a grounding source at grounding.
  • As an action capability at won, where the transaction completes without a human in the loop. 

The revenue consequences are already real: brands without MCP-ready data are losing transactions to those with it, because the agent can’t access their inventory, pricing, or availability in real time when it needs to make a decision. This is where you see multi-hundred percent gains in the surviving signal.

MCP is already simultaneously push and pull, depending on context. 

There’s a dimension to Mode 4 that most people don’t think about much: the agent querying your MCP connection isn’t always a Big Tech recommendation system. It’s increasingly the customer’s own AI, acting as their purchasing agent, evaluating your inventory and pricing in real time, with their credit card behind the query, completing the transaction without them opening a browser.

When your customer’s agent (let’s say OpenClaw-driven) comes knocking, agent-readable is the entry requirement. Agent-writable — the capacity for an agent to act, not just retrieve — is where you’ll make the conversion. The brands without writable infrastructure will be losing transactions to competitors whose systems answered the query and handled the action.

Mode 5: Ambient

This is structurally different from the other four. Where Modes 1 through 4 change how content enters the pipeline, ambient research changes what triggers execution of the final gates. 

The AI proactively pushes a recommendation into the user’s workflow without any query: Gemini suggesting a consultant in Google Sheets, a meeting summary in Microsoft Teams surfacing an expert, and autocomplete recommending your brand. 

Ambient is the reward for reaching recruitment with accumulated confidence high enough that the system fires the execution gates on the user’s behalf, without being asked. You can’t optimize for ambient directly. You earn it — and the brands that earn it capture the 95% of the market that isn’t actively searching.

Several people have told me my obsession with ambient is misplaced, theoretical, and not a real thing in 2026. I’ve experienced it myself already, but the clearest demonstration came at an Entrepreneurs’ Organization event where I was co-presenting with a French Microsoft AI specialist. 

He demonstrated on Teams an unprompted push recommendation: a provider identified as the best solution to a problem his team had been discussing in the meeting. Nobody explicitly asked. Copilot listened, understood the problem, evaluated options, and push-recommended a supplier right after the meeting. Ambient isn’t theoretical. It’s running on Teams, Gmail, and other tools we all use daily, right now.

Get the newsletter search marketers rely on.


Every mode converges at annotation

Five entry modes, each with a different starting point, and they all converge at annotation. Annotation is the key to the entire pipeline. Every algorithm in the algorithmic trinity (LLM + knowledge graph + search) doesn’t use the content itself to recruit, it uses the annotations on your chunked content, and nothing reaches a user without being recruited. 

Why is that important? Because accurate, complete, and confident annotation drives recruitment, and recruitment is competitive regardless of how content entered. A product feed arriving at indexing with zero lost signal competes at recruitment with a huge advantage over every crawled page, every other feed, and every MCP-connected competitor that entered by a different door. 

You control more of this competition than most practitioners assume, but skipping gates gives you a structural advantage in surviving signal. It doesn’t exempt you from the competition itself.

That distinction matters here because annotation sits at the boundary. It’s the last absolute gate: the system classifies your content based on your signals, independently of what any competitor has done. Nobody else’s data changes how your entity is annotated. That makes annotation the last moment in the pipeline where you have the field entirely to yourself.

From recruitment onward, everything is relative. The field opens, every brand that passed annotation enters the same competitive pool, and the advantage you carried through the absolute phase becomes your starting position in a winner-takes-all race. Get annotation right, and you have a significant head start. Get it wrong, and no matter how much work you do to improve recruitment, grounding, or display, it will not catch up, because the misclassification and loss of confidence compound through every gate downstream.

Nobody in the industry was talking about this in 2020. I started making the point then, after a conversation on the record with Canel, and it still isn’t getting the attention it deserves.

Annotation is the key

Annotation is your last chance before competition arrives.

Search is one of three ways users encounter brands — and it’s the least valuable

The research modes on the user’s side have expanded, too. The SEO industry has traditionally focused on just one: implicit, when the user types a query. There was always one more: explicit brand queries, and now we have a third. Each research mode is defined by who initiates and what the user already knows.

Explicit research is the deliberate query, where the user asks for a specific brand, person, or product, and the system returns a full entity response (the AI résumé that replaces the brand SERP). 

This is the lowest-confidence mode of the three, because the user has already signaled very explicit intent: you’re only reaching people who already know your name. Bottom of the funnel, decision. Algorithmic confidence is important here to remove hedging (“they say on their website,” “they claim to be…”) and replace it with absolute enthusiasm (“world leader in…,” “renowned for…”).

Implicit research removes the explicit query. The AI introduces the brand as a recommendation (or advocates for you) within a broader answer, and the user discovers the brand because the system considers it relevant to the conversation, staking its own credibility on the inclusion. Top- and mid-funnel, awareness and consideration. Algorithmic confidence is vital here to beat the competition and get onto the list when a user asks “best X in Y market” or be cited when a user asks “explain topic X.”

Ambient research requires the highest confidence of all. The system pushes the brand into the user’s workflow with no query, no explicit request, the algorithm is making a unilateral decision that this user, in this context, at this moment, needs to see your brand. That requires very significant levels of algorithmic confidence.

The format is small: a sentence, a credential, a contextual mention. The audience reached is the largest: people not yet in-market, not yet actively looking, who encounter your brand because the AI decided they should. And the kicker is that your brand gets the sale before the competition even starts.

For me, this is the structural insight that inverts how most brands prioritize, and where the real money is hiding. They optimize for implicit research, where competition is highest, the target you need to hit is widest, and the work is hardest. 

Most SEOs underestimate explicit research (where profitability is highest) and completely ignore ambient, which reaches the 95% who aren’t yet looking and requires the deepest entity foundation to trigger. I call this the confidence inversion, first documented in May 2025: the smallest format requires the highest investment, and it reaches the most valuable audience.

How algorithmic confidence affects the three research modes in AI

The entity home website is the single source that feeds every mode

In 2019, AI engineers spent 80% to 90% of their time collecting, cleaning, and labeling data, and the remaining 10% to 20% on the work they actually wanted to do. They wryly called themselves data janitors. Today, Gartner estimates 60% of enterprises are still effectively stuck in the 2019 model, manually scrubbing data, while the companies that got organized early compound their advantage.

The same split is happening with brand content and entity management, for the same reason. Every push mode described in this article draws on data: product attributes for merchant feeds, structured entity data for MCP connections, and corroborated identity claims for ambient triggering. 

If that data lives in scattered, inconsistent, contradictory sources, every push attempt is expensive to implement, structurally weak on arrival, and liable to contradict the previous one. Inconsistency is the annotation killer: the system encounters two different versions of who you are from two different push moments, and confidence drops accordingly.

The framing gap, where your proof exists but the algorithm can’t connect it to a coherent entity model, is a direct consequence of disorganized data, and it costs you in recommendation frequency every day it persists.

The entity home website — the full site structured as an education hub for algorithms, bots, and humans simultaneously, built around entity pillar pages that declare specific identity facets — becomes the single source that feeds every mode simultaneously.

Pull, push discovery, push data, MCP, and ambient all draw from the same clean, consistent, non-contradictory data. You build the structure once, maintain it in one place, and you’re ready for push and pull modes today, and any to come that don’t yet exist.

Using your entity home website to feed the bots

AI handles 80%, humans protect the other 20%

That foundation is only as strong as the corrections made to it. How this works in practice depends on where you’re starting from. For enterprises, the website typically mirrors an internal data structure that already exists: 

  • Product catalogs. 
  • CRM records.
  • Service definitions.
  • Organizational hierarchies. 

The website becomes the public representation of structured data that lives inside the business, and the primary challenge is integration and maintenance.

For smaller businesses and personal brands, the direction often runs the other way: building the entity home website well is what forces you to figure out how your business is actually structured, what you genuinely offer, who you serve, and how everything connects. The website imposes discipline. 

We’re doing exactly this: centralizing everything as the structured data representation of the entire brand (personal or corporate). Getting the foundation right (who we are, what we offer, who we serve) is generally the heaviest lift. Building N-E-E-A-T-T credibility on top of that foundation is now comparatively straightforward, and every new push mode draws from the same organized source.

Here’s where using AI fits into this work. It can handle roughly 80% of the organization: extracting structure from existing content, proposing taxonomies, drafting entity descriptions, mapping relationships, and flagging gaps. What it does poorly, and what humans need to correct, are the three failure modes that propagate silently through every downstream gate:

  • Factual errors, where something is simply wrong.
  • Inaccuracies, where something is approximately right but imprecise enough to mislead.
  • Confusions, where two different concepts are conflated, or an entity is ambiguous between interpretations.

Confusion is the sneakiest because it looks like data, passes automated quality checks, enters the pipeline with apparent confidence, and then causes annotation to misclassify in ways that compound through every gate downstream.

Alongside the errors sit the missed opportunities, which are equally costly and considerably less obvious:

  • Lost N-E-E-A-T-T credibility opportunities, where the systems underestimate or undervalue the entity because credibility signals exist but aren’t structured, corroborated, or framed in a way the algorithmic trinity can read. The authority exists, but the machine doesn’t understand it.
  • Annotation misclassification, where the entity is indexed coherently but placed in the wrong category, meaning it competes for the wrong queries entirely and never appears in the contexts where it should win. Correctly classified competitors take the recommendation: your brand is present in the pipeline, but absent from the competition that matters to your business.
  • Untriggered deliverability, where understandability is solid and credibility has crossed the trust threshold, but topical authority signals haven’t accumulated densely enough to push the entity across the deliverability threshold for proactive recommendation. The machine knows who you are and trusts you. It just doesn’t advocate for you yet.

The human doing the correction and optimization work is the competitive advantage. Because the errors are surreptitious and the opportunities non-obvious, the trick is finding where both actually are, fixing one, and acting on the other.

The errors are surreptitious. The opportunities are non-obvious. Finding both is the work that compounds.

See the complete picture of your search visibility.

Track, optimize, and win in Google and AI search from one platform.

Start Free Trial
Get started with

Semrush One Logo

Organize once, feed every mode that exists and every mode to come

The push layer is expanding. The brands that organize their data now — not perfectly, but consistently, and with a system for maintaining it — are building the infrastructure from which every current and future entry mode draws.

The brands still publishing and waiting for the bot (Mode 1) are optimizing for the least advantageous mode in a five-mode landscape, and that disadvantage gap widens with every passing cycle.

This is the seventh piece in my AI authority series. 

Read more at Read More