Intro: The Operating System

The Marketing OS

A working framework for running marketing when most of the execution is done by AI agents. Built from a year of working on it in the open, six figures of pipeline through a small studio, a Lovable hackathon win against 5,800 builders, and reading roughly four hundred primary sources so that future readers don't have to. Released CC-BY 4.0 because I'd rather it be useful than mine.

The OS is meant to be readable by people who don't work in marketing. The interesting part of modern marketing isn't the marketing, it's the operating model, and that model travels into any function where one person has to direct a fleet of agents instead of a team of humans.


How to read this

The OS is split across thirteen files. This intro is the orientation. Eight domain files cover the actual workstreams. AgentOps is the cross-cutting plumbing that makes the rest trustworthy. There's a deeper file on the six agent shapes, a resource library if you want to go further, and a research plan documenting what's deferred to the next version.

Read the intro front to back if you want the full picture. Drop into a domain file when that workstream is the bottleneck. Update them as the field shifts, because the field is shifting fast.


What changed, and what didn't

The work itself didn't change as much as it looks. You're still doing positioning, content, distribution, lead generation, measurement. You're still trying to find people who'd buy from you and make sure they hear from you at the right moment, in the right way, in language they recognize.

What changed is who does the doing. Most of the execution layer now runs on AI agents. A single operator can produce more than a six-person team produced in 2022. Production stopped being the bottleneck. The new bottleneck is judgment, taste, and the discipline to delete 80% of what the agents make.

That last part is the third layer in the operating model.


The three layers

The work decomposes into three layers, distinguished by who leads and how fast the cadence runs.

Strategy is positioning, narrative, ideal customer profile, value proposition, and pricing. Humans decide here, while agents draft, stress-test, and audit consistency on a roughly quarterly cadence with weekly recalibration. The fundamental rule is that agents should never make material strategy decisions: they can synthesize a hundred competitor moves and pressure-test a narrative against simulated buyers, but the call about what to be in the market is human work.

Execution is content production, distribution, optimization, outreach, the daily output of the function. Humans set the boundaries; agents work within them on a continuous cadence, and this is where the leverage lives. McKinsey estimates agentic systems accelerate campaign creation roughly 10 to 15 times compared to traditional workflows, and that organizations doing this well see 10 to 30 percent revenue growth from hyper-personalized marketing. But velocity without governance produces brand damage at scale, which is why the third layer exists.

Operations, or AgentOps, is the substrate underneath. Observability, evaluation, governance, drift detection, cost control, prompt versioning. Same role DevOps played for software in the 2010s and MLOps did for machine learning models more recently. Without this layer, the other two break silently. With it, autonomy becomes observable and controllable.


The eight workstreams

The execution layer decomposes into eight functional domains. Each is its own vertical workstream with its own cadence, its own KPI, and its own set of agents. The horizontal layer running across all of them is AgentOps.

#DomainWhat it doesCadence
1Sensing & IntelligenceConvert signals from the outside world (accounts, competitors, communities) into structured feeds the rest of the system reasons against.Continuous
2Strategy & PositioningDecide what to be in the market. The most agent-resistant domain.Quarterly + weekly recalibration
3Content & Creative ProductionMake the things. Highest velocity, highest brand risk.Daily
4Distribution & Channel OperationsGet the things in front of people, including the 83% of B2B buying time that happens in private channels (Slack, podcasts, DMs) where traditional analytics can't see.Continuous
5AI Search & Answer VisibilityBe cited when someone asks ChatGPT, Claude, Perplexity, or Gemini about your category. New domain, structurally different mechanics from regular search.Weekly
6Demand & Conversational PipelineTurn attention into revenue. The conversion engine: lead capture, qualification, routing, sales handoff.Continuous
7Customer Intelligence & Synthetic TestingTalk to actual customers, model them, and pre-test concepts on synthetic versions of them before committing budget.Per-launch + ongoing
8Measurement, Attribution & Closed-Loop LearningKnow what's actually working and feed that back into everything else. Without this, the rest is theater.Continuous

Each domain has its own file: definition, why it matters, sub-domains, current best practices, the tools that win, the practitioners worth following, and the failure modes that swallow most teams.


The six agent shapes

McKinsey identified six functional patterns that recur across every marketing agent build. The trick isn't to pick one, it's to build agents around the shape and reuse them across domains.

  1. Content Generator produces text, images, video, and code variants.
  2. Knowledge retrieves, synthesizes, and cites from internal and external sources.
  3. Localization adapts content across languages, regions, segments, and regulatory contexts.
  4. Analyzer examines data and produces insight.
  5. Planner decomposes a goal into a sequence of actions.
  6. Operator executes the plan, calling APIs, shipping content, booking meetings, adjusting spend.

A consumer brand reportedly identified almost a hundred individual modular agents within the Content Generator shape alone, reusable across creative production, sales collateral, e-commerce optimization, and co-marketing, which is the point of organizing around shapes in the first place.


The maturity model

Different teams are at different stages. Be honest about where you are.

Crawl: Reactive Marketing. Manual content production, slow inbound response, no observability, last-touch attribution in Google Analytics, ChatGPT in browser tabs that nobody version-controls. Most leads are ignored or mishandled. The right move at this stage is to solidify positioning, get one channel working consistently, and not deploy agents yet, because agents amplify your inputs and the inputs aren't ready.

Walk: Automated Inbound. A marketing automation platform handles email and lead routing, with some content cadence, form-based lead capture, basic lead scoring, and sales development reps doing most outbound manually. AI shows up as copilots rather than autonomous agents. Right move: get measurement infrastructure right, pilot one autonomous agent in a low-risk area, and build the Brand Governance Agent before scaling content.

Run: Multi-Agent Execution. Multiple agents in production with observability. Conversational AI handles inbound qualification. Outreach is multi-channel and context-aware. Measurement is triangulated, combining marketing mix modeling, incrementality testing, and attribution. Real-time signal monitoring across hundreds of accounts. Right move: push deeper into reusability, audit which workflows are still human-led that should be agent-led, invest in synthetic testing for pre-launch validation.

Fly: Fully Integrated Agentic GTM. The agentic stack drives top-line growth measurably. Sales, marketing, and customer success share an agent fleet against unified data. Synthetic audience testing is routine pre-launch, real-time MMM updates daily decisions, and one operator manages the equivalent output of a 20-person team. Demandbase reportedly doubled pipeline running this stack; Greenhouse hit 50% chat-to-meeting conversion, and Crunchbase ran 67,000 AI conversations through it.

To know where you stand, the honest test is to ask whether you can answer "what is our brand voice?" with a written, version-controlled document an LLM can follow. Then whether you have observability into every public-facing AI output, a defensible attribution model that survives a CFO audit, multi-source signal monitoring on at least a hundred accounts, and a Brand Governance Agent your team trusts. If you answered "no" to most of those, you're at Crawl, regardless of what your stack looks like.


Who does what

In an agentic stack, the human roles don't disappear. They change.

The supervisors are broad generalists fluent in AI, orchestrating agents and the hybrid workforce across domains. They own strategy, system design, cross-domain orchestration, hiring, vendor selection, and the relationship layer that doesn't agent well: analysts, journalists, partners, key customers. VPs of marketing, fractional CMOs, founder-operators all sit here. The Team-of-One executive in a BenchSci-style mandate is one of these.

The deep specialists are people who can do one of the eight domains very well, and who handle the exceptions agents fail on. Director of Content, Head of Demand Gen, Senior Brand Strategist. In a Team-of-One structure these often show up as fractional or contract specialists for high-stakes domains, scientific content for a biotech, regulated compliance for fintech.

The frontline is the customer-facing layer (sales, customer success, support) where humans handle the moments that humans handle better. Executive briefings, key customer meetings, edge-case escalation.

For a true Team-of-One operator, you're playing all three roles simultaneously. Agents cover the execution. You own strategy, narrative, brand judgment, and the brand-integrity decision that agents will produce ten times the volume and 80% of it should die. If you reach a point where no agent can credibly write about CRISPR base editing for an immunology audience, you hire a director of scientific content. The CEO's philosophy is talent density, not headcount minimization. Read that distinction correctly.


What's still moving

A few open questions the whole field is figuring out together. Each one matters because the answer changes how you operate this quarter.

Agent-to-agent commerce. As of mid-2026, Anthropic's Model Context Protocol moved to the Linux Foundation, Google launched the Universal Commerce Protocol, and OpenAI and Stripe shipped agent-native checkout, all stacking together rather than competing. The practical move today is to stand up a read-only MCP server exposing your product catalog, pricing, availability, and policies, even if you don't sell direct yet. This is the new robots.txt: you publish or you're invisible to agents.

Brand-to-LLM communication. PR Newswire launched an AEO/GEO Brand Report. OpenAI started testing ads in ChatGPT Free in January 2026. The practical move is to audit your top fifty commercial-intent prompts in ChatGPT, Claude, Perplexity, and Gemini monthly, document the entities and sources cited, and optimize the cited sources, not just your own site. That's where the lever sits.

The ghost workforce. A Content Marketing Institute study found 76% of marketers doing the work of more than one job, while only 11% of organizations actually replaced workers with AI. Stanford reported a 16% relative employment decline for ages 22 to 25, while experienced workers held stable. Junior tasks were the apprenticeship; cutting them creates a senior-shortage problem in three to five years. The practical move is to publish an explicit AI velocity ceiling for your team and hire one paid intern per quarter as a deliberate pipeline investment.

Buyers using AI agents. Two thirds of B2B buyers now use AI agents for vendor research, with 45% using AI as their primary method for identifying new suppliers. Gartner forecasts 90% of B2B purchases will be mediated by AI agents by 2028, roughly $15T in spend. The practical move is small: ask Claude, ChatGPT, and Perplexity to "shortlist three vendors for [your category]." Whatever they cite is your real competitive set. Optimize there with structured data, comparison content, third-party reviews, and a public capability matrix that an agent can parse without rendering JavaScript.

Accountability and insurance. EU AI Act high-risk obligations become enforceable August 2, 2026. Moffatt v. Air Canada is now the global "you own what your bot says" precedent. Insurance moved fast: HSB launched AI liability insurance for SMBs in March 2026. But Verisk released endorsements allowing carriers to exclude generative-AI claims from commercial general liability policies. The practical move is to audit every customer-facing AI surface, what it can promise, and whether your insurance still covers it, because the AI exclusion may have appeared at last renewal without anyone noticing.

The creative quality ceiling. AI-generated video still degrades after 20 to 25 seconds, struggles with emotional dialogue, and garbles on-screen text. Hybrid AI plus human delivers roughly five times faster at 40 percent lower cost, but pure AI hits a quality ceiling for top-tier brand work where the audience is meant to believe a human cared. The practical move is to define your "human-required floor" explicitly: which assets must have a named human author? CEO posts, investor letters, founder essays, brand films. Below that floor, AI-led production is fine. Document it, because your team will keep crossing it accidentally without an explicit line.

A few more frontiers (saturation in the model layer, sentiment risk in AI search, Wikipedia editorial wars in the AI age) sit further out and update less often. They're covered in the domain files where they show up.


How to get good at this

Three things separate people who run agentic marketing from people who follow the field.

Operational depth in two or three domains. You can't be deeply expert in all eight. Pick the two or three that matter most for what you're doing, and be operationally fluent in the others. For most operators that's some combination of Content (Domain 3), AI Search (Domain 5), and one of Demand (Domain 6) or Measurement (Domain 8).

Personal experimentation, not just consumption. Reading about agentic stacks doesn't make you good at them; building one does. Build a Brand Governance Agent, run a synthetic audience test, set up triangulated measurement on a real client. The framework in this document is a map; the territory is where the learning happens.

Public synthesis. Pick a sub-domain, go ten times deeper than what's here, and publish what you find. Other people will find you when they search for those terms. The act of teaching forces clarity, which is the same loop that produced this document.

The compounding loop is the simple version: build, learn what doesn't work, write up what you learned, get feedback, build the next thing better, eventually become someone people quote. This document is the floor of the knowledge, not the ceiling. The ceiling moves up every quarter.


v3, updated 2026-Q2.


Frequently asked questions

What is the Marketing OS?

A structured framework for running an AI-native marketing function. Three layers (strategy, execution, operations), eight functional domains (sensing, strategy, content, distribution, AI search, demand, customer intelligence, measurement), and six reusable agent shapes. AgentOps is the substrate that makes the rest trustworthy at scale.

Who is this built for?

Operators building or running an AI-native marketing function. The Team-of-One framing fits founder-operators, fractional CMOs, and VPs of marketing in early-to-mid-stage companies who direct an agent fleet rather than manage a large team. The framework is product-agnostic, with industry overlays for B2B SaaS, biopharma, consumer DTC, and developer tools.

How is this different from a generic AI marketing playbook?

Three things. First, primary-source rigor: every statistic traces to a named source with a date. Second, named case studies with before/after numbers (Anthropic, Vercel, SaaStr, Refine Labs, CVS Health, Brand.ai). Third, an interactive maturity matrix that places your team in 32 cells (eight domains by four stages), each cell with its own tools list, case study, advance criterion, and pitfall.

How often is this updated?

v3 shipped April 2026. The research-plan file documents what's deferred to v4 plus a quarterly stat-refresh cadence. The frontiers section gets the most active updates because the field moves fastest there: agent-to-agent commerce protocols, EU AI Act enforcement, Wikipedia editorial decisions, OpenAI ad placements, FDA digital-twin guidance.

Can I use this for my own work?

Yes, CC-BY 4.0, which means quote it, fork it, build on it, ship it. Send a note if it ends up useful (mahmoudhalat.com).

Continue reading
AgentOps (the substrate)