How We Built the First AI Agent That Generates Growth Tickets
Most AI marketing tools are chatbots with analytics access.
You ask a question. AI answers with data.
Maybe generates a chart.
That's not an agent.
That's a smart search interface.
An agent is different.
An agent:
- Monitors continuously
- Identifies opportunities autonomously
- Generates specific action items
- Prioritizes by impact
- Explains reasoning
No human prompt required.
That's what we built at Cogny.
And it's harder than it looks.
Let me show you why.
---
The Problem with Marketing AI Today
Most "AI marketing tools" work like this:
- Connect data sources
- Build dashboards
- Add ChatGPT integration
- User asks: "Why did ROAS drop?"
- AI reads data and answers
Useful? Sure.
Revolutionary? No.
Why not:
You still need to know what questions to ask.
Real marketers don't have time for 20 questions.
They want: "Here's what's wrong." "Here's what to do." "Here's the expected impact."
Proactive. Not reactive.
That requires a different architecture.
---
What Makes an Agent Different
Traditional AI assistant:
- Waits for questions
- Responds when prompted
- No memory between sessions
- No initiative
AI agent:
- Monitors continuously
- Identifies issues autonomously
- Remembers context
- Takes initiative
The technical challenge:
How do you build AI that knows what to look for?
You can't just:
- Point AI at data
- Say "find problems"
- Hope it works
Doesn't work.
AI needs:
- Clear objectives
- Evaluation criteria
- Pattern templates
- Domain knowledge
- Decision frameworks
That's the hard part.
---
Our Architecture (Simplified)
Layer 1: Data Ingestion
Connect to marketing platforms:
- Google Ads API
- Meta Marketing API
- GA4 via BigQuery
- Custom data sources
Challenge: Each platform structures data differently.
Google Ads: campaigns → ad groups → ads → keywords Meta: campaigns → ad sets → ads GA4: Events → users → sessions
Solution: Unified data model that normalizes across platforms.
Layer 2: Continuous Analysis Engine
This is where most tools stop.
They show you the data.
We go further:
Run 40+ analysis algorithms continuously:
- Conversion rate variance detection
- Budget allocation efficiency
- Audience overlap analysis
- Creative fatigue patterns
- Geographic performance
- Time-based optimization
- Keyword effectiveness
- Placement efficiency
- Device performance
- And 30+ more
Each algorithm:
- Scans all data every 6 hours
- Compares to baselines
- Identifies anomalies
- Calculates statistical significance
- Estimates impact
Layer 3: Opportunity Scoring
Found 200 potential issues.
Which matter?
Scoring algorithm considers:
- Monetary impact (€ saved or gained)
- Statistical confidence (is this real?)
- Implementation ease (how hard to fix?)
- Time sensitivity (how urgent?)
- Historical success rate (did similar tickets work before?)
Output: Ranked list of opportunities.
Layer 4: Ticket Generation
For each high-score opportunity:
AI generates a "growth ticket":
- Issue: What's wrong (specific)
- Impact: Expected result in € or %
- Action: Exactly what to do
- Reasoning: Why AI recommends this
- Confidence: How sure we are
Example ticket:
ISSUE:
47 keywords in "Healthcare Services" campaign have spent €3,200
over last 90 days with zero attributed conversions.
IMPACT:
Pause these keywords → Save €1,067/month (~€12,800/year)
ACTION:
1. Review keyword list (attached)
2. Pause in Google Ads
3. Or add to excluded keywords
REASONING:
These keywords attract clicks (3.2% CTR) but never convert.
Likely wrong intent match. Spending budget that could go to
proven keywords.
CONFIDENCE: 95%
(Sufficient data: 90 days, 1,847 clicks, statistical significance)
This is actionable.
Not "here's some data." But "do this, expect this."
Layer 5: Learning Loop
When user implements ticket:
- Track outcome
- Compare to prediction
- Update models
- Improve future recommendations
AI gets smarter over time.
---
The LLM Decision
We use Claude (Anthropic) as core AI.
Not GPT-4.
Why?
1. Context Window
Claude handles 200K tokens.
That's:
- Entire campaign data
- Historical performance
- All tickets generated
- User feedback
- Domain knowledge
In one context.
GPT-4's smaller window forced chunking. Lost connections between data points.
2. Reasoning Quality
We tested both on marketing analysis.
Example task: "Find wasted spend in this Google Ads account"
GPT-4: Found obvious issues (zero-conversion keywords) Missed subtle patterns (audience overlap, time-of-day inefficiency)
Claude: Found obvious issues AND subtle patterns Better at multi-step reasoning More nuanced understanding of marketing context
For our use case: Claude wins.
3. Instruction Following
AI agent needs to follow complex instructions:
- Analyze data
- Apply marketing principles
- Consider business context
- Generate specific recommendations
- Format as structured tickets
Claude excels at instruction following.
4. Cost Efficiency
Claude pricing competitive with GPT-4. But better results. So effectively cheaper per quality output.
---
Technical Challenges We Solved
Challenge 1: Real-Time vs Batch Processing
Problem:
Marketing data changes constantly.
Run analysis every minute? Expensive, unnecessary. Run once per day? Miss urgent issues.
Solution:
Hybrid approach:
- Batch analysis every 6 hours (comprehensive)
- Real-time monitors for critical alerts (spend spikes, tracking breaks)
- Event-triggered analysis (campaign launches, budget changes)
Result: Balance between coverage and cost.
Challenge 2: Statistical Significance
Problem:
Small campaigns have high variance.
Day 1: 10% conversion rate Day 2: 2% conversion rate
Is this a problem or noise?
Bad AI: Alerts on every fluctuation Good AI: Only alerts when statistically significant
Solution:
Bayesian confidence intervals.
Calculate probability that change is real.
Only generate ticket if >85% confidence.
Result: No false alarms.
Challenge 3: Domain Knowledge Injection
Problem:
Claude doesn't inherently know marketing.
Needs to learn:
- What's a good CTR?
- What's normal ROAS variance?
- When is creative fatigued?
- What's audience overlap?
Solution:
20 years of marketing knowledge encoded as:
- Benchmark databases
- Pattern templates
- Decision trees
- Heuristics
- Domain-specific prompts
Example prompt snippet:
You are analyzing Google Ads performance.
Normal CTR ranges by campaign type:
- Search brand: 8-15%
- Search non-brand: 2-5%
- Display: 0.3-0.8%
- Shopping: 0.5-1.2%
Creative fatigue typically occurs after:
- 10,000-15,000 impressions (display)
- 5,000-8,000 impressions (social)
Audience overlap above 60% between ad sets indicates
potential auction competition...
Thousands of lines of domain knowledge.
This is what makes AI useful for marketing. Not just raw LLM intelligence.
Challenge 4: Action Specificity
Problem:
Early versions generated vague recommendations: "Consider optimizing keyword performance"
Useless.
Solution:
Constrain AI outputs to specific formats:
- Must include exact data (keyword IDs, campaign names)
- Must provide step-by-step instructions
- Must quantify expected impact
- Must explain reasoning
Enforcement: Structured JSON output format. AI must fill every field.
Result: Every ticket is actionable.
Challenge 5: Handling Multiple Platforms
Problem:
Google Ads optimization is different from Meta optimization.
Google: Keywords, search terms, quality score Meta: Creative, audiences, placement
Different principles.
Solution:
Platform-specific analysis modules:
- Google Ads analyzer (search focus)
- Meta analyzer (creative + audience focus)
- GA4 analyzer (journey focus)
- Cross-platform analyzer (attribution)
Each with domain-specific logic.
Result: Deep expertise per platform.
---
What We Learned
Learning 1: Accuracy > Coverage
Early version:
Analyzed everything. Generated 200+ tickets per account.
Users overwhelmed.
Now:
Analyze everything. Generate 10-20 high-confidence tickets.
Quality over quantity.
Better to find top 10 opportunities than surface 200 maybes.
Learning 2: Explanation Matters
Early version:
"Pause these 47 keywords."
Users: "Why?"
Now:
"Pause these 47 keywords because:
- Zero conversions in 90 days
- €3,200 spent
- Good CTR but wrong intent match
- Budget better used on proven keywords"
Users need reasoning to trust recommendations.
Learning 3: Users Want Control
Early version:
"AI recommends this. Execute? Yes/No"
Felt like black box.
Now:
"AI recommends this because [reasoning]. You can:
- Execute (one-click)
- Modify and execute
- Dismiss
- Snooze for later
- Ask AI to explain more"
Users want to understand and control.
Learning 4: Context is Everything
Without context:
"Campaign X has 2.1 ROAS"
Is that good or bad?
With context:
- Your average: 4.2 ROAS
- Industry benchmark: 3.8 ROAS
- This campaign's historical: 5.1 ROAS
"Campaign X has 2.1 ROAS. → Underperforming. Investigate."
Absolute numbers mean nothing without context.
Learning 5: Continuous Monitoring > Periodic Reports
Old way:
Generate report weekly. Review on Monday. Implement changes Tuesday.
Problem happened Friday. Not caught until Monday. €10K wasted over weekend.
New way:
AI monitors 24/7. Alert in real-time. Problem Friday morning. Fixed Friday afternoon. €9K saved.
Continuous monitoring catches issues before they're expensive.
---
The Tech Stack
For those curious:
Backend:
- Python (FastAPI)
- PostgreSQL (data storage)
- Redis (caching)
- Celery (task queue)
AI:
- Anthropic Claude 3.5 Sonnet (primary)
- Custom embeddings for pattern matching
- scikit-learn (statistical analysis)
Data Pipeline:
- Apache Airflow (orchestration)
- dbt (transformations)
- BigQuery (data warehouse)
Infrastructure:
- Google Cloud Platform
- Kubernetes (container orchestration)
- Terraform (infrastructure as code)
Monitoring:
- Prometheus + Grafana
- Sentry (error tracking)
- Custom dashboards
Not reinventing the wheel.
Using proven tools. Focus on AI logic, not infrastructure.
---
What's Next
We're working on:
1. Predictive Tickets
Not just "this is wrong." But "this will be wrong in 3 days."
Prevent problems before they happen.
2. Automated Execution
User approves once: "Auto-pause keywords with zero conversions after 30 days"
AI executes automatically.
3. Strategic Recommendations
Today: Tactical optimization Tomorrow: Strategic insights
"Your creative performs best with testimonials. Consider brand campaign focused on customer stories."
4. Multi-Account Learning
Learn from patterns across all accounts. "This type of campaign works 2.3x better in fintech vs e-commerce."
Cross-account intelligence.
5. Natural Language Queries
"Why did CAC increase last week?"
AI investigates and explains.
Already works. Making it better.
---
Advice for AI Builders
If you're building AI agents:
1. Domain knowledge is 80% of value
Raw LLM = 20% of solution Domain expertise = 80%
Encode your expertise.
2. Start narrow
Don't try to build "AI for everything."
Pick one problem. Solve it well. Expand.
3. Action > insights
Users don't want reports. They want "do this."
Focus on actionability.
4. Trust through transparency
Show reasoning. Explain logic. Let users verify.
Black boxes fail.
5. Continuous learning
Track outcomes. Improve models. Get better over time.
Static AI becomes stale.
---
The Future is Agentic
Chatbots are 2023.
Agents are 2025.
The difference:
- Chatbots: reactive
- Agents: proactive
Next evolution:
Not "AI that answers questions" But "AI that solves problems"
Autonomously.
That's what we're building.
---
See It in Action
Want to see how the AI agent works?
I'll show you:
- How AI analyzes campaigns
- How tickets are generated
- The reasoning process
- Technical architecture (if you want details)
I love talking about this stuff.
---
About Berner Setterwall
Berner is the Co-Founder and CTO of Cogny. He previously built the AI optimization platform at Campanja that worked with Netflix, Zalando, and Momondo before its acquisition. He has 15 years of experience building machine learning systems for digital advertising, including real-time bidding systems, attribution models, and predictive analytics platforms. He's particularly passionate about making AI practical and actionable for marketing teams.
Connect:
- LinkedIn: linkedin.com/in/bernersetterwall
- Email: berner@cogny.com
- GitHub: github.com/bernersetterwall
Last Updated: January 8, 2025