Real-Time BigQuery Analysis: Lessons from Production
Three months ago, a user asked our AI agent: "Show me campaign performance for the last 24 hours."
The AI generated a perfectly valid query, sent it to BigQuery, and... the user waited. And waited. 43 seconds later, they got their answer.
They closed the app and never came back.
That's when I realized that "real-time analysis" isn't about how fast you can query the database. It's about managing expectations, optimizing for the 90th percentile, and building systems that fail gracefully when BigQuery decides to be slow.
Here's what we've learned running AI-powered BigQuery analysis in production, processing thousands of queries daily from users who expect instant answers.
The Promise vs The Reality
The promise: Users ask questions in natural language, AI generates SQL, BigQuery returns results in seconds, everyone's happy.
The reality:
- Some queries return in 2 seconds
- Some take 30 seconds
- Some timeout after 60 seconds
- Some cost $0.001
- Some cost $15
- Some return 10 rows
- Some try to return 10 million rows
- Some queries are perfect
- Some have subtle bugs that produce wrong results
The challenge isn't making it work once. It's making it work reliably, fast, and affordably for arbitrary user questions against arbitrary schemas.
Lesson 1: BigQuery Performance is Wildly Unpredictable
I came from traditional SQL databases where query performance was relatively predictable. Optimize your indexes, understand your execution plan, and you could estimate query time pretty accurately.
BigQuery doesn't work that way.
Case 1: Query scans 100GB, completes in 3 seconds.
Case 2: Query scans 5GB, takes 28 seconds.
Why? BigQuery's performance depends on:
- Data layout and clustering
- Slot availability (shared resources)
- Query complexity (joins, aggregations)
- Partition pruning effectiveness
- Time of day (contention)
- Table size vs scanned data
- Whether data is in cache
The implication: You can't reliably predict query time from query size or complexity.
What We Did About It
1. Tiered timeout strategy
Instead of one timeout, we use three:
- Fast queries (< 5s): Interactive responses
- Medium queries (5-15s): Show progress indicator
- Slow queries (15-30s): Warn user, offer to run async
2. Query cost estimation
Before running expensive queries, we estimate cost:
-- Get approximate scan size
SELECT
table_name,
size_bytes
FROM `project.dataset.INFORMATION_SCHEMA.TABLES`
WHERE table_name IN (...)
If estimated cost > $1, we ask for user confirmation.
3. Progressive disclosure
Instead of running the full query immediately, we:
- Run a
COUNT(*)first (fast, cheap) - If count > 100k rows, suggest sampling
- If user confirms, run the full query
4. Caching layer
We cache query results for identical queries. Hit rate: ~30% (better than expected).
Implementation: Redis with TTL based on data freshness requirements.
Lesson 2: AI-Generated Queries Optimize for Correctness, Not Performance
Claude is excellent at generating syntactically correct SQL. It's decent at generating semantically correct SQL. It's terrible at generating performant SQL.
Example user question: "Show me daily spend by campaign for the last 30 days"
Claude's first attempt:
SELECT
DATE(timestamp) as date,
campaign_name,
SUM(cost) as spend
FROM `project.dataset.ad_impressions`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY date, campaign_name
ORDER BY date DESC
Problems:
- Scans the entire
ad_impressionstable (potentially billions of rows) - Filters after scanning (no partition pruning)
- No partition or clustering key usage
Optimized version:
SELECT
DATE(_PARTITIONTIME) as date,
campaign_name,
SUM(cost) as spend
FROM `project.dataset.ad_impressions`
WHERE _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
GROUP BY date, campaign_name
ORDER BY date DESC
Impact:
- Original: 2.3TB scanned, 47 seconds, $11.50
- Optimized: 90GB scanned, 4 seconds, $0.45
The challenge: How do you teach an AI to generate the optimized version?
What We Did About It
1. Schema enrichment
We augment table schemas with metadata:
{
"table": "ad_impressions",
"partition_field": "_PARTITIONTIME",
"cluster_fields": ["campaign_id", "date"],
"common_filters": ["timestamp >= ...", "campaign_id IN ..."],
"performance_tips": [
"Always filter on _PARTITIONTIME for date ranges",
"Use campaign_id in WHERE clause when possible"
]
}
This metadata goes into the AI's context when generating queries.
2. Query rewriting rules
Before execution, we apply transformation rules:
- Replace
WHERE timestamp >=withWHERE _PARTITIONTIME >=if partition exists - Add clustering key filters when mentioned in WHERE clause
- Push down projections to reduce data scanned
3. Query analysis and suggestions
After generating a query, we analyze it:
def analyze_query(query, schema):
issues = []
# Check for full table scans
if not uses_partition_filter(query, schema):
issues.append("Warning: Full table scan. Consider filtering on partition key.")
# Check for expensive operations
if has_cross_join(query):
issues.append("Warning: Cross join detected. This may be slow.")
# Estimate cost
estimated_bytes = estimate_bytes_scanned(query, schema)
if estimated_bytes > 1_000_000_000: # 1TB
issues.append(f"Warning: Will scan ~{estimated_bytes / 1e12:.2f}TB (${estimated_bytes / 1e12 * 5:.2f})")
return issues
If issues are found, we ask Claude to revise the query with the analysis as context.
4. Example-driven prompting
We include performance examples in the system prompt:
When querying partitioned tables, always filter on the partition key:
GOOD:
WHERE _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
BAD:
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
The GOOD version prunes partitions before scanning. The BAD version scans all partitions then filters.
Results: Average query cost dropped from $1.20 to $0.18 per query.
Lesson 3: Users Ask Questions Your Schema Can't Answer
This surprised me: About 30% of user questions can't be answered from their data warehouse alone.
Examples:
User: "Compare our Facebook ROAS to industry average" Problem: Industry data isn't in their warehouse
User: "Show me campaigns targeting women 25-34" Problem: Demographic targeting data isn't stored in impressions table
User: "Why did traffic drop yesterday?" Problem: External factors (competitors, seasonality, news) aren't in the data
Initial approach: Tell the user we can't answer.
User response: Frustration and churn.
What We Did About It
1. Augmented data sources
For common external data needs, we maintain reference datasets:
- Industry benchmarks (by vertical and channel)
- Seasonality indices
- Holiday calendars
- Platform changes/updates
These are separate BigQuery tables users can optionally enable.
2. Partial answers with context
Instead of "I can't answer that," we say:
"I can show you your campaign performance for that audience segment, but I don't have industry benchmark data to compare against. Would you like to see your historical performance instead?"
3. Multi-source reasoning
When the data warehouse is insufficient, Claude can:
- Reference external knowledge ("Facebook launched iOS 14 targeting restrictions in April 2021, which might explain the drop")
- Suggest what additional data would help
- Provide qualified insights based on available data
Example:
User: "Why did our Instagram campaign suddenly get expensive?"
Claude: "Your Instagram CPM increased 45% starting March 15th. While I can't see Instagram's platform changes directly, I notice this coincides with when Meta typically adjusts ad auction dynamics for Q2. Your competitor spend in the competitor_analysis table also increased 60% in that period, which could be driving auction pressure. I'd recommend checking Meta's advertiser updates for mid-March."
This is way more helpful than "I don't have that data."
Lesson 4: Result Sets Are Often Unusably Large
When a user asks "show me all campaigns," they don't realize they have 50,000 campaigns.
Naive approach: Return all 50,000 rows.
Problems:
- Takes forever to query
- Takes forever to transmit
- Crashes the browser trying to render
- User can't meaningfully review that much data
Better approach: Intelligence in limiting and summarizing.
What We Did About It
1. Smart defaults
We automatically limit queries unless explicitly asked for more:
-- User asks: "Show me campaigns"
-- We actually run:
SELECT * FROM campaigns
ORDER BY spend DESC -- or most relevant order
LIMIT 100
Then we tell the user: "Showing top 100 campaigns by spend. There are 47,342 total. Would you like to filter or see more?"
2. Automatic summarization
For large result sets, we ask Claude to summarize:
I found 12,347 campaigns. Here are the key patterns:
- Top 10 campaigns represent 67% of total spend
- 8,234 campaigns (67%) have zero conversions in the last 30 days
- Campaign performance varies widely by category:
- Branded: avg ROAS 8.2
- Non-branded: avg ROAS 3.1
Would you like me to:
1. Show top-performing campaigns
2. Show underperforming campaigns needing attention
3. Break down by specific category
3. Progressive loading
For medium-sized results (100-1000 rows), we:
- Load first 50 rows immediately
- Stream remaining rows
- Allow user to stop loading if they found what they need
4. Aggregation suggestions
If a query would return >10k rows, we suggest aggregation:
"This query would return 45,000 individual ad impressions. Would you like me to aggregate by day/campaign/creative instead?"
Lesson 5: Error Handling is Half the Product
Queries fail. A lot. Users need to understand why and what to do about it.
Categories of failures:
- Syntax errors (AI generated invalid SQL)
- Schema errors (referenced non-existent columns)
- Permission errors (can't access certain tables)
- Timeout errors (query too slow)
- Cost errors (query too expensive)
- Resource errors (BigQuery quota exceeded)
Naive error handling:
Error: Query failed
Details: [BigQuery error message]
User interpretation: "This thing is broken."
What We Did About It
1. Error translation
We intercept BigQuery errors and translate them:
def translate_error(bigquery_error):
if "Exceeded rate limits" in bigquery_error:
return {
"user_message": "Your BigQuery project is currently at its query limit. This usually resolves in a few minutes.",
"suggestion": "Wait 2-3 minutes and try again, or consider upgrading your BigQuery quota.",
"can_retry": True
}
if "Not found: Table" in bigquery_error:
table_name = extract_table_name(bigquery_error)
return {
"user_message": f"I tried to query table '{table_name}' but it doesn't exist in your dataset.",
"suggestion": "Check if the table name is correct or if you need to connect additional datasets.",
"can_retry": False
}
# ... more cases
2. Automatic retry with fixes
For certain errors, we automatically attempt fixes:
Syntax errors: Ask Claude to fix the SQL with the error message as context.
Schema errors: Refresh schema and regenerate query.
Timeout errors: Suggest sampling or narrower date range.
3. Graceful degradation
When a query fails, we try to provide partial value:
"I couldn't complete the full analysis due to query timeout, but based on the schema, I can tell you that your campaigns table has 12,347 unique campaigns. Would you like me to try a simpler query, or would you prefer to run this async?"
4. User education
After errors, we explain what happened and how to avoid it:
"This query scanned 5TB of data, which exceeded the 1TB limit we set for interactive queries. For large analyses like this, you can:
- Narrow the date range (e.g., last 7 days instead of all time)
- Sample the data (e.g., 10% of impressions)
- Run as a background job and get notified when complete"
Impact: User retry rate after errors increased from 15% to 68%.
Lesson 6: Cost Control is Not Optional
Early in production, a user asked: "Analyze all impression data."
Our AI generated a query that scanned 47TB.
Cost: $235.
We found out when the user complained about their BigQuery bill.
What We Did About It
1. Query cost estimation before execution
def estimate_query_cost(query, schema):
# Get bytes that would be scanned
dry_run_response = bigquery.query(query, dry_run=True)
bytes_scanned = dry_run_response.total_bytes_processed
# BigQuery pricing: $5 per TB
estimated_cost = (bytes_scanned / 1e12) * 5
return estimated_cost, bytes_scanned
2. Tiered confirmation
- < $0.10: Run automatically
- $0.10 - $1.00: Show estimate, auto-run
- $1.00 - $10: Require user confirmation
- > $10: Require explicit confirmation + suggest optimization
3. Cost budgets
Users can set daily/monthly query budgets. When approached:
- 80% of budget: Warning notification
- 100% of budget: Require admin approval for queries
4. Cost-aware query optimization
When generating queries, we include cost context:
"I can answer this two ways:
Option A (fast, expensive):
- Scan all 5TB of impression data
- Complete in ~8 seconds
- Cost: ~$25
Option B (slower, cheap):
- Use pre-aggregated daily summary table
- Complete in ~3 seconds
- Cost: ~$0.01
Which would you prefer?"
5. Optimization suggestions
After expensive queries, we suggest improvements:
"This query cost $5.50. You could reduce cost by:
- Using the aggregated table instead of raw impressions (95% cost reduction)
- Creating a materialized view for this analysis
- Partitioning by date and querying shorter time ranges"
Lesson 7: Schema Complexity Matters More Than Data Size
One client has 200TB of data across 5 simple tables. Queries work great.
Another client has 5TB of data across 200 tables with complex relationships. Queries are a mess.
The problem: AI needs to understand relationships, not just tables.
What We Did About It
1. Schema graph representation
We build a relationship graph:
{
"tables": {
"campaigns": {
"columns": [...],
"relationships": [
{
"table": "ad_groups",
"type": "one_to_many",
"join_on": "campaigns.id = ad_groups.campaign_id"
}
]
}
}
}
2. Smart context pruning
Instead of sending all 200 tables to the AI, we:
- Identify which tables are likely relevant based on the question
- Send only those tables + their direct relationships
- Keep total context under 40k tokens
3. Schema documentation
We encourage users to document their schemas:
{
"table": "campaigns",
"description": "Marketing campaigns across all channels",
"columns": {
"campaign_type": {
"description": "Type of campaign: 'brand', 'perf', 'retarget'",
"common_values": ["brand", "perf", "retarget"]
}
}
}
This helps the AI understand domain-specific terminology.
4. Query templates
For common analysis patterns, we maintain templates:
"To calculate campaign ROAS, join campaigns + conversions + costs, group by campaign, calculate revenue / cost."
When a question matches a pattern, we use the template as a starting point.
Lesson 8: Streaming Improves Perceived Performance
Even when queries take 15 seconds, users are more patient if they see progress.
What We Did About It
1. Thinking indicators
We show what the AI is doing:
- "Understanding your question..."
- "Analyzing schema..."
- "Generating query..."
- "Running query (this may take 10-15 seconds)..."
- "Analyzing results..."
2. Partial results
For queries with multiple steps:
- Show results from step 1 while step 2 runs
- Stream result rows as they arrive
- Update summary statistics in real-time
3. Progress estimation
BigQuery doesn't provide progress updates, but we fake it:
// Show fake progress for UX
const estimatedTime = estimateQueryTime(query);
const progressInterval = setInterval(() => {
progress = Math.min(progress + 10, 90);
updateUI(progress);
}, estimatedTime / 10);
Not accurate, but better than a blank screen.
Lesson 9: Caching is Harder Than It Looks
Caching BigQuery results seems simple: hash the query, store the result.
Problems:
- Data freshness: Cached result might be stale
- Schema changes: Cached query might be invalid after schema update
- Parameterization: Similar queries with different dates aren't cache hits
- Storage costs: Caching 1M rows * 100 queries = expensive
What We Did About It
1. TTL based on data freshness
We ask users about data update frequency:
- Real-time data (e.g., live campaigns): 5-minute TTL
- Daily data (e.g., GA4): 24-hour TTL
- Historical data: 7-day TTL
2. Semantic caching
Instead of exact query matching, we use semantic similarity:
"Show me campaign performance last 7 days" and "What's my campaign ROAS for the last week?" are semantically similar → cache hit.
3. Partial caching
For queries with date ranges, we cache by date:
- User asks for last 30 days
- We have cache for days 1-28
- Query only last 2 days, merge with cache
4. Smart invalidation
When schema changes, we invalidate affected cached queries instead of all cache.
Lesson 10: BigQuery Isn't Always the Answer
Sometimes, the best solution is not querying BigQuery at all.
Example: User asks "What's my total spend today?"
Option A: Query BigQuery
- Generate SQL
- Run query (3-5 seconds)
- Return result
Option B: Maintain real-time aggregates
- Update running totals on data insert
- Return from cache (<100ms)
For frequently-asked questions, precomputation beats querying.
What We Did About It
1. Identify hot queries
We log all questions and identify the most common:
- "Total spend today"
- "Campaign count"
- "Top 10 campaigns by spend"
- etc.
2. Materialized views
For hot queries, we create materialized views that auto-update:
CREATE MATERIALIZED VIEW daily_campaign_summary AS
SELECT
DATE(_PARTITIONTIME) as date,
campaign_id,
SUM(cost) as total_cost,
SUM(conversions) as total_conversions
FROM ad_impressions
GROUP BY date, campaign_id
3. Query rewriting
When a question matches a hot pattern, we rewrite to use the materialized view:
User query → "total spend today" AI generates → `SELECT SUM(cost) FROM adimpressions WHERE ...` We rewrite → `SELECT SUM(totalcost) FROM dailycampaignsummary WHERE date = CURRENT_DATE()`
Result: 15 seconds → 0.8 seconds, $0.50 → $0.001.
What I'd Tell My Past Self
If I could go back to when we started building this, here's what I'd say:
1. Build cost controls from day one
We added them after the $235 query incident. Should have been there from the start.
2. Instrument everything
We didn't track query performance metrics early. Now we track:
- Query time distribution
- Cost per query
- Cache hit rates
- Error rates by type
- Most common queries
This data drives optimization priorities.
3. Design for failure
Users will ask impossible questions. Queries will timeout. BigQuery will have outages.
Build graceful degradation into the product from the beginning.
4. Performance perception matters more than actual performance
A 10-second query with progress indicators feels faster than a 7-second query with a blank screen.
5. Invest in query optimization infrastructure
We spent weeks building query rewriting, cost estimation, and performance analysis tools. Worth every hour.
6. Users don't think in SQL
Teaching the AI to generate performant SQL is necessary but not sufficient. You need to translate user intent to optimal data access patterns.
The Honest Truth
Running real-time BigQuery analysis in production is hard.
Not "challenging" hard. Genuinely difficult engineering work that requires:
- Deep BigQuery knowledge
- AI prompt engineering
- Cost management
- Performance optimization
- Error handling
- UX design
But when it works—when a user asks a complex question and gets an insightful answer in 3 seconds—it feels like magic.
We're processing thousands of queries daily now. The system works. But it took months of optimization, iteration, and learning from production issues.
If you're building something similar, expect to spend more time on performance, cost, and reliability than on the AI itself.
The AI is the easy part. Making it work at scale is the challenge.
---
About Berner Setterwall
Berner is CTO and co-founder of Cogny, where he's building AI-powered marketing automation on top of BigQuery. He's spent years optimizing large-scale data systems, from Campanja's AI platforms to Cogny's real-time analytics engine. He specializes in making complex data systems fast, reliable, and cost-effective.
Want to see our BigQuery integration in action?
Cogny connects directly to your BigQuery warehouse and provides real-time AI-powered analysis. We've spent months optimizing for performance, cost, and reliability so you don't have to. Book a demo to see how we make BigQuery analysis feel instant.