Athena Pipeline β Warehouse-First Architecture
Migrate all data reads to athena_warehouse. Products, brands, dev groups, SKUs and stages stop calling Airtable directly β everything flows through the warehouse. Data gaps become visible. Write-back goes to warehouse first, then queues to Airtable/Asana.
The Core Principle
athena_warehouse. Data is fresh as of last nightly sync. Data gaps are visible. Write-back queues to Airtable/Asana via PDOS Bridge.Warehouse Data Already Available
All data needed for this migration is already in athena_warehouse:
| Table | Rows | Source | Sync Frequency |
|---|---|---|---|
airtable_raw.product | 160 | Airtable | Nightly |
airtable_raw.dev_group | 29 | Airtable | Nightly |
airtable_raw.sku | 741 | Airtable | Nightly |
airtable_raw.product_line | 13 | Airtable | Nightly |
airtable_raw.client | 17 | Airtable (brands) | Nightly |
asana_raw.project | 42 | Asana | Nightly |
asana_raw.task | 462 | Asana | Nightly |
ns.invoices | 325K | NetSuite | Nightly |
pdos.pending_sync_queue | 2 pending | App writes | PDOS Bridge drains every 5min |
netsuite_raw.sync_state shows last sync April 20 (3 weeks ago) despite nightly crons reporting success. This is a separate issue to diagnose β Gluon must answer this in the review phase. Revenue data may be stale.
Architecture After This Build
Phase 1 β New FastAPI Endpoints
Add to athena-bi-mvp/backend/ as a new router: routers/pipeline.py.
| Endpoint | Source | Key fields |
|---|---|---|
GET /api/pipeline/products |
airtable_raw.product + product_line + client | id, name, status, brand_name, product_line_name, sku_count, has_data_gaps, data_gaps[] |
GET /api/pipeline/brands |
airtable_raw.client + product + ns.invoices | id, name, product_count, sku_count, dev_group_count, revenue_ytd |
GET /api/pipeline/product-lines |
airtable_raw.product_line + product | id, name, product_count, active_product_count, dev_group_count |
GET /api/pipeline/skus |
airtable_raw.sku + product | id, sku_name, product_name, size_name, regions, photo_url, has_data_gaps |
GET /api/pipeline/dev-groups |
airtable_raw.dev_group + client + product_line | All stage dates, asana_gids, brand_name, has_data_gaps |
PATCH /api/pipeline/dev-groups/{id} |
airtable_raw.dev_group β pdos.pending_sync_queue | Writes warehouse first, enqueues Airtable + Asana sync. Returns 200 immediately. |
GET /api/pipeline/line-badges |
airtable_raw.product_line | id, name, logo_url β replaces CF Worker Airtable call |
GET /api/pipeline/revenue |
ns.invoices (existing) | ytd_total, monthly[], by_brand[], by_product_line[], data_freshness.is_stale |
has_data_gaps: true when:Product: brand_id IS NULL or product_line IS NULL or netsuite_item IS NULL
Dev group: brand IS NULL or product_line IS NULL or launch_date IS NULL
SKU: product_id IS NULL (orphaned)
Phase 2 β CF Worker Cleanup
Remove β proxy to FastAPI instead
GET /api/productsβ proxypulse.athenaag.com/api/pipeline/productsGET /api/line-badgesβ proxypulse.athenaag.com/api/pipeline/line-badgesGET /api/revenueβ update topulse.athenaag.com/api/pipeline/revenue
Keep β no warehouse equivalent
POST /api/tasksβ Asana task creationGET/POST /api/queue/*β action queuePOST /api/askβ Groq AIGET /api/modal/{id}β Notion modalPOST /api/activityβ activity log
Phase 3 β Frontend Updates
Minimal frontend changes required β the read path changes are transparent since CF Worker proxies the same URLs.
- Data gap badges β when
has_data_gaps: true, show β οΈ badge on product/brand/dev group card with tooltip listingdata_gaps[] - Revenue staleness banner β if
data_freshness.is_stale: true, show top banner: "Revenue data may be outdated β last sync [date]" - Dev group stage editing β existing edit UI sends
PATCHto CF Worker, which proxies to FastAPIPATCH /api/pipeline/dev-groups/{id}
Sync Frequency Decisions
| Source | Direction | Frequency | Rationale |
|---|---|---|---|
| NetSuite | Inbound β warehouse | Nightly | Revenue doesn't change intraday in ways that matter for pipeline view |
| Airtable | Inbound β warehouse | Nightly | App is authoritative writer; inbound only catches external edits |
| Airtable | Outbound (write-back) | Every 5min via PDOS queue | PDOS Bridge already runs every 5min |
| Asana | Outbound (stage dates) | Every 5min via PDOS queue | Same queue, same cron |
| Asana | Inbound β warehouse | Nightly | Task status is informational on pipeline view |
Open Questions β Gluon Must Answer Before Build
netsuite_raw.sync_state shows last sync April 20 despite nightly crons reporting success. Why the discrepancy? Is the overnight sync writing to the wrong table? Is revenue data actually 3 weeks stale?
Pull exact column definitions for pdos.pending_sync_queue. What shape does a dev_group write-back entry need? PATCH endpoint inserts here β must match the schema exactly.
Is there currently a CF Access policy on pulse.athenaag.com? If the pipeline frontend calls it directly (not via CF Worker), will unauthenticated requests be blocked? Confirm before routing frontend there.
Does the existing BRAND_MAP in main.py match cleanly against airtable_raw.client.name? Any mismatches that would break the /api/pipeline/brands revenue join?