# FP&A Test Data Platform

Python tooling to generate, validate, and load realistic FP&A data into your Go API.

## Structure

```
testing/
├── generators/
│   └── generate_data.py     # Creates all CSV files
├── loaders/
│   └── api_loader.py        # POSTs CSVs to your Go API
├── tests/
│   └── test_fpa.py          # Data integrity + API tests
├── data/
│   └── csv/                 # Generated CSV files land here
└── requirements.txt
```

## Quick Start

```bash
pip install -r requirements.txt

# 1. Generate all CSV data
python generators/generate_data.py

# 2. Validate data integrity (no API needed)
pytest tests/test_fpa.py -v

# 3. Load into your Go API (dry-run first)
python loaders/api_loader.py --dry-run

# 4. Load for real
python loaders/api_loader.py --url http://localhost:8080
```

## Generated Datasets

| File | Rows | Description |
|------|------|-------------|
| `revenue_budget_vs_actuals.csv` | 48 | Product & Service revenue — budget vs actuals, 24 months |
| `opex_budget_vs_actuals.csv` | ~2,688 | Dept × category opex — budget vs actuals |
| `pl_income_statement.csv` | 24 | Monthly P&L: revenue, COGS, gross profit, EBITDA, net income |
| `cash_flow.csv` | 24 | Operating / investing / financing cash flows, rolling balance |
| `headcount_workforce.csv` | ~1,000+ | Employee snapshots per month, with hire/term dates & salaries |

## Key Concepts

**Budget vs Actuals** — Every financial row has a `budget_amount` (what was planned) and
`actual_amount` (what really happened). The `variance` = actual − budget. Positive variance
on revenue = good. Positive variance on spend = over budget.

**Product vs Service revenue** — Product is recurring SaaS subscriptions (~70%, higher margin).
Service is consulting/support (~30%, lower margin). Both grow monthly at different rates.

**Cash Flow** — Separate from revenue. Collections can lag invoicing (DSO effect). Includes
a simulated Series A raise in June 2023.

## API Endpoints (from main.go)

```
POST   /api/v1/budgets          ← create one budget line
PUT    /api/v1/budgets/{id}     ← update a budget line
DELETE /api/v1/budgets/{id}     ← delete a budget line
POST   /api/v1/actuals/ingest   ← bulk ingest actuals { "records": [...] }
GET    /api/v1/variance         ← variance report (budget vs actuals)
GET    /api/v1/variance/alerts  ← over/under budget alerts
GET    /api/v1/health           ← db health check
```

## Load Order

The seeder always loads in this order — **budgets must exist before actuals**:

1. **Budgets** — `POST /api/v1/budgets` individually (revenue + opex lines)
2. **Actuals** — `POST /api/v1/actuals/ingest` in batches of 50
3. **Variance check** — `GET /api/v1/variance` to confirm data landed

## Loader Options

```bash
python loaders/api_loader.py --help

  --url       API base URL (default: http://localhost:8080)
  --batch     Records per actuals/ingest request (default: 50)
  --dry-run   Print payloads without sending
  --token     Bearer token for auth header
  --only      Run one step only: budgets | actuals | variance | alerts
```