Notes on ML engineering, data platforms, and the developer tools I build along the way.
Hourly probes across 15 frontier models from OpenAI, Anthropic, Google, DeepSeek, and xAI. Median TTFT ranges from 321ms to 4,226ms. Raw data included.
I scanned OpenAI's cookbook for LLM API calls and estimated the monthly cost at 1,000 calls per site. Four gpt-5 call sites account for 68% of the total spend.
How I built an ELT data platform for 100k+ IoT devices: Dagster for orchestration, dbt for transforms, Sqitch for migrations, ArgoCD for GitOps deployment, and PII-safe extraction from five API shards.
How I built an evaluation pipeline for battery prediction algorithms serving 100k+ IoT devices: Dagster-orchestrated dataset creation from field data, human-in-the-loop review, isolated venv testing across algorithm versions, MLflow tracking, and fleet-wide rollout.
60,000 probes across GPT-4o-mini, Claude 3.5 Haiku, Gemini 2.0 Flash, Llama 3.3 70B, DeepSeek Chat, and Mistral Small. Real latency numbers from continuous monitoring.
Building tokentoll, an Infracost-style cost-impact tool for LLM API spend, in a single day. Architecture, model-name resolution, multi-pass constant propagation, and validation across twenty real codebases.