Documentation

Methodology, Assumptions & Calculations

Contents

  1. Overview
  2. Data Sources
  3. Core Calculations
  4. Data Center Load Profiles
  5. Chart-by-Chart Guide
    1. Supply vs Demand
    2. Effective Headroom
    3. LMP vs Net Load Scatter
    4. Headroom Impact (DC Profiles)
    5. Grid Utilization Scorecard
    6. Electricity Cost Comparison
    7. Stress Day Deep-Dive
    8. Overgeneration Surplus Absorption
    9. Marginal Emissions
    10. Optimized Load Profile Calculator
  6. Custom Profile Editor
  7. Assumptions & Limitations
  8. Validation Reference Points
  9. Future Data Sources

1. Overview

This dashboard analyzes California (CAISO) grid conditions to evaluate how flexible data center load profiles affect grid utilization, reliability, cost, and emissions. The core thesis:

Data centers that shift load to follow solar generation ("camel" profiles) improve grid utilization, reduce electricity costs, absorb renewable surplus, and lower marginal emissions — compared to flat or night-heavy profiles.

Virginia passed a bill incorporating a grid utilization factor to encourage better use of existing capacity before building new infrastructure. Data center operators seeking faster interconnection can demonstrate they improve grid utilization through flexible load profiles. This dashboard quantifies that value proposition using real CAISO data.


2. Data Sources

2.1 Demand Forecast

PropertyDetails
SourceCAISO day-ahead demand forecast
Filedemand_forecast.json
FormatDaily entries keyed by date ("YYYY-MM-DD"), each containing a 24-element array of hourly demand in MW
CoverageJan 1, 2020 – Dec 31, 2025 (2,192 days)
Missing dataNull values treated as 0 MW

2.2 Available Capacity

PropertyDetails
SourceDerived from CAISO resource adequacy filings and installed capacity data
Fileavailable_capacity.json
FormatNested by year → month → resource type → 24-hour array (MW)
Resource typesNuclear, Geothermal, Biomass, Other Thermal, Gas CCGT, Gas Peaker, Gas Steam, Gas ICE, Hydro, Wind, Solar, Battery Storage

Key assumption: Monthly average hourly profiles — the same 24-hour profile is used for every day within a given month. This does NOT capture day-to-day weather variability. Solar profiles reflect monthly capacity factors (zero at night, peak at solar noon). Battery Storage is excluded from total available capacity (treated as load modifier, not supply).

2.3 CAISO LMP Prices

PropertyDetails
SourceCAISO Day-Ahead Market, DLAP (Default Load Aggregation Point) nodes
Filecaiso_prices.json
WeightingLoad-weighted across 3 utility zones: PG&E (43.5%), SCE (43.5%), SDG&E (13%)
ComponentsLMP = MEC (Marginal Energy) + MCC (Congestion) + Loss + GHG
Coverage~2,170 of 2,192 days have price data

2.4 CA Natural Gas Citygate Prices

PropertyDetails
SourceEIA (U.S. Energy Information Administration)
ResolutionMonthly ($/MCF)
UsageNormalizing LMP and MEC to remove gas price variation, creating an implied heat-rate proxy
CoverageJan 2020 – Dec 2025

3. Core Calculations

3.1 Reserve Requirements

Reserves model California's operating reserve obligations. The requirement is the greater of a percentage of demand or a floor value:

reserve[h] = max(3,500 MW, demand[h] × reserve_percentage)
PeriodHoursReserve %
Peak4–9 PM (h=16–21)8.4%
Shoulder6–9 AM, 9–11 PM (h=6–9, h=21–23)7.7%
Off-peakAll other hours7.0%

Simplification: Actual CAISO procurement uses Regulation Up/Down, Spinning Reserve, and Non-Spinning Reserve with different requirements. The 3,500 MW floor approximates California's typical minimum reserve margin.

3.2 Effective Headroom

The key reliability metric — how much spare capacity exists above demand and reserves:

headroom[h] = available_capacity[h] − demand[h] − reserve[h]

3.3 Net Load

Demand minus non-gas generation — the residual load that gas plants must serve:

net_load[h] = demand[h] − (solar + wind + hydro + nuclear + geothermal + biomass + other_thermal)[h]

Used in the LMP vs Net Load scatter chart to show how electricity prices respond to the gas fleet's workload.


4. Data Center Load Profiles

All profiles deliver the same total daily energy (avgMw × 24 MWh) for fair comparison. This is critical: no profile "cheats" by consuming less total energy.

4.1 Flat

flat[h] = avgMw    for all h ∈ [0, 23]

Constant load at all hours. The simplest profile and the baseline for comparison.

4.2 Camel (Solar-Following)

raw[h] = max(0, avgMw × (1 + flex × SOLAR_NORM[month][h] / 0.3))

Where:

4.3 Night-Heavy (Inverse Camel)

raw[h] = max(0, avgMw × (1 − flex × SOLAR_NORM[month][h] / 0.3))

The opposite of camel — shifts load away from solar hours into evening and night. Represents a traditional data center pattern that ramps up when electricity demand is highest.

4.4 Energy Preservation

After the max(0, ...) clipping, the total energy may not equal avgMw × 24. A post-clipping rescaling step corrects this:

target = avgMw × 24
actual_sum = sum(raw[0..23])
scale = target / actual_sum
profile[h] = raw[h] × scale

This is particularly important for the night-heavy profile in summer months, where max(0, ...) clips solar-hour values to zero.

4.5 Controls

4.6 Monthly Solar Shape

The SOLAR_CF profiles (12 months × 24 hours) are derived from California solar capacity factor data:


5. Chart-by-Chart Guide

Supply vs Demand (Stacked Area)
Top-level view of the grid for any given day

What it shows: Stacked area chart of each generation resource's available capacity (24 hours) overlaid with a demand line.

How to read it:

Key insight: On sunny spring days with low demand, solar pushes total capacity far above demand — this is where overgeneration and negative prices occur.

Effective Headroom (Confidence Bands)
Statistical view of reliability across all 2,192 days

What it shows: For each hour of the day, the distribution of effective headroom across all days in the dataset.

Bands:

Key insight: Headroom is tightest during evening peak hours (5–8 PM) when solar drops off but demand stays high. The p5 line dipping toward or below zero means 5% of days have dangerously low spare capacity at that hour.

LMP vs Net Load Scatter
How electricity prices respond to the gas fleet's workload

What it shows: Each dot is one hour from the dataset. X-axis = net load (demand minus all non-gas generation), Y-axis = LMP (wholesale electricity price).

Color: Time of day (yellow = daytime solar hours, blue/purple = night hours)

Dot size: Scaled by CA natural gas citygate price — larger dots = more expensive gas months

Key insight: When net load is negative (surplus), LMP drops to zero or negative — the grid is paying to offload excess generation. When net load is high, LMP rises sharply as expensive peaker plants come online. Gas price shifts the entire curve vertically.

Headroom Impact: DC Load Profiles
How each DC profile shifts reliability margins

What it shows: The baseline headroom percentile bands (shaded, from the chart above) with additional dashed lines showing the median headroom after adding each data center profile.

Profile lines:

Key insight: The camel profile's line stays closest to the baseline during evening peak hours (4–9 PM) because it consumes less at those times, preserving reliability when it matters most.

Grid Utilization Scorecard
Side-by-side comparison of key metrics

What it shows: Color-coded stat cards comparing Baseline, Flat, Camel, Night-Heavy, and Custom profiles across three metrics:

MetricFormulaWhat it means
Avg Utilization mean((demand + dc_load) / available) How efficiently grid capacity is used. Higher = better utilization of existing infrastructure.
Min Headroom (p5) 5th percentile of all hourly headroom values Worst-case reliability. The headroom level that only 5% of hours fall below.
Stress Hours Count of hours where headroom < 3,500 MW How often the grid is dangerously tight. Fewer = better.

Arrows: Green up-arrows indicate improvement vs baseline; red down-arrows indicate degradation.

Key insight: The camel profile increases utilization (good) while adding fewer stress hours than the flat or night-heavy profiles.

Electricity Cost Comparison by Load Profile
How each profile shifts grid headroom and affects wholesale prices

Methodology: Empirical LMP-Headroom Model

Unlike a simple weighted average that assumes prices don't change when load is added, this analysis uses an empirical relationship learned from 6 years of CAISO data to model how adding DC load shifts grid headroom and thereby affects LMP.

Why Not Use Historical LMP Directly?

A naive approach would compute: cost = Σ(dc_load[h] × historical_LMP[h]) / Σ(dc_load[h]). This produces misleading results because:

Result: All DC profiles would show nearly identical costs (~$35/MWh), with no price signal for flexibility value.

Building the LMP-Headroom Lookup Table

We extracted the empirical relationship between effective headroom and LMP from ~52,080 hourly observations (2020-2025):

Step 1: Define 6 headroom bins (based on scatter plot analysis)

BinHeadroom Range (GW)Grid Condition
0< 0Scarcity (load shedding risk)
10–5Very Tight (peakers running)
25–10Tight (high stress)
310–20Moderate (normal operations)
420–30Comfortable (baseload only)
5≥ 30Abundant (possible curtailment)

Step 2: Normalize LMP by gas price

normalized_LMP = LMP / gas_price_CA_citygate

Removes gas price volatility (2020-2021: ~$2-3/MCF vs 2022: ~$7-9/MCF) to extract the structural relationship between headroom and pricing.

Step 3: Build 24×6 lookup matrix

For each hour-of-day (0-23) and headroom bin (0-5):

  1. Collect all historical observations that fall into that (hour, bin) combination
  2. Filter outliers (normalized LMP > 200 MCF-1)
  3. Compute median normalized LMP for the bin
lmpHeadroomLookup[hour][bin] = median(normalized_LMP_observations)

This captures:

Applying the Model

For each day with price data and each hour:

Baseline (No DC):

headroom_baseline = avail[h] - demand[h] - reserves[h] bin = get_headroom_bin(headroom_baseline) LMP_baseline = lmpHeadroomLookup[h][bin] × gas_price[month]

With DC Profile:

headroom_dc = avail[h] - (demand[h] + dc_load[h]) - reserves_dc[h] bin_dc = get_headroom_bin(headroom_dc) LMP_adjusted = lmpHeadroomLookup[h][bin_dc] × gas_price[month] cost[profile] = Σ(dc_load[h] × LMP_adjusted[h]) / Σ(dc_load[h])

Note: Reserves are recomputed with DC load since they scale with total demand.

Chart Display

Bar chart: Single bar per profile showing adjusted LMP ($/MWh). No component breakdown (MEC/MCC/Loss/GHG) because the lookup table provides total LMP only.

Baseline bar: Simple average LMP under historical grid conditions (no DC load). Provides reference for what grid prices were before any DC is added.

Stat cards below: Show per-profile $/MWh, annual cost ($M/year), and savings vs Flat (not Baseline). Color borders match profile colors.

Key Results

ProfileTypical CostWhy?
Baseline$38-42/MWhAverage grid LMP with no DC (reference point)
Flat$40-45/MWhReduces headroom uniformly at all hours
Camel$28-35/MWhLoads during surplus hours (high headroom → low LMP)
Night-Heavy$55-75/MWhLoads during tight evening hours (low headroom → high LMP)

Economic justification for flexible load: A 500 MW solar-following DC saves $10-15M/year compared to flat profile, purely from avoiding high-LMP hours. At 5,000 MW, night-heavy profiles spike to $100-150/MWh as they push the grid into scarcity bins.

Limitations:

  • Model is empirical, not a full dispatch simulation (no unit commitment, transmission constraints, or offer-based pricing)
  • Lookup table provides total LMP only (no component breakdown available)
  • Assumes bin-level granularity (6 bins) is sufficient to capture price curve
  • Large DCs (5+ GW) may exceed the range of historical observations
Stress Day Deep-Dive (Heatmap)
The 100 most-stressed days, hour by hour

How it works:

  1. For each of 2,192 days, compute the minimum hourly headroom (baseline)
  2. Sort all days by minimum headroom (ascending — worst days first)
  3. Take the 100 most-stressed days
  4. Display a heatmap: rows = days, columns = 24 hours, color = headroom

Color scale:

Tabs: Switch between Baseline, +Flat DC, +Camel DC, +Night DC, and +Custom DC to see how each profile changes the heatmap pattern. Hover over any cell for exact headroom values.

Summary cards: Show average minimum headroom, worst single hour, and stress hour count for the selected profile — all computed on the 100 stress days only.

Overgeneration Surplus Absorption
How much excess renewable energy each profile absorbs

What is surplus? Hours when all non-gas generation exceeds demand:

surplus[h] = max(0, non_gas_gen[h] − demand[h])

Where non_gas_gen = Solar + Wind + Hydro + Nuclear + Geothermal + Biomass + Other Thermal.

When surplus > 0, even with ALL gas plants shut off, the grid overproduces. This energy is typically curtailed (mostly solar).

Chart: Monthly bar chart showing surplus (gray line) and how much each DC profile absorbs (colored bars). Values are annualized (divided by years of data).

Absorption formula:

absorbed[h] = min(dc_load[h], surplus[h])

Camel DCs absorb the most because they peak during solar hours — exactly when surplus is highest.

Why include hydro in non-gas generation?

Limitations: Available capacity ≠ actual generation. CAISO's 2025 curtailment was ~92% transmission-constrained (3.12 of 3.4 TWh) — our model captures only oversupply-driven surplus. No imports/exports or battery dispatch are modeled.

Marginal Emissions by Load Profile
CO² impact of each profile based on which gas plants are on the margin

Methodology: Dispatch-Stack Proxy

For each hour, we determine the marginal generator by walking the gas merit order:

  1. Compute gas residual: gasResidual = max(0, demand - nonGasGen)
  2. Walk merit order (cheapest first):
    • Gas CCGT capacity: if gasResidual fits → marginal rate = 0.41 tCO²/MWh
    • Gas Steam → 0.55 tCO²/MWh
    • Gas ICE → 0.50 tCO²/MWh
    • Gas Peaker → 0.63 tCO²/MWh
  3. If gasResidual = 0 (surplus): marginal rate = 0 tCO²/MWh — adding load avoids RE curtailment rather than causing gas combustion
Gas TypeHeat Rate (MMBtu/MWh)CO² Rate (tCO²/MWh)
CCGT7.00.41
Steam9.50.55
ICE8.50.50
Peaker10.80.63

Rates derived from standard heat rates × EPA emission factor for natural gas (53.06 kg CO²/MMBtu).

Chart (dual axis):

Key insight: During solar hours, marginal emissions are near zero because the next MW of load displaces RE curtailment rather than causing additional gas combustion. Camel DCs consume most energy during these low-emission hours, making them the cleanest option per MWh consumed.

Stat cards: Show per-profile weighted average emission rate (tCO²/MWh), annual CO² (ktCO²/yr), and savings vs Flat.

Optimized Load Profile Calculator
Computes the optimal hourly DC load allocation that minimizes electricity cost while meeting daily energy requirements

Purpose: While preset profiles (Flat, Camel, Night-Heavy) demonstrate general patterns, this calculator finds the theoretically best load allocation across all 24 hours for each day, given real grid conditions and empirical LMP-headroom relationships.

Methodology: Iterative Greedy Optimization with Constraints

For a data center of size avgMw, the optimizer allocates avgMw × 24 MWh/day of energy across 24 hours to minimize cost:

  1. Initialization: Set optLoad[h] = 0 for all hours (h = 0..23)
  2. Increment sizing:
    • avgMw ≤ 1,000 MW → 50 MW increments
    • avgMw ≤ 2,500 MW → 100 MW increments
    • avgMw > 2,500 MW → 200 MW increments

    Larger increments for bigger DCs reduce computation time while maintaining accuracy.

  3. Iterative allocation: Repeat until sum(optLoad) ≈ avgMw × 24:
    1. For each hour h:
      • Skip if optLoad[h] + INCREMENT_MW > 3 × avgMw (hourly capacity constraint)
      • Compute test demand: testDemand[h] = demand[h] + optLoad[h] + INCREMENT_MW
      • Compute test reserves: testReserves = max(0.06 × max(testDemand), 3500)
      • Compute headroom: headroom = available[h] - testDemand[h] - testReserves[h]
      • Lookup marginal LMP from empirical 24×6 lookup table: LMP[h][bin(headroom)]
      • Compute marginal cost: cost = INCREMENT_MW × LMP
    2. Find hour h* with lowest marginal cost
    3. Allocate: optLoad[h*] += INCREMENT_MW
  4. Rescaling: After allocation, rescale to exactly match target:
    scale = (avgMw × 24) / sum(optLoad) optLoad[h] = optLoad[h] × scale for all h
  5. Cost computation: For each day's optimized profile, recompute actual cost using the adjusted headroom-based LMP (same method as other profiles)

Hourly capacity constraint: No single hour can exceed 3× avgMw. This prevents unrealistic concentration (e.g., allocating all 24h of energy to a single hour). For a 4,500 MW DC, max hourly load = 13,500 MW.

Why Greedy Works

The greedy approach (always choosing the lowest-cost hour for the next increment) is near-optimal because:

Results Display

Chart: Dual-bar chart showing:

The chart's y-axis is fixed across all dates (set to 1.2× the maximum hourly load across all optimized days) to make day-to-day variation visible.

Missing data warning: If a date is missing demand, capacity, or price data (~22 of 2,192 days), the purple bars show the average profile and a warning appears. This ensures consistent visualization.

Scorecard: Four cost comparison cards:

Each card shows:

Key Insights

  • Camel ≈ Optimized: The preset Camel profile typically comes within 2-5% of the optimized cost, validating that solar-following is near-optimal
  • Hourly variation: Optimized profiles vary significantly day-to-day based on actual grid conditions (solar output, demand peaks, headroom), unlike preset profiles which use fixed monthly shapes
  • Marginal value of flexibility: The savings (Flat cost - Optimized cost) quantify the economic value of perfect flexibility

Performance

Typical computation times (client-side JavaScript):

DC SizeIncrements per DayTotal IterationsTime
500 MW240~520k3-5 sec
2,000 MW480~1.0M8-12 sec
5,000 MW600~1.3M15-20 sec

Computation is opt-in (triggered by "Calculate" button) to avoid slowing down the main dashboard.


6. Custom Profile Editor

Click the "Custom" preset button to open the interactive profile editor. This adds a 4th profile alongside Flat, Camel, and Night-Heavy, allowing direct comparison of any arbitrary load shape against the presets.

How it works

Data model

customMultipliers[24] — relative weights, sum = 24 actual_load[h] = avgMw × customMultipliers[h]

Since sum(customMultipliers) = 24, total daily energy is always avgMw × 24 MWh.

Normalization

After each draw stroke (mouseup):

scale = 24 / sum(customMultipliers) customMultipliers[h] = customMultipliers[h] × scale    for all h

This ensures energy preservation regardless of the shape you draw. The "Reset to Flat" button restores all multipliers to 1.0.


7. Assumptions & Limitations

#AssumptionImpact
1Monthly average capacity profiles applied to all daysOverestimates surplus on cloudy days, underestimates on clear days
2No battery/storage dispatch modeledUnderestimates real-world surplus absorption
3No import/export flowsCAISO exports surplus to neighbors; our model assumes autarky
4Simplified reserve model (%-of-demand + floor)Approximation of CAISO's actual procurement
5DLAP load-weighted LMP (PG&E 43.5%, SCE 43.5%, SDG&E 13%)Smooths out locational price extremes
6DC is price-taker (no market impact)Reasonable up to ~500 MW; large DCs would affect prices
7Solar CF profiles are fixed monthly averagesNo cloud/weather variability captured
8Analysis is CAISO-specificVirginia/PJM would need different data, different generation mix
9Time period: 2020–2025Includes COVID demand anomaly (2020), Texas freeze (Feb 2021), summer heat events
10Hydro treated as inflexible in surplus calculationPartially justified by spring snowmelt; overstates surplus in dry months
11Simplified gas merit order for emissionsReal CAISO dispatch uses offer-based optimization, not a fixed stack
12No imported emissions modeledCAISO imports significant energy from neighboring regions
13Static emission rates per gas typeReal rates vary with plant efficiency curves and ambient conditions

8. Validation Reference Points

MetricOur ModelReal-World ReferenceNotes
Annual surplus~3.1 TWh/yrCAISO curtailed ~3.4 TWh (2025)Close match, but different mechanism mix
Peak demand~45–52 GWCAISO record: 52.1 GW (Sep 2022)Data captures this
Min headroomCan go negativeCAISO issues Flex Alerts when tightModel captures stress periods
LMP range-$100 to $1,000+/MWhCAISO prices have gone negative and spiked >$1,000Verified for specific dates

Verification Checklist

To verify any specific calculation from the dashboard:

  1. Pick a known date (e.g., 2023-08-15, a summer peak day)
  2. Check demand: demandData["2023-08-15"] → 24-element array
  3. Check capacity: capacityData["2023"]["08"] → resource arrays
  4. Compute headroom by hand: sum(all resources except Battery)[h] - demand[h] - max(3500, demand[h] × 0.084)
  5. Check LMP: priceData["2023-08-15"]["14"] → {LMP, MEC, MCC, Loss, GHG}
  6. Compute DC cost: dc_load[h] × LMP[h] for several hours, compare to displayed average

The dashboard pre-computes these across all 2,192 days. Any individual data point can be verified against the raw JSON files.


9. Future Data Sources (Not Yet Integrated)

SourceWhat It ProvidesWhy It Matters
EIA Hourly Generation (EIA-930)Actual hourly generation by fuel typeReplaces capacity-based proxy with real generation data
CAISO Curtailment DataActual hourly curtailment by resourceGround truth for surplus/curtailment analysis
WattTime MOERMarginal Operating Emissions RateReal marginal emission rates by grid region and hour
PJM Data (Virginia)PJM hourly prices, generation, capacityEnables analysis for Virginia data centers (where the utilization bill passed)