Improve car load subtraction logic#3406
Conversation
Increase hold back to 48 hours if we have enough data
There was a problem hiding this comment.
Pull request overview
Updates Predbat’s ML load-forecasting pipeline to (a) subtract EV/car charging energy from household load more robustly and (b) increase the validation holdout window to 48h when sufficient history is available.
Changes:
- Add a new
LoadMLComponent.car_subtraction()path that interpolates gaps and smooths/matches car deltas before subtracting from load. - Make
LoadPredictor.train()acceptvalidation_holdout_hoursand wire it through to dataset creation. - Update/extend ML load tests to reflect the “backwards cumulative” minute-series convention and to use a 48h holdout in the real-data training helper.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| apps/predbat/load_ml_component.py | Adds improved car charging subtraction logic and dynamically sets validation holdout to 24/48h based on available history. |
| apps/predbat/load_predictor.py | Threads a configurable validation holdout window into dataset creation/training and adds dataset split logging. |
| apps/predbat/tests/test_load_ml.py | Adjusts synthetic cumulative series construction and updates the (optional) real-data training helper to use a 48h holdout. |
| def car_subtraction(self, load_cumulative, car_cumulative, step=PREDICT_STEP, interpolate_gaps=True, max_gap_minutes=60, smoothing_window=3): | ||
| """ | ||
| Improved car charging subtraction that handles timing misalignment and gaps. | ||
|
|
||
| Args: | ||
| load_cumulative: Dict of {minute: cumulative_kwh} for load (backwards format) | ||
| car_cumulative: Dict of {minute: cumulative_kwh} for car charging (backwards format) | ||
| step: Step size in minutes (default 5) | ||
| interpolate_gaps: Whether to interpolate missing car data | ||
| max_gap_minutes: Maximum gap to interpolate (minutes) | ||
| smoothing_window: Number of steps for smoothing car data | ||
|
|
There was a problem hiding this comment.
car_subtraction() introduces new behavior (gap interpolation, median smoothing, and ±window matching), but current tests only cover perfectly aligned 5‑minute series. Please add a unit test that includes (a) missing car data points/gaps and (b) a few-minute offset between load and car series to verify interpolation/smoothing works and never over-subtracts beyond load.
| def train(self, load_minutes, now_utc, pv_minutes=None, temp_minutes=None, import_rates=None, export_rates=None, is_initial=True, epochs=100, time_decay_days=7, patience=5, validation_holdout_hours=24): | ||
| """ | ||
| Train or fine-tune the model. | ||
|
|
There was a problem hiding this comment.
train() docstring still states validation uses the most recent 24 hours, but behavior is now configurable via validation_holdout_hours. Please update the docstring wording so it reflects the parameter-driven holdout (e.g., 24h vs 48h).
| if i == 0: | ||
| energy_kwh = val_predictions[minute] | ||
| # First cumulative value IS the energy for the first step (0-5 min) | ||
| energy_kwh = 0 | ||
| else: | ||
| # Subsequent steps: calculate delta from previous cumulative | ||
| prev_minute = val_pred_keys[i - 1] | ||
| energy_kwh = max(0, val_predictions[minute] - val_predictions[prev_minute]) | ||
| val_pred_minutes.append(minute) | ||
| val_pred_energy.append(energy_kwh) | ||
| val_pred_energy.append(dp4(energy_kwh)) |
There was a problem hiding this comment.
In _test_real_data_training(), the first prediction step’s energy is being set to 0, but LoadPredictor.predict() returns a cumulative series where the value at minute 0 already equals the first step energy (it starts cumulative from 0 and adds step 0). Use the first cumulative value for i == 0 instead of forcing it to zero; otherwise the plotted/compared energy series is shifted and undercounts by one step.
| if i == 0: | ||
| # First step - use the value directly as energy | ||
| energy_kwh = predictions[minute] | ||
| # First cumulative value IS the energy for the first step (0-5 min) | ||
| energy_kwh = 0 | ||
| else: |
There was a problem hiding this comment.
Same issue for the main predictions conversion: the first step’s energy should come from predictions[0] (first cumulative value), not 0. As written, the per-step series drops the first interval and makes downstream comparisons/charts inaccurate.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Increase hold back to 48 hours if we have enough data