Skip to content

Comments

Changes to model normalization to allow them to align over time to the data#3408

Merged
springfall2008 merged 4 commits intomainfrom
fixes24
Feb 21, 2026
Merged

Changes to model normalization to allow them to align over time to the data#3408
springfall2008 merged 4 commits intomainfrom
fixes24

Conversation

@springfall2008
Copy link
Owner

No description provided.

Copilot AI review requested due to automatic review settings February 21, 2026 19:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements exponential moving average (EMA) updates for normalization parameters in the ML load predictor to allow the model to gradually adapt to distribution drift over time (e.g., seasonal changes, new appliances, tariff changes). The feature normalization statistics (mean/std) are blended with new data during each fine-tuning cycle using a configurable alpha parameter (default 0.1 for slow drift tracking).

Changes:

  • Added EMA blending logic to _normalize_features method for tracking feature distribution drift during fine-tuning
  • Introduced norm_ema_alpha parameter to control drift adaptation rate (0=frozen, 0.1=slow drift)
  • Added normalization statistics logging to track drift in feature groups over time
  • Updated documentation to explain EMA drift tracking, normalization logging, and model persistence behavior

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File Description
docs/load-ml.md Documents EMA normalization drift tracking feature, explains alpha parameter, describes normalization stats logging, and clarifies model persistence behavior
apps/predbat/load_predictor.py Implements EMA blending for feature normalization, adds drift tracking logs, introduces norm_ema_alpha parameter, and refactors normalization logic to support gradual adaptation

Comment on lines 697 to 707
# Apply same min-std clamping to new stats before blending
n_features = len(new_std)
min_std = np.ones(n_features) * 1e-8
if n_features == TOTAL_FEATURES:
min_std[0:LOOKBACK_STEPS] = 0.01
min_std[LOOKBACK_STEPS : 2 * LOOKBACK_STEPS] = 0.01
min_std[2 * LOOKBACK_STEPS : 3 * LOOKBACK_STEPS] = 0.5
min_std[3 * LOOKBACK_STEPS : 4 * LOOKBACK_STEPS] = 1.0
min_std[4 * LOOKBACK_STEPS : 5 * LOOKBACK_STEPS] = 1.0
min_std[5 * LOOKBACK_STEPS :] = 0.01
new_std = np.maximum(new_std, min_std)
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The min-std clamping logic is duplicated between the initial fit path (lines 671-686) and the EMA update path (lines 697-707). This duplication makes the code harder to maintain - if these values need to change, they must be updated in multiple places.

Consider extracting this logic into a helper method like _get_min_std_array(n_features) that returns the appropriate min-std array for the given number of features.

Copilot uses AI. Check for mistakes.
Comment on lines 692 to 712
elif ema_alpha > 0 and self.feature_mean is not None and self.feature_std is not None:
# EMA update: blend new statistics with existing to track distribution drift
new_mean = np.mean(X, axis=0)
new_std = np.std(X, axis=0)

# Apply same min-std clamping to new stats before blending
n_features = len(new_std)
min_std = np.ones(n_features) * 1e-8
if n_features == TOTAL_FEATURES:
min_std[0:LOOKBACK_STEPS] = 0.01
min_std[LOOKBACK_STEPS : 2 * LOOKBACK_STEPS] = 0.01
min_std[2 * LOOKBACK_STEPS : 3 * LOOKBACK_STEPS] = 0.5
min_std[3 * LOOKBACK_STEPS : 4 * LOOKBACK_STEPS] = 1.0
min_std[4 * LOOKBACK_STEPS : 5 * LOOKBACK_STEPS] = 1.0
min_std[5 * LOOKBACK_STEPS :] = 0.01
new_std = np.maximum(new_std, min_std)

# Blend: small alpha = slow drift tracking, large alpha = fast adaptation
self.feature_mean = ema_alpha * new_mean + (1 - ema_alpha) * self.feature_mean
self.feature_std = ema_alpha * new_std + (1 - ema_alpha) * self.feature_std
self._log_normalization_stats(label="ema-update alpha={}".format(ema_alpha))
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new EMA normalization drift tracking functionality (norm_ema_alpha parameter and EMA update logic) is not covered by tests. The existing test_load_ml.py has a _test_normalization function but it only tests basic z-score normalization, not the EMA blending behavior during fine-tuning.

Consider adding a test that:

  1. Trains a model on initial data
  2. Captures the initial feature_mean and feature_std
  3. Fine-tunes with data from a shifted distribution
  4. Verifies that feature_mean and feature_std have moved toward the new distribution proportional to the EMA alpha value
  5. Confirms predictions still work correctly with the updated normalization

Copilot uses AI. Check for mistakes.
time_decay_days: Time constant for sample weighting
patience: Early stopping patience
validation_holdout_hours: Hours of most recent data to hold out for validation
norm_ema_alpha: EMA alpha for normalization drift tracking during fine-tuning (0=frozen, 0.1=slow drift)
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The norm_ema_alpha parameter is not validated. If a user passes a negative value or a value greater than 1.0, the EMA blending formula could produce unexpected results (e.g., negative variance, or extrapolation beyond the new statistics).

Consider adding parameter validation at the start of the train method to ensure 0.0 <= norm_ema_alpha <= 1.0, raising a ValueError if the constraint is violated.

Copilot uses AI. Check for mistakes.
- Preserves learned patterns while adapting to new ones
- Same regularization techniques applied as initial training
- Each fine-tune cycle blends the current data's feature statistics (mean/std) with the stored normalization parameters via an exponential moving average (alpha=0.1).
This lets the model slowly track long-term shifts in feature distributions (e.g. seasonal load changes, new tariff rates) without sudden jumps that could destabilise existing weights.
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "destabilise" uses British spelling, but the codebase generally uses American English spelling (e.g., "optimize" not "optimise" in the same file). For consistency, change "destabilise" to "destabilize".

Suggested change
This lets the model slowly track long-term shifts in feature distributions (e.g. seasonal load changes, new tariff rates) without sudden jumps that could destabilise existing weights.
This lets the model slowly track long-term shifts in feature distributions (e.g. seasonal load changes, new tariff rates) without sudden jumps that could destabilize existing weights.

Copilot uses AI. Check for mistakes.
Comment on lines +840 to +841
y_train_norm = self._normalize_targets(y_train, fit=False)
self.log("ML Predictor: Applied EMA normalization update (alpha={}) to track feature drift".format(norm_ema_alpha))
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Target normalization parameters are not updated during fine-tuning, only feature normalization parameters receive EMA updates. This inconsistency means that if the target distribution (load values) drifts over time (e.g., seasonal changes, new appliances), the target normalization will become stale while feature normalization adapts. This could lead to prediction issues since the model's output layer expects targets normalized with the original statistics.

Consider also applying EMA updates to target_mean and target_std during fine-tuning to maintain consistency with the feature normalization drift tracking approach.

Suggested change
y_train_norm = self._normalize_targets(y_train, fit=False)
self.log("ML Predictor: Applied EMA normalization update (alpha={}) to track feature drift".format(norm_ema_alpha))
y_train_norm = self._normalize_targets(y_train, fit=False, ema_alpha=norm_ema_alpha)
self.log("ML Predictor: Applied EMA normalization update (alpha={}) to track feature and target drift".format(norm_ema_alpha))

Copilot uses AI. Check for mistakes.
grp_std = float(np.mean(self.feature_std[start:end]))
parts.append("{}(mean={:.4f} std={:.4f})".format(name, grp_mean, grp_std))

self.log("ML Predictor: Normalization stats [{}] target(mean={:.4f} std={:.4f}) {}".format(label, self.target_mean if self.target_mean is not None else 0, self.target_std if self.target_std is not None else 0, " ".join(parts)))
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logging statement includes target mean/std in the normalization stats output, but these values won't actually change during EMA updates since only feature normalization receives EMA updates. While this doesn't cause incorrect behavior, it could be misleading when monitoring drift - users might expect to see target stats change during fine-tuning cycles when they actually remain frozen.

Consider either: (1) updating the log format to clarify that target stats are from initial fit only, or (2) implementing EMA updates for target normalization as well for consistency.

Suggested change
self.log("ML Predictor: Normalization stats [{}] target(mean={:.4f} std={:.4f}) {}".format(label, self.target_mean if self.target_mean is not None else 0, self.target_std if self.target_std is not None else 0, " ".join(parts)))
self.log("ML Predictor: Normalization stats [{}] target_init(mean={:.4f} std={:.4f}, initial-fit only) {}".format(label, self.target_mean if self.target_mean is not None else 0, self.target_std if self.target_std is not None else 0, " ".join(parts)))

Copilot uses AI. Check for mistakes.
- **Network weights and biases**: All 4 hidden layers plus output layer
- **Optimizer state**: Adam momentum terms for continuing fine-tuning
- **Normalization parameters**: Feature and target mean/standard deviation
- **Normalization parameters**: Feature and target mean/standard deviation (updated via EMA each fine-tune cycle to track distribution drift)
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states "normalization parameters" are updated via EMA, but the implementation only applies EMA updates to feature normalization (feature_mean, feature_std), not target normalization (target_mean, target_std). The documentation should either clarify that only feature normalization receives EMA updates, or the code should be updated to also apply EMA to target normalization for consistency.

Suggest updating this line to: "Feature normalization parameters (mean/standard deviation) are updated via EMA each fine-tune cycle to track distribution drift"

Copilot uses AI. Check for mistakes.
- **Training metadata**: Epochs trained, timestamp, model version, architecture details

The model is automatically loaded on Predbat restart, allowing predictions to continue immediately without retraining.
The model is automatically loaded on Predbat restart, allowing predictions to continue immediately without retraining. The EMA-updated normalization parameters are saved and restored with the model, so drift tracking is preserved across restarts.
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the issue at line 415, this statement is imprecise - only feature normalization parameters receive EMA updates, not target normalization parameters. For accuracy, consider updating to: "The EMA-updated feature normalization parameters are saved and restored with the model, so feature drift tracking is preserved across restarts. Target normalization parameters remain fixed from initial training."

Copilot uses AI. Check for mistakes.
@springfall2008 springfall2008 merged commit 0f0b502 into main Feb 21, 2026
1 check passed
@springfall2008 springfall2008 deleted the fixes24 branch February 21, 2026 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant