Changes to model normalization to allow them to align over time to the data by springfall2008 · Pull Request #3408 · springfall2008/batpred

springfall2008 · 2026-02-21T19:50:52Z

No description provided.

…e data

Copilot

Pull request overview

This pull request implements exponential moving average (EMA) updates for normalization parameters in the ML load predictor to allow the model to gradually adapt to distribution drift over time (e.g., seasonal changes, new appliances, tariff changes). The feature normalization statistics (mean/std) are blended with new data during each fine-tuning cycle using a configurable alpha parameter (default 0.1 for slow drift tracking).

Changes:

Added EMA blending logic to _normalize_features method for tracking feature distribution drift during fine-tuning
Introduced norm_ema_alpha parameter to control drift adaptation rate (0=frozen, 0.1=slow drift)
Added normalization statistics logging to track drift in feature groups over time
Updated documentation to explain EMA drift tracking, normalization logging, and model persistence behavior

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File	Description
docs/load-ml.md	Documents EMA normalization drift tracking feature, explains alpha parameter, describes normalization stats logging, and clarifies model persistence behavior
apps/predbat/load_predictor.py	Implements EMA blending for feature normalization, adds drift tracking logs, introduces norm_ema_alpha parameter, and refactors normalization logic to support gradual adaptation

Copilot · 2026-02-21T19:55:53Z

apps/predbat/load_predictor.py

+            # Apply same min-std clamping to new stats before blending
+            n_features = len(new_std)
+            min_std = np.ones(n_features) * 1e-8
+            if n_features == TOTAL_FEATURES:
+                min_std[0:LOOKBACK_STEPS] = 0.01
+                min_std[LOOKBACK_STEPS : 2 * LOOKBACK_STEPS] = 0.01
+                min_std[2 * LOOKBACK_STEPS : 3 * LOOKBACK_STEPS] = 0.5
+                min_std[3 * LOOKBACK_STEPS : 4 * LOOKBACK_STEPS] = 1.0
+                min_std[4 * LOOKBACK_STEPS : 5 * LOOKBACK_STEPS] = 1.0
+                min_std[5 * LOOKBACK_STEPS :] = 0.01
+            new_std = np.maximum(new_std, min_std)


The min-std clamping logic is duplicated between the initial fit path (lines 671-686) and the EMA update path (lines 697-707). This duplication makes the code harder to maintain - if these values need to change, they must be updated in multiple places.

Consider extracting this logic into a helper method like _get_min_std_array(n_features) that returns the appropriate min-std array for the given number of features.

Copilot · 2026-02-21T19:55:53Z

apps/predbat/load_predictor.py

+        elif ema_alpha > 0 and self.feature_mean is not None and self.feature_std is not None:
+            # EMA update: blend new statistics with existing to track distribution drift
+            new_mean = np.mean(X, axis=0)
+            new_std = np.std(X, axis=0)
+
+            # Apply same min-std clamping to new stats before blending
+            n_features = len(new_std)
+            min_std = np.ones(n_features) * 1e-8
+            if n_features == TOTAL_FEATURES:
+                min_std[0:LOOKBACK_STEPS] = 0.01
+                min_std[LOOKBACK_STEPS : 2 * LOOKBACK_STEPS] = 0.01
+                min_std[2 * LOOKBACK_STEPS : 3 * LOOKBACK_STEPS] = 0.5
+                min_std[3 * LOOKBACK_STEPS : 4 * LOOKBACK_STEPS] = 1.0
+                min_std[4 * LOOKBACK_STEPS : 5 * LOOKBACK_STEPS] = 1.0
+                min_std[5 * LOOKBACK_STEPS :] = 0.01
+            new_std = np.maximum(new_std, min_std)
+
+            # Blend: small alpha = slow drift tracking, large alpha = fast adaptation
+            self.feature_mean = ema_alpha * new_mean + (1 - ema_alpha) * self.feature_mean
+            self.feature_std = ema_alpha * new_std + (1 - ema_alpha) * self.feature_std
+            self._log_normalization_stats(label="ema-update alpha={}".format(ema_alpha))


The new EMA normalization drift tracking functionality (norm_ema_alpha parameter and EMA update logic) is not covered by tests. The existing test_load_ml.py has a _test_normalization function but it only tests basic z-score normalization, not the EMA blending behavior during fine-tuning.

Consider adding a test that:

Trains a model on initial data

Captures the initial feature_mean and feature_std

Fine-tunes with data from a shifted distribution

Verifies that feature_mean and feature_std have moved toward the new distribution proportional to the EMA alpha value

Confirms predictions still work correctly with the updated normalization

Copilot · 2026-02-21T19:55:53Z

apps/predbat/load_predictor.py

            time_decay_days: Time constant for sample weighting
            patience: Early stopping patience
            validation_holdout_hours: Hours of most recent data to hold out for validation
+            norm_ema_alpha: EMA alpha for normalization drift tracking during fine-tuning (0=frozen, 0.1=slow drift)


The norm_ema_alpha parameter is not validated. If a user passes a negative value or a value greater than 1.0, the EMA blending formula could produce unexpected results (e.g., negative variance, or extrapolation beyond the new statistics).

Consider adding parameter validation at the start of the train method to ensure 0.0 <= norm_ema_alpha <= 1.0, raising a ValueError if the constraint is violated.

Copilot · 2026-02-21T19:55:54Z

docs/load-ml.md

 - Preserves learned patterns while adapting to new ones
 - Same regularization techniques applied as initial training
+- Each fine-tune cycle blends the current data's feature statistics (mean/std) with the stored normalization parameters via an exponential moving average (alpha=0.1).
+This lets the model slowly track long-term shifts in feature distributions (e.g. seasonal load changes, new tariff rates) without sudden jumps that could destabilise existing weights.


The word "destabilise" uses British spelling, but the codebase generally uses American English spelling (e.g., "optimize" not "optimise" in the same file). For consistency, change "destabilise" to "destabilize".

Suggested change

This lets the model slowly track long-term shifts in feature distributions (e.g. seasonal load changes, new tariff rates) without sudden jumps that could destabilise existing weights.

This lets the model slowly track long-term shifts in feature distributions (e.g. seasonal load changes, new tariff rates) without sudden jumps that could destabilize existing weights.

Copilot · 2026-02-21T19:55:54Z

apps/predbat/load_predictor.py

+            y_train_norm = self._normalize_targets(y_train, fit=False)
+            self.log("ML Predictor: Applied EMA normalization update (alpha={}) to track feature drift".format(norm_ema_alpha))


Target normalization parameters are not updated during fine-tuning, only feature normalization parameters receive EMA updates. This inconsistency means that if the target distribution (load values) drifts over time (e.g., seasonal changes, new appliances), the target normalization will become stale while feature normalization adapts. This could lead to prediction issues since the model's output layer expects targets normalized with the original statistics.

Consider also applying EMA updates to target_mean and target_std during fine-tuning to maintain consistency with the feature normalization drift tracking approach.

Suggested change

y_train_norm = self._normalize_targets(y_train, fit=False)

self.log("ML Predictor: Applied EMA normalization update (alpha={}) to track feature drift".format(norm_ema_alpha))

y_train_norm = self._normalize_targets(y_train, fit=False, ema_alpha=norm_ema_alpha)

self.log("ML Predictor: Applied EMA normalization update (alpha={}) to track feature and target drift".format(norm_ema_alpha))

Copilot · 2026-02-21T19:55:54Z

apps/predbat/load_predictor.py

+            grp_std = float(np.mean(self.feature_std[start:end]))
+            parts.append("{}(mean={:.4f} std={:.4f})".format(name, grp_mean, grp_std))
+
+        self.log("ML Predictor: Normalization stats [{}] target(mean={:.4f} std={:.4f}) {}".format(label, self.target_mean if self.target_mean is not None else 0, self.target_std if self.target_std is not None else 0, " ".join(parts)))


The logging statement includes target mean/std in the normalization stats output, but these values won't actually change during EMA updates since only feature normalization receives EMA updates. While this doesn't cause incorrect behavior, it could be misleading when monitoring drift - users might expect to see target stats change during fine-tuning cycles when they actually remain frozen.

Consider either: (1) updating the log format to clarify that target stats are from initial fit only, or (2) implementing EMA updates for target normalization as well for consistency.

Suggested change

self.log("ML Predictor: Normalization stats [{}] target(mean={:.4f} std={:.4f}) {}".format(label, self.target_mean if self.target_mean is not None else 0, self.target_std if self.target_std is not None else 0, " ".join(parts)))

self.log("ML Predictor: Normalization stats [{}] target_init(mean={:.4f} std={:.4f}, initial-fit only) {}".format(label, self.target_mean if self.target_mean is not None else 0, self.target_std if self.target_std is not None else 0, " ".join(parts)))

Copilot · 2026-02-21T19:55:54Z

docs/load-ml.md

 - **Network weights and biases**: All 4 hidden layers plus output layer
 - **Optimizer state**: Adam momentum terms for continuing fine-tuning
- **Normalization parameters**: Feature and target mean/standard deviation
+- **Normalization parameters**: Feature and target mean/standard deviation (updated via EMA each fine-tune cycle to track distribution drift)


The documentation states "normalization parameters" are updated via EMA, but the implementation only applies EMA updates to feature normalization (feature_mean, feature_std), not target normalization (target_mean, target_std). The documentation should either clarify that only feature normalization receives EMA updates, or the code should be updated to also apply EMA to target normalization for consistency.

Suggest updating this line to: "Feature normalization parameters (mean/standard deviation) are updated via EMA each fine-tune cycle to track distribution drift"

Copilot · 2026-02-21T19:55:54Z

docs/load-ml.md

 - **Training metadata**: Epochs trained, timestamp, model version, architecture details

-The model is automatically loaded on Predbat restart, allowing predictions to continue immediately without retraining.
+The model is automatically loaded on Predbat restart, allowing predictions to continue immediately without retraining. The EMA-updated normalization parameters are saved and restored with the model, so drift tracking is preserved across restarts.


Similar to the issue at line 415, this statement is imprecise - only feature normalization parameters receive EMA updates, not target normalization parameters. For accuracy, consider updating to: "The EMA-updated feature normalization parameters are saved and restored with the model, so feature drift tracking is preserved across restarts. Target normalization parameters remain fixed from initial training."

Changes to model normalization to allow them to align over time to th…

ecbad42

…e data

Copilot AI review requested due to automatic review settings February 21, 2026 19:50

Copilot started reviewing on behalf of springfall2008 February 21, 2026 19:51 View session

Copilot AI reviewed Feb 21, 2026

View reviewed changes

springfall2008 added 3 commits February 21, 2026 20:04

Review feedback - code tidy and new test

f9aa949

Web charts - add latest to history

8ac00a0

Fix days history on load ml chart

621cb25

springfall2008 merged commit 0f0b502 into main Feb 21, 2026
1 check passed

springfall2008 deleted the fixes24 branch February 21, 2026 20:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

Changes to model normalization to allow them to align over time to the data#3408

Changes to model normalization to allow them to align over time to the data#3408
springfall2008 merged 4 commits intomainfrom
fixes24

springfall2008 commented Feb 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	This lets the model slowly track long-term shifts in feature distributions (e.g. seasonal load changes, new tariff rates) without sudden jumps that could destabilise existing weights.
	This lets the model slowly track long-term shifts in feature distributions (e.g. seasonal load changes, new tariff rates) without sudden jumps that could destabilize existing weights.

		y_train_norm = self._normalize_targets(y_train, fit=False)
		self.log("ML Predictor: Applied EMA normalization update (alpha={}) to track feature drift".format(norm_ema_alpha))

	self.log("ML Predictor: Normalization stats [{}] target(mean={:.4f} std={:.4f}) {}".format(label, self.target_mean if self.target_mean is not None else 0, self.target_std if self.target_std is not None else 0, " ".join(parts)))
	self.log("ML Predictor: Normalization stats [{}] target_init(mean={:.4f} std={:.4f}, initial-fit only) {}".format(label, self.target_mean if self.target_mean is not None else 0, self.target_std if self.target_std is not None else 0, " ".join(parts)))

Uh oh!

Comments

Conversation

springfall2008 commented Feb 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant