← Journal

Hybrid Deep Learning for Solar Energy Holding Capacity in Bangladesh

A stacked ensemble combining Random Forest, XGBoost, Gradient Boosting, AdaBoost and a Neural Network meta-learner reaches ~95% accuracy on Bangladesh solar-irradiance data.

Bangladesh sits in a high-irradiance belt, yet most regional planners still size solar deployments from yearly-average heuristics. The gap between installed and holding capacity — the share of nameplate kW the grid can reliably absorb under local weather — is large and seasonal. This work replaces those heuristics with a stacked ensemble that ingests environmental and weather signals and predicts holding capacity for every region.

Motivation

Classical methods (regression on monthly mean irradiance, or physics-based clear-sky models) break down for two reasons. First, the underlying signal is strongly non-linear: humidity, cloud type, and aerosol load interact in ways that linear models can’t capture. Second, the regional variance is huge — coastal cyclone seasons behave nothing like the dry north.

We needed a model that is:

  1. Non-linear and robust to missingness.
  2. Strong on both stable inland regions and high-variance coastal ones.
  3. Cheap to retrain as new sensor data arrives.

Data

We assembled a panel of region-month observations covering 2014 – 2024, with the following features per row:

FeatureSourceNotes
Solar irradianceNASA POWER, BMD ground stationsDaily GHI, monthly averages
TemperatureBMDMean, max, min
Relative humidityBMDAffects atmospheric scattering
Wind speedBMDCools panels, raises real-world output
Cloud coverNASA MERRA-2Total + low-cloud fractions
RainfallBMDProxy for monsoon intensity
Cyclone incidentsBMD storm catalogBinary per region-month
Drought intensityBAMIS SPIStandardised Precipitation Index

The target is holding capacity (ChC_h) in MW, derived from utility-side curtailment logs and SCADA dispatch records.

Methodology

Stacked generalisation

We chose stacked generalisation because it combines diverse hypothesis classes through a learned aggregator — exactly the setup that works when no single model dominates across regions. Five base learners feed their out-of-fold predictions into a small neural network meta-learner:

y^=fmeta(f1(x),f2(x),f3(x),f4(x),f5(x))\hat{y} = f_{\text{meta}}\bigl(\,f_1(x),\, f_2(x),\, f_3(x),\, f_4(x),\, f_5(x)\,\bigr)

where f1f5f_1 \dots f_5 are the five base learners (Random Forest, XGBoost, Gradient Boosting, AdaBoost, and a feedforward NN) and fmetaf_{\text{meta}} is a two-layer MLP that learns the optimal blend weights per region.

Loss function

We use a region-aware mean-squared error that down-weights the noisier coastal panel:

L(θ)=1Ni=1Nwr(i)(y^iyi)2+λθ22\mathcal{L}(\theta) = \frac{1}{N} \sum_{i=1}^{N} w_{r(i)} \, \bigl(\hat{y}_i - y_i\bigr)^2 + \lambda\,\lVert \theta \rVert_2^2

with wr[0.5,1.0]w_{r} \in [0.5, 1.0] per region rr, and λ=104\lambda = 10^{-4} for L2 regularisation. The weights are learned end-to-end from a small validation set, so the model adapts to the noise structure of each region instead of us hand-tuning it.

Feature engineering

The biggest accuracy gains came not from the model but from features:

  • Humidity-corrected irradiance — multiply GHI by an empirical attenuation curve in RHRH.
  • Diurnal phase — sine/cosine of the hour-of-day so models can express sunrise/sunset cleanly.
  • Seasonal one-hot — explicit monsoon / pre-monsoon / dry markers.
  • Rolling cyclone intensity — three-month EWMA of storm counts.

Results

Per-model accuracy

Test accuracy by model
92 %
Random Forest
92.8 %
XGBoost
91.4 %
Gradient Boost
88.6 %
AdaBoost
90.5 %
Neural Net
95.1 %
Stacked (ours)
Five base learners individually reach 88 – 93%. The stacked meta-learner clears 95% by routing each region to the model that handles its noise best.

The meta-learner consistently picks XGBoost for stable inland regions (Rangpur, Rajshahi) and shifts weight toward the NN for coastal regions (Chittagong, Khulna) where the noise distribution is heavier-tailed.

Regional error breakdown

Mean absolute error by region (MW)
Rangpur
1.4 MW
Rajshahi
1.6 MW
Dhaka
2.1 MW
Sylhet
2.3 MW
Khulna
3.2 MW
Barishal
3.5 MW
Chittagong
4.1 MW
Inland regions are easy; the coastal belt — with cyclones and salt-spray-driven panel degradation — drives the bulk of the residual.

Learning curve

Validation accuracy over training epochs
100 75 50 25 0 1 5 10 20 40 60 80 100 Accuracy (%)
Stacked (ours) Best base (XGBoost)
The stacked model gains the most in the first 40 epochs as the meta-learner discovers per-region blend weights, then settles.

Headline numbers

MetricValue
Test accuracy (R²-style)95.1%
Mean absolute error2.4 MW
Root mean squared error3.1 MW
Training time (1 GPU)18 min
Inference per region< 2 ms

Implementation

The pipeline is in Python, scikit-learn for the base learners (except XGBoost), TensorFlow for the meta-learner, with a thin Pandas layer for feature engineering.

from sklearn.ensemble import (
    RandomForestRegressor,
    GradientBoostingRegressor,
    AdaBoostRegressor,
)
from sklearn.neural_network import MLPRegressor
from xgboost import XGBRegressor
import numpy as np

base = {
    "rf": RandomForestRegressor(n_estimators=200, max_depth=15, n_jobs=-1),
    "xgb": XGBRegressor(n_estimators=300, max_depth=6, learning_rate=0.05),
    "gbr": GradientBoostingRegressor(n_estimators=200, max_depth=5),
    "ada": AdaBoostRegressor(n_estimators=150, learning_rate=0.8),
    "nn":  MLPRegressor(hidden_layer_sizes=(64, 32), max_iter=400),
}

oof = {name: np.zeros(len(X_train)) for name in base}
for name, model in base.items():
    for tr, va in kfold.split(X_train):
        model.fit(X_train[tr], y_train[tr])
        oof[name][va] = model.predict(X_train[va])

# Meta-learner ingests the OOF predictions.
meta_X = np.column_stack(list(oof.values()))
meta = build_meta_nn()              # 2-layer MLP, 64 → 32 → 1
meta.fit(meta_X, y_train, epochs=100, batch_size=64, validation_split=0.2)

The meta-network is intentionally small. We don’t need it to learn the data again — we only need it to learn which base learner to trust where.

What we’d do next

  • Replace the meta-MLP with a gating network conditioned explicitly on region embeddings. Anecdotally, the current MLP already learns this, but a gating formulation would be more interpretable.
  • Add satellite cloud-top imagery through a small CNN feature extractor — the current cloud features are coarse averages.
  • Quantile regression heads so planners get an interval, not a point estimate.

The full paper appeared at IEEE QPAIN 2026. If you’re working on adjacent problems — utility-scale forecasting, microgrid sizing, curtailment optimisation — I’d love to compare notes.