🌟 Overview
Accurate climate projections are critical in an era of accelerating climate change. Traditional climate models face challenges in representing small-scale atmospheric processes—such as clouds, storms, turbulence, and precipitation—because these processes occur at scales smaller than the model grid and are computationally expensive to resolve explicitly. This project develops a machine learning emulator that replicates the subgrid-scale physics within the E3SM-MMF climate model. By replacing high-resolution physical parameterizations with an efficient neural network, the solution offers a scalable, fast, and physically credible approach to long-term climate prediction.
This project was created as part of the Kaggle LEAP - Atmospheric Physics using AI (ClimSim) competition. The final submission achieved a bronze medal with a public leaderboard (lb) score of 0.73575 and a private leaderboard (pb) score of 0.73955.
The dataset for this competition is generated by the state-of-the-art E3SM-MMF climate model. Its multi-scale framework explicitly resolves small-scale processes (e.g., clouds, storms, turbulence) that influence large-scale climate patterns. However, the computational cost of explicit resolution is extremely high. The task is to emulate the effects of these processes with a machine learning model that is far less computationally expensive.
Each row in the training set corresponds to the inputs and outputs of a cloud-resolving model (CRM) in E3SM-MMF at a given location and timestep. The dataset includes:
-
Inputs:
556 columns representing 25 input variables. Some variables are scalars while others are vertically resolved over 60 levels. For vertically resolved variables, an underscore and level number (ranging from 0 to 59) are appended to the variable name, with lower numbers representing higher positions in the atmosphere. -
Targets:
368 columns representing 14 target variables. These include both vertically resolved variables (e.g., heating tendency across 60 levels) and scalars (e.g., surface fluxes).
Name | Description | Dimension | Units |
---|---|---|---|
state_t |
Air temperature | 60 levels | K |
state_q0001 |
Specific humidity | 60 levels | kg/kg |
state_q0002 |
Cloud liquid mixing ratio | 60 levels | kg/kg |
state_q0003 |
Cloud ice mixing ratio | 60 levels | kg/kg |
state_u |
Zonal wind speed | 60 levels | m/s |
state_v |
Meridional wind speed | 60 levels | m/s |
state_ps |
Surface pressure | Scalar | Pa |
pbuf_SOLIN |
Solar insolation | Scalar | W/m² |
pbuf_LHFLX |
Surface latent heat flux | Scalar | W/m² |
pbuf_SHFLX |
Surface sensible heat flux | Scalar | W/m² |
pbuf_TAUX |
Zonal surface stress | Scalar | N/m² |
pbuf_TAUY |
Meridional surface stress | Scalar | N/m² |
pbuf_COSZRS |
Cosine of solar zenith angle | Scalar | — |
cam_in_ALDIF |
Albedo for diffuse longwave radiation | Scalar | — |
cam_in_ALDIR |
Albedo for direct longwave radiation | Scalar | — |
cam_in_ASDIF |
Albedo for diffuse shortwave radiation | Scalar | — |
cam_in_ASDIR |
Albedo for direct shortwave radiation | Scalar | — |
cam_in_LWUP |
Upward longwave flux | Scalar | W/m² |
cam_in_ICEFRAC |
Sea-ice areal fraction | Scalar | — |
cam_in_LANDFRAC |
Land areal fraction | Scalar | — |
cam_in_OCNFRAC |
Ocean areal fraction | Scalar | — |
cam_in_SNOWHLAND |
Snow depth over land | Scalar | m |
pbuf_ozone |
Ozone volume mixing ratio | 60 levels | mol/mol |
pbuf_CH4 |
Methane volume mixing ratio | 60 levels | mol/mol |
pbuf_N2O |
Nitrous oxide volume mixing ratio | 60 levels | mol/mol |
Name | Description | Dimension | Units |
---|---|---|---|
ptend_t |
Heating tendency | 60 levels | K/s |
ptend_q0001 |
Moistening tendency | 60 levels | kg/kg/s |
ptend_q0002 |
Cloud liquid mixing ratio change over time | 60 levels | kg/kg/s |
ptend_q0003 |
Cloud ice mixing ratio change over time | 60 levels | kg/kg/s |
ptend_u |
Zonal wind acceleration | 60 levels | m/s² |
ptend_v |
Meridional wind acceleration | 60 levels | m/s² |
cam_out_NETSW |
Net shortwave flux at surface | Scalar | W/m² |
cam_out_FLWDS |
Downward longwave flux at surface | Scalar | W/m² |
cam_out_PRECSC |
Snow rate (liquid water equivalent) | Scalar | m/s |
cam_out_PRECC |
Rain rate | Scalar | m/s |
cam_out_SOLS |
Downward visible direct solar flux to surface | Scalar | W/m² |
cam_out_SOLL |
Downward near-infrared direct solar flux to surface | Scalar | W/m² |
cam_out_SOLSD |
Downward diffuse solar flux to surface | Scalar | W/m² |
cam_out_SOLLD |
Downward diffuse near-infrared solar flux to surface | Scalar | W/m² |
-
Efficient Data Handling:
CSV files are converted into more efficient formats (such as TFRecord, Parquet, or Numpy arrays) to improve input/output performance. -
Normalization:
Both inputs and targets are normalized using TensorFlow’sNormalization
layer to ensure stable and effective training. -
Deterministic Data Pipeline:
A reproducible data pipeline is built using fixed random seeds and deterministic shuffling. A batch size of 512 is used for training.
The model is designed to capture multi-scale atmospheric dynamics by incorporating several advanced components:
-
Input Reformatting:
A custom function (x_to_seq
) converts the raw 556-dimensional input vector into a sequence format that separates vertically resolved data from scalar variables. -
U-Net Style Architecture with Transformer and Residual Blocks:
- Encoder & Decoder Blocks:
The encoder progressively downsamples the input using repeated residual blocks to extract high-level features, while the decoder upsamples these features to reconstruct the target variables. - Transformer Bottleneck:
A transformer block with multi-head attention layers captures long-range dependencies and complex interactions between vertical levels. - Residual Connections:
Skip connections preserve fine-scale details throughout the network.
- Encoder & Decoder Blocks:
Below is a snippet illustrating the custom transformer encoder layer:
@keras.saving.register_keras_serializable()
class TransformerEncoderLayer(tf.keras.layers.Layer):
def __init__(self, head_size, num_heads, ff_dim, dropout=0.1, **kwargs):
super(TransformerEncoderLayer, self).__init__(**kwargs)
self.att = MultiHeadAttention(key_dim=head_size, num_heads=num_heads, dropout=dropout)
self.ffn = tf.keras.Sequential([
Dense(ff_dim, activation='gelu'),
Dense(25)
])
self.layernorm1 = LayerNormalization(epsilon=1e-6)
self.layernorm2 = LayerNormalization(epsilon=1e-6)
self.dropout1 = Dropout(dropout)
self.dropout2 = Dropout(dropout)
def call(self, inputs, training=False):
attn_output = self.att(inputs, inputs)
attn_output = self.dropout1(attn_output, training=training)
out1 = self.layernorm1(inputs + attn_output)
ffn_output = self.ffn(out1)
ffn_output = self.dropout2(ffn_output, training=training)
return self.layernorm2(out1 + ffn_output)
-
Ensemble Modeling
Multiple model variants (including different U-Net and transformer configurations) are trained, and predictions are averaged. This ensemble strategy increases robustness and overall performance.
-
Training Strategy
-
Loss Function & Metrics:
The model is trained using Mean Squared Error (MSE) loss. Performance is evaluated with a custom weighted R² metric:
$$ R^2 = 1 - \frac{SS_{res}}{SS_{tot}} $$ where residuals are weighted element-wise by the values from
sample_submission.csv
. -
Callbacks
- Early Stopping: Monitors validation loss to prevent overfitting.
- Model Checkpointing: Saves the best model during training based on validation performance.
- Learning Rate Scheduling: A cosine annealing schedule with warm-up phases is used to stabilize and speed up training.
-
-
Post-Processing
After predictions are generated, they are scaled back to the original target values using stored mean and standard deviation values.
-
Weighting Predictions
Final predictions are multiplied element-wise by the sample submission weights.
-
Ensemble Averaging
Predictions from multiple models are averaged before generating the final submission file.
-
Evaluation & Results
Evaluation is performed using a custom weighted R² metric, where predictions are weighted by the values in
sample_submission.csv
. Final leaderboard results were:- Public Score (lb): 0.73575
- Private Score (pb): 0.73955