Predict Extreme Weather in Minutes Without a Supercomputer: Huge Ensembles (HENS)

TL;DR

NVIDIA, with Lawrence Berkeley National Laboratory, released Huge Ensembles (HENS), a machine-learning tool for extreme-weather forecasting available as open source code or a ready-to-run model.
HENS generates 27,000 years of data with 7,424 ensemble members based on summer 2023 conditions, roughly 150x more members than traditional models.
It forecasts from six hours to 14 days ahead at a 15-mile (25 km) resolution, delivering faster results and lower energy and compute costs.
Built on NVIDIA PhysicsNeMo and Makani open-source frameworks, trained on ERA5 data (40 years), HENS offers smaller uncertainties and higher recall for rare events.

Context and background

Weather and climate forecasts often rely on physics-based numerical models that simulate atmospheric processes. These models produce ensembles—multiple simulations with slightly varied initial conditions—to capture uncertainty and estimate the likelihood of different outcomes. Traditional ensembles are computationally expensive, typically requiring supercomputers and limiting the number of ensemble members. The new HENS approach combines AI with physics-based modeling to generate massive ensembles with substantially lower computational demands, enabling exploration of low-likelihood, high-impact events over extended timescales. The two-part study introducing HENS was published in Geoscientific Model Development, focusing on creating a large, high-fidelity dataset of weather trajectories. The work demonstrates that downscaled, high-resolution forecasts can be produced more efficiently while maintaining robust uncertainty estimates. The researchers trained global weather models using NVIDIA PhysicsNeMo, an open-source Python framework for physics-informed AI, as part of the HENS pipeline, and leveraged Makani frameworks to support scalable experimentation. “Twenty-seven thousand years of simulations is a goldmine for studying the statistics and drivers of extreme weather events,” said Ankur Mahesh, co-author and Berkeley Lab researcher. The project also emphasizes retraining on new data to sustain accuracy while reducing energy consumption.

What’s new

Introduction of Huge Ensembles (HENS), an AI-assisted approach to extreme-weather prediction that reduces the need for traditional supercomputing resources.
Availability as open-source code or a ready-to-run model, enabling rapid experimentation and deployment.
Capacity to generate 27,000 years of climate data with 7,424 ensemble members based on daily initial conditions from summer 2023.
Forecast window expanded from six hours to 14 days with a high-resolution 15-mile (25 km) grid.
Demonstrated uncertainties over 10x smaller than traditional models and the ability to catch about 96% of rare but severe events.
A dataset totaling about 27,000 years of climate data (approximately 20 petabytes) created and validated at NERSC (DOE’s National Energy Research Scientific Computing Center).
Training and evaluation workflows built on ERA5 data (40 years) and implemented with NVIDIA PhysicsNeMo and Makani open-source frameworks.

Metric	Value
Ensemble members generated	7,424
Data span	~27,000 years (20 PB)
Forecast window	6 hours to 14 days
Spatial resolution	15 miles / 25 km
Training data	ERA5 data (40 years)
Open source / ready-to-run	Yes

Why it matters (impact for developers/enterprises)

HENS enables climate scientists, city officials, and emergency managers to rapidly test scenarios and update response plans with far less computing power and cost than traditional methods require. By generating massive ensembles and counterfactuals for events like heat waves and hurricanes, it becomes feasible to explore tail risks and understand drivers of extreme weather over years and decades rather than single near-term events. The approach also promises energy savings by retraining models on new data, potentially accelerating updates to forecasts as conditions evolve.

Technical details or Implementation

HENS relies on a physics-informed AI model trained with PhysicsNeMo using ERA5 as the primary historical atmospheric state source, spanning 40 years. After training, the model serves as a computationally cheaper substitute for conventional numerical simulations to create large ensembles and scenario tests. The workflow combines AI-driven emulation with physics-based constraints to maintain fidelity while drastically reducing compute time and energy use. Validation at NERSC showed that HENS predictions align closely with established gold-standard metrics, with uncertainties more than an order of magnitude smaller than traditional methods. HENS produces 7,424 ensemble members for the summer 2023 conditions, allowing a richer characterization of the tail of the distribution and enabling more reliable assessments of low-likelihood, high-impact events. The dataset built through this process offers a large, high-resolution record suitable for studying long-term patterns and drivers of extremes, such as heat waves, hurricanes, and atmospheric rivers.

Key technical components

PhysicsNeMo: open-source Python framework for building and refining physics-based AI models at scale.
Makani: open-source frameworks used to support modeling and experimentation.
ERA5: historical atmospheric state data used for training (40 years).
NERSC: rigorous validation environment where ensemble predictions were weighed across multiple diagnostics.
Open-source / ready-to-run availability: enables broad adoption and testing across institutions.

Key takeaways

HENS is a scalable AI-assisted method for extreme-weather forecasting that reduces dependence on supercomputers.
The approach delivers massive ensembles (7,424 members) from a single training run, enabling detailed tail-risk analysis.
Forecasts extend from hours to days ahead (6 hours–14 days) with high spatial resolution (15 miles / 25 km).
Uncertainty in HENS predictions is substantially lower than traditional models, with improved detection of rare events.
The generated dataset (27,000 years, ~20 PB) supports long-range climate insights and future methodological improvements.

FAQ

What is HENS?

machine learning tool released by NVIDIA and Berkeley Lab to produce large ensembles for extreme-weather forecasting, available as open source code or a ready-to-run model.
How many ensemble members were created, and on what data are they based?

7,424 ensemble members based on initial conditions from summer 2023, generated using ERA5 data and AI modeling.
What is the forecast range and resolution?

Forecasts cover 6 hours to 14 days ahead at a 15-mile (25 km) resolution.
Why is this significant for practitioners?

It enables faster scenario testing with lower uncertainty and reduced energy usage, supporting decision-making for extreme events.