How We Build Forecasts You Can Trust: Multi-Model Ensemble Processing at Meteo centar

Weather forecasting has a fundamental problem: no single model gets it right all the time. Global models like ECMWF’s IFS excel at synoptic-scale patterns but miss local convective events. High-resolution models like WRF capture terrain-driven phenomena but can drift at longer lead times. Regional models such as DWD’s ICON or AROME offer excellent short-range skill but limited forecast horizons.

At Meteo centar, we don’t pick one model and hope for the best. We run a production pipeline that ingests data from multiple numerical weather prediction systems — including our own custom-built WRF configurations — and synthesizes them into a single, coherent forecast for thousands of locations. The result is a forecast that is more skillful, more reliable, and more physically consistent than any individual source alone.

This post walks through how our system works, from raw model output to the hourly JSON and CSV data that powers our mobile application and API services.

The Input: A Diverse Model Portfolio

Our system ingests forecast data from a deliberately diverse set of sources, which we call providers. Each provider contributes a different perspective on the atmosphere:

Custom in-house WRF systems. We operate multiple configurations of the Weather Research and Forecasting (WRF) model, each initialized with different global boundary conditions — ECMWF’s IFS, DWD’s ICON-Global, and NCEP’s GFS. Running the same mesoscale model with different large-scale forcings gives us a controlled measure of initialization uncertainty, which is one of the largest sources of forecast error in the 12–72 hour range.

Operational NWP centres. We also ingest direct output from established operational models such as DWD’s ICON suite (ICON-EU, ICON-D2) and Météo-France’s AROME. These models bring independent dynamical cores, data assimilation systems, and physics parameterizations — diversity that a single-model ensemble simply cannot provide.

Each provider is assigned a weight reflecting our confidence in its skill for our target domain, and a lead-time constraint that determines when its data becomes authoritative. A high-resolution WRF run initialized by fresh ECMWF analysis might be trusted from hour zero, while a coarser model might only be used for its later forecast hours where its longer spin-up time is less of a concern. The system evaluates provider freshness on every execution cycle, automatically excluding any source whose data is older than a configurable staleness threshold. This means the blend adapts dynamically: if a provider’s latest run is delayed or missing, the system gracefully proceeds with whatever sources are available.

The Aggregation: Weighted Multi-Model Blending

When the pipeline runs — triggered automatically every time any input source refreshes — it loads data for each of our several thousand forecast locations and performs a weighted statistical aggregation across all active providers.

For every forecast hour and every meteorological variable, the system computes the weighted mean, median, maximum, minimum, and standard deviation across the provider ensemble. These powerful metrics drive a different part of the downstream processing.

The aggregation also implements a gap-filling fallback: because different providers have different lead-time offsets, the earliest hours of the blended forecast might only have data from a subset of sources. Rather than leaving gaps, the system fills initial hours with the full unfiltered ensemble average, ensuring seamless temporal continuity from hour one through the end of the forecast horizon.

The result is not just a simple average. It is a statistically characterized probability distribution of the atmosphere’s future state at every location and every hour. Tackled from the various angles.

Derived Intelligence: Where the Physics Meets the Statistics

Raw model output — temperature, humidity, wind vectors, cloud cover, precipitation — is only the starting point. Our pipeline computes a suite of derived parameters that turn numbers into actionable forecasts.

Precipitation Processing

Averaging precipitation across multiple models is notoriously problematic. Ensemble means tend to smooth out temporal variability: after simple averaging of multiple sources, a sharp two-hour thunderstorm becomes a gentle six-hour drizzle. This is physically wrong and operationally misleading.

We address this with a multi-stage processing chain:

Median selection. For precipitation amounts, we use the ensemble median rather than the mean as our baseline. The median is inherently more resistant to outlier contamination from a single model producing unrealistic rainfall.
Spatial enhancement. When the ensemble’s spatial maximum (the heaviest rain within any cell around the location) substantially exceeds the point average, it signals convective activity that the point average underrepresents. We apply a calibrated cube-root enhancement to capture this signal without overcorrecting.
Temporal compression. This is the core innovation. We identify continuous blocks of precipitation and apply a power-based contrast-stretching algorithm that sharpens peaks and suppresses lulls — restoring the temporal structure that ensemble averaging erased. The stretch factor is dynamically calculated from the ensemble’s own standard deviation, CAPE (Convective Available Potential Energy), and block duration. High instability with large inter-model disagreement triggers aggressive compression; stable, uniform events are left largely untouched. The total precipitation volume within each block is rigorously conserved: we redistribute intensity, not invent or destroy rain.

The end result is an hourly precipitation time series that preserves the ensemble’s total accumulation while recovering physically realistic temporal gradients.

Snow Probability: A Two-Factor Survival Model

Determining whether precipitation falls as rain or snow is one of the most impactful binary decisions in weather forecasting, and one of the most frequently oversimplified. Many systems use a single-threshold approach — “below 1°C, it snows” — which fails in transitional situations.

Our system uses a two-factor survival framework. For snow to reach the ground, it must survive two independent melting challenges:

Surface melting, governed by the wet-bulb temperature (Tw) at 2 metres. Tw, calculated via Stull’s (2011) formula, is a better discriminator than dry-bulb temperature because it accounts for evaporative cooling. Below 0°C Tw, snow always survives at the surface. Above 3°C Tw, it always melts. Between these bounds, survival probability decreases linearly.
Transit melting, governed by the depth of any warm layer aloft (the freezing level height, h0). Even if the surface is cold, snow falling through a deep warm layer above will melt before arrival. At warm layer depth that equals 0 m, snow always survives transit. Above 750 m of warm layer depth, it always melts.

The overall snow probability is the product of surviving both factors — a joint probability that naturally handles marginal cases far better than any single-variable threshold.

Thunderstorm Probability

Our thunderstorm probability algorithm fuses thermodynamic and kinematic signals across multiple time scales:

CAPE-based instability is evaluated using a rolling 4-hour maximum to prevent convective potential from being missed due to slight timing offsets between models, as well to conform to the fact that the maximum-instability zone is positioned almost always in the pre-frontal warm airmass.
Deep-layer wind shear (surface–500 hPa) is combined with CAPE to produce a composite wind-maximum shear parameter, identifying environments capable of producing organized severe convection.
Cold-front detection flags hours where 850 hPa temperature drops sharply coincident with precipitation and elevated CAPE, triggering an enhancement bonus that captures frontal thunderstorm potential.
Precipitation consistency checks ensure that thunderstorm probability stays grounded in the actual precipitation forecast — high instability alone does not generate a storm forecast if no precipitation mechanism is present.

Weather Symbolology: 50+ Distinct Conditions

The pipeline maps the processed numerical data onto a comprehensive set of over 50 weather symbols covering clear skies through heavy snow with thunderstorms. The symbol logic evaluates cloud cover, precipitation intensity, precipitation type (rain, snow, or mixed), thunderstorm flags, fog conditions, and wind speed/direction in a prioritized decision cascade.

Every symbol has an astronomically correct day/night variant, computed using the Skyfield library with JPL ephemeris data for each location’s exact latitude and longitude. Sunrise and sunset times are calculated to the minute, ensuring that a 5 AM forecast in Split doesn’t show a sun icon when it’s still dark.

Wind symbols encode both speed category (calm, light, moderate, strong, storm) and compass direction into a single glyph, giving users an instant visual read of wind conditions.

The Output: Two Formats, Many Applications

Every processing cycle produces two parallel outputs for each location:

JSON — optimized for our Meteocentar mobile application. Grouped by calendar date with localized weekday names, human-readable probability strings, and integer-rounded values for clean UI rendering. Low-probability precipitation (below 20%) is zeroed out to prevent visual noise in the app — we’d rather show “dry” than “maybe 0.1 mm.”

CSV — the full-resolution scientific output, inserted into our forecast database. This includes all intermediate variables: wet-bulb temperature, ensemble standard deviation, CAPE, wind shear components, freezing level, 850 hPa temperature spreads, and raw precipitation statistics. This data is available to clients via API calls or can be further processed for specialized applications such as agricultural forecasting, energy production planning, construction scheduling, or marine operations.

Why This Matters for Your Business

Single-model forecasts are a gamble. Our multi-model pipeline is designed to deliver:

Reduced forecast error — ensemble blending consistently outperforms any individual model, particularly for precipitation timing and intensity.
Quantified uncertainty — standard deviations and min/max spreads across the ensemble give you a measure of forecast confidence, not just a single number.
Precipitation type discrimination — our physics-based snow model provides far more reliable rain/snow boundaries than simple temperature thresholds, critical for logistics, road maintenance, and winter operations.
Severe weather intelligence — CAPE-shear composites and cold-front detection flag thunderstorm risk hours before convective initiation, giving operations teams actionable lead time.
Thousands of locations, updated continuously — the pipeline runs every time any input model refreshes, meaning your forecast data is never more than a few hours old.
Flexible delivery — JSON for lightweight mobile and web integration, CSV/database for analytical workflows, API access for automated systems.

Whether you’re managing a fleet, scheduling outdoor construction, planning agricultural operations, or building a weather-dependent application of your own, our data provides the foundation. We handle the model diversity, the physics, and the statistical processing. You get a clean, reliable forecast feed.