ERA5 hourly atmospheric data on single- and pressure-levels

This walkthrough covers downloading ERA5 reanalysis fields from the Copernicus Climate Data Store (CDS), with the Rain in Shallow Cumulus Over the Ocean (RICO) trade-wind cumulus campaign (Rauber et al., 2007) as a unifying case study. We consider both single-level (2-D) and pressure-level (3-D) fields with two subsetting approaches (bounding box and column) that restrict the amount of data requested through the CDS API.

Our focus is on the first four days of the undisturbed period (Dec 27 2004 – Jan 2 2005) by van Zanten et al. (2011), during which a mean precipitation was observed. Here, we briefly analyze and present the ERA5 data, referring to published material where appropriate.

Three scales are demonstrated:

  1. Global scale - surface winds and Stokes drift over the entire globe
  2. Synoptic scale — surface precipitation over an Atlantic-centered region covering the tropics
  3. Microscale — single- and pressure-level $u$, $v$, $T$, $qᵛ$ over the RICO study box; pressure-level $qᶜˡ$, $qʳ$ in a single column.

Install dependencies

using Pkg
pkg"add NumericalEarth CDSAPI Oceananigans CairoMakie"

You also need CDS API credentials in ~/.cdsapirc. See https://cds.climate.copernicus.eu/how-to-api for setup instructions.

using Downloads: download
using NumericalEarth
using NumericalEarth.DataWrangling.ERA5
using CDSAPI
using Dates
using Oceananigans
using Oceananigans.Fields: interpolate!
using Oceananigans.Grids: x_domain
using Statistics
using CairoMakie
using Suppressor
[ Info: Oceananigans will use 8 threads

Study definition

For demonstration purposes, we select four days within van Zanten et al. (2011)'s undisturbed period, giving us 96 hourly snapshots. This is used by both sections below.

dates = DateTime(2004, 12, 27):Hour(1):DateTime(2004, 12, 30, 23)

To subset the ERA5 data, we define two types of regions. We introduce two BoundingBoxes:

# Synoptic-scale region, cf. Fig. 1 by [rauber2007rain](@citet)
synoptic_region = BoundingBox(latitude=(-25, 35), longitude=(-110, 30))

# RICO study area near Antigua and Barbuda
rico_region = BoundingBox(latitude=(17.5, 18.5), longitude=(-62, -61))

And a single Column, which has no lateral dimensionality:

rico_column = Column(-61.5, 18) # longitude, latitude

§1 Global conditions

This part of the analysis is based on ERA5 hourly data on single levels, available from 1940 to present. In this section, we evaluate ocean surface conditions based on the near-surface wind velocity at 10-m ASL and Stokes drift.

dataset = ERA5HourlySingleLevel()

Note that ERA5 atmospheric variables (wind) live on a 0.25° grid (1440×721), whereas ocean wave variables (Stokes drift) live on a 0.5° grid (720×361).

Metadata definition

We first define metadata for each variable at a single date.

Metadatum vs Metadata

Metadatum is used for single dates while Metadata handles multiple dates.

No region kwarg is specified because we want to download global fields.

date = first(dates)
u_stokes_meta = Metadatum(:eastward_stokes_drift;  dataset, date)
v_stokes_meta = Metadatum(:northward_stokes_drift; dataset, date)
u_wind_meta   = Metadatum(:eastward_velocity;      dataset, date)
v_wind_meta   = Metadatum(:northward_velocity;     dataset, date)
Metadatum{NumericalEarth.DataWrangling.ERA5.ERA5HourlySingleLevel, Dates.DateTime}:
├── name: northward_velocity
├── dataset: NumericalEarth.DataWrangling.ERA5.ERA5HourlySingleLevel
├── dates: 2004-12-27 00:00:00
├── filename: 10m_v_component_of_wind_ERA5HourlySingleLevel_2004-12-27T00.nc
└── dir: /storage5/github-action-runners/julia-depot/scratchspaces/904d977b-046a-4731-8b86-9235c0d1ef02/ERA5

Build a grid and create fields

We build a single LatitudeLongitudeGrid and use set! to download and interpolate all four variables onto it.

grid = LatitudeLongitudeGrid(size = (1440, 720, 1),
                             longitude = (0, 360),
                             latitude = (-90, 90),
                             z = (0, 1))

u_stokes = CenterField(grid)
v_stokes = CenterField(grid)
u_wind   = CenterField(grid)
v_wind   = CenterField(grid)

set!(u_stokes, u_stokes_meta)
set!(v_stokes, v_stokes_meta)
set!(u_wind,   u_wind_meta)
set!(v_wind,   v_wind_meta)
1440×720×1 Field{Oceananigans.Grids.Center, Oceananigans.Grids.Center, Oceananigans.Grids.Center} on Oceananigans.Grids.LatitudeLongitudeGrid on CPU
├── grid: 1440×720×1 LatitudeLongitudeGrid{Float64, Oceananigans.Grids.Periodic, Oceananigans.Grids.Bounded, Oceananigans.Grids.Bounded} on CPU with 3×3×1 halo
├── boundary conditions: FieldBoundaryConditions
│   └── west: Periodic, east: Periodic, south: Value, north: Value, bottom: ZeroFlux, top: ZeroFlux, immersed: Nothing
└── data: 1446×726×3 OffsetArray(::Array{Float64, 3}, -2:1443, -2:723, 0:2) with eltype Float64 with indices -2:1443×-2:723×0:2
    └── max=18.7217, min=-20.9516, mean=0.0774534

Compute speeds and plot

We use Oceananigans abstract operations to compute the speed fields, then plot them directly as heatmaps on latitude–longitude axes.

stokes_speed = sqrt(u_stokes^2 + v_stokes^2)
wind_speed   = sqrt(u_wind^2   + v_wind^2)

fig = Figure(size=(1200, 600))

ax1 = Axis(fig[1, 1]; title="Stokes drift speed (m/s)",
           xlabel="Longitude (°)", ylabel="Latitude (°)")
ax2 = Axis(fig[1, 2]; title="10-m wind speed (m/s)",
           xlabel="Longitude (°)", ylabel="Latitude (°)")

hm1 = heatmap!(ax1, stokes_speed; colormap=:speed, colorrange=(0, 0.3))
hm2 = heatmap!(ax2, wind_speed;   colormap=:speed, colorrange=(0, 20))

Colorbar(fig[2, 1], hm1; vertical=false, width=Relative(0.8), label="m/s")
Colorbar(fig[2, 2], hm2; vertical=false, width=Relative(0.8), label="m/s")

Label(fig[0, :],
      "ERA5 Stokes Drift and Surface Wind — $(Dates.format(date, "yyyy-mm-dd HH:MM")) UTC";
      fontsize=20)

fig

§2 Synoptic conditions

New in this section:

  • A BoundingBox defined by latitude and longitude ranges is introduced for region restriction.
  • We work with a FieldTimeSeries, constructed from metadata, to construct an ERA5 time series. Field data are downloaded on the fly.
  • We illustrate the per-column-z interpolation by dropping specific humidity from its native pressure-level grid onto a fixed altitude (~800 m, RICO cloud base).

We download two fields over the same synoptic-scale box: surface precipitation (single-level, 2-D) and specific humidity at all pressure levels (pressure-level, 3-D). The pressure-level data is bound to a PressureLevelGrid whose z-coordinate is built from the instantaneous geopotential, and Oceananigans.interpolate! bisects each column's actual heights inside the kernel — see issue #236.

precip_meta = Metadata(:total_precipitation; dataset, dates, region = synoptic_region)
precip_series = @suppress_out FieldTimeSeries(precip_meta)

Pressure-level qᵛ over the same region. We restrict the data retrieval to the lower troposphere (≥ 250 hPa, surface up to ~10 km), giving 21 vertical levels rather than the 37 standard pressure levels.

selected_levels = filter(≥(250hPa), ERA5_all_pressure_levels)
ds_pl = ERA5HourlyPressureLevels(selected_levels)

qv_meta = Metadata(:specific_humidity; dataset = ds_pl, dates, region = synoptic_region)
qv_series = @suppress_out FieldTimeSeries(qv_meta)

A flat target grid at z = 800 m (RICO cloud-base altitude) that we'll interpolate the pressure-level qᵛ onto each frame.

qv_longitude = x_domain(qv_series.grid)

qv_target_grid = LatitudeLongitudeGrid(CPU();
                                       size = (140, 60, 1),
                                       longitude = qv_longitude,
                                       latitude  = synoptic_region.latitude,
                                       z = (800.0, 801.0),
                                       halo = (2, 2, 1),
                                       topology = (Bounded, Bounded, Bounded))
qv_2d = CenterField(qv_target_grid)
140×60×1 Field{Oceananigans.Grids.Center, Oceananigans.Grids.Center, Oceananigans.Grids.Center} on Oceananigans.Grids.LatitudeLongitudeGrid on CPU
├── grid: 140×60×1 LatitudeLongitudeGrid{Float64, Oceananigans.Grids.Bounded, Oceananigans.Grids.Bounded, Oceananigans.Grids.Bounded} on CPU with 2×2×1 halo
├── boundary conditions: FieldBoundaryConditions
│   └── west: ZeroFlux, east: ZeroFlux, south: ZeroFlux, north: ZeroFlux, bottom: ZeroFlux, top: ZeroFlux, immersed: Nothing
└── data: 144×64×3 OffsetArray(::Array{Float64, 3}, -1:142, -1:62, 0:2) with eltype Float64 with indices -1:142×-1:62×0:2
    └── max=0.0, min=0.0, mean=0.0

Build a two-panel animation: precipitation (mm/day) on top, qᵛ at z = 800 m below. The precip panel reads its (λ, φ) axes from the source grid; the qᵛ panel reads them from qv_target_grid via the Oceananigans Makie ext.

Nt = length(dates)
λ, φ, _ = nodes(precip_series[1])
λ_plot = @. ifelse(λ > 180, λ - 360, λ)

# ERA5 `total_precipitation` is in m/hour; convert to mm/day.
to_mm_day = 1000 * 24
to_g_per_kg = 1000

n = Observable(1)

precip_obs = @lift interior(precip_series[$n], :, :, 1) .* to_mm_day
qv_obs = @lift begin
    interpolate!(qv_2d, qv_series[$n])
    interior(qv_2d, :, :, 1) .* to_g_per_kg
end

λ_qv, φ_qv, _ = nodes(qv_2d)
λ_qv_plot = @. ifelse(λ_qv > 180, λ_qv - 360, λ_qv)

fig1 = Figure(size=(900, 700))
title = @lift "ERA5 synoptic conditions — " * Dates.format(dates[$n], dateformat"u d HH:MM") * " UTC"
Label(fig1[0, 1:2], title; fontsize=14, font=:bold, tellwidth=false)

ax_p = Axis(fig1[1, 1]; title="Total precipitation",
            xlabel="Longitude (°)", ylabel="Latitude (°)",
            xticks=-90:30:30)
ax_q = Axis(fig1[2, 1]; title="Specific humidity at z = 800 m",
            xlabel="Longitude (°)", ylabel="Latitude (°)",
            xticks=-90:30:30)

hm_p = heatmap!(ax_p, λ_plot, φ, precip_obs; colormap=:rain, colorrange=(0, 12))
hm_q = heatmap!(ax_q, λ_qv_plot, φ_qv, qv_obs; colormap=:viridis, colorrange=(0, 20))

Colorbar(fig1[1, 2], hm_p, label="mm/day")
Colorbar(fig1[2, 2], hm_q, label="qᵛ [g/kg]")

linkaxes!(ax_p, ax_q)

record(fig1, "synoptic_animation.mp4", 1:Nt; framerate=12) do nn
    n[] = nn
end

§3 Microscale conditions

Time history of precipitation at the RICO location

New in this section:

  • Column replaces BoundingBox as the region restriction. This issues a smaller CDS request that downloads only the cells needed to linearly interpolate (by default) to the requested (longitude, latitude) coordinate. A Column(...; interpolation = Nearest()) option is also available.
  • We explicitly load all fields in the timeseries into memory through FieldTimeSeries...; time_indices_in_memory = Nt) whereas the default is to load only two snapshots at a time. Having the full timeseries in memory facilitates any operations we want to perform on the data — in this case, a units conversion.
Slicing

We could have sliced the precip_series from above, but we illustrate here a seperate data retrieval path.

precip_col_meta = Metadata(:total_precipitation; dataset, dates, region = rico_column)
precip_col_series = @suppress_out FieldTimeSeries(precip_col_meta; time_indices_in_memory = Nt)

ERA5 total_precipitation is in m (liquid-water-equivalent depth). Note that this is an accumulated rather than instantaneous quantity (more discussion here) with an accumulation period of 1 hour. We convert to a latent-heat-equivalent flux in W/m² to compare with van Zanten et al. (2011)'s reported 21 W m⁻² mean.

ρᴸ = 1000   # kg/m³
Lᵛ = 2.5e6  # J/kg, latent heat of vaporization

to_W_per_m2 = ρᴸ * Lᵛ / 3600  # m/hr → W/m²

precip_W_m2 = interior(precip_col_series, 1, 1, 1, :)
precip_W_m2 .*= to_W_per_m2

Now, plot the precipitation time history.

fig2 = Figure(size=(900, 300))

# Tick at each day boundary (00:00 of each day in the window).
day_dts    = first(dates):Day(1):last(dates)
day_ticks  = (0:length(day_dts)-1) .* 86400.0   # seconds since first(dates)
day_labels = Dates.format.(day_dts, dateformat"u d")

ax2 = Axis(fig2[1, 1],
           title  = "Precipitation at $(rico_column.longitude)°E, $(rico_column.latitude)°N",
           ylabel = "Precipitation [W m⁻²]",
           xticks = (day_ticks, day_labels))

lines!(ax2, precip_col_series.times, precip_W_m2; color=:steelblue, label="ERA5 hourly data")
hlines!(ax2, [21.0]; color=:black, linestyle=:dash, label="van Zanten mean (21 W m⁻²)")

axislegend(ax2; position=:rt)

fig2

In comparison with observations (van Zanten et al. (2011)'s Fig. 1), we see different day-to-day variability in the reanalysis, with a wider range of values and greater mean precipitation.

@show mean(precip_W_m2)
31.73395f0

Time-height of cloud liquid and rain water content at the RICO location

This part of the analysis is based on ERA5 hourly data on pressure levels, also available from 1940 to present. What's new:

  • We restrict the data retrieval to the lower troposphere (≥ 250 hPa, surface up to ~10 km). This returns data with 21 vertical levels instead of all 37 standard pressure levels — the full list is given by ERA5_all_pressure_levels.
  • download(variables, dataset, dates; region) bundles multi-variable requests into a single CDS API call — fewer round trips than calling download per variable, which is what FieldTimeSeries does automatically on demand.
# Selected pressure levels [hPa] (filtered to the lower troposphere in §2).
ds_pl.pressure_levels' / hPa
1×21 adjoint(::Vector{Float64}) with eltype Float64:
 1000.0  975.0  950.0  925.0  900.0  875.0  850.0  825.0  800.0  775.0  750.0  700.0  650.0  600.0  550.0  500.0  450.0  400.0  350.0  300.0  250.0

3-D data are downloaded in a Column region, resulting in one 1-D field per snapshot.

The download list also includes:geopotential because a pressure-level FieldTimeSeries grid derives its z-coordinate from the time-mean geopotential height. FieldTimeSeries would also have automatically downloaded this field on demand, but pre-downloading saves API calls.

Geopotential

If the geopotential field isn't available, the fallback is to estimate geopotential heights from the international standard atmosphere.

variables = [:specific_cloud_liquid_water_content,
             :specific_rain_water_content,
             :geopotential]
@suppress_out download(variables, ds_pl, dates; region = rico_column)

Load the downloaded data and stack the column profile from each time into a (Nt × Nz) matrix.

cloud_set = MetadataSet(:specific_cloud_liquid_water_content,
                        :specific_rain_water_content;
                        dataset = ds_pl, dates, region = rico_column)
cloud_series  = FieldTimeSeries(cloud_set)
qᶜ_col_series = cloud_series.specific_cloud_liquid_water_content
qʳ_col_series = cloud_series.specific_rain_water_content
1×1×21×96 FieldTimeSeries{NumericalEarth.DataWrangling.DatasetBackend} located at (⋅, ⋅, Center) on Oceananigans.Architectures.CPU
├── grid: 1×1×21 RectilinearGrid{Float32, Flat, Flat, Bounded} on CPU with 0×0×3 halo
├── indices: (:, :, :)
├── time_indexing: Cyclical(period=345600.0)
├── backend: DatasetBackend(1, 2)
└── data: 1×1×27×2 OffsetArray(::Array{Float32, 4}, 1:1, 1:1, -2:24, 1:2) with eltype Float32 with indices 1:1×1:1×-2:24×1:2
    └── max=0.0, min=0.0, mean=0.0

The column FieldTimeSeries lives on a Flat-Flat-Bounded grid, so it has no horizontal extent — znodes on it returns the 1-D column-mean axis.

z_col  = znodes(qᶜ_col_series[1])
Nz_col = length(z_col)

qᶜ_data = zeros(Nt, Nz_col)
qʳ_data = zeros(Nt, Nz_col)
for n in 1:Nt
    qᶜ_data[n, :] = vec(interior(qᶜ_col_series[n]))
    qʳ_data[n, :] = vec(interior(qʳ_col_series[n]))
end
qᶜ_data .*= 1000   # kg/kg → g/kg
qʳ_data .*= 1000   # kg/kg → g/kg

Render the Hovmöller diagram using heatmap, with the same x-ticks as fig2.

fig3 = Figure(size=(900, 600))

ax_qc = Axis(fig3[1, 1],
             title  = "Specific cloud liquid water content at $(rico_column.longitude)°E, $(rico_column.latitude)°N",
             ylabel = "Height [m]",
             xticks = (day_ticks, day_labels))
ax_qr = Axis(fig3[2, 1],
             title  = "Specific rain water content at $(rico_column.longitude)°E, $(rico_column.latitude)°N",
             ylabel = "Height [m]",
             xticks = (day_ticks, day_labels))

hm_qc = heatmap!(ax_qc, qᶜ_col_series.times, z_col, qᶜ_data; colormap=:Blues)
hm_qr = heatmap!(ax_qr, qʳ_col_series.times, z_col, qʳ_data; colormap=:Blues)

Colorbar(fig3[1, 2], hm_qc, label="qᶜˡ [g kg⁻¹]")
Colorbar(fig3[2, 2], hm_qr, label="qʳ [g kg⁻¹]")

linkaxes!(ax_qc, ax_qr)
ylims!(ax_qc, 0, 4000)
hidexdecorations!(ax_qc, grid=false)

fig3

This figure tells the same story as fig2 in a different way.

Profiles at the RICO location

We use the filtered pressure-levels dataset from before, as well as the previously defined BoundingBox. As before, we bundle API requests to expedite the data retrieval.

variables = [:temperature, :specific_humidity,
             :eastward_velocity, :northward_velocity,
             :geopotential]
download(variables, ds_pl, dates; region = rico_region)

rico_set = MetadataSet(:temperature, :specific_humidity,
                       :eastward_velocity, :northward_velocity;
                       dataset = ds_pl, dates, region = rico_region)

rico_series = @suppress_out FieldTimeSeries(rico_set)
T_series = rico_series.temperature
q_series = rico_series.specific_humidity
u_series = rico_series.eastward_velocity
v_series = rico_series.northward_velocity

Calculate mean profiles and quantities of interest.

z  = znodes(T_series.grid, nothing, nothing, Center())
Nz = length(z)
p_levs  = sort(selected_levels, rev=true) ./ hPa   # Pa → hPa, from bottom-to-top

function horizontal_mean_profiles(series)
    profiles = zeros(Nz, Nt)
    for n in 1:Nt
        profiles[:, n] = mean(interior(series[n], :, :, :), dims=(1, 2))
    end
    return profiles
end

T_profiles = horizontal_mean_profiles(T_series)
q_profiles = horizontal_mean_profiles(q_series)
u_profiles = horizontal_mean_profiles(u_series)
v_profiles = horizontal_mean_profiles(v_series)

# T → potential temperature: θ = T (p₀/p)^(R/cₚ)
Rᵈ_over_cᵖ = 0.286
pˢᵗ = 1000
θ_profiles = T_profiles .* (pˢᵗ ./ p_levs) .^ Rᵈ_over_cᵖ

# specific humidity: kg/kg → g/kg
q_profiles .*= 1000

Lastly, plot the profiles (cf. van Zanten et al. (2011), Fig. 2).

fig4 = Figure(size=(900, 540), fontsize=12)

fig4_title = string("Mean ± IQR vertical profiles over the RICO box, ",
                    Dates.format(first(dates), dateformat"u d yyyy"), " – ",
                    Dates.format(last(dates),  dateformat"u d yyyy"))
Label(fig4[0, 1:4], fig4_title;
      fontsize=14, font=:bold, halign=:center, tellwidth=false)

ax_θ = Axis(fig4[1, 1], xlabel="θ [K]",       ylabel="Height [m]")
ax_q = Axis(fig4[1, 2], xlabel="qᵛ [g kg⁻¹]", ylabel="Height [m]")
ax_u = Axis(fig4[1, 3], xlabel="u [m s⁻¹]",   ylabel="Height [m]", xticks=-10:2:2)
ax_v = Axis(fig4[1, 4], xlabel="v [m s⁻¹]",   ylabel="Height [m]", xticks=-10:2:0)

for (ax, profiles) in [(ax_θ, θ_profiles), (ax_q, q_profiles),
                       (ax_u, u_profiles), (ax_v, v_profiles)]
    μ  = vec(mean(profiles, dims=2))
    lo = [quantile(r, 0.25) for r in eachrow(profiles)]
    hi = [quantile(r, 0.75) for r in eachrow(profiles)]
    band!(ax, z, lo, hi; direction=:y, color=(:gray, 0.4))
    lines!(ax, μ, z; color=:black, linewidth=2)
end

xlims!(ax_θ, 295, 320)
xlims!(ax_q,   0,  15)
xlims!(ax_u, -10,   2)
xlims!(ax_v,  -9,  -1)
linkyaxes!(ax_θ, ax_q, ax_u, ax_v)
ylims!(ax_θ, 0, 4000)
hideydecorations!(ax_q, grid=false)
hideydecorations!(ax_u, grid=false)
hideydecorations!(ax_v, grid=false)

fig4

This page was generated using Literate.jl.