netCDF-SCM¶
netCDF-SCM is a Python package for processing netCDF files. It focusses on metrics which are relevant to simple climate models and is built on top of the Iris package.
netCDF-SCM is free software under a BSD 3-Clause License, see LICENSE. If you make any use of netCDF-SCM, please cite the Geoscience Data Journal (GDJ) paper (Nicholls et al., GDJ 2021) as well as the relevant Zenodo release.
Disclaimer¶
One of the most common uses of netCDF-SCM is for processing Coupled Model Intercomparison Project data. If this is your use case, please note that you must abide by the terms of use of the data, in particular the required acknowledgement statements (see the CMIP5 terms of use, CMIP6 terms of use and CMIP6 GMD Special Issue).
To make it easier to do this, we have developed some basic tools which simplify the process of checking model license terms and creating the tables required in publications to cite CMIP data (check them out here). However, we provide no guarantees that these tools are up to date so all users should double check that they do in fact produce output consistent with the terms of use referenced above (and if there are issues, please raise an issue at our issue tracker :) ).
Installation¶
The easiest way to install netCDF-SCM is with conda
# if you're using a conda environment, make sure you're in it
conda install -c conda-forge netcdf-scm
It is also possible to install it with pip
# if you're using a virtual environment, make sure you're in it
pip install netcdf-scm
However installing with pip requires installing all of Iris’s dependencies yourself which is not trivial. As far as we know, Iris cannot be installed with pip alone.
Usage¶
Here we provide various examples of netCDF-SCM’s behaviour and usage. The source code of these usage examples is available in the folder docs/source/usage of the GitLab repository.
Basic demos¶
Handling netCDF files for simple climate models¶
In this notebook we give a brief introduction to iris, the library we use for our analysis, before giving a demonstration of some of the key functionality of netCDF-SCM.
[1]:
# NBVAL_IGNORE_OUTPUT
from os.path import join
import numpy as np
import iris
import iris.coord_categorisation
import iris.quickplot as qplt
import iris.analysis.cartography
import matplotlib
import matplotlib.pyplot as plt
from netcdf_scm.iris_cube_wrappers import ScmCube, MarbleCMIP5Cube
[2]:
plt.style.use("bmh")
%matplotlib inline
[3]:
DATA_PATH_TEST = join("..", "..", "..", "tests", "test-data")
DATA_PATH_TEST_MARBLE_CMIP5_ROOT = join(DATA_PATH_TEST, "marble-cmip5")
Loading a cube¶
Here we show how to load a cube directly using iris.
[4]:
tas_file = join(
DATA_PATH_TEST_MARBLE_CMIP5_ROOT,
"cmip5",
"1pctCO2",
"Amon",
"tas",
"CanESM2",
"r1i1p1",
"tas_Amon_CanESM2_1pctCO2_r1i1p1_189201-190312.nc",
)
[5]:
# NBVAL_IGNORE_OUTPUT
# Ignore output as the warnings are likely to change with
# new iris versions
tas_iris_load = ScmCube()
# you need this in case your cube has multiple variables
variable_constraint = iris.Constraint(
cube_func=(lambda c: c.var_name == np.str("tas"))
)
tas_iris_load.cube = iris.load_cube(tas_file, variable_constraint)
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/cf.py:803: UserWarning: Missing CF-netCDF measure variable 'areacella', referenced by netCDF variable 'tas'
warnings.warn(message % (variable_name, nc_var_name))
The warning tells us that we need to add the areacella as a measure variable to our cube. Doing this manually everytime involves finding the areacella file, loading it, turning it into a cell measure and then adding it to the cube. This is a pain and involves about 100 lines of code. To make life easier, we wrap all of that away using netcdf_scm
, which we will introduce in the next section.
netcdf_scm
¶There are a couple of particularly annoying things involved with processing netCDF data. Firstly, the data is often stored in a folder hierarchy which can be fiddly to navigate. Secondly, the metadata is often stored separate from the variable cubes.
Hence in netcdf_scm
, we try to abstract the code which solves these two things away to make life a bit easier. This involves defining a cube in netcdf_scm.iris_cube_wrappers
. The details can be read there, for now we just give an example.
Our example uses MarbleCMIP5Cube
. This cube is designed to work with the CMIP5 data on our server at University of Melbourne, which has been organised into a number of folders which are similar, but not quite identical, to the CMOR directory structure described in section 3.1 of the CMIP5 Data Reference Syntax. To facilitate our example, the test data in DATA_PATH_TEST_MARBLE_CMIP5_ROOT
is organised in the
same way.
With our MarbleCMIP5Cube
, we can simply pass in the information about the data we want (experiment, model, ensemble member etc.) and it will load our desired cube using the load_data_from_identifiers
method.
[6]:
tas = MarbleCMIP5Cube()
tas.load_data_from_identifiers(
root_dir=DATA_PATH_TEST_MARBLE_CMIP5_ROOT,
activity="cmip5",
experiment="1pctCO2",
mip_table="Amon",
variable_name="tas",
model="CanESM2",
ensemble_member="r1i1p1",
time_period="189201-190312",
file_ext=".nc",
)
We can verify that the loaded cube is exactly the same as the cube we loaded in the previous section (where we provided the full path).
[7]:
# NBVAL_IGNORE_OUTPUT
assert tas.cube == tas_iris_load.cube
We can have a look at our cube too (note that the broken cell measures representation is intended to be fixed in https://github.com/SciTools/iris/pull/3173).
[8]:
# NBVAL_IGNORE_OUTPUT
tas.cube
[8]:
Air Temperature (K) | time | latitude | longitude |
---|---|---|---|
Shape | 144 | 64 | 128 |
Dimension coordinates | |||
time | x | - | - |
latitude | - | x | - |
longitude | - | - | x |
Cell | Measures | ||
cell_area | - | x | x |
Scalar coordinates | |||
height | 2.0 m | ||
Attributes | |||
CCCma_data_licence | 1) GRANT OF LICENCE - The Government of Canada (Environment Canada) is... | ||
CCCma_parent_runid | IGA | ||
CCCma_runid | IDK | ||
CDI | Climate Data Interface version 1.9.7.1 (http://mpimet.mpg.de/cdi) | ||
CDO | Climate Data Operators version 1.9.7.1 (http://mpimet.mpg.de/cdo) | ||
Conventions | CF-1.4 | ||
associated_files | baseURL: http://cmip-pcmdi.llnl.gov/CMIP5/dataLocation gridspecFile: gridspec_atmos_fx_CanESM2_1pctCO2_r0i0p0.nc... | ||
branch_time | 171915.0 | ||
branch_time_YMDH | 2321:01:01:00 | ||
cmor_version | 2.5.4 | ||
contact | cccma_info@ec.gc.ca | ||
creation_date | 2011-03-10T12:09:13Z | ||
experiment | 1 percent per year CO2 | ||
experiment_id | 1pctCO2 | ||
forcing | GHG (GHG includes CO2 only) | ||
frequency | mon | ||
history | 2011-03-10T12:09:13Z altered by CMOR: Treated scalar dimension: 'height'.... | ||
initialization_method | 1 | ||
institute_id | CCCma | ||
institution | CCCma (Canadian Centre for Climate Modelling and Analysis, Victoria, BC,... | ||
model_id | CanESM2 | ||
modeling_realm | atmos | ||
original_name | ST | ||
parent_experiment | pre-industrial control | ||
parent_experiment_id | piControl | ||
parent_experiment_rip | r1i1p1 | ||
physics_version | 1 | ||
product | output | ||
project_id | CMIP5 | ||
realization | 1 | ||
references | http://www.cccma.ec.gc.ca/models | ||
source | CanESM2 2010 atmosphere: CanAM4 (AGCM15i, T63L35) ocean: CanOM4 (OGCM4.0,... | ||
table_id | Table Amon (31 January 2011) 53b766a395ac41696af40aab76a49ae5 | ||
title | CanESM2 model output prepared for CMIP5 1 percent per year CO2 | ||
tracking_id | 36b6de63-cce5-4a7a-a3f4-69e5b4056fde | ||
Cell methods | |||
mean | time (15 minutes) |
With our MarbleCMIP5Cube
, we can also pass in the filepath and the cube will determine the relevant attributes for us, as well as loading the other required cubes.
[9]:
example_path = join(
DATA_PATH_TEST_MARBLE_CMIP5_ROOT,
"cmip5",
"1pctCO2",
"Amon",
"tas",
"CanESM2",
"r1i1p1",
"tas_Amon_CanESM2_1pctCO2_r1i1p1_189201-190312.nc",
)
example_path
[9]:
'../../../tests/test-data/marble-cmip5/cmip5/1pctCO2/Amon/tas/CanESM2/r1i1p1/tas_Amon_CanESM2_1pctCO2_r1i1p1_189201-190312.nc'
[10]:
tas_from_path = MarbleCMIP5Cube()
tas_from_path.load_data_from_path(example_path)
tas_from_path.model
[10]:
'CanESM2'
We can also confirm that this cube is the same again.
[11]:
# NBVAL_IGNORE_OUTPUT
assert tas_from_path.cube == tas_iris_load.cube
Acting on the cube¶
Once we have loaded our ScmCube
, we can act on its cube
attribute like any other Iris Cube. For example, we can add a year categorisation, take an annual mean and then plot the timeseries.
[12]:
# NBVAL_IGNORE_OUTPUT
year_cube = tas.cube.copy()
iris.coord_categorisation.add_year(year_cube, "time")
annual_mean = year_cube.aggregated_by(
["year"], iris.analysis.MEAN # Do the averaging
)
global_annual_mean = annual_mean.collapsed(
["latitude", "longitude"],
iris.analysis.MEAN,
weights=iris.analysis.cartography.area_weights(annual_mean),
)
qplt.plot(global_annual_mean);
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/analysis/cartography.py:394: UserWarning: Using DEFAULT_SPHERICAL_EARTH_RADIUS.
warnings.warn("Using DEFAULT_SPHERICAL_EARTH_RADIUS.")

We can also take a time average and make a spatial plot.
[13]:
# NBVAL_IGNORE_OUTPUT
time_mean = tas.cube.collapsed("time", iris.analysis.MEAN)
qplt.pcolormesh(time_mean);

Add a spatial plot with coastlines.
[14]:
# NBVAL_IGNORE_OUTPUT
# we ignore output here as CI sometimes has to
# download the map file
qplt.pcolormesh(time_mean,)
plt.gca().coastlines();

SCM specifics¶
Finally, we present the key functions of this package. These are directly related to processing netCDF files for simple climate models.
The major one is get_scm_timeseries
. This function wraps a number of steps:
load the land surface fraction data
combine the land surface fraction and latitude data to determine the hemisphere and land/ocean boxes
cut the data into the relevant boxes
take a time average in each box
return it all as an
ScmRun
instance
As you can imagine, we find it very useful to be able to abstract all these normally nasty steps away.
[15]:
tas_scm_timeseries = tas.get_scm_timeseries()
type(tas_scm_timeseries)
[15]:
scmdata.run.ScmRun
[16]:
tas_scm_timeseries.tail()
[16]:
time | 1892-01-16 12:00:00 | 1892-02-15 00:00:00 | 1892-03-16 12:00:00 | 1892-04-16 00:00:00 | 1892-05-16 12:00:00 | 1892-06-16 00:00:00 | 1892-07-16 12:00:00 | 1892-08-16 12:00:00 | 1892-09-16 00:00:00 | 1892-10-16 12:00:00 | ... | 1903-03-16 12:00:00 | 1903-04-16 00:00:00 | 1903-05-16 12:00:00 | 1903-06-16 00:00:00 | 1903-07-16 12:00:00 | 1903-08-16 12:00:00 | 1903-09-16 00:00:00 | 1903-10-16 12:00:00 | 1903-11-16 00:00:00 | 1903-12-16 12:00:00 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
activity_id | climate_model | member_id | mip_era | model | region | scenario | unit | variable | variable_standard_name | |||||||||||||||||||||
cmip5 | CanESM2 | r1i1p1 | CMIP5 | unspecified | World|Southern Hemisphere | 1pctCO2 | K | tas | air_temperature | 289.995973 | 290.172797 | 289.714070 | 288.535888 | 286.995946 | 285.540528 | 284.223201 | 284.026061 | 284.104412 | 285.718940 | ... | 289.463490 | 288.250221 | 287.143196 | 285.560893 | 284.483592 | 284.028544 | 284.407818 | 285.688775 | 287.605868 | 289.166832 |
World|Northern Hemisphere|Land | 1pctCO2 | K | tas | air_temperature | 273.761214 | 274.977433 | 279.443819 | 285.065822 | 290.592045 | 295.710179 | 298.054659 | 296.513476 | 291.840867 | 285.930817 | ... | 280.092979 | 285.314135 | 290.661860 | 295.907703 | 298.376198 | 296.972201 | 292.059519 | 286.135415 | 280.426803 | 275.705298 | |||||
World|Southern Hemisphere|Land | 1pctCO2 | K | tas | air_temperature | 285.622468 | 284.234503 | 282.236887 | 279.420272 | 277.156861 | 275.346434 | 273.870726 | 276.048329 | 276.954356 | 280.633229 | ... | 281.224552 | 278.331160 | 277.761742 | 275.291770 | 274.902389 | 275.396653 | 277.198751 | 280.436202 | 283.624803 | 285.681420 | |||||
World|Northern Hemisphere|Ocean | 1pctCO2 | K | tas | air_temperature | 289.822557 | 289.047582 | 289.343519 | 290.817539 | 292.586042 | 294.197028 | 295.573417 | 296.190904 | 295.730899 | 294.419906 | ... | 289.406973 | 290.787364 | 292.588619 | 294.351466 | 295.773334 | 296.487474 | 296.136925 | 294.893012 | 293.140459 | 291.236986 | |||||
World|Southern Hemisphere|Ocean | 1pctCO2 | K | tas | air_temperature | 291.079561 | 291.644080 | 291.566631 | 290.794390 | 289.433696 | 288.066236 | 286.788149 | 286.002639 | 285.875923 | 286.978986 | ... | 291.504785 | 290.707786 | 289.467563 | 288.105189 | 286.857449 | 286.167197 | 286.193950 | 286.990162 | 288.592224 | 290.030384 |
5 rows × 144 columns
Having the data as an ScmRun
makes it trivial to plot and work with.
[17]:
# NBVAL_IGNORE_OUTPUT
restricted_time_df = tas_scm_timeseries.filter(
year=range(1895, 1901),
region="*Ocean*", # Try e.g. "*", "World", "*Land", "*Southern Hemisphere*" here
)
restricted_time_df.line_plot(
x="time", hue="region",
);

[18]:
# NBVAL_IGNORE_OUTPUT
tas_scm_timeseries_annual_mean = (
tas_scm_timeseries.filter(region="World").timeseries().T
)
tas_scm_timeseries_annual_mean = tas_scm_timeseries_annual_mean.groupby(
tas_scm_timeseries_annual_mean.index.map(lambda x: x.year)
).mean()
tas_scm_timeseries_annual_mean.head()
[18]:
activity_id | cmip5 |
---|---|
climate_model | CanESM2 |
member_id | r1i1p1 |
mip_era | CMIP5 |
model | unspecified |
region | World |
scenario | 1pctCO2 |
unit | K |
variable | tas |
variable_standard_name | air_temperature |
time | |
1892 | 288.398356 |
1893 | 288.119690 |
1894 | 288.089608 |
1895 | 288.392593 |
1896 | 288.339564 |
[19]:
tas_scm_timeseries_annual_mean.plot(figsize=(16, 9));

As part of the process above, we calculate all the timeseries as iris.cube.Cube
’s. Extracting these intermediate cubes can be done with get_scm_timeseries_cubes
. These intermediate cubes are useful as they contain all the metadata from the source cube in a slightly more friendly format than ScmRun
’s metadata
attribute [Note: ScmRun
’s metadata handling is a work in progress].
[20]:
tas_scm_ts_cubes = tas.get_scm_timeseries_cubes()
[21]:
# NBVAL_IGNORE_OUTPUT
print(tas_scm_ts_cubes["World"].cube)
air_temperature (time: 144)
Dimension coordinates:
time x
Scalar coordinates:
area_world: 510099672793088.0 m**2
area_world_land: 154606610000000.0 m**2
area_world_northern_hemisphere: 255049836396544.0 m**2
area_world_northern_hemisphere_land: 103962633261056.0 m**2
area_world_northern_hemisphere_ocean: 151087203135488.0 m**2
area_world_ocean: 355493070000000.0 m**2
area_world_southern_hemisphere: 255049836396544.0 m**2
area_world_southern_hemisphere_land: 50643977650176.0 m**2
area_world_southern_hemisphere_ocean: 204405858746368.0 m**2
height: 2.0 m
land_fraction: 0.30309097827134884
land_fraction_northern_hemisphere: 0.4076169376537766
land_fraction_southern_hemisphere: 0.19856502699902237
latitude: 0.0 degrees, bound=(-90.0, 90.0) degrees
longitude: 178.59375 degrees, bound=(-1.40625, 358.59375) degrees
Scalar cell measures:
cell_area
Attributes:
CCCma_data_licence: 1) GRANT OF LICENCE - The Government of Canada (Environment Canada) is...
CCCma_parent_runid: IGA
CCCma_runid: IDK
CDI: Climate Data Interface version 1.9.7.1 (http://mpimet.mpg.de/cdi)
CDO: Climate Data Operators version 1.9.7.1 (http://mpimet.mpg.de/cdo)
Conventions: CF-1.4
activity_id: cmip5
associated_files: baseURL: http://cmip-pcmdi.llnl.gov/CMIP5/dataLocation gridspecFile: gridspec_atmos_fx_CanESM2_1pctCO2_r0i0p0.nc...
branch_time: 171915.0
branch_time_YMDH: 2321:01:01:00
climate_model: CanESM2
cmor_version: 2.5.4
contact: cccma_info@ec.gc.ca
creation_date: 2011-03-10T12:09:13Z
crunch_netcdf_scm_version: 2.0.0rc5+3.gc7d2d42.dirty (more info at gitlab.com/netcdf-scm/netcdf-s...
crunch_netcdf_scm_weight_kwargs: {}
crunch_source_files: Files: ['/cmip5/1pctCO2/Amon/tas/CanESM2/r1i1p1/tas_Amon_CanESM2_1pctCO2_r1i1p1_189201-190312.nc'];...
experiment: 1 percent per year CO2
experiment_id: 1pctCO2
forcing: GHG (GHG includes CO2 only)
frequency: mon
history: 2011-03-10T12:09:13Z altered by CMOR: Treated scalar dimension: 'height'....
initialization_method: 1
institute_id: CCCma
institution: CCCma (Canadian Centre for Climate Modelling and Analysis, Victoria, BC,...
member_id: r1i1p1
mip_era: CMIP5
model_id: CanESM2
modeling_realm: atmos
original_name: ST
parent_experiment: pre-industrial control
parent_experiment_id: piControl
parent_experiment_rip: r1i1p1
physics_version: 1
product: output
project_id: CMIP5
realization: 1
references: http://www.cccma.ec.gc.ca/models
region: World
scenario: 1pctCO2
source: CanESM2 2010 atmosphere: CanAM4 (AGCM15i, T63L35) ocean: CanOM4 (OGCM4.0,...
table_id: Table Amon (31 January 2011) 53b766a395ac41696af40aab76a49ae5
title: CanESM2 model output prepared for CMIP5 1 percent per year CO2
tracking_id: 36b6de63-cce5-4a7a-a3f4-69e5b4056fde
variable: tas
variable_standard_name: air_temperature
Cell methods:
mean: time (15 minutes)
mean: latitude, longitude
In particular, the land_fraction*
auxillary co-ordinates provide useful information about the fraction of area that was assumed to be land in the crunching.
[22]:
tas_scm_ts_cubes["World"].cube.coords("land_fraction")
[22]:
[AuxCoord(array([0.30309098]), standard_name=None, units=Unit('1'), long_name='land_fraction')]
[23]:
tas_scm_ts_cubes["World"].cube.coords("land_fraction_northern_hemisphere")
[23]:
[AuxCoord(array([0.40761694]), standard_name=None, units=Unit('1'), long_name='land_fraction_northern_hemisphere')]
Another utility function is get_scm_timeseries_weights
. This function is very similar to get_scm_timeseries
but returns just the weights rather than a ScmRun
. These weights are also area weighted.
[24]:
tas_scm_weights = tas.get_scm_timeseries_weights()
[25]:
# NBVAL_IGNORE_OUTPUT
plt.figure(figsize=(18, 18))
no_rows = 3
no_cols = 4
total_panels = no_cols * no_rows
rows_plt_comp = no_rows * 100
cols_plt_comp = no_cols * 10
for i, (label, weights) in enumerate(tas_scm_weights.items()):
if label == "World":
index = int((no_rows + 1) / 2)
plt.subplot(no_rows, 1, index)
else:
if label == "World|Northern Hemisphere":
index = 1
elif label == "World|Southern Hemisphere":
index = 1 + (no_rows - 1) * no_cols
elif label == "World|Land":
index = 2
elif label == "World|Ocean":
index = 2 + (no_rows - 1) * no_cols
else:
index = 3
if "Ocean" in label:
index += 1
if "Southern Hemisphere" in label:
index += (no_rows - 1) * no_cols
plt.subplot(no_rows, no_cols, index)
weight_cube = tas.cube.collapsed("time", iris.analysis.MEAN)
weight_cube.data = weights
qplt.pcolormesh(weight_cube)
plt.title(label)
plt.gca().coastlines()
plt.tight_layout()
<ipython-input-25-462c0f3bdde1>:38: UserWarning: Tight layout not applied. tight_layout cannot make axes width small enough to accommodate all axes decorations
plt.tight_layout()

More detail¶
Atmospheric, oceanic and land data handling¶
In this notebook we discuss the subtleties of how netCDF-SCM handles different data ‘realms’ and why these choices are made. The realms of interest are atmosphere, ocean and land and the distinction between the realms follows the CMIP6 realm controlled vocabulary.
[1]:
# NBVAL_IGNORE_OUTPUT
import traceback
from os.path import join
import iris
import iris.quickplot as qplt
import matplotlib.pyplot as plt
import numpy as np
from netcdf_scm.iris_cube_wrappers import CMIP6OutputCube
from netcdf_scm.utils import broadcast_onto_lat_lon_grid
[2]:
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
plt.style.use("bmh")
[3]:
import logging
logging.captureWarnings(True)
root_logger = logging.getLogger()
root_logger.setLevel(logging.WARNING)
fmt = logging.Formatter("{levelname}:{name}:{message}", style="{")
stream_handler = logging.StreamHandler()
stream_handler.setFormatter(fmt)
root_logger.addHandler(stream_handler)
[4]:
DATA_PATH_TEST = join("..", "..", "..", "tests", "test-data")
Note that all of our data is on a regular grid, we show an example of using native model grid data in the ocean section.
[5]:
tas_file = join(
DATA_PATH_TEST,
"cmip6output",
"CMIP6",
"CMIP",
"IPSL",
"IPSL-CM6A-LR",
"historical",
"r1i1p1f1",
"Amon",
"tas",
"gr",
"v20180803",
"tas_Amon_IPSL-CM6A-LR_historical_r1i1p1f1_gr_191001-191003.nc",
)
gpp_file = tas_file.replace("Amon", "Lmon").replace("tas", "gpp")
csoilfast_file = gpp_file.replace("gpp", "cSoilFast")
hfds_file = join(
DATA_PATH_TEST,
"cmip6output",
"CMIP6",
"CMIP",
"NOAA-GFDL",
"GFDL-CM4",
"piControl",
"r1i1p1f1",
"Omon",
"hfds",
"gr",
"v20180701",
"hfds_Omon_GFDL-CM4_piControl_r1i1p1f1_gr_015101-015103.nc",
)
Oceans¶
We start by loading our data.
[6]:
# NBVAL_IGNORE_OUTPUT
hfds = CMIP6OutputCube()
hfds.load_data_from_path(hfds_file)
netCDF-SCM will assume whether the data is “ocean”, “land” or “atmosphere”. The assumed realm can be checked by examining a ScmCube
’s netcdf_scm_realm
property.
In our case we have “ocean” data.
[7]:
hfds.netcdf_scm_realm
[7]:
'ocean'
If we have ocean data, then there is no data which will go in a “land” box. Hence, if we request e.g. World|Land
data, we will get a warning and land data will not be returned.
[8]:
out = hfds.get_scm_timeseries(regions=["World", "World|Land"])
out["region"].unique()
WARNING:py.warnings:/Users/znicholls/Documents/AGCEC/netCDF-SCM/netcdf-scm/src/netcdf_scm/weights/__init__.py:869: UserWarning: Failed to create 'World|Land' weights: All weights are zero for region: `World|Land`
warnings.warn(warn_str)
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available
WARNING:netcdf_scm.iris_cube_wrappers:Performing lazy conversion to datetime for calendar: 365_day. This may cause subtle errors in operations that depend on the length of time between dates
[8]:
array(['World'], dtype=object)
As there is no land data, the World
mean is equal to the World|Ocean
mean.
[9]:
# NBVAL_IGNORE_OUTPUT
hfds_scm_ts = hfds.get_scm_timeseries(regions=["World", "World|Ocean"])
hfds_scm_ts.line_plot(style="region")
np.testing.assert_allclose(
hfds_scm_ts.filter(region="World").values,
hfds_scm_ts.filter(region="World|Ocean").values,
);
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available
WARNING:netcdf_scm.iris_cube_wrappers:Performing lazy conversion to datetime for calendar: 365_day. This may cause subtle errors in operations that depend on the length of time between dates

When taking averages, there are 3 obvious options:
unweighted average
area weighted average
area and surface fraction weighted average
In netCDF-SCM, we provide the choice of the first two (if you want an unweighted average, please raise an issue on our issue tracker). Depending on the context, one will likely make more sense than the other. The user can specify this to ScmCube.get_scm_timeseries_weights
via the cell_weights
argument. If the user doesn’t supply a value, ScmCube
will guess depending on what is most appropriate, see the docstring below for more
details.
[10]:
print(hfds.get_scm_timeseries_weights.__doc__)
Get the scm timeseries weights
Parameters
----------
surface_fraction_cube : :obj:`ScmCube`, optional
land surface fraction data which is used to determine whether a given
gridbox is land or ocean. If ``None``, we try to load the land surface fraction automatically.
areacell_scmcube : :obj:`ScmCube`, optional
cell area data which is used to take the latitude-longitude mean of the
cube's data. If ``None``, we try to load this data automatically and if
that fails we fall back onto ``iris.analysis.cartography.area_weights``.
regions : list[str]
List of regions to use. If ``None`` then
``netcdf_scm.regions.DEFAULT_REGIONS`` is used.
cell_weights : {'area-only', 'area-surface-fraction'}
How cell weights should be calculated. If ``'area-surface-fraction'``, both cell area and its
surface fraction will be used to weight the cell. If ``'area-only'``, only the cell's area
will be used to weight the cell (cells which do not belong to the region are nonetheless
excluded). If ``None``, netCDF-SCM will guess whether land surface fraction weights should
be included or not based on the data being processed. When guessing, for ocean data,
netCDF-SCM will weight cells only by the horizontal area of the cell i.e. no land fraction
(see Section L5 of Griffies et al., *GMD*, 2016, `<https://doi.org/10.5194/gmd-9-3231-2016>`_).
For land variables, netCDF-SCM will weight cells by both thier horizontal area and their land
surface fraction. “Yes, you do need to weight the output by land frac (sftlf is the CMIP
variable name).” (Chris Jones, *personal communication*, 18 April 2020). For land variables,
note that there seems to be nothing in Jones et al., *GMD*, 2016
(`<https://doi.org/10.5194/gmd-9-2853-2016>`_).
log_failure : bool
Should regions which fail be logged? If no, failures are raised as
warnings.
Returns
-------
dict of str: :obj:`np.ndarray`
Dictionary of 'region name': weights, key: value pairs
Notes
-----
Only regions which can be calculated are returned. If no regions can be calculated, an empty
dictionary will be returned.
In the cells below, we show the difference the choice of cell weighting makes makes.
[11]:
def compare_weighting_options(input_scm_cube):
unweighted_mean = input_scm_cube.cube.collapsed(
["latitude", "longitude"], iris.analysis.MEAN
)
area_cell = input_scm_cube.get_metadata_cube(
input_scm_cube.areacell_var
).cube
area_weights = broadcast_onto_lat_lon_grid(input_scm_cube, area_cell.data)
area_weighted_mean = input_scm_cube.cube.collapsed(
["latitude", "longitude"], iris.analysis.MEAN, weights=area_weights
)
surface_frac = input_scm_cube.get_metadata_cube(
input_scm_cube.surface_fraction_var
).cube
area_sf = area_cell * surface_frac
area_sf_weights = broadcast_onto_lat_lon_grid(input_scm_cube, area_sf.data)
area_sf_weighted_mean = input_scm_cube.cube.collapsed(
["latitude", "longitude"], iris.analysis.MEAN, weights=area_sf_weights
)
plt.figure(figsize=(8, 4.5))
qplt.plot(unweighted_mean, label="unweighted")
qplt.plot(area_weighted_mean, label="area weighted")
qplt.plot(
area_sf_weighted_mean,
label="area-surface fraction weighted",
linestyle="--",
dashes=(10, 10),
linewidth=4,
)
plt.legend();
[12]:
# NBVAL_IGNORE_OUTPUT
compare_weighting_options(hfds)
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/cube.py:3218: UserWarning: Collapsing spatial coordinate 'latitude' without weighting
warnings.warn(msg.format(coord.name()))

We go to the trouble of taking these area-surface fraction weightings because they matter. In particular, the area weight is required to not overweight the poles (on whatever grid we’re working) whilst the surface fraction allows the user to ensure that the cells’ contribution to an average reflects how much they belong in a given ‘SCM box’.
We can check which variable is being used for the cell areas by loooking at ScmCube.areacell_var
. For ocean data this is areacello
.
[13]:
hfds.areacell_var
[13]:
'areacello'
[14]:
hfds_area_cell = hfds.get_metadata_cube(hfds.areacell_var).cube
qplt.pcolormesh(hfds_area_cell);

We can check which variable is being used for the surface fraction by loooking at ScmCube.surface_fraction_var
. For ocean data this is sftof
.
[15]:
hfds.surface_fraction_var
[15]:
'sftof'
[16]:
hfds_surface_frac = hfds.get_metadata_cube(hfds.surface_fraction_var).cube
qplt.pcolormesh(hfds_surface_frac);

The product of the area of the cells and the surface fraction gives us the area-surface fraction weights. The addition of the surface fraction only really matters near the coastlines where cells are neither entirely land nor entirely ocean.
[17]:
hfds_area_sf = hfds_area_cell * hfds_surface_frac
plt.figure(figsize=(16, 9))
plt.subplot(121)
qplt.pcolormesh(hfds_area_sf,)
plt.subplot(122)
lat_con = iris.Constraint(latitude=lambda cell: -50 < cell < -20)
lon_con = iris.Constraint(longitude=lambda cell: 140 < cell < 160)
qplt.pcolormesh(hfds_area_sf.extract(lat_con & lon_con),);

For ocean data, by default netCDF-SCM will only use the area weights. If we turn the logging up, we can see the decisions being made internally (look at the line following the line containing cell_weights
).
[18]:
# NBVAL_IGNORE_OUTPUT
root_logger.setLevel(logging.DEBUG)
# also load the cube again so the caching doesn't hide the logging messages
hfds = CMIP6OutputCube()
hfds.load_data_from_path(hfds_file)
DEBUG:netcdf_scm.iris_cube_wrappers:loading cube ../../../tests/test-data/cmip6output/CMIP6/CMIP/NOAA-GFDL/GFDL-CM4/piControl/r1i1p1f1/Omon/hfds/gr/v20180701/hfds_Omon_GFDL-CM4_piControl_r1i1p1f1_gr_015101-015103.nc
DEBUG:netcdf_scm.iris_cube_wrappers:loading cube ../../../tests/test-data/cmip6output/CMIP6/CMIP/NOAA-GFDL/GFDL-CM4/piControl/r1i1p1f1/Ofx/areacello/gr/v20180701/areacello_Ofx_GFDL-CM4_piControl_r1i1p1f1_gr.nc
[19]:
# NBVAL_IGNORE_OUTPUT
hfds_area_weights = broadcast_onto_lat_lon_grid(hfds, hfds_area_cell.data)
hfds_area_weighted_mean = hfds.cube.collapsed(
["latitude", "longitude"], iris.analysis.MEAN, weights=hfds_area_weights
)
netcdf_scm_calculated = hfds.get_scm_timeseries(regions=["World"]).timeseries()
np.testing.assert_allclose(
hfds_area_weighted_mean.data,
netcdf_scm_calculated.values.squeeze(),
rtol=1e-6,
)
netcdf_scm_calculated.T
DEBUG:netcdf_scm.iris_cube_wrappers:cell_weights: None
DEBUG:netcdf_scm.iris_cube_wrappers:self.netcdf_scm_realm: ocean
DEBUG:netcdf_scm.iris_cube_wrappers:Using: <class 'netcdf_scm.weights.AreaWeightCalculator'>
DEBUG:netcdf_scm.iris_cube_wrappers:loading cube ../../../tests/test-data/cmip6output/CMIP6/CMIP/NOAA-GFDL/GFDL-CM4/piControl/r1i1p1f1/Ofx/sftof/gr/v20180701/sftof_Ofx_GFDL-CM4_piControl_r1i1p1f1_gr.nc
DEBUG:netcdf_scm.weights:sftof data max is 100.0, dividing by 100.0 to convert units to fraction
DEBUG:netcdf_scm.iris_cube_wrappers:Crunching SCM timeseries in memory
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available
WARNING:netcdf_scm.iris_cube_wrappers:Performing lazy conversion to datetime for calendar: 365_day. This may cause subtle errors in operations that depend on the length of time between dates
[19]:
activity_id | CMIP |
---|---|
climate_model | GFDL-CM4 |
member_id | r1i1p1f1 |
mip_era | CMIP6 |
model | unspecified |
region | World |
scenario | piControl |
unit | W m^-2 |
variable | hfds |
variable_standard_name | surface_downward_heat_flux_in_sea_water |
time | |
0151-01-16 12:00:00 | 12.899261 |
0151-02-15 00:00:00 | 12.346571 |
0151-03-16 12:00:00 | 7.410532 |
If we specify that surface fractions should be included, the timeseries calculated by netCDF-SCM is the same as the timeseries calculated using the surface fraction and area weights.
[20]:
# NBVAL_IGNORE_OUTPUT
hfds_area_sf_weights = broadcast_onto_lat_lon_grid(hfds, hfds_area_sf.data)
hfds_area_sf_weighted_mean = hfds.cube.collapsed(
["latitude", "longitude"], iris.analysis.MEAN, weights=hfds_area_sf_weights
)
netcdf_scm_calculated = hfds.get_scm_timeseries(
regions=["World"], cell_weights="area-surface-fraction"
).timeseries()
np.testing.assert_allclose(
hfds_area_sf_weighted_mean.data,
netcdf_scm_calculated.values.squeeze(),
rtol=1e-6,
)
netcdf_scm_calculated.T
DEBUG:netcdf_scm.iris_cube_wrappers:cell_weights: area-surface-fraction
DEBUG:netcdf_scm.iris_cube_wrappers:Using: <class 'netcdf_scm.weights.AreaSurfaceFractionWeightCalculator'>
DEBUG:netcdf_scm.weights:sftof data max is 100.0, dividing by 100.0 to convert units to fraction
DEBUG:netcdf_scm.iris_cube_wrappers:Crunching SCM timeseries in memory
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available
WARNING:netcdf_scm.iris_cube_wrappers:Performing lazy conversion to datetime for calendar: 365_day. This may cause subtle errors in operations that depend on the length of time between dates
[20]:
activity_id | CMIP |
---|---|
climate_model | GFDL-CM4 |
member_id | r1i1p1f1 |
mip_era | CMIP6 |
model | unspecified |
region | World |
scenario | piControl |
unit | W m^-2 |
variable | hfds |
variable_standard_name | surface_downward_heat_flux_in_sea_water |
time | |
0151-01-16 12:00:00 | 13.440214 |
0151-02-15 00:00:00 | 12.608150 |
0151-03-16 12:00:00 | 7.226662 |
[21]:
root_logger.setLevel(logging.WARNING)
Land¶
Next we look at land data.
[22]:
gpp = CMIP6OutputCube()
gpp.load_data_from_path(gpp_file)
csoilfast = CMIP6OutputCube()
csoilfast.load_data_from_path(csoilfast_file)
[23]:
gpp.netcdf_scm_realm
[23]:
'land'
[24]:
csoilfast.netcdf_scm_realm
[24]:
'land'
If we have land data, then there is no data which will go in a “ocean” box. Hence, if we request e.g. World|Ocean
data, we will get a warning and ocean data will not be returned.
[25]:
out = gpp.get_scm_timeseries(regions=["World", "World|Ocean"])
out["region"].unique()
WARNING:py.warnings:/Users/znicholls/Documents/AGCEC/netCDF-SCM/netcdf-scm/src/netcdf_scm/weights/__init__.py:869: UserWarning: Failed to create 'World|Ocean' weights: All weights are zero for region: `World|Ocean`
warnings.warn(warn_str)
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available
[25]:
array(['World'], dtype=object)
As there is no ocean data, the World
mean is equal to the World|Land
mean.
[26]:
# NBVAL_IGNORE_OUTPUT
gpp_scm_ts = gpp.get_scm_timeseries(regions=["World", "World|Land"])
gpp_scm_ts.line_plot(style="region")
np.testing.assert_allclose(
gpp_scm_ts.filter(region="World").values,
gpp_scm_ts.filter(region="World|Land").values,
);
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available

[27]:
# NBVAL_IGNORE_OUTPUT
compare_weighting_options(gpp)
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/cube.py:3218: UserWarning: Collapsing spatial coordinate 'latitude' without weighting
warnings.warn(msg.format(coord.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))

[28]:
# NBVAL_IGNORE_OUTPUT
compare_weighting_options(csoilfast)
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/cube.py:3218: UserWarning: Collapsing spatial coordinate 'latitude' without weighting
warnings.warn(msg.format(coord.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))

For land data, by default netCDF-SCM will use the area and surface fraction weights. Once again, if we turn the logging up, we can see the decisions being made internally.
[29]:
# NBVAL_IGNORE_OUTPUT
root_logger.setLevel(logging.DEBUG)
csoilfast = CMIP6OutputCube()
csoilfast.load_data_from_path(csoilfast_file)
DEBUG:netcdf_scm.iris_cube_wrappers:loading cube ../../../tests/test-data/cmip6output/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/Lmon/cSoilFast/gr/v20180803/cSoilFast_Lmon_IPSL-CM6A-LR_historical_r1i1p1f1_gr_191001-191003.nc
DEBUG:netcdf_scm.iris_cube_wrappers:loading cube ../../../tests/test-data/cmip6output/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/fx/areacella/gr/v20180803/areacella_fx_IPSL-CM6A-LR_historical_r1i1p1f1_gr.nc
[30]:
# NBVAL_IGNORE_OUTPUT
netcdf_scm_calculated = csoilfast.get_scm_timeseries(
regions=["World"]
).timeseries()
netcdf_scm_calculated.T
DEBUG:netcdf_scm.iris_cube_wrappers:cell_weights: None
DEBUG:netcdf_scm.iris_cube_wrappers:self.netcdf_scm_realm: land
DEBUG:netcdf_scm.iris_cube_wrappers:Using: <class 'netcdf_scm.weights.AreaSurfaceFractionWeightCalculator'>
DEBUG:netcdf_scm.iris_cube_wrappers:loading cube ../../../tests/test-data/cmip6output/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/fx/sftlf/gr/v20180803/sftlf_fx_IPSL-CM6A-LR_historical_r1i1p1f1_gr.nc
DEBUG:netcdf_scm.weights:sftlf data max is 100.0, dividing by 100.0 to convert units to fraction
DEBUG:netcdf_scm.iris_cube_wrappers:Crunching SCM timeseries in memory
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available
[30]:
activity_id | CMIP |
---|---|
climate_model | IPSL-CM6A-LR |
member_id | r1i1p1f1 |
mip_era | CMIP6 |
model | unspecified |
region | World |
scenario | historical |
unit | kg m^-2 |
variable | cSoilFast |
variable_standard_name | fast_soil_pool_carbon_content |
time | |
1910-01-16 12:00:00 | 0.058512 |
1910-02-15 00:00:00 | 0.058663 |
1910-03-16 12:00:00 | 0.059181 |
Atmosphere¶
Finally we look at atmospheric data.
[31]:
tas = CMIP6OutputCube()
tas.load_data_from_path(tas_file)
DEBUG:netcdf_scm.iris_cube_wrappers:loading cube ../../../tests/test-data/cmip6output/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/Amon/tas/gr/v20180803/tas_Amon_IPSL-CM6A-LR_historical_r1i1p1f1_gr_191001-191003.nc
DEBUG:netcdf_scm.iris_cube_wrappers:loading cube ../../../tests/test-data/cmip6output/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/fx/areacella/gr/v20180803/areacella_fx_IPSL-CM6A-LR_historical_r1i1p1f1_gr.nc
[32]:
tas.netcdf_scm_realm
[32]:
'atmosphere'
If we have atmosphere data, then we have global coverage and so can split data into both the land and ocean boxes.
[33]:
# NBVAL_IGNORE_OUTPUT
fig = plt.figure(figsize=(16, 14))
ax1 = fig.add_subplot(311)
tas.get_scm_timeseries(
regions=[
"World",
"World|Land",
"World|Ocean",
"World|Northern Hemisphere",
"World|Southern Hemisphere",
]
).lineplot(hue="region", ax=ax1)
ax2 = fig.add_subplot(312, sharey=ax1, sharex=ax1)
tas.get_scm_timeseries(
regions=[
"World",
"World|Northern Hemisphere|Land",
"World|Southern Hemisphere|Land",
"World|Northern Hemisphere|Ocean",
"World|Southern Hemisphere|Ocean",
]
).lineplot(hue="region", ax=ax2)
ax3 = fig.add_subplot(313, sharey=ax1, sharex=ax1)
tas.get_scm_timeseries(
regions=[
"World",
"World|Ocean",
"World|North Atlantic Ocean",
"World|El Nino N3.4",
]
).lineplot(hue="region", ax=ax3);
DEBUG:netcdf_scm.iris_cube_wrappers:cell_weights: None
DEBUG:netcdf_scm.iris_cube_wrappers:self.netcdf_scm_realm: atmosphere
DEBUG:netcdf_scm.iris_cube_wrappers:Using: <class 'netcdf_scm.weights.AreaSurfaceFractionWeightCalculator'>
DEBUG:netcdf_scm.iris_cube_wrappers:loading cube ../../../tests/test-data/cmip6output/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/fx/sftlf/gr/v20180803/sftlf_fx_IPSL-CM6A-LR_historical_r1i1p1f1_gr.nc
DEBUG:netcdf_scm.weights:sftlf data max is 100.0, dividing by 100.0 to convert units to fraction
DEBUG:netcdf_scm.iris_cube_wrappers:Crunching SCM timeseries in memory
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available
DEBUG:netcdf_scm.iris_cube_wrappers:Crunching SCM timeseries in memory
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available
DEBUG:netcdf_scm.iris_cube_wrappers:Crunching SCM timeseries in memory
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available

[34]:
# NBVAL_IGNORE_OUTPUT
compare_weighting_options(tas)
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/cube.py:3218: UserWarning: Collapsing spatial coordinate 'latitude' without weighting
warnings.warn(msg.format(coord.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))

As our data is global, the “World” data is simply an area-weighted mean.
[35]:
# NBVAL_IGNORE_OUTPUT
tas_area = tas.get_metadata_cube(tas.areacell_var).cube
tas_area_weights = broadcast_onto_lat_lon_grid(tas, tas_area.data)
tas_area_weighted_mean = tas.cube.collapsed(
["latitude", "longitude"], iris.analysis.MEAN, weights=tas_area_weights
)
netcdf_scm_calculated = tas.get_scm_timeseries(regions=["World"]).timeseries()
np.testing.assert_allclose(
tas_area_weighted_mean.data, netcdf_scm_calculated.values.squeeze()
)
netcdf_scm_calculated.T
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
DEBUG:netcdf_scm.iris_cube_wrappers:Crunching SCM timeseries in memory
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available
[35]:
activity_id | CMIP |
---|---|
climate_model | IPSL-CM6A-LR |
member_id | r1i1p1f1 |
mip_era | CMIP6 |
model | unspecified |
region | World |
scenario | historical |
unit | K |
variable | tas |
variable_standard_name | air_temperature |
time | |
1910-01-16 12:00:00 | 284.148122 |
1910-02-15 00:00:00 | 284.196805 |
1910-03-16 12:00:00 | 284.876555 |
The “World|Land” data is surface fraction weighted.
[36]:
# NBVAL_IGNORE_OUTPUT
tas_sf = tas.get_metadata_cube(tas.surface_fraction_var).cube
# netcdf-scm normalises weights to 1 internally so we do so here too
tas_sf = tas_sf / tas_sf.data.max()
tas_area_sf = tas_area * tas_sf
tas_area_sf_weights = broadcast_onto_lat_lon_grid(tas, tas_area_sf.data)
tas_area_sf_weighted_mean = tas.cube.collapsed(
["latitude", "longitude"], iris.analysis.MEAN, weights=tas_area_sf_weights
)
netcdf_scm_calculated = tas.get_scm_timeseries(
regions=["World|Land"]
).timeseries()
np.testing.assert_allclose(
tas_area_sf_weighted_mean.data, netcdf_scm_calculated.values.squeeze()
)
netcdf_scm_calculated.T
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
DEBUG:netcdf_scm.iris_cube_wrappers:Crunching SCM timeseries in memory
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available
[36]:
activity_id | CMIP |
---|---|
climate_model | IPSL-CM6A-LR |
member_id | r1i1p1f1 |
mip_era | CMIP6 |
model | unspecified |
region | World|Land |
scenario | historical |
unit | K |
variable | tas |
variable_standard_name | air_temperature |
time | |
1910-01-16 12:00:00 | 273.530365 |
1910-02-15 00:00:00 | 273.393341 |
1910-03-16 12:00:00 | 275.527954 |
The “World|Ocean” data is also surface fraction weighted (calculated as 100 minus land surface fraction).
[37]:
# NBVAL_IGNORE_OUTPUT
tas_sf_ocean = tas.get_metadata_cube(tas.surface_fraction_var).cube
tas_sf_ocean.data = 100 - tas_sf_ocean.data
# netcdf-scm normalises weights to 1 internally so we do so here too
tas_sf_ocean = tas_sf_ocean / tas_sf_ocean.data.max()
tas_area_sf_ocean = tas_area.data * tas_sf_ocean.data
tas_area_sf_ocean_weights = broadcast_onto_lat_lon_grid(tas, tas_area_sf_ocean)
tas_area_sf_ocean_weighted_mean = tas.cube.collapsed(
["latitude", "longitude"],
iris.analysis.MEAN,
weights=tas_area_sf_ocean_weights,
)
netcdf_scm_calculated = tas.get_scm_timeseries(
regions=["World|Ocean"]
).timeseries()
np.testing.assert_allclose(
tas_area_sf_ocean_weighted_mean.data,
netcdf_scm_calculated.values.squeeze(),
)
netcdf_scm_calculated.T
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
DEBUG:netcdf_scm.iris_cube_wrappers:Crunching SCM timeseries in memory
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available
[37]:
activity_id | CMIP |
---|---|
climate_model | IPSL-CM6A-LR |
member_id | r1i1p1f1 |
mip_era | CMIP6 |
model | unspecified |
region | World|Ocean |
scenario | historical |
unit | K |
variable | tas |
variable_standard_name | air_temperature |
time | |
1910-01-16 12:00:00 | 288.427979 |
1910-02-15 00:00:00 | 288.551514 |
1910-03-16 12:00:00 | 288.644806 |
For atmosphere data, by default netCDF-SCM will use the area and surface fraction weights. Once again, if we turn the logging up, we can see the decisions being made internally.
[38]:
# NBVAL_IGNORE_OUTPUT
root_logger.setLevel(logging.DEBUG)
tas = CMIP6OutputCube()
tas.load_data_from_path(tas_file)
DEBUG:netcdf_scm.iris_cube_wrappers:loading cube ../../../tests/test-data/cmip6output/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/Amon/tas/gr/v20180803/tas_Amon_IPSL-CM6A-LR_historical_r1i1p1f1_gr_191001-191003.nc
DEBUG:netcdf_scm.iris_cube_wrappers:loading cube ../../../tests/test-data/cmip6output/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/fx/areacella/gr/v20180803/areacella_fx_IPSL-CM6A-LR_historical_r1i1p1f1_gr.nc
[39]:
# NBVAL_IGNORE_OUTPUT
netcdf_scm_calculated = tas.get_scm_timeseries(regions=["World"]).timeseries()
netcdf_scm_calculated.T
DEBUG:netcdf_scm.iris_cube_wrappers:cell_weights: None
DEBUG:netcdf_scm.iris_cube_wrappers:self.netcdf_scm_realm: atmosphere
DEBUG:netcdf_scm.iris_cube_wrappers:Using: <class 'netcdf_scm.weights.AreaSurfaceFractionWeightCalculator'>
DEBUG:netcdf_scm.iris_cube_wrappers:Crunching SCM timeseries in memory
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
WARNING:py.warnings:/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1410: UserWarning: Collapsing a non-contiguous coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
WARNING:netcdf_scm.iris_cube_wrappers:Not calculating land fractions as all required cubes are not available
[39]:
activity_id | CMIP |
---|---|
climate_model | IPSL-CM6A-LR |
member_id | r1i1p1f1 |
mip_era | CMIP6 |
model | unspecified |
region | World |
scenario | historical |
unit | K |
variable | tas |
variable_standard_name | air_temperature |
time | |
1910-01-16 12:00:00 | 284.148122 |
1910-02-15 00:00:00 | 284.196805 |
1910-03-16 12:00:00 | 284.876555 |
Ocean data handling¶
In this notebook we show how ocean data is handled.
[1]:
# NBVAL_IGNORE_OUTPUT
import traceback
from os.path import join
import numpy as np
import iris
import iris.quickplot as qplt
import matplotlib
import matplotlib.pyplot as plt
from scmdata import ScmRun
from netcdf_scm.iris_cube_wrappers import CMIP6OutputCube
[2]:
# make all logs apper
import logging
root_logger = logging.getLogger()
root_logger.addHandler(logging.StreamHandler())
[3]:
plt.style.use("bmh")
%matplotlib inline
[4]:
DATA_PATH_TEST = join("..", "..", "..", "tests", "test-data")
DATA_PATH_TEST_CMIP6_OUTPUT_ROOT = join(DATA_PATH_TEST, "cmip6output")
Test data¶
For this notebook’s test data we use CMIP6Output from NCAR’s CESM2 model.
Some ocean data is 2D. Here we use surface downward heat flux in sea water.
Firstly we use data which has been regridded by the modelling group.
[5]:
hfds_file = join(
DATA_PATH_TEST,
"cmip6output",
"CMIP6",
"CMIP",
"NCAR",
"CESM2",
"historical",
"r7i1p1f1",
"Omon",
"hfds",
"gr",
"v20190311",
"hfds_Omon_CESM2_historical_r7i1p1f1_gr_195701-195703.nc",
)
We also examine how iris handles data which is provided on the native model grid.
[6]:
hfds_file_gn = hfds_file.replace("gr", "gn")
Some ocean data is 3D. netCDF-SCM currently supports crunching this to iris cubes but will not convert those cubes to SCM timeseries.
[7]:
thetao_file = join(
DATA_PATH_TEST,
"cmip6output",
"CMIP6",
"CMIP",
"NCAR",
"CESM2",
"historical",
"r10i1p1f1",
"Omon",
"thetao",
"gn",
"v20190313",
"thetao_Omon_CESM2_historical_r10i1p1f1_gn_195310-195312.nc",
)
2D data handling¶
[8]:
# NBVAL_IGNORE_OUTPUT
hfds_cube = CMIP6OutputCube()
hfds_cube.load_data_from_path(hfds_file)
[9]:
print(hfds_cube.cube)
surface_downward_heat_flux_in_sea_water / (W m-2) (time: 3; latitude: 180; longitude: 360)
Dimension coordinates:
time x - -
latitude - x -
longitude - - x
Cell Measures:
cell_area - x x
Attributes:
CDI: Climate Data Interface version 1.8.2 (http://mpimet.mpg.de/cdi)
CDO: Climate Data Operators version 1.8.2 (http://mpimet.mpg.de/cdo)
Conventions: CF-1.7 CMIP-6.2
activity_id: CMIP
branch_method: standard
branch_time_in_child: 674885.0
branch_time_in_parent: 273750.0
case_id: 21
cesm_casename: b.e21.BHIST.f09_g17.CMIP6-historical.007
comment: Model data on the 1x1 grid includes values in all cells for which ocean...
contact: cesm_cmip6@ucar.edu
creation_date: 2019-01-19T03:13:13Z
data_specs_version: 01.00.29
description: This is the net flux of heat entering the liquid water column through its...
experiment: all-forcing simulation of the recent past
experiment_id: historical
external_variables: areacello
frequency: mon
further_info_url: https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2.historical.none.r7i1p1...
grid: ocean data regridded from native gx1v7 displaced pole grid (384x320 latxlon)...
grid_label: gr
history: Sun Aug 18 22:57:15 2019: cdo -selmonth,1/3 tmp.nc hfds_Omon_CESM2_historical_r7i1p1f1_gr_195701-195703.nc
Sun...
id: hfds
institution: National Center for Atmospheric Research
institution_id: NCAR
license: CMIP6 model data produced by <The National Center for Atmospheric Research>...
mipTable: Omon
mip_era: CMIP6
model_doi_url: https://doi.org/10.5065/D67H1H0V
nominal_resolution: 1x1 degree
out_name: hfds
parent_activity_id: CMIP
parent_experiment_id: piControl
parent_mip_era: CMIP6
parent_source_id: CESM2
parent_time_units: days since 0001-01-01 00:00:00
parent_variant_label: r1i1p1f1
product: model-output
prov: Omon ((isd.003))
realm: ocean
source: CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite volume grid; 288 x 192...
source_id: CESM2
source_type: AOGCM BGC
sub_experiment: none
sub_experiment_id: none
table_id: Omon
time: time
time_label: time-mean
time_title: Temporal mean
title: Downward Heat Flux at Sea Water Surface
tracking_id: hdl:21.14100/18907361-7d4d-4a3c-b355-4450472ab458
type: real
variable_id: hfds
variant_info: CMIP6 20th century experiments (1850-2014) with CAM6, interactive land...
variant_label: r7i1p1f1
Cell methods:
mean where sea: area
mean: time
[10]:
# NBVAL_IGNORE_OUTPUT
time_mean = hfds_cube.cube.collapsed("time", iris.analysis.MEAN)
qplt.pcolormesh(time_mean)
plt.gca().coastlines();

Iris’ handling of data on the native model grid is mostly workable, but not yet perfect.
[11]:
# NBVAL_IGNORE_OUTPUT
hfds_cube_gn = CMIP6OutputCube()
hfds_cube_gn.load_data_from_path(hfds_file_gn)
print(hfds_cube_gn.cube)
WARNING: missing_value not used since it
cannot be safely cast to variable data type
surface_downward_heat_flux_in_sea_water / (W m-2) (time: 3; -- : 384; -- : 320)
Dimension coordinates:
time x - -
Auxiliary coordinates:
latitude - x x
longitude - x x
Cell Measures:
cell_area - x x
Attributes:
CDI: Climate Data Interface version 1.8.2 (http://mpimet.mpg.de/cdi)
CDO: Climate Data Operators version 1.8.2 (http://mpimet.mpg.de/cdo)
Conventions: CF-1.7 CMIP-6.2
activity_id: CMIP
branch_method: standard
branch_time_in_child: 674885.0
branch_time_in_parent: 273750.0
case_id: 21
cesm_casename: b.e21.BHIST.f09_g17.CMIP6-historical.007
comment: This is the net flux of heat entering the liquid water column through its...
contact: cesm_cmip6@ucar.edu
creation_date: 2019-01-19T03:13:13Z
data_specs_version: 01.00.29
description: This is the net flux of heat entering the liquid water column through its...
experiment: all-forcing simulation of the recent past
experiment_id: historical
external_variables: areacello
frequency: mon
further_info_url: https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2.historical.none.r7i1p1...
grid: native gx1v7 displaced pole grid (384x320 latxlon)
grid_label: gn
history: Sun Aug 18 22:57:16 2019: cdo -selmonth,1/3 tmp.nc hfds_Omon_CESM2_historical_r7i1p1f1_gn_195701-195703.nc
Sun...
id: hfds
institution: National Center for Atmospheric Research
institution_id: NCAR
license: CMIP6 model data produced by <The National Center for Atmospheric Research>...
mipTable: Omon
mip_era: CMIP6
model_doi_url: https://doi.org/10.5065/D67H1H0V
nominal_resolution: 100 km
out_name: hfds
parent_activity_id: CMIP
parent_experiment_id: piControl
parent_mip_era: CMIP6
parent_source_id: CESM2
parent_time_units: days since 0001-01-01 00:00:00
parent_variant_label: r1i1p1f1
product: model-output
prov: Omon ((isd.003))
realm: ocean
source: CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite volume grid; 288 x 192...
source_id: CESM2
source_type: AOGCM BGC
sub_experiment: none
sub_experiment_id: none
table_id: Omon
time: time
time_label: time-mean
time_title: Temporal mean
title: Downward Heat Flux at Sea Water Surface
tracking_id: hdl:21.14100/f92a6db7-e8ea-44f1-882c-076226f8a62b
type: real
variable_id: hfds
variant_info: CMIP6 20th century experiments (1850-2014) with CAM6, interactive land...
variant_label: r7i1p1f1
Cell methods:
mean where sea: area
mean: time
[12]:
# NBVAL_IGNORE_OUTPUT
time_mean = hfds_cube_gn.cube.collapsed("time", iris.analysis.MEAN)
qplt.pcolormesh(time_mean)
plt.gca().coastlines();

Getting SCM Timeseries¶
We cut down to SCM timeseries in the standard way.
[13]:
# NBVAL_IGNORE_OUTPUT
regions_to_get = [
"World",
"World|Northern Hemisphere",
"World|Northern Hemisphere|Ocean",
"World|Ocean",
"World|Southern Hemisphere",
"World|Southern Hemisphere|Ocean",
"World|North Atlantic Ocean",
"World|El Nino N3.4",
]
hfds_ts = hfds_cube.get_scm_timeseries(regions=regions_to_get)
hfds_gn_ts = hfds_cube_gn.get_scm_timeseries(regions=regions_to_get)
ax = plt.figure(figsize=(16, 9)).add_subplot(111)
ax = hfds_ts.lineplot(hue="region", style="variable", dashes=[(3, 3)], ax=ax)
hfds_gn_ts.lineplot(
hue="region", style="variable", dashes=[(10, 30)], ax=ax, legend=False
);
Not calculating land fractions as all required cubes are not available
Performing lazy conversion to datetime for calendar: 365_day. This may cause subtle errors in operations that depend on the length of time between dates
WARNING: missing_value not used since it
cannot be safely cast to variable data type
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
Not calculating land fractions as all required cubes are not available
Performing lazy conversion to datetime for calendar: 365_day. This may cause subtle errors in operations that depend on the length of time between dates

Comparing the results of collapsing the native grid and the regridded data reveals a small difference (approx 1%), in particular in the small El Nino N3.4 region.
[14]:
ax1, ax2 = plt.figure(figsize=(16, 9)).subplots(nrows=1, ncols=2)
ScmRun(hfds_ts.timeseries() - hfds_gn_ts.timeseries()).line_plot(
hue="region", ax=ax1, legend=False
)
ax1.set_title("Absolute difference")
ScmRun(
(
(hfds_ts.timeseries() - hfds_gn_ts.timeseries()) / hfds_ts.timeseries()
).abs()
* 100
).line_plot(hue="region", ax=ax2)
ax2.set_title("Percentage difference");

3D Data Handling¶
[15]:
# NBVAL_IGNORE_OUTPUT
thetao_cube = CMIP6OutputCube()
thetao_cube.load_data_from_path(thetao_file)
WARNING: missing_value not used since it
cannot be safely cast to variable data type
Missing CF-netCDF measure variable 'volcello', referenced by netCDF variable 'thetao'
[16]:
print(thetao_cube.cube)
sea_water_potential_temperature / (degC) (time: 3; generic: 60; -- : 384; -- : 320)
Dimension coordinates:
time x - - -
generic - x - -
Auxiliary coordinates:
latitude - - x x
longitude - - x x
Cell Measures:
cell_area - - x x
Attributes:
CDI: Climate Data Interface version 1.8.2 (http://mpimet.mpg.de/cdi)
CDO: Climate Data Operators version 1.8.2 (http://mpimet.mpg.de/cdo)
Conventions: CF-1.7 CMIP-6.2
activity_id: CMIP
branch_method: standard
branch_time_in_child: 674885.0
branch_time_in_parent: 306600.0
case_id: 24
cesm_casename: b.e21.BHIST.f09_g17.CMIP6-historical.010
comment: Diagnostic should be contributed even for models using conservative temperature...
contact: cesm_cmip6@ucar.edu
creation_date: 2019-03-12T02:46:53Z
data_specs_version: 01.00.29
description: Diagnostic should be contributed even for models using conservative temperature...
experiment: Simulation of recent past (1850 to 2014). Impose changing conditions (consistent...
experiment_id: historical
external_variables: areacello volcello
frequency: mon
further_info_url: https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2.historical.none.r10i1p...
grid: native gx1v7 displaced pole grid (384x320 latxlon)
grid_label: gn
history: Mon Aug 19 17:25:30 2019: cdo -selmonth,10/12 tmp.nc thetao_Omon_CESM2_historical_r10i1p1f1_gn_195310-195312.nc
Mon...
id: thetao
institution: National Center for Atmospheric Research
institution_id: NCAR
license: CMIP6 model data produced by <The National Center for Atmospheric Research>...
mipTable: Omon
mip_era: CMIP6
model_doi_url: https://doi.org/10.5065/D67H1H0V
nominal_resolution: 100 km
out_name: thetao
parent_activity_id: CMIP
parent_experiment_id: piControl
parent_mip_era: CMIP6
parent_source_id: CESM2
parent_time_units: days since 0001-01-01 00:00:00
parent_variant_label: r1i1p1f1
product: model-output
prov: Omon ((isd.003))
realm: ocean
source: CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite volume grid; 288 x 192...
source_id: CESM2
source_type: AOGCM BGC
sub_experiment: none
sub_experiment_id: none
table_id: Omon
time: time
time_label: time-mean
time_title: Temporal mean
title: Sea Water Potential Temperature
tracking_id: hdl:21.14100/19f9ed4d-daf4-4a51-8563-fe32b9c2a0cd
type: real
variable_id: thetao
variant_info: CMIP6 20th century experiments (1850-2014) with CAM6, interactive land...
variant_label: r10i1p1f1
Cell methods:
mean where sea: area
mean: time
If we take a time mean of a cube with 3D spatial data, we end up with a 3D cube, which cannot be plotted on a 2D plot.
[17]:
# NBVAL_IGNORE_OUTPUT
time_mean = thetao_cube.cube.collapsed("time", iris.analysis.MEAN)
try:
qplt.pcolormesh(time_mean,)
except ValueError as e:
traceback.print_exc(limit=0, chain=False)
Traceback (most recent call last):
ValueError: Cube must be 2-dimensional. Got 3 dimensions.
If we take e.g. a depth mean too, then we can plot (although as this data is on the model’s native grid iris doesn’t do a great job of plotting it).
[18]:
# NBVAL_IGNORE_OUTPUT
# the depth co-ordinate is labelled as 'generic' for some reason
time_depth_mean = time_mean.collapsed("generic", iris.analysis.MEAN)
qplt.pcolormesh(time_depth_mean);

We can crunch into SCM timeseries cubes.
[19]:
# NBVAL_IGNORE_OUTPUT
thetao_ts_cubes = thetao_cube.get_scm_timeseries_cubes(regions=regions_to_get)
WARNING: missing_value not used since it
cannot be safely cast to variable data type
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
Not calculating land fractions as all required cubes are not available
These cubes now have dimensions of time and depth (labelled as ‘generic’ here). Hence we can plot them.
[20]:
plt.figure(figsize=(12, 15))
plt.subplot(311)
qplt.pcolormesh(thetao_ts_cubes["World"].cube,)
plt.title("World")
plt.subplot(323)
qplt.pcolormesh(thetao_ts_cubes["World|Northern Hemisphere|Ocean"].cube,)
plt.title("World|Northern Hemisphere|Ocean")
plt.subplot(324)
qplt.pcolormesh(thetao_ts_cubes["World|Southern Hemisphere|Ocean"].cube,)
plt.title("World|Southern Hemisphere|Ocean")
plt.subplot(325)
qplt.pcolormesh(thetao_ts_cubes["World|El Nino N3.4"].cube,)
plt.title("World|El Nino N3.4")
plt.subplot(326)
qplt.pcolormesh(thetao_ts_cubes["World|North Atlantic Ocean"].cube,)
plt.title("World|North Atlantic Ocean")
plt.tight_layout()

We have also not yet decided on our convention for handling the depth information in ScmRun
’s, hence attempting to retrieve SCM timeseries will result in an error.
[21]:
# NBVAL_IGNORE_OUTPUT
try:
thetao_cube.get_scm_timeseries(regions=regions_to_get)
except NotImplementedError as e:
traceback.print_exc(limit=0, chain=False)
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/fileformats/netcdf.py:395: UserWarning: WARNING: missing_value not used since it
cannot be safely cast to variable data type
var = variable[keys]
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'latitude'.
warnings.warn(msg.format(self.name()))
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.8/site-packages/iris/coords.py:1406: UserWarning: Collapsing a multi-dimensional coordinate. Metadata may not be fully descriptive for 'longitude'.
warnings.warn(msg.format(self.name()))
Not calculating land fractions as all required cubes are not available
Traceback (most recent call last):
NotImplementedError: Cannot yet get SCM timeseries for data with dimensions other than time, latitude and longitude
Wranglers¶
In this notebook we give a brief overview of wrangling with netCDF-SCM.
[1]:
# NBVAL_IGNORE_OUTPUT
import glob
from pathlib import Path
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymagicc
[2]:
plt.style.use("bmh")
%matplotlib inline
Wrangling help¶
The wrangling help can be accessed via our command line interface.
[3]:
# NBVAL_IGNORE_OUTPUT
!netcdf-scm wrangle -h
Usage: netcdf-scm wrangle [OPTIONS] SRC DST WRANGLE_CONTACT
Wrangle netCDF-SCM ``.nc`` files into other formats and directory
structures.
``src`` is searched recursively and netcdf-scm will attempt to wrangle all
the files found.
``wrangle_contact`` is written into the header of the output files.
Options:
--regexp TEXT Regular expression to apply to file
directory (only wrangles matches). Be
careful, if you use a very copmlex regexp
directory sorting can be extremely slow (see
e.g. discussion at
https://stackoverflow.com/a/5428712)!
[default: ^(?!.*(fx)).*$]
--prefix TEXT Prefix to apply to output file names (not
paths).
--out-format [mag-files|mag-files-average-year-start-year|mag-files-average-year-mid-year|mag-files-average-year-end-year|mag-files-point-start-year|mag-files-point-mid-year|mag-files-point-end-year|magicc-input-files|magicc-input-files-average-year-start-year|magicc-input-files-average-year-mid-year|magicc-input-files-average-year-end-year|magicc-input-files-point-start-year|magicc-input-files-point-mid-year|magicc-input-files-point-end-year|tuningstrucs-blend-model]
Format to re-write crunched data into. The
time operation conventions follow those in
`Pymagicc <https://pymagicc.readthedocs.io/e
n/latest/file_conventions.html#namelists>`_.
[default: mag-files]
--drs [None|MarbleCMIP5|CMIP6Input4MIPs|CMIP6Output]
Data reference syntax to use to decipher
paths. This is required to ensure the output
folders match the input data reference
syntax. [default: None]
-f, --force / --do-not-force Overwrite any existing files. [default:
False]
--number-workers INTEGER Number of worker (threads) to use when
wrangling. [default: 4]
--target-units-specs PATH csv containing target units for wrangled
variables.
-h, --help Show this message and exit.
MAG file wrangling¶
The most common format to wrangle to is the .MAG
format. This is a custom MAGICC format (see https://pymagicc.readthedocs.io/en/latest/file_conventions.html#the-future). We can wrangle data which has already been crunched to this format as shown below.
[4]:
# NBVAL_IGNORE_OUTPUT
!netcdf-scm wrangle \
"../../../tests/test-data/expected-crunching-output/cmip6output/Lmon/CMIP6/CMIP/NCAR" \
"../../../output-examples/wrangled-files" "notebook example <email address>" \
--force \
--drs "CMIP6Output" \
--out-format "mag-files" \
--regexp ".*cSoilFast.*"
87973 2021-03-18 13:05:52,227 INFO:netcdf_scm:netcdf-scm: 2.0.2+15.g74db9d85.dirty
87973 2021-03-18 13:05:52,228 INFO:netcdf_scm:wrangle_contact: notebook example <email address>
87973 2021-03-18 13:05:52,228 INFO:netcdf_scm:source: /Users/znicholls/Documents/AGCEC/netCDF-SCM/netcdf-scm/tests/test-data/expected-crunching-output/cmip6output/Lmon/CMIP6/CMIP/NCAR
87973 2021-03-18 13:05:52,228 INFO:netcdf_scm:destination: /Users/znicholls/Documents/AGCEC/netCDF-SCM/netcdf-scm/output-examples/wrangled-files
87973 2021-03-18 13:05:52,228 INFO:netcdf_scm:regexp: .*cSoilFast.*
87973 2021-03-18 13:05:52,228 INFO:netcdf_scm:prefix: None
87973 2021-03-18 13:05:52,228 INFO:netcdf_scm:drs: CMIP6Output
87973 2021-03-18 13:05:52,228 INFO:netcdf_scm:out_format: mag-files
87973 2021-03-18 13:05:52,228 INFO:netcdf_scm:force: True
87973 2021-03-18 13:05:52,230 INFO:netcdf_scm:Finding directories with files
Walking through directories and applying `check_func`: 11it [00:00, 9394.69it/s]
87973 2021-03-18 13:05:52,238 INFO:netcdf_scm:Found 1 directories with files
87973 2021-03-18 13:05:52,239 INFO:netcdf_scm.cli_parallel:Processing in parallel with 4 workers
87973 2021-03-18 13:05:52,239 INFO:netcdf_scm.cli_parallel:Forcing dask to use a single thread when reading
100%|████████████████████████████████████████| 1.00/1.00 [00:04<00:00, 4.32s/it]
We can then load the .MAG
files using Pymagicc.
[5]:
written_files = [
f for f in Path("../../../output-examples/wrangled-files").rglob("*.MAG")
]
written_files
[5]:
[PosixPath('../../../output-examples/wrangled-files/CMIP6/CMIP/NCAR/CESM2/historical/r7i1p1f1/Lmon/cSoilFast/gn/v20190311/netcdf-scm_cSoilFast_Lmon_CESM2_historical_r7i1p1f1_gn_195701-195703.MAG')]
[6]:
wrangled = pymagicc.io.MAGICCData(str(written_files[0]))
[7]:
# NBVAL_IGNORE_OUTPUT
wrangled.timeseries()
[7]:
time | 1957-01-15 12:00:00 | 1957-02-14 00:00:00 | 1957-03-15 12:00:00 | ||||||
---|---|---|---|---|---|---|---|---|---|
climate_model | model | region | scenario | todo | unit | variable | |||
unspecified | unspecified | World | unspecified | SET | kg m^-2 | cSoilFast | 0.085600 | 0.085547 | 0.085422 |
World|Northern Hemisphere | unspecified | SET | kg m^-2 | cSoilFast | 0.097727 | 0.097910 | 0.098135 | ||
World|Southern Hemisphere | unspecified | SET | kg m^-2 | cSoilFast | 0.060421 | 0.059879 | 0.059024 | ||
World|Land | unspecified | SET | kg m^-2 | cSoilFast | 0.085600 | 0.085547 | 0.085422 | ||
World|Northern Hemisphere|Land | unspecified | SET | kg m^-2 | cSoilFast | 0.097727 | 0.097910 | 0.098135 | ||
World|Southern Hemisphere|Land | unspecified | SET | kg m^-2 | cSoilFast | 0.060421 | 0.059879 | 0.059024 |
[8]:
# NBVAL_IGNORE_OUTPUT
wrangled.lineplot(hue="region")
[8]:
<AxesSubplot:xlabel='time', ylabel='kg m^-2'>

The units of the wrangled data are kgmsuper-2
. This might not be super helpful. As such, netcdf-scm wrangle
allows users to specify a csv which defines the target units to use for variables when wrangling.
The conversion csv should look like the below.
[9]:
conv_csv = pd.DataFrame(
[["cSoilFast", "t / m**2"], ["tos", "K"]], columns=["variable", "unit"]
)
conv_csv_path = "../../../output-examples/conversion-new-units.csv"
conv_csv.to_csv(conv_csv_path, index=False)
with open(conv_csv_path) as f:
conv_csv_content = f.read()
print(conv_csv_content)
variable,unit
cSoilFast,t / m**2
tos,K
With such a csv, we can now crunch to our desired units.
[10]:
# NBVAL_IGNORE_OUTPUT
!netcdf-scm wrangle \
"../../../tests/test-data/expected-crunching-output/cmip6output/Lmon/CMIP6/CMIP/NCAR" \
"../../../output-examples/wrangled-files-new-units" \
"notebook example <email address>" \
--force --drs "CMIP6Output" \
--out-format "mag-files" \
--regexp ".*cSoilFast.*" \
--target-units-specs "../../../output-examples/conversion-new-units.csv"
87988 2021-03-18 13:06:01,020 INFO:netcdf_scm:netcdf-scm: 2.0.2+15.g74db9d85.dirty
87988 2021-03-18 13:06:01,020 INFO:netcdf_scm:wrangle_contact: notebook example <email address>
87988 2021-03-18 13:06:01,020 INFO:netcdf_scm:source: /Users/znicholls/Documents/AGCEC/netCDF-SCM/netcdf-scm/tests/test-data/expected-crunching-output/cmip6output/Lmon/CMIP6/CMIP/NCAR
87988 2021-03-18 13:06:01,020 INFO:netcdf_scm:destination: /Users/znicholls/Documents/AGCEC/netCDF-SCM/netcdf-scm/output-examples/wrangled-files-new-units
87988 2021-03-18 13:06:01,020 INFO:netcdf_scm:regexp: .*cSoilFast.*
87988 2021-03-18 13:06:01,021 INFO:netcdf_scm:prefix: None
87988 2021-03-18 13:06:01,021 INFO:netcdf_scm:drs: CMIP6Output
87988 2021-03-18 13:06:01,021 INFO:netcdf_scm:out_format: mag-files
87988 2021-03-18 13:06:01,021 INFO:netcdf_scm:force: True
87988 2021-03-18 13:06:01,022 INFO:netcdf_scm:Finding directories with files
Walking through directories and applying `check_func`: 11it [00:00, 9150.60it/s]
87988 2021-03-18 13:06:01,030 INFO:netcdf_scm:Found 1 directories with files
87988 2021-03-18 13:06:01,031 INFO:netcdf_scm.cli_parallel:Processing in parallel with 4 workers
87988 2021-03-18 13:06:01,031 INFO:netcdf_scm.cli_parallel:Forcing dask to use a single thread when reading
100%|████████████████████████████████████████| 1.00/1.00 [00:04<00:00, 4.28s/it]
[11]:
# NBVAL_IGNORE_OUTPUT
written_files = [
f
for f in Path("../../../output-examples/wrangled-files-new-units").rglob(
"*.MAG"
)
]
wrangled_new_units = pymagicc.io.MAGICCData(str(written_files[0]))
wrangled_new_units.timeseries()
[11]:
time | 1957-01-15 12:00:00 | 1957-02-14 00:00:00 | 1957-03-15 12:00:00 | ||||||
---|---|---|---|---|---|---|---|---|---|
climate_model | model | region | scenario | todo | unit | variable | |||
unspecified | unspecified | World | unspecified | SET | t / m^2 | cSoilFast | 0.000086 | 0.000086 | 0.000085 |
World|Northern Hemisphere | unspecified | SET | t / m^2 | cSoilFast | 0.000098 | 0.000098 | 0.000098 | ||
World|Southern Hemisphere | unspecified | SET | t / m^2 | cSoilFast | 0.000060 | 0.000060 | 0.000059 | ||
World|Land | unspecified | SET | t / m^2 | cSoilFast | 0.000086 | 0.000086 | 0.000085 | ||
World|Northern Hemisphere|Land | unspecified | SET | t / m^2 | cSoilFast | 0.000098 | 0.000098 | 0.000098 | ||
World|Southern Hemisphere|Land | unspecified | SET | t / m^2 | cSoilFast | 0.000060 | 0.000060 | 0.000059 |
[12]:
# NBVAL_IGNORE_OUTPUT
wrangled_new_units.lineplot(hue="region")
[12]:
<AxesSubplot:xlabel='time', ylabel='t / m^2'>

We can also set the units to include an area sum. For example, if we set our units to Gt / yr
rather than Gt / m**2 / yr
then the wrangler will automatically take an area sum of the data (weighted by the effective area used in the crunching) before returning the data.
[13]:
conv_csv = pd.DataFrame(
[["cSoilFast", "Gt"], ["tos", "K"]], columns=["variable", "unit"]
)
conv_csv_path = "../../../output-examples/conversion-area-sum-units.csv"
conv_csv.to_csv(conv_csv_path, index=False)
with open(conv_csv_path) as f:
conv_csv_content = f.read()
print(conv_csv_content)
variable,unit
cSoilFast,Gt
tos,K
[14]:
# NBVAL_IGNORE_OUTPUT
!netcdf-scm wrangle \
"../../../tests/test-data/expected-crunching-output/cmip6output/Lmon/CMIP6/CMIP/NCAR" \
"../../../output-examples/wrangled-files-area-sum-units" \
"notebook example <email address>" \
--force \
--drs "CMIP6Output" \
--out-format "mag-files" \
--regexp ".*cSoilFast.*" \
--target-units-specs "../../../output-examples/conversion-area-sum-units.csv"
88003 2021-03-18 13:06:10,165 INFO:netcdf_scm:netcdf-scm: 2.0.2+15.g74db9d85.dirty
88003 2021-03-18 13:06:10,166 INFO:netcdf_scm:wrangle_contact: notebook example <email address>
88003 2021-03-18 13:06:10,166 INFO:netcdf_scm:source: /Users/znicholls/Documents/AGCEC/netCDF-SCM/netcdf-scm/tests/test-data/expected-crunching-output/cmip6output/Lmon/CMIP6/CMIP/NCAR
88003 2021-03-18 13:06:10,166 INFO:netcdf_scm:destination: /Users/znicholls/Documents/AGCEC/netCDF-SCM/netcdf-scm/output-examples/wrangled-files-area-sum-units
88003 2021-03-18 13:06:10,166 INFO:netcdf_scm:regexp: .*cSoilFast.*
88003 2021-03-18 13:06:10,166 INFO:netcdf_scm:prefix: None
88003 2021-03-18 13:06:10,166 INFO:netcdf_scm:drs: CMIP6Output
88003 2021-03-18 13:06:10,166 INFO:netcdf_scm:out_format: mag-files
88003 2021-03-18 13:06:10,166 INFO:netcdf_scm:force: True
88003 2021-03-18 13:06:10,168 INFO:netcdf_scm:Finding directories with files
Walking through directories and applying `check_func`: 11it [00:00, 9617.96it/s]
88003 2021-03-18 13:06:10,176 INFO:netcdf_scm:Found 1 directories with files
88003 2021-03-18 13:06:10,177 INFO:netcdf_scm.cli_parallel:Processing in parallel with 4 workers
88003 2021-03-18 13:06:10,177 INFO:netcdf_scm.cli_parallel:Forcing dask to use a single thread when reading
100%|████████████████████████████████████████| 1.00/1.00 [00:04<00:00, 4.44s/it]
[15]:
# NBVAL_IGNORE_OUTPUT
written_files = [
f
for f in Path(
"../../../output-examples/wrangled-files-area-sum-units"
).rglob("*.MAG")
]
wrangled_area_sum_units = pymagicc.io.MAGICCData(str(written_files[0]))
wrangled_area_sum_units.timeseries()
[15]:
time | 1957-01-15 12:00:00 | 1957-02-14 00:00:00 | 1957-03-15 12:00:00 | ||||||
---|---|---|---|---|---|---|---|---|---|
climate_model | model | region | scenario | todo | unit | variable | |||
unspecified | unspecified | World | unspecified | SET | Gt | cSoilFast | 12.79290 | 12.7849 | 12.76610 |
World|Land | unspecified | SET | Gt | cSoilFast | 12.79290 | 12.7849 | 12.76610 | ||
World|Northern Hemisphere | unspecified | SET | Gt | cSoilFast | 9.85760 | 9.8760 | 9.89873 | ||
World|Northern Hemisphere|Land | unspecified | SET | Gt | cSoilFast | 9.85760 | 9.8760 | 9.89873 | ||
World|Southern Hemisphere | unspecified | SET | Gt | cSoilFast | 2.93526 | 2.9089 | 2.86740 | ||
World|Southern Hemisphere|Land | unspecified | SET | Gt | cSoilFast | 2.93526 | 2.9089 | 2.86740 |
[16]:
# NBVAL_IGNORE_OUTPUT
solid_regions = [
"World",
"World|Northern Hemisphere",
"World|Southern Hemisphere",
]
ax = wrangled_area_sum_units.filter(region=solid_regions).lineplot(
hue="region", linestyle="-"
)
wrangled_area_sum_units.filter(region=solid_regions, keep=False).lineplot(
hue="region", linestyle="--", dashes=(5, 7.5), ax=ax
)
[16]:
<AxesSubplot:xlabel='time', ylabel='Gt'>

As one last sanity check, we can make sure that the world total equals the regional total to within rounding errors.
[17]:
np.testing.assert_allclose(
wrangled_area_sum_units.filter(region="World")
.timeseries()
.values.squeeze(),
wrangled_area_sum_units.filter(
region=["World|Northern Hemisphere", "World|Southern Hemisphere"]
)
.timeseries()
.sum()
.values.squeeze(),
rtol=1e-5,
)
The wrangling can also include a few basic time operations e.g. annual means or interpolation onto different grids. The different out-format
codes follow those in Pymagicc (link to be updated once PR is merged). Here we show one example where we take the annual mean as part of the wrangling process.
[18]:
# NBVAL_IGNORE_OUTPUT
!netcdf-scm wrangle \
"../../../tests/test-data/expected-crunching-output/cmip6output/Amon/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/piControl" \
"../../../output-examples/wrangled-files-average-year" \
"notebook example <email address>" \
--force \
--drs "CMIP6Output" \
--out-format "mag-files-average-year-mid-year"
88024 2021-03-18 13:06:19,120 INFO:netcdf_scm:netcdf-scm: 2.0.2+15.g74db9d85.dirty
88024 2021-03-18 13:06:19,120 INFO:netcdf_scm:wrangle_contact: notebook example <email address>
88024 2021-03-18 13:06:19,120 INFO:netcdf_scm:source: /Users/znicholls/Documents/AGCEC/netCDF-SCM/netcdf-scm/tests/test-data/expected-crunching-output/cmip6output/Amon/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/piControl
88024 2021-03-18 13:06:19,120 INFO:netcdf_scm:destination: /Users/znicholls/Documents/AGCEC/netCDF-SCM/netcdf-scm/output-examples/wrangled-files-average-year
88024 2021-03-18 13:06:19,120 INFO:netcdf_scm:regexp: ^(?!.*(fx)).*$
88024 2021-03-18 13:06:19,120 INFO:netcdf_scm:prefix: None
88024 2021-03-18 13:06:19,120 INFO:netcdf_scm:drs: CMIP6Output
88024 2021-03-18 13:06:19,120 INFO:netcdf_scm:out_format: mag-files-average-year-mid-year
88024 2021-03-18 13:06:19,120 INFO:netcdf_scm:force: True
88024 2021-03-18 13:06:19,120 INFO:netcdf_scm:Finding directories with files
Walking through directories and applying `check_func`: 6it [00:00, 9062.23it/s]
88024 2021-03-18 13:06:19,127 INFO:netcdf_scm:Found 1 directories with files
88024 2021-03-18 13:06:19,128 INFO:netcdf_scm.cli_parallel:Processing in parallel with 4 workers
88024 2021-03-18 13:06:19,128 INFO:netcdf_scm.cli_parallel:Forcing dask to use a single thread when reading
0%| | 0.00/1.00 [00:00<?, ?it/s]/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/xarray/coding/times.py:463: SerializationWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using cftime.datetime objects instead, reason: dates out of range
dtype = _decode_cf_datetime_dtype(data, units, calendar, self.use_cftime)
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/xarray/coding/times.py:463: SerializationWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using cftime.datetime objects instead, reason: dates out of range
dtype = _decode_cf_datetime_dtype(data, units, calendar, self.use_cftime)
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/numpy/core/_asarray.py:102: SerializationWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using cftime.datetime objects instead, reason: dates out of range
return array(a, dtype, copy=False, order=order)
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/numpy/core/_asarray.py:102: SerializationWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using cftime.datetime objects instead, reason: dates out of range
return array(a, dtype, copy=False, order=order)
100%|████████████████████████████████████████| 1.00/1.00 [00:04<00:00, 4.82s/it]
[19]:
# NBVAL_IGNORE_OUTPUT
written_files = [
f
for f in Path(
"../../../output-examples/wrangled-files-average-year"
).rglob("*.MAG")
]
wrangled_annual_mean = pymagicc.io.MAGICCData(str(written_files[0]))
wrangled_annual_mean.timeseries()
[19]:
time | 2840-07-01 00:00:00 | 2841-07-01 00:00:00 | 2842-07-01 00:00:00 | 2843-07-01 00:00:00 | 2844-07-01 00:00:00 | 2845-07-01 00:00:00 | 2846-07-01 00:00:00 | 2847-07-01 00:00:00 | 2848-07-01 00:00:00 | 2849-07-01 00:00:00 | 2850-07-01 00:00:00 | 2851-07-01 00:00:00 | 2852-07-01 00:00:00 | 2853-07-01 00:00:00 | 2854-07-01 00:00:00 | 2855-07-01 00:00:00 | 2856-07-01 00:00:00 | 2857-07-01 00:00:00 | 2858-07-01 00:00:00 | 2859-07-01 00:00:00 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
climate_model | model | region | scenario | todo | unit | variable | ||||||||||||||||||||
unspecified | unspecified | World | unspecified | SET | K | tas | 285.883 | 285.841 | 285.847 | 285.860 | 286.026 | 285.880 | 285.612 | 285.717 | 285.700 | 285.913 | 285.862 | 285.964 | 286.154 | 285.959 | 286.096 | 286.110 | 286.074 | 285.771 | 285.768 | 285.969 |
World|Northern Hemisphere | unspecified | SET | K | tas | 286.566 | 286.482 | 286.411 | 286.538 | 286.717 | 286.539 | 286.240 | 286.387 | 286.297 | 286.521 | 286.558 | 286.596 | 286.700 | 286.628 | 286.699 | 286.866 | 286.802 | 286.469 | 286.387 | 286.682 | ||
World|Southern Hemisphere | unspecified | SET | K | tas | 285.184 | 285.185 | 285.270 | 285.167 | 285.320 | 285.206 | 284.970 | 285.033 | 285.090 | 285.292 | 285.150 | 285.318 | 285.596 | 285.275 | 285.479 | 285.338 | 285.331 | 285.057 | 285.135 | 285.240 | ||
World|Land | unspecified | SET | K | tas | 279.502 | 279.494 | 279.590 | 279.505 | 279.732 | 279.534 | 279.048 | 279.332 | 279.280 | 279.630 | 279.474 | 279.452 | 279.771 | 279.458 | 279.714 | 279.814 | 279.724 | 279.463 | 279.369 | 279.686 | ||
World|Ocean | unspecified | SET | K | tas | 288.453 | 288.397 | 288.366 | 288.419 | 288.561 | 288.435 | 288.255 | 288.289 | 288.286 | 288.444 | 288.434 | 288.587 | 288.725 | 288.577 | 288.666 | 288.646 | 288.632 | 288.311 | 288.344 | 288.499 | ||
World|Northern Hemisphere|Land | unspecified | SET | K | tas | 281.049 | 281.014 | 281.052 | 281.100 | 281.308 | 281.099 | 280.672 | 280.947 | 280.782 | 281.140 | 281.171 | 280.932 | 281.135 | 281.040 | 281.087 | 281.490 | 281.301 | 281.107 | 280.946 | 281.283 | ||
World|Southern Hemisphere|Land | unspecified | SET | K | tas | 276.230 | 276.279 | 276.497 | 276.131 | 276.400 | 276.223 | 275.613 | 275.917 | 276.103 | 276.438 | 275.885 | 276.322 | 276.885 | 276.113 | 276.811 | 276.270 | 276.389 | 275.986 | 276.034 | 276.309 | ||
World|Northern Hemisphere|Ocean | unspecified | SET | K | tas | 290.029 | 289.914 | 289.774 | 289.950 | 290.112 | 289.952 | 289.734 | 289.801 | 289.759 | 289.899 | 289.939 | 290.151 | 290.192 | 290.134 | 290.221 | 290.239 | 290.254 | 289.833 | 289.801 | 290.070 | ||
World|Southern Hemisphere|Ocean | unspecified | SET | K | tas | 287.236 | 287.225 | 287.279 | 287.237 | 287.363 | 287.264 | 287.113 | 287.121 | 287.149 | 287.320 | 287.273 | 287.379 | 287.592 | 287.374 | 287.465 | 287.416 | 287.379 | 287.135 | 287.220 | 287.286 | ||
World|North Atlantic Ocean | unspecified | SET | K | tas | 291.030 | 291.114 | 290.932 | 290.894 | 291.015 | 290.987 | 290.787 | 290.591 | 290.833 | 290.852 | 290.752 | 290.677 | 291.057 | 291.026 | 291.205 | 291.117 | 291.371 | 291.050 | 290.847 | 291.153 | ||
World|El Nino N3.4 | unspecified | SET | K | tas | 297.656 | 296.947 | 296.951 | 297.666 | 297.818 | 297.051 | 296.019 | 296.918 | 296.887 | 297.067 | 297.055 | 298.488 | 297.646 | 297.392 | 298.025 | 297.780 | 296.800 | 295.766 | 297.053 | 298.179 |
[20]:
# NBVAL_IGNORE_OUTPUT
wrangled_annual_mean.lineplot(hue="region")
[20]:
<AxesSubplot:xlabel='time', ylabel='K'>

[21]:
# NBVAL_IGNORE_OUTPUT
fig = plt.figure(figsize=(16, 9))
ax = fig.add_subplot(221)
wrangled_annual_mean.filter(region=["World", "World|*Hemisphere"]).lineplot(
hue="region", ax=ax
)
ax = fig.add_subplot(222, sharey=ax, sharex=ax)
wrangled_annual_mean.filter(
region=["World", "World|Land", "World|Ocean"]
).lineplot(hue="region", ax=ax)
ax = fig.add_subplot(223, sharey=ax, sharex=ax)
wrangled_annual_mean.filter(region=["World", "World|*Hemis*|*"]).lineplot(
hue="region", ax=ax
)
ax = fig.add_subplot(224, sharey=ax, sharex=ax)
wrangled_annual_mean.filter(
region=["World", "World|*El*", "World|*Ocean*"]
).lineplot(hue="region", ax=ax)
[21]:
<AxesSubplot:xlabel='time', ylabel='K'>

Weights¶
In this notebook we demonstrate all of netCDF-SCM’s known weightings. These weights are used when taking area overages for different SCM boxes e.g. the ocean/land boxes or the El Nino box.
Note: here we use the “last resort” land surface fraction values. However, if land surface fraction data is available then that is used to do land/ocean weighting rather than the “last resort” values.
This notebook is set out as follows:
we show the default weights
we show how the different available options for combining area and surface fraction information
we show all our inbuilt weights
we show how the user can define their own custom weights.
Imports¶
[1]:
# NBVAL_IGNORE_OUTPUT
from os.path import join
import iris
import iris.quickplot as qplt
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import regionmask
from netcdf_scm.iris_cube_wrappers import CMIP6OutputCube
from netcdf_scm.weights import (
AreaSurfaceFractionWeightCalculator,
AreaWeightCalculator,
get_weights_for_area,
WEIGHTS_FUNCTIONS_WITHOUT_AREA_WEIGHTING,
)
[2]:
plt.style.use("bmh")
%matplotlib inline
Data path¶
Here we use our test data.
[3]:
DATA_PATH_TEST = join("..", "..", "..", "tests", "test-data")
DATA_PATH_TEST_CMIP6_ROOT = join(DATA_PATH_TEST, "cmip6output")
Load the cube¶
[4]:
example = CMIP6OutputCube()
example.load_data_in_directory(
join(
DATA_PATH_TEST_CMIP6_ROOT,
# "CMIP6/ScenarioMIP/BCC/BCC-CSM2-MR/ssp126/r1i1p1f1/Amon/example/gn/v20190314",
"CMIP6/CMIP/NCAR/CESM2/historical/r10i1p1f1/Amon/tas/gn/v20190313",
)
)
Interpolate the cube to get higher resolution data.
[5]:
sample_points = [
("longitude", np.arange(0, 360, 2)),
("latitude", np.arange(-90, 90 + 1, 2)),
]
example.cube = example.cube.interpolate(sample_points, iris.analysis.Linear())
Weights¶
By default, only land/ocean and hemispheric weights are considered.
[6]:
# NBVAL_IGNORE_OUTPUT
default_weights = example.get_scm_timeseries_weights()
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/analysis/cartography.py:394: UserWarning: Using DEFAULT_SPHERICAL_EARTH_RADIUS.
warnings.warn("Using DEFAULT_SPHERICAL_EARTH_RADIUS.")
[7]:
# NBVAL_IGNORE_OUTPUT
def plot_weights(weights_to_plot, constraint=None, axes=None, **kwargs):
for i, (label, weights) in enumerate(weights_to_plot.items()):
if axes is None:
ax = plt.figure().add_subplot(111)
else:
ax = axes[i]
weight_cube = example.cube.collapsed("time", iris.analysis.MEAN)
weight_cube.data = weights
weight_cube.units = ""
if constraint is not None:
weight_cube = weight_cube.extract(constraint)
plt.sca(ax)
qplt.pcolormesh(
weight_cube, **kwargs,
)
plt.gca().set_title(label)
plt.gca().coastlines()
plot_weights(default_weights)









By defaults, the weights are calculated as the combination of area and surface fractions using netcdf_scm.weights.AreaSurfaceFractionWeightCalculator
.
[8]:
# NBVAL_IGNORE_OUTPUT
print(AreaSurfaceFractionWeightCalculator.__doc__)
Calculates weights which are both area and surface fraction weighted
.. math::
w(lat, lon) = a(lat, lon) \\times s(lat, lon)
where :math:`w(lat, lon)` is the weight of the cell at given latitude and
longitude, :math:`a` is area of the cell and :math:`s` is the surface
fraction of the cell (e.g. fraction of ocean area for ocean based regions).
For land/ocean weights, this causes regions on coastlines to have weights less than their area weight, because they are not fully land or ocean.
The user can instead use netcdf_scm.weights.AreaWeightCalculator
, which focusses on area weights but removes any areas that have a surface fraction of zero.
[9]:
# NBVAL_IGNORE_OUTPUT
print(AreaWeightCalculator.__doc__)
Calculates weights which are area weighted but surface fraction aware.
This means that any cells which have a surface fraction of zero will
receive zero weight, otherwise cells are purely area weighted.
.. math::
w(lat, lon) = \\begin{cases}
a(lat, lon), & s(lat, lon) > 0 \\\\
0, & s(lat, lon) = 0
\\end{cases}
where :math:`w(lat, lon)` is the weight of the cell at given latitude and
longitude, :math:`a` is area of the cell and :math:`s` is the surface
fraction of the cell (e.g. fraction of ocean area for ocean based regions).
[10]:
# NBVAL_IGNORE_OUTPUT
area_weights = example.get_scm_timeseries_weights(cell_weights="area-only")
/Users/znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/analysis/cartography.py:394: UserWarning: Using DEFAULT_SPHERICAL_EARTH_RADIUS.
warnings.warn("Using DEFAULT_SPHERICAL_EARTH_RADIUS.")
[11]:
# NBVAL_IGNORE_OUTPUT
fig, axes = plt.subplots(figsize=(16, 9), nrows=2, ncols=2)
for i, (w, title) in enumerate(
((default_weights, "Default"), (area_weights, "No land fraction"))
):
plt_weights = {k: w[k] for k in ["World|Ocean", "World|Land"]}
zoom_constraint = iris.Constraint(
latitude=lambda cell: -45 < cell < -25
) & iris.Constraint(longitude=lambda cell: 120 < cell < 160)
plot_weights(
plt_weights, constraint=zoom_constraint, axes=[axes[0][i], axes[1][i]],
)
cf = plt.gcf()
for i, (w, title) in enumerate(
((default_weights, "Default"), (area_weights, "Area only"))
):
title_ax = cf.axes[i * 4]
title_ax.set_title("{}\n{}".format(title, title_ax.get_title()))

The default masks do not contain all inbuilt masks. We also provide masks for the IPCC AR6 regions, as defined in Iturbide et al. (2020), as well as country-level (at the 50m scale) masks defined by Natural Earth. For both these masks, we use the regionmask implementation.
The regionmask names can be inspected as shown below. Not that the abbreviations for the countries are not unique.
[12]:
regionmask_countries = (
pd.DataFrame(
{
"name": regionmask.defined_regions.natural_earth.countries_50.names,
"abbreviation": regionmask.defined_regions.natural_earth.countries_50.abbrevs,
}
)
.sort_values(by="name")
.reset_index(drop=True)
)
regionmask_countries
[12]:
name | abbreviation | |
---|---|---|
0 | Afghanistan | AF |
1 | Albania | AL |
2 | Algeria | DZ |
3 | American Samoa | AS |
4 | Andorra | AND |
... | ... | ... |
236 | Yemen | YE |
237 | Zambia | ZM |
238 | Zimbabwe | ZW |
239 | eSwatini | SW |
240 | Åland | AI |
241 rows × 2 columns
Below we show a selection of plots for the regions we include.
[13]:
selection_inbuilt_weights = example.get_scm_timeseries_weights(
regions=[
"World",
"World|Northern Hemisphere",
"World|Southern Hemisphere",
"World|Land",
"World|Ocean",
"World|Northern Hemisphere|Land",
"World|Southern Hemisphere|Land",
"World|Northern Hemisphere|Ocean",
"World|Southern Hemisphere|Ocean",
"World|North Atlantic Ocean",
"World|El Nino N3.4",
"World|AR6|GIC",
"World|AR6|NWN",
"World|AR6|NEN",
"World|AR6|WNA",
"World|AR6|SSA",
"World|AR6|NEU",
]
+ [
"World|Natural Earth 50m|{}".format(c)
for c in [
"Australia",
"Austria",
"China",
"New Zealand",
"United States of America",
# this fails as the region is tiny and
# our data is not high-resolution enough to capture it
"Vatican",
"Vietnam",
]
]
)
/Users/znicholls/Documents/AGCEC/netCDF-SCM/netcdf-scm/src/netcdf_scm/weights/__init__.py:869: UserWarning: Failed to create 'World|Natural Earth 50m|Vatican' weights: All weights are zero for region: `World|Natural Earth 50m|Vatican`
warnings.warn(warn_str)
[14]:
# NBVAL_IGNORE_OUTPUT
plot_weights(selection_inbuilt_weights)
<ipython-input-7-e10806626a7c>:5: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`).
ax = plt.figure().add_subplot(111)























[15]:
# full list of available regions
sorted(list(WEIGHTS_FUNCTIONS_WITHOUT_AREA_WEIGHTING.keys()))
[15]:
['World',
'World|AR6|ARO',
'World|AR6|ARP',
'World|AR6|ARS',
'World|AR6|BOB',
'World|AR6|CAF',
'World|AR6|CAR',
'World|AR6|CAU',
'World|AR6|CNA',
'World|AR6|EAN',
'World|AR6|EAO',
'World|AR6|EAS',
'World|AR6|EAU',
'World|AR6|ECA',
'World|AR6|EEU',
'World|AR6|EIO',
'World|AR6|ENA',
'World|AR6|EPO',
'World|AR6|ESAF',
'World|AR6|ESB',
'World|AR6|GIC',
'World|AR6|MDG',
'World|AR6|MED',
'World|AR6|NAO',
'World|AR6|NAU',
'World|AR6|NCA',
'World|AR6|NEAF',
'World|AR6|NEN',
'World|AR6|NES',
'World|AR6|NEU',
'World|AR6|NPO',
'World|AR6|NSA',
'World|AR6|NWN',
'World|AR6|NWS',
'World|AR6|NZ',
'World|AR6|RAR',
'World|AR6|RFE',
'World|AR6|SAH',
'World|AR6|SAM',
'World|AR6|SAO',
'World|AR6|SAS',
'World|AR6|SAU',
'World|AR6|SCA',
'World|AR6|SEA',
'World|AR6|SEAF',
'World|AR6|SES',
'World|AR6|SIO',
'World|AR6|SOO',
'World|AR6|SPO',
'World|AR6|SSA',
'World|AR6|SWS',
'World|AR6|TIB',
'World|AR6|WAF',
'World|AR6|WAN',
'World|AR6|WCA',
'World|AR6|WCE',
'World|AR6|WNA',
'World|AR6|WSAF',
'World|AR6|WSB',
'World|El Nino N3.4',
'World|Land',
'World|Natural Earth 50m|Afghanistan',
'World|Natural Earth 50m|Albania',
'World|Natural Earth 50m|Algeria',
'World|Natural Earth 50m|American Samoa',
'World|Natural Earth 50m|Andorra',
'World|Natural Earth 50m|Angola',
'World|Natural Earth 50m|Anguilla',
'World|Natural Earth 50m|Antarctica',
'World|Natural Earth 50m|Antigua and Barb.',
'World|Natural Earth 50m|Argentina',
'World|Natural Earth 50m|Armenia',
'World|Natural Earth 50m|Aruba',
'World|Natural Earth 50m|Ashmore and Cartier Is.',
'World|Natural Earth 50m|Australia',
'World|Natural Earth 50m|Austria',
'World|Natural Earth 50m|Azerbaijan',
'World|Natural Earth 50m|Bahamas',
'World|Natural Earth 50m|Bahrain',
'World|Natural Earth 50m|Bangladesh',
'World|Natural Earth 50m|Barbados',
'World|Natural Earth 50m|Belarus',
'World|Natural Earth 50m|Belgium',
'World|Natural Earth 50m|Belize',
'World|Natural Earth 50m|Benin',
'World|Natural Earth 50m|Bermuda',
'World|Natural Earth 50m|Bhutan',
'World|Natural Earth 50m|Bolivia',
'World|Natural Earth 50m|Bosnia and Herz.',
'World|Natural Earth 50m|Botswana',
'World|Natural Earth 50m|Br. Indian Ocean Ter.',
'World|Natural Earth 50m|Brazil',
'World|Natural Earth 50m|British Virgin Is.',
'World|Natural Earth 50m|Brunei',
'World|Natural Earth 50m|Bulgaria',
'World|Natural Earth 50m|Burkina Faso',
'World|Natural Earth 50m|Burundi',
'World|Natural Earth 50m|Cabo Verde',
'World|Natural Earth 50m|Cambodia',
'World|Natural Earth 50m|Cameroon',
'World|Natural Earth 50m|Canada',
'World|Natural Earth 50m|Cayman Is.',
'World|Natural Earth 50m|Central African Rep.',
'World|Natural Earth 50m|Chad',
'World|Natural Earth 50m|Chile',
'World|Natural Earth 50m|China',
'World|Natural Earth 50m|Colombia',
'World|Natural Earth 50m|Comoros',
'World|Natural Earth 50m|Congo',
'World|Natural Earth 50m|Cook Is.',
'World|Natural Earth 50m|Costa Rica',
'World|Natural Earth 50m|Croatia',
'World|Natural Earth 50m|Cuba',
'World|Natural Earth 50m|Curaçao',
'World|Natural Earth 50m|Cyprus',
'World|Natural Earth 50m|Czechia',
"World|Natural Earth 50m|Côte d'Ivoire",
'World|Natural Earth 50m|Dem. Rep. Congo',
'World|Natural Earth 50m|Denmark',
'World|Natural Earth 50m|Djibouti',
'World|Natural Earth 50m|Dominica',
'World|Natural Earth 50m|Dominican Rep.',
'World|Natural Earth 50m|Ecuador',
'World|Natural Earth 50m|Egypt',
'World|Natural Earth 50m|El Salvador',
'World|Natural Earth 50m|Eq. Guinea',
'World|Natural Earth 50m|Eritrea',
'World|Natural Earth 50m|Estonia',
'World|Natural Earth 50m|Ethiopia',
'World|Natural Earth 50m|Faeroe Is.',
'World|Natural Earth 50m|Falkland Is.',
'World|Natural Earth 50m|Fiji',
'World|Natural Earth 50m|Finland',
'World|Natural Earth 50m|Fr. Polynesia',
'World|Natural Earth 50m|Fr. S. Antarctic Lands',
'World|Natural Earth 50m|France',
'World|Natural Earth 50m|Gabon',
'World|Natural Earth 50m|Gambia',
'World|Natural Earth 50m|Georgia',
'World|Natural Earth 50m|Germany',
'World|Natural Earth 50m|Ghana',
'World|Natural Earth 50m|Greece',
'World|Natural Earth 50m|Greenland',
'World|Natural Earth 50m|Grenada',
'World|Natural Earth 50m|Guam',
'World|Natural Earth 50m|Guatemala',
'World|Natural Earth 50m|Guernsey',
'World|Natural Earth 50m|Guinea',
'World|Natural Earth 50m|Guinea-Bissau',
'World|Natural Earth 50m|Guyana',
'World|Natural Earth 50m|Haiti',
'World|Natural Earth 50m|Heard I. and McDonald Is.',
'World|Natural Earth 50m|Honduras',
'World|Natural Earth 50m|Hong Kong',
'World|Natural Earth 50m|Hungary',
'World|Natural Earth 50m|Iceland',
'World|Natural Earth 50m|India',
'World|Natural Earth 50m|Indian Ocean Ter.',
'World|Natural Earth 50m|Indonesia',
'World|Natural Earth 50m|Iran',
'World|Natural Earth 50m|Iraq',
'World|Natural Earth 50m|Ireland',
'World|Natural Earth 50m|Isle of Man',
'World|Natural Earth 50m|Israel',
'World|Natural Earth 50m|Italy',
'World|Natural Earth 50m|Jamaica',
'World|Natural Earth 50m|Japan',
'World|Natural Earth 50m|Jersey',
'World|Natural Earth 50m|Jordan',
'World|Natural Earth 50m|Kazakhstan',
'World|Natural Earth 50m|Kenya',
'World|Natural Earth 50m|Kiribati',
'World|Natural Earth 50m|Kosovo',
'World|Natural Earth 50m|Kuwait',
'World|Natural Earth 50m|Kyrgyzstan',
'World|Natural Earth 50m|Laos',
'World|Natural Earth 50m|Latvia',
'World|Natural Earth 50m|Lebanon',
'World|Natural Earth 50m|Lesotho',
'World|Natural Earth 50m|Liberia',
'World|Natural Earth 50m|Libya',
'World|Natural Earth 50m|Liechtenstein',
'World|Natural Earth 50m|Lithuania',
'World|Natural Earth 50m|Luxembourg',
'World|Natural Earth 50m|Macao',
'World|Natural Earth 50m|Macedonia',
'World|Natural Earth 50m|Madagascar',
'World|Natural Earth 50m|Malawi',
'World|Natural Earth 50m|Malaysia',
'World|Natural Earth 50m|Maldives',
'World|Natural Earth 50m|Mali',
'World|Natural Earth 50m|Malta',
'World|Natural Earth 50m|Marshall Is.',
'World|Natural Earth 50m|Mauritania',
'World|Natural Earth 50m|Mauritius',
'World|Natural Earth 50m|Mexico',
'World|Natural Earth 50m|Micronesia',
'World|Natural Earth 50m|Moldova',
'World|Natural Earth 50m|Monaco',
'World|Natural Earth 50m|Mongolia',
'World|Natural Earth 50m|Montenegro',
'World|Natural Earth 50m|Montserrat',
'World|Natural Earth 50m|Morocco',
'World|Natural Earth 50m|Mozambique',
'World|Natural Earth 50m|Myanmar',
'World|Natural Earth 50m|N. Cyprus',
'World|Natural Earth 50m|N. Mariana Is.',
'World|Natural Earth 50m|Namibia',
'World|Natural Earth 50m|Nauru',
'World|Natural Earth 50m|Nepal',
'World|Natural Earth 50m|Netherlands',
'World|Natural Earth 50m|New Caledonia',
'World|Natural Earth 50m|New Zealand',
'World|Natural Earth 50m|Nicaragua',
'World|Natural Earth 50m|Niger',
'World|Natural Earth 50m|Nigeria',
'World|Natural Earth 50m|Niue',
'World|Natural Earth 50m|Norfolk Island',
'World|Natural Earth 50m|North Korea',
'World|Natural Earth 50m|Norway',
'World|Natural Earth 50m|Oman',
'World|Natural Earth 50m|Pakistan',
'World|Natural Earth 50m|Palau',
'World|Natural Earth 50m|Palestine',
'World|Natural Earth 50m|Panama',
'World|Natural Earth 50m|Papua New Guinea',
'World|Natural Earth 50m|Paraguay',
'World|Natural Earth 50m|Peru',
'World|Natural Earth 50m|Philippines',
'World|Natural Earth 50m|Pitcairn Is.',
'World|Natural Earth 50m|Poland',
'World|Natural Earth 50m|Portugal',
'World|Natural Earth 50m|Puerto Rico',
'World|Natural Earth 50m|Qatar',
'World|Natural Earth 50m|Romania',
'World|Natural Earth 50m|Russia',
'World|Natural Earth 50m|Rwanda',
'World|Natural Earth 50m|S. Geo. and the Is.',
'World|Natural Earth 50m|S. Sudan',
'World|Natural Earth 50m|Saint Helena',
'World|Natural Earth 50m|Saint Lucia',
'World|Natural Earth 50m|Samoa',
'World|Natural Earth 50m|San Marino',
'World|Natural Earth 50m|Saudi Arabia',
'World|Natural Earth 50m|Senegal',
'World|Natural Earth 50m|Serbia',
'World|Natural Earth 50m|Seychelles',
'World|Natural Earth 50m|Siachen Glacier',
'World|Natural Earth 50m|Sierra Leone',
'World|Natural Earth 50m|Singapore',
'World|Natural Earth 50m|Sint Maarten',
'World|Natural Earth 50m|Slovakia',
'World|Natural Earth 50m|Slovenia',
'World|Natural Earth 50m|Solomon Is.',
'World|Natural Earth 50m|Somalia',
'World|Natural Earth 50m|Somaliland',
'World|Natural Earth 50m|South Africa',
'World|Natural Earth 50m|South Korea',
'World|Natural Earth 50m|Spain',
'World|Natural Earth 50m|Sri Lanka',
'World|Natural Earth 50m|St-Barthélemy',
'World|Natural Earth 50m|St-Martin',
'World|Natural Earth 50m|St. Kitts and Nevis',
'World|Natural Earth 50m|St. Pierre and Miquelon',
'World|Natural Earth 50m|St. Vin. and Gren.',
'World|Natural Earth 50m|Sudan',
'World|Natural Earth 50m|Suriname',
'World|Natural Earth 50m|Sweden',
'World|Natural Earth 50m|Switzerland',
'World|Natural Earth 50m|Syria',
'World|Natural Earth 50m|São Tomé and Principe',
'World|Natural Earth 50m|Taiwan',
'World|Natural Earth 50m|Tajikistan',
'World|Natural Earth 50m|Tanzania',
'World|Natural Earth 50m|Thailand',
'World|Natural Earth 50m|Timor-Leste',
'World|Natural Earth 50m|Togo',
'World|Natural Earth 50m|Tonga',
'World|Natural Earth 50m|Trinidad and Tobago',
'World|Natural Earth 50m|Tunisia',
'World|Natural Earth 50m|Turkey',
'World|Natural Earth 50m|Turkmenistan',
'World|Natural Earth 50m|Turks and Caicos Is.',
'World|Natural Earth 50m|U.S. Virgin Is.',
'World|Natural Earth 50m|Uganda',
'World|Natural Earth 50m|Ukraine',
'World|Natural Earth 50m|United Arab Emirates',
'World|Natural Earth 50m|United Kingdom',
'World|Natural Earth 50m|United States of America',
'World|Natural Earth 50m|Uruguay',
'World|Natural Earth 50m|Uzbekistan',
'World|Natural Earth 50m|Vanuatu',
'World|Natural Earth 50m|Vatican',
'World|Natural Earth 50m|Venezuela',
'World|Natural Earth 50m|Vietnam',
'World|Natural Earth 50m|W. Sahara',
'World|Natural Earth 50m|Wallis and Futuna Is.',
'World|Natural Earth 50m|Yemen',
'World|Natural Earth 50m|Zambia',
'World|Natural Earth 50m|Zimbabwe',
'World|Natural Earth 50m|eSwatini',
'World|Natural Earth 50m|Åland',
'World|North Atlantic Ocean',
'World|Northern Hemisphere',
'World|Northern Hemisphere|Land',
'World|Northern Hemisphere|Ocean',
'World|Ocean',
'World|Southern Hemisphere',
'World|Southern Hemisphere|Land',
'World|Southern Hemisphere|Ocean']
As a user, you can also define masks. Simply add them to netcdf_scm.masks.MASKS
and then use them in your get_scm_cubes
call.
[16]:
WEIGHTS_FUNCTIONS_WITHOUT_AREA_WEIGHTING["custom mask"] = get_weights_for_area(
-60, 100, -10, 330
)
WEIGHTS_FUNCTIONS_WITHOUT_AREA_WEIGHTING[
"Northern Atlantic area bounds"
] = get_weights_for_area(0, -80, 65, 0)
[17]:
custom_weights = example.get_scm_timeseries_weights(
regions=[
"World|El Nino N3.4",
"custom mask",
"World|Land",
"Northern Atlantic area bounds",
]
)
[18]:
plot_weights(custom_weights)




Default land/ocean mask¶
When crunching data with netCDF-SCM, we want to cut files into (at least) Northern/Southern Hemisphere, land/ocean boxes. However, we don’t always have access to land-surface fraction information from the raw model output. In these cases, we simply apply a default land/ocean mask instead. In this notebook, we show how this mask looks and how it was derived.
Imports¶
[1]:
import iris
import numpy as np
[2]:
from matplotlib import pyplot as plt
import iris.plot as iplt
import iris.quickplot as qplt
Default mask¶
Our default mask lives in netcdf_scm.masks
. We can access it using netcdf_scm.masks.get_default_sftlf_cube
.
[3]:
from netcdf_scm.weights import get_default_sftlf_cube
[4]:
default_sftlf = get_default_sftlf_cube()
[5]:
# NBVAL_IGNORE_OUTPUT
fig = plt.figure(figsize=(16, 9))
qplt.pcolormesh(default_sftlf,);
/data/ubuntu-znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1192: UserWarning: Coordinate 'longitude' is not bounded, guessing contiguous bounds.
warnings.warn('Coordinate {!r} is not bounded, guessing '
/data/ubuntu-znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1192: UserWarning: Coordinate 'latitude' is not bounded, guessing contiguous bounds.
warnings.warn('Coordinate {!r} is not bounded, guessing '

[6]:
zoomed = default_sftlf.extract(
iris.Constraint(latitude=lambda cell: -45 < cell < -25)
& iris.Constraint(longitude=lambda cell: 120 < cell < 160)
)
[7]:
# NBVAL_IGNORE_OUTPUT
fig = plt.figure(figsize=(16, 9))
qplt.pcolormesh(zoomed,);
/data/ubuntu-znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1192: UserWarning: Coordinate 'longitude' is not bounded, guessing contiguous bounds.
warnings.warn('Coordinate {!r} is not bounded, guessing '
/data/ubuntu-znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1192: UserWarning: Coordinate 'latitude' is not bounded, guessing contiguous bounds.
warnings.warn('Coordinate {!r} is not bounded, guessing '

Deriving the mask¶
To derive the mask, we simply use the mask from the IPSL-CM6A-LR model in CMIP6.
[8]:
source_file = "../../../tests/test-data/cmip6output/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/fx/sftlf/gr/v20180803/sftlf_fx_IPSL-CM6A-LR_historical_r1i1p1f1_gr.nc"
[9]:
comp_cube = iris.load_cube(source_file)
/data/ubuntu-znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/fileformats/cf.py:803: UserWarning: Missing CF-netCDF measure variable 'areacella', referenced by netCDF variable 'sftlf'
warnings.warn(message % (variable_name, nc_var_name))
[10]:
# NBVAL_IGNORE_OUTPUT
fig = plt.figure(figsize=(16, 9))
qplt.pcolormesh(comp_cube);
/data/ubuntu-znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1192: UserWarning: Coordinate 'longitude' is not bounded, guessing contiguous bounds.
warnings.warn('Coordinate {!r} is not bounded, guessing '
/data/ubuntu-znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1192: UserWarning: Coordinate 'latitude' is not bounded, guessing contiguous bounds.
warnings.warn('Coordinate {!r} is not bounded, guessing '

[11]:
sample_points = [
("longitude", np.arange(0.5, 360, 1)),
("latitude", np.arange(-89.5, 90, 1)),
]
[12]:
comp_cube_interp = comp_cube.interpolate(sample_points, iris.analysis.Linear())
comp_cube_interp.attributes[
"history"
] = "Interpolated to a 1deg x 1deg grid using iris.interpolate with linear interpolation"
comp_cube_interp.attributes[
"title"
] = "Default land area fraction assumption in netcdf-scm. Base on {}".format(
comp_cube_interp.attributes["title"]
)
[13]:
iris.save(comp_cube_interp, "default_weights.nc")
!ncdump -h default_weights.nc
netcdf default_weights {
dimensions:
lat = 180 ;
lon = 360 ;
string8 = 8 ;
variables:
float sftlf(lat, lon) ;
sftlf:standard_name = "land_area_fraction" ;
sftlf:long_name = "Land Area Fraction" ;
sftlf:units = "%" ;
sftlf:cell_methods = "area: mean" ;
sftlf:coordinates = "type" ;
double lat(lat) ;
lat:axis = "Y" ;
lat:units = "degrees_north" ;
lat:standard_name = "latitude" ;
lat:long_name = "Latitude" ;
double lon(lon) ;
lon:axis = "X" ;
lon:units = "degrees_east" ;
lon:standard_name = "longitude" ;
lon:long_name = "Longitude" ;
char type(string8) ;
type:units = "1" ;
type:standard_name = "area_type" ;
type:long_name = "Land area type" ;
// global attributes:
:CMIP6_CV_version = "cv=6.2.3.5-2-g63b123e" ;
:EXPID = "historical" ;
:NCO = "\"4.6.0\"" ;
:activity_id = "CMIP" ;
:branch_method = "standard" ;
:branch_time_in_child = 0. ;
:branch_time_in_parent = 21914. ;
:contact = "ipsl-cmip6@listes.ipsl.fr" ;
:creation_date = "2018-07-11T07:27:04Z" ;
:data_specs_version = "01.00.21" ;
:description = "Land Area Fraction" ;
:dr2xml_md5sum = "f1e40c1fc5d8281f865f72fbf4e38f9d" ;
:dr2xml_version = "1.11" ;
:experiment = "all-forcing simulation of the recent past" ;
:experiment_id = "historical" ;
:forcing_index = 1 ;
:frequency = "fx" ;
:further_info_url = "https://furtherinfo.es-doc.org/CMIP6.IPSL.IPSL-CM6A-LR.historical.none.r1i1p1f1" ;
:grid = "LMDZ grid" ;
:grid_label = "gr" ;
:history = "Interpolated to a 1deg x 1deg grid using iris.interpolate with linear interpolation" ;
:initialization_index = 1 ;
:institution = "Institut Pierre Simon Laplace, Paris 75252, France" ;
:institution_id = "IPSL" ;
:license = "CMIP6 model data produced by IPSL is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://creativecommons.org/licenses). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file) and at https://cmc.ipsl.fr/. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law." ;
:mip_era = "CMIP6" ;
:model_version = "6.1.5" ;
:name = "/ccc/work/cont003/gencmip6/p86caub/IGCM_OUT/IPSLCM6/PROD/historical/CM61-LR-hist-03.1910/CMIP6/ATM/sftlf_fx_IPSL-CM6A-LR_historical_r1i1p1f1_gr" ;
:nominal_resolution = "250 km" ;
:online_operation = "once" ;
:parent_activity_id = "CMIP" ;
:parent_experiment_id = "piControl" ;
:parent_mip_era = "CMIP6" ;
:parent_source_id = "IPSL-CM6A-LR" ;
:parent_time_units = "days since 1850-01-01 00:00:00" ;
:parent_variant_label = "r1i1p1f1" ;
:physics_index = 1 ;
:product = "model-output" ;
:realization_index = 1 ;
:realm = "atmos" ;
:source = "IPSL-CM6A-LR (2017): atmos: LMDZ (NPv6, N96; 144 x 143 longitude/latitude; 79 levels; top level 40000 m) land: ORCHIDEE (v2.0, Water/Carbon/Energy mode) ocean: NEMO-OPA (eORCA1.3, tripolar primarily 1deg; 362 x 332 longitude/latitude; 75 levels; top grid cell 0-2 m) ocnBgchem: NEMO-PISCES seaIce: NEMO-LIM3" ;
:source_id = "IPSL-CM6A-LR" ;
:source_type = "AOGCM BGC" ;
:sub_experiment = "none" ;
:sub_experiment_id = "none" ;
:table_id = "fx" ;
:title = "Default land area fraction assumption in netcdf-scm. Base on IPSL-CM6A-LR model output prepared for CMIP6 / CMIP historical" ;
:tracking_id = "hdl:21.14100/cc6c4852-271d-4c5a-adc3-42530ef19550" ;
:variable_id = "sftlf" ;
:variant_label = "r1i1p1f1" ;
:Conventions = "CF-1.7" ;
}
[14]:
# NBVAL_IGNORE_OUTPUT
fig = plt.figure(figsize=(16, 9))
qplt.pcolormesh(comp_cube_interp);
/data/ubuntu-znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1192: UserWarning: Coordinate 'longitude' is not bounded, guessing contiguous bounds.
warnings.warn('Coordinate {!r} is not bounded, guessing '
/data/ubuntu-znicholls/miniconda3/envs/netcdf-scm/lib/python3.9/site-packages/iris/coords.py:1192: UserWarning: Coordinate 'latitude' is not bounded, guessing contiguous bounds.
warnings.warn('Coordinate {!r} is not bounded, guessing '

[15]:
comp_cube_regrid = comp_cube.regrid(default_sftlf, iris.analysis.Linear())
As expected, the default mask is more or less identical to the IPSL mask, even with regridding.
[16]:
# NBVAL_IGNORE_OUTPUT
fig = plt.figure(figsize=(16, 9))
qplt.pcolormesh((default_sftlf - comp_cube_regrid));

Miscellaneous¶
Year zero handling¶
The CMIP6 historical concentration data files use a gregorian calendar which has a reference year of zero. There is no year zero in a gregorian calendar so this case cannot be handled by iris. As a result, we provide a simple wrapper to handle this edge case. Note, as we have to read in the entire data file, it can be slow.
[1]:
# NBVAL_IGNORE_OUTPUT
import datetime
import iris
import iris.coord_categorisation
import iris.plot as iplt
from netcdf_scm.iris_cube_wrappers import CMIP6Input4MIPsCube
import matplotlib.pyplot as plt
plt.style.use("bmh")
[2]:
# NBVAL_IGNORE_OUTPUT
cmip6_hist_concs = CMIP6Input4MIPsCube()
cmip6_hist_concs.load_data_from_identifiers(
root_dir="../../../tests/test-data/cmip6input4mips",
activity_id="input4MIPs",
mip_era="CMIP6",
target_mip="CMIP",
institution_id="UoM",
source_id="UoM-CMIP-1-2-0",
realm="atmos",
frequency="mon",
variable_id="mole-fraction-of-carbon-dioxide-in-air",
grid_label="gr1-GMNHSH",
version="v20100304",
dataset_category="GHGConcentrations",
time_range="000001-201412",
file_ext=".nc",
)
[3]:
# NBVAL_IGNORE_OUTPUT
cmip6_hist_concs.cube
[3]:
Mole (1.e-6) | time | sector |
---|---|---|
Shape | 24168 | 3 |
Dimension coordinates | ||
time | x | - |
sector | - | x |
Attributes | ||
Conventions | CF-1.6 | |
activity_id | input4MIPs | |
comment | Data provided are global and hemispheric area-weighted means. Zonal means... | |
contact | malte.meinshausen@unimelb.edu.au | |
creation_date | 2016-08-30T18:22:16Z | |
dataset_category | GHGConcentrations | |
dataset_version_number | 1.2.0 | |
frequency | mon | |
further_info_url | http://climatecollege.unimelb.edu.au/cmip6 | |
grid | global and hemispheric means - area-averages from the original latitudinal... | |
grid_label | gr1-GMNHSH | |
institution | Australian-German Climate & Energy College, The University of Melbourne... | |
institution_id | UoM | |
license | GHG concentrations produced by UoM are licensed under a Creative Commons... | |
mip_era | CMIP6 | |
nominal_resolution | 10000 km | |
product | assimilated observations | |
realm | atmos | |
references | Malte Meinshausen, Elisabeth Vogel, Alexander Nauels, Katja Lorbacher,... | |
source | UoM-CMIP-1-2-0: Historical GHG mole fractions from NOAA & AGAGE networks... | |
source_id | UoM-CMIP-1-2-0 | |
table_id | input4MIPs | |
target_mip | CMIP | |
title | UoM-CMIP-1-2-0: historical GHG concentrations: global and hemispheric means... | |
tracking_id | hdl:21.14100/3ef0a11f-2ed2-4004-9234-4087c2d41cee | |
variable_id | mole_fraction_of_carbon_dioxide_in_air | |
Cell methods | ||
mean | time | |
mean | area |
We also make a plot to show how the underlying data looks.
[4]:
# NBVAL_IGNORE_OUTPUT
yearmin = 1700
cube = cmip6_hist_concs.cube.extract(
iris.Constraint(time=lambda t: t[0].year > yearmin)
)
fig = plt.figure(figsize=(16, 9))
for i in range(3):
region = cube.extract(iris.Constraint(sector=i))
for title in region.coord("sector").attributes["ids"].split(";"):
if title.strip().startswith(str(i)):
title = title.split(":")[1].strip()
break
if "Global" in title:
plt.subplot(322)
elif "Northern" in title:
plt.subplot(324)
elif "Southern" in title:
plt.subplot(326)
iplt.plot(region, lw=2.0)
xlabel = "Time"
plt.title(title)
plt.xlabel(xlabel)
plt.xlim(
[datetime.date(1965, 1, 1), datetime.date(2015, 1, 1),]
)
if "Global" in title:
plt.subplot(121)
iris.coord_categorisation.add_year(region, "time", name="year")
region_annual_mean = region.aggregated_by(["year"], iris.analysis.MEAN)
iplt.plot(region_annual_mean, lw=2.0)
var_name = region.var_name.replace("_", " ")
var_name = var_name.replace("in", "\nin")
plt.ylabel("{} ({})".format(var_name, region.units))
plt.title(title + "-annual mean")
plt.xlabel("Time")
plt.tight_layout();

Miscellaneous reading¶
There are some files which don’t fit within the standard CMIP6 output but which we would nonetheless like to read. For this purpose, we have netcdf_scm.misc_readers
. At the moment it only helps us to read hemispheric-mean data for CMIP6 input concentrations, but more options can be added as needed (pull requests welcome :)).
CMIP6 concentrations input¶
The CMIP6 input concentrations are provided on a grid. However, hemispheric mean data was also provided. These can be read as shown below.
[1]:
# NBVAL_IGNORE_OUTPUT
import os.path
import matplotlib.pyplot as plt
from netcdf_scm.misc_readers import read_cmip6_concs_gmnhsh
[2]:
TEST_DATA_DIR = os.path.join("..", "..", "..", "tests", "test-data")
TEST_HISTORICAL_FILE = os.path.join(
TEST_DATA_DIR,
"mole-fraction-of-carbon-dioxide-in-air_input4MIPs_GHGConcentrations_CMIP_UoM-CMIP-1-2-0_gr1-GMNHSH_000001-201412.nc",
)
TEST_PROJECTION_FILE = os.path.join(
TEST_DATA_DIR,
"mole-fraction-of-carbon-dioxide-in-air_input4MIPs_GHGConcentrations_ScenarioMIP_UoM-MESSAGE-GLOBIOM-ssp245-1-2-1_gr1-GMNHSH_201501-250012.nc",
)
[3]:
# NBVAL_IGNORE_OUTPUT
historical_concs = read_cmip6_concs_gmnhsh(TEST_HISTORICAL_FILE)
historical_concs.head()
[3]:
time | 0001-01-17 12:00:00 | 0001-02-16 00:00:00 | 0001-03-17 12:00:00 | 0001-04-17 00:00:00 | 0001-05-17 12:00:00 | 0001-06-17 00:00:00 | 0001-07-17 12:00:00 | 0001-08-17 12:00:00 | 0001-09-17 00:00:00 | 0001-10-17 12:00:00 | ... | 2014-03-17 12:00:00 | 2014-04-17 00:00:00 | 2014-05-17 12:00:00 | 2014-06-17 00:00:00 | 2014-07-17 12:00:00 | 2014-08-17 12:00:00 | 2014-09-17 00:00:00 | 2014-10-17 12:00:00 | 2014-11-17 00:00:00 | 2014-12-17 12:00:00 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
model | scenario | region | variable | unit | variable_standard_name | mip_era | climate_model | activity_id | member_id | |||||||||||||||||||||
unspecified | historical | World | mole_fraction_of_carbon_dioxide_in_air | ppm | NaN | CMIP6 | MAGICC7 | input4MIPs | unspecified | 277.876678 | 278.231598 | 278.551178 | 278.774658 | 278.706543 | 277.966461 | 276.322845 | 274.719147 | 274.680359 | 275.719604 | ... | 399.020050 | 399.094604 | 398.623932 | 397.337616 | 395.648834 | 394.573456 | 395.026825 | 396.668762 | 398.189087 | 399.179688 |
World|Northern Hemisphere | mole_fraction_of_carbon_dioxide_in_air | ppm | NaN | CMIP6 | MAGICC7 | input4MIPs | unspecified | 278.555908 | 279.183929 | 279.804108 | 280.321655 | 280.213593 | 278.691864 | 275.406738 | 272.345093 | 272.443939 | 274.555481 | ... | 402.959412 | 403.127563 | 402.125000 | 399.311371 | 395.616882 | 393.376556 | 394.318665 | 397.456665 | 400.321228 | 402.195099 | ||
World|Southern Hemisphere | mole_fraction_of_carbon_dioxide_in_air | ppm | NaN | CMIP6 | MAGICC7 | input4MIPs | unspecified | 277.197449 | 277.279266 | 277.298248 | 277.227661 | 277.199493 | 277.241058 | 277.238922 | 277.093170 | 276.916809 | 276.883728 | ... | 395.080719 | 395.061676 | 395.122864 | 395.363861 | 395.680786 | 395.770386 | 395.734955 | 395.880859 | 396.056915 | 396.164307 |
3 rows × 24168 columns
[4]:
# NBVAL_IGNORE_OUTPUT
projection_concs = read_cmip6_concs_gmnhsh(TEST_PROJECTION_FILE)
projection_concs.head()
[4]:
time | 2015-01-16 12:00:00 | 2015-02-15 00:00:00 | 2015-03-16 12:00:00 | 2015-04-16 00:00:00 | 2015-05-16 12:00:00 | 2015-06-16 00:00:00 | 2015-07-16 12:00:00 | 2015-08-16 12:00:00 | 2015-09-16 00:00:00 | 2015-10-16 12:00:00 | ... | 2500-03-16 12:00:00 | 2500-04-16 00:00:00 | 2500-05-16 12:00:00 | 2500-06-16 00:00:00 | 2500-07-16 12:00:00 | 2500-08-16 12:00:00 | 2500-09-16 00:00:00 | 2500-10-16 12:00:00 | 2500-11-16 00:00:00 | 2500-12-16 12:00:00 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
model | scenario | region | variable | unit | variable_standard_name | climate_model | activity_id | mip_era | member_id | |||||||||||||||||||||
MESSAGE-GLOBIOM | ssp245 | World | mole_fraction_of_carbon_dioxide_in_air | ppm | NaN | MAGICC7 | input4MIPs | CMIP6 | unspecified | 399.985443 | 400.471680 | 400.829407 | 401.061829 | 400.765961 | 399.643005 | 398.118835 | 397.217529 | 397.855652 | 399.668549 | ... | 581.471252 | 581.205750 | 580.177002 | 578.151550 | 576.184753 | 575.389526 | 576.097717 | 578.050537 | 579.750366 | 580.699402 |
World|Northern Hemisphere | mole_fraction_of_carbon_dioxide_in_air | ppm | NaN | MAGICC7 | input4MIPs | CMIP6 | unspecified | 403.364502 | 404.067444 | 404.587128 | 404.826294 | 403.919159 | 401.186096 | 397.617981 | 395.575470 | 396.728790 | 400.064789 | ... | 583.975220 | 583.604553 | 581.599060 | 577.361206 | 573.039368 | 571.351440 | 572.867310 | 576.690613 | 580.148682 | 582.227295 | ||
World|Southern Hemisphere | mole_fraction_of_carbon_dioxide_in_air | ppm | NaN | MAGICC7 | input4MIPs | CMIP6 | unspecified | 396.606384 | 396.875916 | 397.071716 | 397.297333 | 397.612732 | 398.099915 | 398.619690 | 398.859619 | 398.982513 | 399.272278 | ... | 578.967224 | 578.806885 | 578.754883 | 578.941895 | 579.330139 | 579.427612 | 579.328125 | 579.410461 | 579.352112 | 579.171448 |
3 rows × 5832 columns
[5]:
combined_concs = historical_concs.append(projection_concs)
# hack around Pyam's inability to handle NaN for now
combined_concs = combined_concs.timeseries().reset_index()
combined_concs = combined_concs.drop("variable_standard_name", axis="columns")
combined_concs = type(historical_concs)(combined_concs)
[6]:
# NBVAL_IGNORE_OUTPUT
combined_concs.head()
[6]:
time | 0001-01-17 12:00:00 | 0001-02-16 00:00:00 | 0001-03-17 12:00:00 | 0001-04-17 00:00:00 | 0001-05-17 12:00:00 | 0001-06-17 00:00:00 | 0001-07-17 12:00:00 | 0001-08-17 12:00:00 | 0001-09-17 00:00:00 | 0001-10-17 12:00:00 | ... | 2500-03-16 12:00:00 | 2500-04-16 00:00:00 | 2500-05-16 12:00:00 | 2500-06-16 00:00:00 | 2500-07-16 12:00:00 | 2500-08-16 12:00:00 | 2500-09-16 00:00:00 | 2500-10-16 12:00:00 | 2500-11-16 00:00:00 | 2500-12-16 12:00:00 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
model | scenario | region | variable | unit | mip_era | climate_model | activity_id | member_id | |||||||||||||||||||||
unspecified | historical | World | mole_fraction_of_carbon_dioxide_in_air | ppm | CMIP6 | MAGICC7 | input4MIPs | unspecified | 277.876678 | 278.231598 | 278.551178 | 278.774658 | 278.706543 | 277.966461 | 276.322845 | 274.719147 | 274.680359 | 275.719604 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
World|Northern Hemisphere | mole_fraction_of_carbon_dioxide_in_air | ppm | CMIP6 | MAGICC7 | input4MIPs | unspecified | 278.555908 | 279.183929 | 279.804108 | 280.321655 | 280.213593 | 278.691864 | 275.406738 | 272.345093 | 272.443939 | 274.555481 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ||
World|Southern Hemisphere | mole_fraction_of_carbon_dioxide_in_air | ppm | CMIP6 | MAGICC7 | input4MIPs | unspecified | 277.197449 | 277.279266 | 277.298248 | 277.227661 | 277.199493 | 277.241058 | 277.238922 | 277.093170 | 276.916809 | 276.883728 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ||
MESSAGE-GLOBIOM | ssp245 | World | mole_fraction_of_carbon_dioxide_in_air | ppm | CMIP6 | MAGICC7 | input4MIPs | unspecified | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | 581.471252 | 581.205750 | 580.177002 | 578.151550 | 576.184753 | 575.389526 | 576.097717 | 578.050537 | 579.750366 | 580.699402 |
World|Northern Hemisphere | mole_fraction_of_carbon_dioxide_in_air | ppm | CMIP6 | MAGICC7 | input4MIPs | unspecified | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | 583.975220 | 583.604553 | 581.599060 | 577.361206 | 573.039368 | 571.351440 | 572.867310 | 576.690613 | 580.148682 | 582.227295 |
5 rows × 30000 columns
[7]:
# NBVAL_IGNORE_OUTPUT
fig = plt.figure(figsize=(16, 9))
ax = fig.add_subplot(121)
combined_concs.filter(year=range(2010, 2021)).line_plot(
hue="scenario", style="region", ax=ax
)
ax = fig.add_subplot(122)
combined_concs.filter(year=range(1500, 2300)).line_plot(
hue="scenario", style="region", ax=ax
);

Development¶
If you’re interested in contributing to netCDF-SCM, we’d love to have you on board! This section of the docs details how to get setup to contribute and how best to communicate.
Contributing¶
All contributions are welcome, some possible suggestions include:
tutorials (or support questions which, once solved, result in a new tutorial :D)
blog posts
improving the documentation
bug reports
feature requests
pull requests
Please report issues or discuss feature requests in the netCDF-SCM issue tracker. If your issue is a feature request or a bug, please use the templates available, otherwise, simply open a normal issue :)
As a contributor, please follow a couple of conventions:
Create issues in the netCDF-SCM issue tracker for changes and enhancements, this ensures that everyone in the community has a chance to comment
Be welcoming to newcomers and encourage diverse new contributors from all backgrounds: see the Python Community Code of Conduct
Getting setup¶
To get setup as a developer, we recommend the following steps (if any of these tools are unfamiliar, please see the resources we recommend in Development tools):
Install conda and make
Run
make conda-environment
, if that fails you can try doing it manually by reading the commands from theMakefile
Make sure the tests pass by running
make test
, as above if that fails you can try doing it manually by reading the commands from theMakefile
Getting help¶
Whilst developing, unexpected things can go wrong (that’s why it’s called ‘developing’, if we knew what we were doing, it would already be ‘developed’). Normally, the fastest way to solve an issue is to contact us via the issue tracker. The other option is to debug yourself. For this purpose, we provide a list of the tools we use during our development as starting points for your search to find what has gone wrong.
Development tools¶
This list of development tools is what we rely on to develop netCDF-SCM reliably and reproducibly. It gives you a few starting points in case things do go inexplicably wrong and you want to work out why. We include links with each of these tools to starting points that we think are useful, in case you want to learn more.
- Conda virtual environments
note the common gotcha that
source activate
has now changed toconda activate
we use conda instead of pure pip environments because they help us deal with Iris’ dependencies: if you want to learn more about pip and pip virtual environments, check out this introduction
- Continuous integration (CI)
we use GitLab CI for our CI but there are a number of good providers
- Jupyter Notebooks
we’d recommend simply installing
jupyter
(conda install jupyter
) in your virtual environment
Other tools¶
We also use some other tools which aren’t necessarily the most familiar. Here we provide a list of these along with useful resources.
- Regular expressions
we use regex101.com to help us write and check our regular expressions, make sure the language is set to Python to make your life easy!
Formatting¶
To help us focus on what the code does, not how it looks, we use a couple of automatic formatting tools.
These automatically format the code for us and tell use where the errors are.
To use them, after setting yourself up (see Getting setup), simply run make black
and make flake8
.
Note that make black
can only be run if you have committed all your work i.e. your working directory is ‘clean’.
This restriction is made to ensure that you don’t format code without being able to undo it, just in case something goes wrong.
Buiding the docs¶
After setting yourself up (see Getting setup), building the docs is as simple as running make docs
(note, run make -B docs
to force the docs to rebuild and ignore make when it says ‘… index.html is up to date’).
This will build the docs for you.
You can preview them by opening docs/build/html/index.html
in a browser.
For documentation we use Sphinx. To get ourselves started with Sphinx, we started with this example then used Sphinx’s getting started guide.
Gotchas¶
To get Sphinx to generate pdfs (rarely worth the hassle), you require Latexmk.
On a Mac this can be installed with sudo tlmgr install latexmk
.
You will most likely also need to install some other packages (if you don’t have the full distribution).
You can check which package contains any missing files with tlmgr search --global --file [filename]
.
You can then install the packages with sudo tlmgr install [package]
.
Docstring style¶
For our docstrings we use numpy style docstrings. For more information on these, here is the full guide and the quick reference we also use.
Releasing¶
The steps to release a new version of netCDF-SCM are shown below. Please do all the steps below and all the steps for both release platforms.
First step¶
Test installation with dependencies
make test-install
Update
CHANGELOG.rst
:add a header for the new version between
master
and the latest bullet pointthis should leave the section underneath the master header empty
git add .
git commit -m "Prepare for release of vX.Y.Z"
git tag vX.Y.Z
Test version updated as intended with
make test-install
PyPI¶
If uploading to PyPI, do the following (otherwise skip these steps)
make publish-on-testpypi
Go to test PyPI and check that the new release is as intended. If it isn’t, stop and debug.
Test the install with
make test-testpypi-install
(this doesn’t test all the imports as most required packages are not on test PyPI).
Assuming test PyPI worked, now upload to the main repository
make publish-on-pypi
Go to netCDF-SCM’s PyPI and check that the new release is as intended.
Test the install with
make test-pypi-install
(a pip only install will throw warnings about Iris not being installed, that’s fine).
Conda¶
If you haven’t already, fork the netCDF-SCM conda feedstock. In your fork, add the feedstock upstream with
git remote add upstream https://github.com/conda-forge/netcdf-scm-feedstock
(upstream
should now appear in the output ofgit remote -v
)Update your fork’s master to the upstream master with:
git checkout master
git fetch upstream
git reset --hard upstream/master
Create a new branch in the feedstock for the version you want to bump to.
Edit
recipe/meta.yaml
and update:version number in line 1 (don’t include the ‘v’ in the version tag)
the build number to zero (you should only be here if releasing a new version)
update
sha256
in line 9 (you can get the sha from netCDF-SCM’s PyPI by clicking on ‘Download files’ on the left and then clicking on ‘SHA256’ of the.tar.gz
file to copy it to the clipboard)
git add .
git commit -m "Update to vX.Y.Z"
git push
Make a PR into the netCDF-SCM conda feedstock
If the PR passes (give it at least 10 minutes to run all the CI), merge
Check https://anaconda.org/conda-forge/netcdf-scm to double check that the version has increased (this can take a few minutes to update)
Archiving on zenodo¶
Create a clean version of the repo (note, this deletes all files not tracked by git, use with care!),
git clean -xdf
(dry run can be done withgit clean -ndf
)Tar the repo
VERSION=`python -c 'import netcdf_scm; print(netcdf_scm.__version__)'` \ && tar --exclude='./.git' -czvf "netcdf-scm-${VERSION}.tar.gz" .
Run the zenodo script to get the curl command for the file to upload,
python scripts/prepare_zenodo_upload.py <file-to-upload>
The above script spits out a curl command, run this command (having set the
ZENODO_TOKEN
environment variable first) to upload your archiveGo to
zenodo.org
, read through and finalise the upload by pushing publish
Why is there a Makefile
in a pure Python repository?¶
Whilst it may not be standard practice, a Makefile
is a simple way to automate general setup (environment setup in particular).
Hence we have one here which basically acts as a notes file for how to do all those little jobs which we often forget e.g. setting up environments, running tests (and making sure we’re in the right environment), building docs, setting up auxillary bits and pieces.
Why did we choose a BSD 2-Clause License?¶
We want to ensure that our code can be used and shared as easily as possible. Whilst we love transparency, we didn’t want to force all future users to also comply with a stronger license such as AGPL. Hence the choice we made.
We recommend Morin et al. 2012 for more information for scientists about open-source software licenses.
Iris cube wrappers API¶
Wrappers of the iris cube.
These classes automate handling of a number of netCDF processing steps. For example, finding surface land fraction files, applying regions to data and returning timeseries in key regions for simple climate models.
-
class
netcdf_scm.iris_cube_wrappers.
CMIP6Input4MIPsCube
[source]¶ Bases:
netcdf_scm.iris_cube_wrappers._CMIPCube
Cube which can be used with CMIP6 input4MIPs data
The data must match the CMIP6 Forcing Datasets Summary, specifically the Forcing Dataset Specifications.
-
activity_id
= None¶ The activity_id for which we want to load data.
For these cubes, this will almost always be
input4MIPs
.- Type
-
areacell_var
¶ The name of the variable associated with the area of each gridbox.
If required, this is used to determine the area of each cell in a data file. For example, if our data file is
tas_Amon_HadCM3_rcp45_r1i1p1_200601.nc
thenareacell_var
can be used to work out the name of the associated cell area file. In some cases, it might be as simple as replacingtas
with the value ofareacell_var
.- Type
-
convert_scm_timeseries_cubes_to_openscmdata
(scm_timeseries_cubes, out_calendar=None)¶ Convert dictionary of SCM timeseries cubes to an
scmdata.ScmRun
- Parameters
- Returns
scmdata.ScmRun
containing the data from the SCM timeseries cubes- Return type
scmdata.ScmRun
- Raises
NotImplementedError – The (original) input data has dimensions other than time, latitude and longitude (so the data to convert has dimensions other than time).
-
dataset_category
= None¶ The dataset_category for which we want to load data e.g.
GHGConcentrations
- Type
-
dim_names
¶ Names of the dimensions in this cube
Here the names are the
standard_names
which means there can beNone
in the output.- Type
-
get_area_weights
(areacell_scmcube=None)¶ Get area weights for this cube
- Parameters
areacell_scmcube (
ScmCube
) –ScmCube
containing areacell data. IfNone
, we calculate the weights using iris.- Returns
Weights on the cube’s latitude-longitude grid.
- Return type
np.ndarray
- Raises
iris.exceptions.CoordinateMultiDimError – The cube’s co-ordinates are multi-dimensional and we don’t have cell area data.
ValueError – Area weights units are not as expected (contradict with
self._area_weights_units
).
-
get_data_directory
()¶ Get the path to a data file from self’s attributes.
This can take multiple forms, it may just return a previously set filepath attribute or it could combine a number of different metadata elements (e.g. model name, experiment name) to create the data path.
-
get_data_filename
()¶ Get the name of a data file from self’s attributes.
This can take multiple forms, it may just return a previously set filename attribute or it could combine a number of different metadata elements (e.g. model name, experiment name) to create the data name.
-
classmethod
get_data_reference_syntax
(**kwargs)¶ Get data reference syntax for this cube
-
get_filepath_from_load_data_from_identifiers_args
(**kwargs)[source]¶ Get the full filepath of the data to load from the arguments passed to
self.load_data_from_identifiers
.Full details about the meaning of the identifiers are given in the Forcing Dataset Specifications.
- Parameters
kwargs (str) – Identifiers to use to load the data
- Returns
The full filepath (path and name) of the file to load.
- Return type
- Raises
AttributeError – An input argument does not match with the cube’s data reference syntax
-
get_load_data_from_identifiers_args_from_filepath
(filepath)¶ Get the set of identifiers to use to load data from a filepath.
- Parameters
filepath (str) – The filepath from which to load the data.
- Returns
Set of arguments which can be passed to
self.load_data_from_identifiers
to load the data in the filepath.- Return type
- Raises
ValueError – Path and filename contradict each other
-
get_metadata_cube
(metadata_variable, cube=None)¶ Load a metadata cube from self’s attributes.
- Parameters
- Returns
instance of self which has been loaded from the file containing the metadata variable of interest.
- Return type
type(self)
- Raises
-
get_scm_timeseries
(**kwargs)¶ Get SCM relevant timeseries from
self
.- Parameters
**kwargs – Passed to
get_scm_timeseries_cubes()
- Returns
scmdata.ScmRun
instance with the data in thedata
attribute and metadata in themetadata
attribute.- Return type
scmdata.ScmRun
-
get_scm_timeseries_cubes
(lazy=False, **kwargs)¶ Get SCM relevant cubes
The effective areas used for each of the regions are added as auxillary co-ordinates of each timeseries cube.
If global, Northern Hemisphere and Southern Hemisphere land cubes are calculated, then three auxillary co-ordinates are also added to each cube:
land_fraction
,land_fraction_northern_hemisphere
andland_fraction_southern_hemisphere
. These co-ordinates document the area fraction that was considered to be land when the cubes were crunched i.e.land_fraction
is the fraction of the entire globe which was considered to be land,land_fraction_northern_hemisphere
is the fraction of the Northern Hemisphere which was considered to be land andland_fraction_southern_hemisphere
is the fraction of the Southern Hemisphere which was considered to be land.- Parameters
lazy (bool) – Should I process the data lazily? This can be slow as data has to be read off disk multiple time.
kwargs (anys) – Passed to
get_scm_timeseries_weights()
- Returns
dict of str – Dictionary of cubes (region: cube key: value pairs), with latitude-longitude mean data as appropriate for each of the requested regions.
- Return type
- Raises
InvalidWeightsError – No valid weights are found for the requested regions
-
get_scm_timeseries_weights
(surface_fraction_cube=None, areacell_scmcube=None, regions=None, cell_weights=None, log_failure=False)¶ Get the scm timeseries weights
- Parameters
surface_fraction_cube (
ScmCube
, optional) – land surface fraction data which is used to determine whether a given gridbox is land or ocean. IfNone
, we try to load the land surface fraction automatically.areacell_scmcube (
ScmCube
, optional) – cell area data which is used to take the latitude-longitude mean of the cube’s data. IfNone
, we try to load this data automatically and if that fails we fall back ontoiris.analysis.cartography.area_weights
.regions (list[str]) – List of regions to use. If
None
thennetcdf_scm.regions.DEFAULT_REGIONS
is used.cell_weights ({'area-only', 'area-surface-fraction'}) – How cell weights should be calculated. If
'area-surface-fraction'
, both cell area and its surface fraction will be used to weight the cell. If'area-only'
, only the cell’s area will be used to weight the cell (cells which do not belong to the region are nonetheless excluded). IfNone
, netCDF-SCM will guess whether land surface fraction weights should be included or not based on the data being processed. When guessing, for ocean data, netCDF-SCM will weight cells only by the horizontal area of the cell i.e. no land fraction (see Section L5 of Griffies et al., GMD, 2016, https://doi.org/10.5194/gmd-9-3231-2016). For land variables, netCDF-SCM will weight cells by both thier horizontal area and their land surface fraction. “Yes, you do need to weight the output by land frac (sftlf is the CMIP variable name).” (Chris Jones, personal communication, 18 April 2020). For land variables, note that there seems to be nothing in Jones et al., GMD, 2016 (https://doi.org/10.5194/gmd-9-2853-2016).log_failure (bool) – Should regions which fail be logged? If no, failures are raised as warnings.
- Returns
dict of str – Dictionary of ‘region name’: weights, key: value pairs
- Return type
np.ndarray
Notes
Only regions which can be calculated are returned. If no regions can be calculated, an empty dictionary will be returned.
-
get_variable_constraint
()¶ Get the iris variable constraint to use when loading data with
self.load_data_from_identifiers
.- Returns
constraint to use which ensures that only the variable of interest is loaded.
- Return type
iris.Constraint
-
info
¶ Information about the cube’s source files
res["files"]
contains the files used to load the data in this cube.res["metadata"]
contains information for each of the metadata cubes used to load the data in this cube.- Returns
- Return type
-
lat_dim
¶ iris.coords.DimCoord
The latitude dimension of the data.
-
lat_dim_number
¶ The index which corresponds to the latitude dimension.
e.g. if latitude is the first dimension of the data, then
self.lat_dim_number
will be0
(Python is zero-indexed).- Type
-
lat_lon_shape
¶ 2D Tuple of
int
which gives shape of a lat-lon slice of the datae.g. if the cube’s shape is (4, 3, 5, 4) and its dimensions are (time, lat, depth, lon) then
cube.lat_lon_shape
will be(3, 4)
- Raises
AssertionError – No lat lon slice can be deduced (if this happens, please raise an issue at https://gitlab.com/netcdf-scm/netcdf-scm/issues so we can address your use case).
- Type
-
load_data_from_identifiers
(process_warnings=True, **kwargs)¶ Load data using key identifiers.
The identifiers are used to determine the path of the file to load. The file is then loaded into an iris cube which can be accessed through
self.cube
.- Parameters
process_warnings (bool) – Should I process warnings to add e.g. missing metadata information?
kwargs (any) – Arguments which can then be processed by
self.get_filepath_from_load_data_from_identifiers_args
andself.get_variable_constraint
to determine the full filepath of the file to load and the variable constraint to use.
-
load_data_from_path
(filepath, process_warnings=True)¶ Load data from a path.
-
load_data_in_directory
(directory=None, process_warnings=True)¶ Load data in a directory.
The data is loaded into an iris cube which can be accessed through
self.cube
.Initially, this method is intended to only be used to load data when it is saved in a number of different timeslice files e.g.:
tas_Amon_HadCM3_rcp45_r1i1p1_200601-203012.nc
tas_Amon_HadCM3_rcp45_r1i1p1_203101-203512.nc
tas_Amon_HadCM3_rcp45_r1i1p1_203601-203812.nc
It is not intended to be used to load multiple different variables or non-continuous timeseries. These use cases could be added in future, but are not required yet so have not been included.
Note that this function removes any attributes which aren’t common between the loaded cubes. In general, we have found that this mainly means
creation_date
,tracking_id
andhistory
are deleted. If unsure, please check.- Parameters
- Raises
ValueError – If the files in the directory are not from the same run (i.e. their filenames are not identical except for the timestamp) or if the files don’t form a continuous timeseries.
-
lon_dim
¶ iris.coords.DimCoord
The longitude dimension of the data.
-
lon_dim_number
¶ The index which corresponds to the longitude dimension.
e.g. if longitude is the third dimension of the data, then
self.lon_dim_number
will be2
(Python is zero-indexed).- Type
-
netcdf_scm_realm
¶ The realm in which netCDF-SCM thinks the data belongs.
This is used to make decisions about how to take averages of the data and where to find metadata variables.
If it is not sure, netCDF-SCM will guess that the data belongs to the ‘atmosphere’ realm.
- Type
-
root_dir
= None¶ - The root directory of the database i.e. where the cube should start its
path
e.g.
/home/users/usertim/cmip6input
.- Type
-
source_id
= None¶ The source_id for which we want to load data e.g.
UoM-REMIND-MAGPIE-ssp585-1-2-0
This must include the institution_id.
- Type
-
surface_fraction_var
¶ The name of the variable associated with the surface fraction in each gridbox.
If required, this is used when looking for the surface fraction file which belongs to a given data file. For example, if our data file is
tas_Amon_HadCM3_rcp45_r1i1p1_200601.nc
thensurface_fraction_var
can be used to work out the name of the associated surface fraction file. In some cases, it might be as simple as replacingtas
with the value ofsurface_fraction_var
.- Type
-
table_name_for_metadata_vars
¶ The name of the ‘table’ in which metadata variables can be found.
For example,
fx
orOfx
.We wrap this as a property as table typically means
table_id
but is sometimes referred to in other ways e.g. asmip_table
in CMIP5.- Type
-
time_dim
¶ iris.coords.DimCoord
The time dimension of the data.
-
time_dim_number
¶ The index which corresponds to the time dimension.
e.g. if time is the first dimension of the data, then
self.time_dim_number
will be0
(Python is zero-indexed).- Type
-
time_period_regex
¶ Regular expression which captures the timeseries identifier in input data files.
For help on regular expressions, see regular expressions.
- Type
_sre.SRE_Pattern
-
time_range
= None¶ The time range for which we want to load data e.g.
2005-2100
If
None
, this information isn’t included in the filename which is useful for loading metadata files which don’t have a relevant time period.- Type
-
timestamp_definitions
¶ Definition of valid timestamp information and corresponding key values.
This follows the CMIP standards where time strings must be one of the following: YYYY, YYYYMM, YYYYMMDD, YYYYMMDDHH or one of the previous combined with a hyphen e.g. YYYY-YYYY.
Each key in the definitions dictionary is the length of the timestamp. Each value is itself a dictionary, with keys:
datetime_str: the string required to convert a timestamp of this length into a datetime using
datetime.datetime.strptime
generic_regexp: a regular expression which will match timestamps in this format
expected_timestep: a
dateutil.relativedelta.relativedelta
object which contains the expected timestep in files with this timestamp
- Returns
- Return type
Examples
>>> self.timestamp_definitions[len("2012")]["datetime_str"] "%Y"
-
variable_id
= None¶ The variable_id for which we want to load data e.g.
mole-fraction-of-carbon-dioxide-in-air
- Type
-
-
class
netcdf_scm.iris_cube_wrappers.
CMIP6OutputCube
[source]¶ Bases:
netcdf_scm.iris_cube_wrappers._CMIPCube
Cube which can be used with CMIP6 model output data
The data must match the CMIP6 data reference syntax as specified in the ‘File name template’ and ‘Directory structure template’ sections of the CMIP6 Data Reference Syntax.
-
activity_id
= None¶ The activity_id for which we want to load data.
In CMIP6, this denotes the responsible MIP e.g.
DCPP
.- Type
-
areacell_var
¶ The name of the variable associated with the area of each gridbox.
If required, this is used to determine the area of each cell in a data file. For example, if our data file is
tas_Amon_HadCM3_rcp45_r1i1p1_200601.nc
thenareacell_var
can be used to work out the name of the associated cell area file. In some cases, it might be as simple as replacingtas
with the value ofareacell_var
.- Type
-
convert_scm_timeseries_cubes_to_openscmdata
(scm_timeseries_cubes, out_calendar=None)¶ Convert dictionary of SCM timeseries cubes to an
scmdata.ScmRun
- Parameters
- Returns
scmdata.ScmRun
containing the data from the SCM timeseries cubes- Return type
scmdata.ScmRun
- Raises
NotImplementedError – The (original) input data has dimensions other than time, latitude and longitude (so the data to convert has dimensions other than time).
-
dim_names
¶ Names of the dimensions in this cube
Here the names are the
standard_names
which means there can beNone
in the output.- Type
-
get_area_weights
(areacell_scmcube=None)¶ Get area weights for this cube
- Parameters
areacell_scmcube (
ScmCube
) –ScmCube
containing areacell data. IfNone
, we calculate the weights using iris.- Returns
Weights on the cube’s latitude-longitude grid.
- Return type
np.ndarray
- Raises
iris.exceptions.CoordinateMultiDimError – The cube’s co-ordinates are multi-dimensional and we don’t have cell area data.
ValueError – Area weights units are not as expected (contradict with
self._area_weights_units
).
-
get_data_directory
()¶ Get the path to a data file from self’s attributes.
This can take multiple forms, it may just return a previously set filepath attribute or it could combine a number of different metadata elements (e.g. model name, experiment name) to create the data path.
-
get_data_filename
()¶ Get the name of a data file from self’s attributes.
This can take multiple forms, it may just return a previously set filename attribute or it could combine a number of different metadata elements (e.g. model name, experiment name) to create the data name.
-
classmethod
get_data_reference_syntax
(**kwargs)¶ Get data reference syntax for this cube
-
get_filepath_from_load_data_from_identifiers_args
(**kwargs)[source]¶ Get the full filepath of the data to load from the arguments passed to
self.load_data_from_identifiers
.Full details about the meaning of each identifier is given in Table 1 of the CMIP6 Data Reference Syntax.
- Parameters
kwargs (str) – Identifiers to use to load the data
- Returns
The full filepath (path and name) of the file to load.
- Return type
- Raises
AttributeError – An input argument does not match with the cube’s data reference syntax
-
classmethod
get_instance_id
(filepath)[source]¶ Get the instance_id from a given path
This is used as a unique identifier for datasets on the ESGF.
-
get_load_data_from_identifiers_args_from_filepath
(filepath)¶ Get the set of identifiers to use to load data from a filepath.
- Parameters
filepath (str) – The filepath from which to load the data.
- Returns
Set of arguments which can be passed to
self.load_data_from_identifiers
to load the data in the filepath.- Return type
- Raises
ValueError – Path and filename contradict each other
-
get_metadata_cube
(metadata_variable, cube=None)¶ Load a metadata cube from self’s attributes.
- Parameters
- Returns
instance of self which has been loaded from the file containing the metadata variable of interest.
- Return type
type(self)
- Raises
-
get_scm_timeseries
(**kwargs)¶ Get SCM relevant timeseries from
self
.- Parameters
**kwargs – Passed to
get_scm_timeseries_cubes()
- Returns
scmdata.ScmRun
instance with the data in thedata
attribute and metadata in themetadata
attribute.- Return type
scmdata.ScmRun
-
get_scm_timeseries_cubes
(lazy=False, **kwargs)¶ Get SCM relevant cubes
The effective areas used for each of the regions are added as auxillary co-ordinates of each timeseries cube.
If global, Northern Hemisphere and Southern Hemisphere land cubes are calculated, then three auxillary co-ordinates are also added to each cube:
land_fraction
,land_fraction_northern_hemisphere
andland_fraction_southern_hemisphere
. These co-ordinates document the area fraction that was considered to be land when the cubes were crunched i.e.land_fraction
is the fraction of the entire globe which was considered to be land,land_fraction_northern_hemisphere
is the fraction of the Northern Hemisphere which was considered to be land andland_fraction_southern_hemisphere
is the fraction of the Southern Hemisphere which was considered to be land.- Parameters
lazy (bool) – Should I process the data lazily? This can be slow as data has to be read off disk multiple time.
kwargs (anys) – Passed to
get_scm_timeseries_weights()
- Returns
dict of str – Dictionary of cubes (region: cube key: value pairs), with latitude-longitude mean data as appropriate for each of the requested regions.
- Return type
- Raises
InvalidWeightsError – No valid weights are found for the requested regions
-
get_scm_timeseries_weights
(surface_fraction_cube=None, areacell_scmcube=None, regions=None, cell_weights=None, log_failure=False)¶ Get the scm timeseries weights
- Parameters
surface_fraction_cube (
ScmCube
, optional) – land surface fraction data which is used to determine whether a given gridbox is land or ocean. IfNone
, we try to load the land surface fraction automatically.areacell_scmcube (
ScmCube
, optional) – cell area data which is used to take the latitude-longitude mean of the cube’s data. IfNone
, we try to load this data automatically and if that fails we fall back ontoiris.analysis.cartography.area_weights
.regions (list[str]) – List of regions to use. If
None
thennetcdf_scm.regions.DEFAULT_REGIONS
is used.cell_weights ({'area-only', 'area-surface-fraction'}) –
How cell weights should be calculated. If
'area-surface-fraction'
, both cell area and its surface fraction will be used to weight the cell. If'area-only'
, only the cell’s area will be used to weight the cell (cells which do not belong to the region are nonetheless excluded). IfNone
, netCDF-SCM will guess whether land surface fraction weights should be included or not based on the data being processed. When guessing, for ocean data, netCDF-SCM will weight cells only by the horizontal area of the cell i.e. no land fraction (see Section L5 of Griffies et al., GMD, 2016, https://doi.org/10.5194/gmd-9-3231-2016). For land variables, netCDF-SCM will weight cells by both thier horizontal area and their land surface fraction. “Yes, you do need to weight the output by land frac (sftlf is the CMIP variable name).” (Chris Jones, personal communication, 18 April 2020). For land variables, note that there seems to be nothing in Jones et al., GMD, 2016 (https://doi.org/10.5194/gmd-9-2853-2016).log_failure (bool) – Should regions which fail be logged? If no, failures are raised as warnings.
- Returns
dict of str – Dictionary of ‘region name’: weights, key: value pairs
- Return type
np.ndarray
Notes
Only regions which can be calculated are returned. If no regions can be calculated, an empty dictionary will be returned.
-
get_variable_constraint
()¶ Get the iris variable constraint to use when loading data with
self.load_data_from_identifiers
.- Returns
constraint to use which ensures that only the variable of interest is loaded.
- Return type
iris.Constraint
-
info
¶ Information about the cube’s source files
res["files"]
contains the files used to load the data in this cube.res["metadata"]
contains information for each of the metadata cubes used to load the data in this cube.- Returns
- Return type
-
lat_dim
¶ iris.coords.DimCoord
The latitude dimension of the data.
-
lat_dim_number
¶ The index which corresponds to the latitude dimension.
e.g. if latitude is the first dimension of the data, then
self.lat_dim_number
will be0
(Python is zero-indexed).- Type
-
lat_lon_shape
¶ 2D Tuple of
int
which gives shape of a lat-lon slice of the datae.g. if the cube’s shape is (4, 3, 5, 4) and its dimensions are (time, lat, depth, lon) then
cube.lat_lon_shape
will be(3, 4)
- Raises
AssertionError – No lat lon slice can be deduced (if this happens, please raise an issue at https://gitlab.com/netcdf-scm/netcdf-scm/issues so we can address your use case).
- Type
-
load_data_from_identifiers
(process_warnings=True, **kwargs)¶ Load data using key identifiers.
The identifiers are used to determine the path of the file to load. The file is then loaded into an iris cube which can be accessed through
self.cube
.- Parameters
process_warnings (bool) – Should I process warnings to add e.g. missing metadata information?
kwargs (any) – Arguments which can then be processed by
self.get_filepath_from_load_data_from_identifiers_args
andself.get_variable_constraint
to determine the full filepath of the file to load and the variable constraint to use.
-
load_data_from_path
(filepath, process_warnings=True)¶ Load data from a path.
-
load_data_in_directory
(directory=None, process_warnings=True)¶ Load data in a directory.
The data is loaded into an iris cube which can be accessed through
self.cube
.Initially, this method is intended to only be used to load data when it is saved in a number of different timeslice files e.g.:
tas_Amon_HadCM3_rcp45_r1i1p1_200601-203012.nc
tas_Amon_HadCM3_rcp45_r1i1p1_203101-203512.nc
tas_Amon_HadCM3_rcp45_r1i1p1_203601-203812.nc
It is not intended to be used to load multiple different variables or non-continuous timeseries. These use cases could be added in future, but are not required yet so have not been included.
Note that this function removes any attributes which aren’t common between the loaded cubes. In general, we have found that this mainly means
creation_date
,tracking_id
andhistory
are deleted. If unsure, please check.- Parameters
- Raises
ValueError – If the files in the directory are not from the same run (i.e. their filenames are not identical except for the timestamp) or if the files don’t form a continuous timeseries.
-
lon_dim
¶ iris.coords.DimCoord
The longitude dimension of the data.
-
lon_dim_number
¶ The index which corresponds to the longitude dimension.
e.g. if longitude is the third dimension of the data, then
self.lon_dim_number
will be2
(Python is zero-indexed).- Type
-
netcdf_scm_realm
¶ The realm in which netCDF-SCM thinks the data belongs.
This is used to make decisions about how to take averages of the data and where to find metadata variables.
If it is not sure, netCDF-SCM will guess that the data belongs to the ‘atmosphere’ realm.
- Type
-
root_dir
= None¶ - The root directory of the database i.e. where the cube should start its
path
e.g.
/home/users/usertim/cmip6_data
.- Type
-
source_id
= None¶ The source_id for which we want to load data e.g.
CNRM-CM6-1
This was known as model in CMIP5.
- Type
-
surface_fraction_var
¶ The name of the variable associated with the surface fraction in each gridbox.
If required, this is used when looking for the surface fraction file which belongs to a given data file. For example, if our data file is
tas_Amon_HadCM3_rcp45_r1i1p1_200601.nc
thensurface_fraction_var
can be used to work out the name of the associated surface fraction file. In some cases, it might be as simple as replacingtas
with the value ofsurface_fraction_var
.- Type
-
table_name_for_metadata_vars
¶ The name of the ‘table’ in which metadata variables can be found.
For example,
fx
orOfx
.We wrap this as a property as table typically means
table_id
but is sometimes referred to in other ways e.g. asmip_table
in CMIP5.- Type
-
time_dim
¶ iris.coords.DimCoord
The time dimension of the data.
-
time_dim_number
¶ The index which corresponds to the time dimension.
e.g. if time is the first dimension of the data, then
self.time_dim_number
will be0
(Python is zero-indexed).- Type
-
time_period_regex
¶ Regular expression which captures the timeseries identifier in input data files.
For help on regular expressions, see regular expressions.
- Type
_sre.SRE_Pattern
-
time_range
= None¶ The time range for which we want to load data e.g.
198001-198412
If
None
, this information isn’t included in the filename which is useful for loading metadata files which don’t have a relevant time period.- Type
-
timestamp_definitions
¶ Definition of valid timestamp information and corresponding key values.
This follows the CMIP standards where time strings must be one of the following: YYYY, YYYYMM, YYYYMMDD, YYYYMMDDHH or one of the previous combined with a hyphen e.g. YYYY-YYYY.
Each key in the definitions dictionary is the length of the timestamp. Each value is itself a dictionary, with keys:
datetime_str: the string required to convert a timestamp of this length into a datetime using
datetime.datetime.strptime
generic_regexp: a regular expression which will match timestamps in this format
expected_timestep: a
dateutil.relativedelta.relativedelta
object which contains the expected timestep in files with this timestamp
- Returns
- Return type
Examples
>>> self.timestamp_definitions[len("2012")]["datetime_str"] "%Y"
-
-
class
netcdf_scm.iris_cube_wrappers.
MarbleCMIP5Cube
[source]¶ Bases:
netcdf_scm.iris_cube_wrappers._CMIPCube
Cube which can be used with the
cmip5
directory on marble (identical to ETH Zurich’s archive).This directory structure is very similar, but not quite identical, to the recommended CMIP5 directory structure described in section 3.1 of the CMIP5 Data Reference Syntax.
-
areacell_var
¶ The name of the variable associated with the area of each gridbox.
If required, this is used to determine the area of each cell in a data file. For example, if our data file is
tas_Amon_HadCM3_rcp45_r1i1p1_200601.nc
thenareacell_var
can be used to work out the name of the associated cell area file. In some cases, it might be as simple as replacingtas
with the value ofareacell_var
.- Type
-
convert_scm_timeseries_cubes_to_openscmdata
(scm_timeseries_cubes, out_calendar=None)¶ Convert dictionary of SCM timeseries cubes to an
scmdata.ScmRun
- Parameters
- Returns
scmdata.ScmRun
containing the data from the SCM timeseries cubes- Return type
scmdata.ScmRun
- Raises
NotImplementedError – The (original) input data has dimensions other than time, latitude and longitude (so the data to convert has dimensions other than time).
-
dim_names
¶ Names of the dimensions in this cube
Here the names are the
standard_names
which means there can beNone
in the output.- Type
-
get_area_weights
(areacell_scmcube=None)¶ Get area weights for this cube
- Parameters
areacell_scmcube (
ScmCube
) –ScmCube
containing areacell data. IfNone
, we calculate the weights using iris.- Returns
Weights on the cube’s latitude-longitude grid.
- Return type
np.ndarray
- Raises
iris.exceptions.CoordinateMultiDimError – The cube’s co-ordinates are multi-dimensional and we don’t have cell area data.
ValueError – Area weights units are not as expected (contradict with
self._area_weights_units
).
-
get_data_directory
()¶ Get the path to a data file from self’s attributes.
This can take multiple forms, it may just return a previously set filepath attribute or it could combine a number of different metadata elements (e.g. model name, experiment name) to create the data path.
-
get_data_filename
()¶ Get the name of a data file from self’s attributes.
This can take multiple forms, it may just return a previously set filename attribute or it could combine a number of different metadata elements (e.g. model name, experiment name) to create the data name.
-
classmethod
get_data_reference_syntax
(**kwargs)¶ Get data reference syntax for this cube
-
get_filepath_from_load_data_from_identifiers_args
(**kwargs)[source]¶ Get the full filepath of the data to load from the arguments passed to
self.load_data_from_identifiers
.Full details about the identifiers are given in Section 2 of the CMIP5 Data Reference Syntax.
- Parameters
kwargs (str) – Identifiers to use to load the data
- Returns
The full filepath (path and name) of the file to load.
- Return type
- Raises
AttributeError – An input argument does not match with the cube’s data reference syntax
-
get_load_data_from_identifiers_args_from_filepath
(filepath)¶ Get the set of identifiers to use to load data from a filepath.
- Parameters
filepath (str) – The filepath from which to load the data.
- Returns
Set of arguments which can be passed to
self.load_data_from_identifiers
to load the data in the filepath.- Return type
- Raises
ValueError – Path and filename contradict each other
-
get_metadata_cube
(metadata_variable, cube=None)¶ Load a metadata cube from self’s attributes.
- Parameters
- Returns
instance of self which has been loaded from the file containing the metadata variable of interest.
- Return type
type(self)
- Raises
-
get_scm_timeseries
(**kwargs)¶ Get SCM relevant timeseries from
self
.- Parameters
**kwargs – Passed to
get_scm_timeseries_cubes()
- Returns
scmdata.ScmRun
instance with the data in thedata
attribute and metadata in themetadata
attribute.- Return type
scmdata.ScmRun
-
get_scm_timeseries_cubes
(lazy=False, **kwargs)¶ Get SCM relevant cubes
The effective areas used for each of the regions are added as auxillary co-ordinates of each timeseries cube.
If global, Northern Hemisphere and Southern Hemisphere land cubes are calculated, then three auxillary co-ordinates are also added to each cube:
land_fraction
,land_fraction_northern_hemisphere
andland_fraction_southern_hemisphere
. These co-ordinates document the area fraction that was considered to be land when the cubes were crunched i.e.land_fraction
is the fraction of the entire globe which was considered to be land,land_fraction_northern_hemisphere
is the fraction of the Northern Hemisphere which was considered to be land andland_fraction_southern_hemisphere
is the fraction of the Southern Hemisphere which was considered to be land.- Parameters
lazy (bool) – Should I process the data lazily? This can be slow as data has to be read off disk multiple time.
kwargs (anys) – Passed to
get_scm_timeseries_weights()
- Returns
dict of str – Dictionary of cubes (region: cube key: value pairs), with latitude-longitude mean data as appropriate for each of the requested regions.
- Return type
- Raises
InvalidWeightsError – No valid weights are found for the requested regions
-
get_scm_timeseries_weights
(surface_fraction_cube=None, areacell_scmcube=None, regions=None, cell_weights=None, log_failure=False)¶ Get the scm timeseries weights
- Parameters
surface_fraction_cube (
ScmCube
, optional) – land surface fraction data which is used to determine whether a given gridbox is land or ocean. IfNone
, we try to load the land surface fraction automatically.areacell_scmcube (
ScmCube
, optional) – cell area data which is used to take the latitude-longitude mean of the cube’s data. IfNone
, we try to load this data automatically and if that fails we fall back ontoiris.analysis.cartography.area_weights
.regions (list[str]) – List of regions to use. If
None
thennetcdf_scm.regions.DEFAULT_REGIONS
is used.cell_weights ({'area-only', 'area-surface-fraction'}) –
How cell weights should be calculated. If
'area-surface-fraction'
, both cell area and its surface fraction will be used to weight the cell. If'area-only'
, only the cell’s area will be used to weight the cell (cells which do not belong to the region are nonetheless excluded). IfNone
, netCDF-SCM will guess whether land surface fraction weights should be included or not based on the data being processed. When guessing, for ocean data, netCDF-SCM will weight cells only by the horizontal area of the cell i.e. no land fraction (see Section L5 of Griffies et al., GMD, 2016, https://doi.org/10.5194/gmd-9-3231-2016). For land variables, netCDF-SCM will weight cells by both thier horizontal area and their land surface fraction. “Yes, you do need to weight the output by land frac (sftlf is the CMIP variable name).” (Chris Jones, personal communication, 18 April 2020). For land variables, note that there seems to be nothing in Jones et al., GMD, 2016 (https://doi.org/10.5194/gmd-9-2853-2016).log_failure (bool) – Should regions which fail be logged? If no, failures are raised as warnings.
- Returns
dict of str – Dictionary of ‘region name’: weights, key: value pairs
- Return type
np.ndarray
Notes
Only regions which can be calculated are returned. If no regions can be calculated, an empty dictionary will be returned.
-
get_variable_constraint
()¶ Get the iris variable constraint to use when loading data with
self.load_data_from_identifiers
.- Returns
constraint to use which ensures that only the variable of interest is loaded.
- Return type
iris.Constraint
-
info
¶ Information about the cube’s source files
res["files"]
contains the files used to load the data in this cube.res["metadata"]
contains information for each of the metadata cubes used to load the data in this cube.- Returns
- Return type
-
lat_dim
¶ iris.coords.DimCoord
The latitude dimension of the data.
-
lat_dim_number
¶ The index which corresponds to the latitude dimension.
e.g. if latitude is the first dimension of the data, then
self.lat_dim_number
will be0
(Python is zero-indexed).- Type
-
lat_lon_shape
¶ 2D Tuple of
int
which gives shape of a lat-lon slice of the datae.g. if the cube’s shape is (4, 3, 5, 4) and its dimensions are (time, lat, depth, lon) then
cube.lat_lon_shape
will be(3, 4)
- Raises
AssertionError – No lat lon slice can be deduced (if this happens, please raise an issue at https://gitlab.com/netcdf-scm/netcdf-scm/issues so we can address your use case).
- Type
-
load_data_from_identifiers
(process_warnings=True, **kwargs)¶ Load data using key identifiers.
The identifiers are used to determine the path of the file to load. The file is then loaded into an iris cube which can be accessed through
self.cube
.- Parameters
process_warnings (bool) – Should I process warnings to add e.g. missing metadata information?
kwargs (any) – Arguments which can then be processed by
self.get_filepath_from_load_data_from_identifiers_args
andself.get_variable_constraint
to determine the full filepath of the file to load and the variable constraint to use.
-
load_data_from_path
(filepath, process_warnings=True)¶ Load data from a path.
-
load_data_in_directory
(directory=None, process_warnings=True)¶ Load data in a directory.
The data is loaded into an iris cube which can be accessed through
self.cube
.Initially, this method is intended to only be used to load data when it is saved in a number of different timeslice files e.g.:
tas_Amon_HadCM3_rcp45_r1i1p1_200601-203012.nc
tas_Amon_HadCM3_rcp45_r1i1p1_203101-203512.nc
tas_Amon_HadCM3_rcp45_r1i1p1_203601-203812.nc
It is not intended to be used to load multiple different variables or non-continuous timeseries. These use cases could be added in future, but are not required yet so have not been included.
Note that this function removes any attributes which aren’t common between the loaded cubes. In general, we have found that this mainly means
creation_date
,tracking_id
andhistory
are deleted. If unsure, please check.- Parameters
- Raises
ValueError – If the files in the directory are not from the same run (i.e. their filenames are not identical except for the timestamp) or if the files don’t form a continuous timeseries.
-
lon_dim
¶ iris.coords.DimCoord
The longitude dimension of the data.
-
lon_dim_number
¶ The index which corresponds to the longitude dimension.
e.g. if longitude is the third dimension of the data, then
self.lon_dim_number
will be2
(Python is zero-indexed).- Type
-
netcdf_scm_realm
¶ The realm in which netCDF-SCM thinks the data belongs.
This is used to make decisions about how to take averages of the data and where to find metadata variables.
If it is not sure, netCDF-SCM will guess that the data belongs to the ‘atmosphere’ realm.
- Type
-
root_dir
= None¶ The root directory of the database i.e. where the cube should start its path
e.g.
/home/users/usertim/cmip5_25x25
- Type
-
surface_fraction_var
¶ The name of the variable associated with the surface fraction in each gridbox.
If required, this is used when looking for the surface fraction file which belongs to a given data file. For example, if our data file is
tas_Amon_HadCM3_rcp45_r1i1p1_200601.nc
thensurface_fraction_var
can be used to work out the name of the associated surface fraction file. In some cases, it might be as simple as replacingtas
with the value ofsurface_fraction_var
.- Type
-
table_name_for_metadata_vars
¶ The name of the ‘table’ in which metadata variables can be found.
For example,
fx
orOfx
.We wrap this as a property as table typically means
table_id
but is sometimes referred to in other ways e.g. asmip_table
in CMIP5.- Type
-
time_dim
¶ iris.coords.DimCoord
The time dimension of the data.
-
time_dim_number
¶ The index which corresponds to the time dimension.
e.g. if time is the first dimension of the data, then
self.time_dim_number
will be0
(Python is zero-indexed).- Type
-
time_period
= None¶ The time period for which we want to load data
If
None
, this information isn’t included in the filename which is useful for loading metadata files which don’t have a relevant time period.- Type
-
time_period_regex
¶ Regular expression which captures the timeseries identifier in input data files.
For help on regular expressions, see regular expressions.
- Type
_sre.SRE_Pattern
-
timestamp_definitions
¶ Definition of valid timestamp information and corresponding key values.
This follows the CMIP standards where time strings must be one of the following: YYYY, YYYYMM, YYYYMMDD, YYYYMMDDHH or one of the previous combined with a hyphen e.g. YYYY-YYYY.
Each key in the definitions dictionary is the length of the timestamp. Each value is itself a dictionary, with keys:
datetime_str: the string required to convert a timestamp of this length into a datetime using
datetime.datetime.strptime
generic_regexp: a regular expression which will match timestamps in this format
expected_timestep: a
dateutil.relativedelta.relativedelta
object which contains the expected timestep in files with this timestamp
- Returns
- Return type
Examples
>>> self.timestamp_definitions[len("2012")]["datetime_str"] "%Y"
-
-
class
netcdf_scm.iris_cube_wrappers.
ScmCube
[source]¶ Bases:
object
Class for processing netCDF files for use with simple climate models.
-
areacell_var
¶ The name of the variable associated with the area of each gridbox.
If required, this is used to determine the area of each cell in a data file. For example, if our data file is
tas_Amon_HadCM3_rcp45_r1i1p1_200601.nc
thenareacell_var
can be used to work out the name of the associated cell area file. In some cases, it might be as simple as replacingtas
with the value ofareacell_var
.- Type
-
convert_scm_timeseries_cubes_to_openscmdata
(scm_timeseries_cubes, out_calendar=None)[source]¶ Convert dictionary of SCM timeseries cubes to an
scmdata.ScmRun
- Parameters
- Returns
scmdata.ScmRun
containing the data from the SCM timeseries cubes- Return type
scmdata.ScmRun
- Raises
NotImplementedError – The (original) input data has dimensions other than time, latitude and longitude (so the data to convert has dimensions other than time).
-
dim_names
¶ Names of the dimensions in this cube
Here the names are the
standard_names
which means there can beNone
in the output.- Type
-
get_area_weights
(areacell_scmcube=None)[source]¶ Get area weights for this cube
- Parameters
areacell_scmcube (
ScmCube
) –ScmCube
containing areacell data. IfNone
, we calculate the weights using iris.- Returns
Weights on the cube’s latitude-longitude grid.
- Return type
np.ndarray
- Raises
iris.exceptions.CoordinateMultiDimError – The cube’s co-ordinates are multi-dimensional and we don’t have cell area data.
ValueError – Area weights units are not as expected (contradict with
self._area_weights_units
).
-
get_metadata_cube
(metadata_variable, cube=None)[source]¶ Load a metadata cube from self’s attributes.
- Parameters
- Returns
instance of self which has been loaded from the file containing the metadata variable of interest.
- Return type
type(self)
- Raises
-
get_scm_timeseries
(**kwargs)[source]¶ Get SCM relevant timeseries from
self
.- Parameters
**kwargs – Passed to
get_scm_timeseries_cubes()
- Returns
scmdata.ScmRun
instance with the data in thedata
attribute and metadata in themetadata
attribute.- Return type
scmdata.ScmRun
-
get_scm_timeseries_cubes
(lazy=False, **kwargs)[source]¶ Get SCM relevant cubes
The effective areas used for each of the regions are added as auxillary co-ordinates of each timeseries cube.
If global, Northern Hemisphere and Southern Hemisphere land cubes are calculated, then three auxillary co-ordinates are also added to each cube:
land_fraction
,land_fraction_northern_hemisphere
andland_fraction_southern_hemisphere
. These co-ordinates document the area fraction that was considered to be land when the cubes were crunched i.e.land_fraction
is the fraction of the entire globe which was considered to be land,land_fraction_northern_hemisphere
is the fraction of the Northern Hemisphere which was considered to be land andland_fraction_southern_hemisphere
is the fraction of the Southern Hemisphere which was considered to be land.- Parameters
lazy (bool) – Should I process the data lazily? This can be slow as data has to be read off disk multiple time.
kwargs (anys) – Passed to
get_scm_timeseries_weights()
- Returns
dict of str – Dictionary of cubes (region: cube key: value pairs), with latitude-longitude mean data as appropriate for each of the requested regions.
- Return type
- Raises
InvalidWeightsError – No valid weights are found for the requested regions
-
get_scm_timeseries_weights
(surface_fraction_cube=None, areacell_scmcube=None, regions=None, cell_weights=None, log_failure=False)[source]¶ Get the scm timeseries weights
- Parameters
surface_fraction_cube (
ScmCube
, optional) – land surface fraction data which is used to determine whether a given gridbox is land or ocean. IfNone
, we try to load the land surface fraction automatically.areacell_scmcube (
ScmCube
, optional) – cell area data which is used to take the latitude-longitude mean of the cube’s data. IfNone
, we try to load this data automatically and if that fails we fall back ontoiris.analysis.cartography.area_weights
.regions (list[str]) – List of regions to use. If
None
thennetcdf_scm.regions.DEFAULT_REGIONS
is used.cell_weights ({'area-only', 'area-surface-fraction'}) –
How cell weights should be calculated. If
'area-surface-fraction'
, both cell area and its surface fraction will be used to weight the cell. If'area-only'
, only the cell’s area will be used to weight the cell (cells which do not belong to the region are nonetheless excluded). IfNone
, netCDF-SCM will guess whether land surface fraction weights should be included or not based on the data being processed. When guessing, for ocean data, netCDF-SCM will weight cells only by the horizontal area of the cell i.e. no land fraction (see Section L5 of Griffies et al., GMD, 2016, https://doi.org/10.5194/gmd-9-3231-2016). For land variables, netCDF-SCM will weight cells by both thier horizontal area and their land surface fraction. “Yes, you do need to weight the output by land frac (sftlf is the CMIP variable name).” (Chris Jones, personal communication, 18 April 2020). For land variables, note that there seems to be nothing in Jones et al., GMD, 2016 (https://doi.org/10.5194/gmd-9-2853-2016).log_failure (bool) – Should regions which fail be logged? If no, failures are raised as warnings.
- Returns
dict of str – Dictionary of ‘region name’: weights, key: value pairs
- Return type
np.ndarray
Notes
Only regions which can be calculated are returned. If no regions can be calculated, an empty dictionary will be returned.
-
info
¶ Information about the cube’s source files
res["files"]
contains the files used to load the data in this cube.res["metadata"]
contains information for each of the metadata cubes used to load the data in this cube.- Returns
- Return type
-
lat_dim
¶ iris.coords.DimCoord
The latitude dimension of the data.
-
lat_dim_number
¶ The index which corresponds to the latitude dimension.
e.g. if latitude is the first dimension of the data, then
self.lat_dim_number
will be0
(Python is zero-indexed).- Type
-
lat_lon_shape
¶ 2D Tuple of
int
which gives shape of a lat-lon slice of the datae.g. if the cube’s shape is (4, 3, 5, 4) and its dimensions are (time, lat, depth, lon) then
cube.lat_lon_shape
will be(3, 4)
- Raises
AssertionError – No lat lon slice can be deduced (if this happens, please raise an issue at https://gitlab.com/netcdf-scm/netcdf-scm/issues so we can address your use case).
- Type
-
load_data_from_path
(filepath, process_warnings=True)[source]¶ Load data from a path.
If you are using the
ScmCube
class directly, this method simply loads the path into an iris cube which can be accessed throughself.cube
.If implemented on a subclass of
ScmCube
, this method should:use
self.get_load_data_from_identifiers_args_from_filepath
to determine the suitable set of arguments to pass toself.load_data_from_identifiers
from the filepathload the data using
self.load_data_from_identifiers
as this method contains much better checks and helper components
-
load_data_in_directory
(directory=None, process_warnings=True)[source]¶ Load data in a directory.
The data is loaded into an iris cube which can be accessed through
self.cube
.Initially, this method is intended to only be used to load data when it is saved in a number of different timeslice files e.g.:
tas_Amon_HadCM3_rcp45_r1i1p1_200601-203012.nc
tas_Amon_HadCM3_rcp45_r1i1p1_203101-203512.nc
tas_Amon_HadCM3_rcp45_r1i1p1_203601-203812.nc
It is not intended to be used to load multiple different variables or non-continuous timeseries. These use cases could be added in future, but are not required yet so have not been included.
Note that this function removes any attributes which aren’t common between the loaded cubes. In general, we have found that this mainly means
creation_date
,tracking_id
andhistory
are deleted. If unsure, please check.- Parameters
- Raises
ValueError – If the files in the directory are not from the same run (i.e. their filenames are not identical except for the timestamp) or if the files don’t form a continuous timeseries.
-
lon_dim
¶ iris.coords.DimCoord
The longitude dimension of the data.
-
lon_dim_number
¶ The index which corresponds to the longitude dimension.
e.g. if longitude is the third dimension of the data, then
self.lon_dim_number
will be2
(Python is zero-indexed).- Type
-
netcdf_scm_realm
¶ The realm in which netCDF-SCM thinks the data belongs.
This is used to make decisions about how to take averages of the data and where to find metadata variables.
If it is not sure, netCDF-SCM will guess that the data belongs to the ‘atmosphere’ realm.
- Type
-
surface_fraction_var
¶ The name of the variable associated with the surface fraction in each gridbox.
If required, this is used when looking for the surface fraction file which belongs to a given data file. For example, if our data file is
tas_Amon_HadCM3_rcp45_r1i1p1_200601.nc
thensurface_fraction_var
can be used to work out the name of the associated surface fraction file. In some cases, it might be as simple as replacingtas
with the value ofsurface_fraction_var
.- Type
-
table_name_for_metadata_vars
¶ The name of the ‘table’ in which metadata variables can be found.
For example,
fx
orOfx
.We wrap this as a property as table typically means
table_id
but is sometimes referred to in other ways e.g. asmip_table
in CMIP5.- Type
-
time_dim
¶ iris.coords.DimCoord
The time dimension of the data.
-
time_dim_number
¶ The index which corresponds to the time dimension.
e.g. if time is the first dimension of the data, then
self.time_dim_number
will be0
(Python is zero-indexed).- Type
-
time_period_regex
¶ Regular expression which captures the timeseries identifier in input data files.
For help on regular expressions, see regular expressions.
- Type
_sre.SRE_Pattern
-
time_period_separator
= '-'¶ Character used to separate time period strings in the time period indicator in filenames.
e.g.
-
is the ‘time period separator’ in “2015-2030”.- Type
-
timestamp_definitions
¶ Definition of valid timestamp information and corresponding key values.
This follows the CMIP standards where time strings must be one of the following: YYYY, YYYYMM, YYYYMMDD, YYYYMMDDHH or one of the previous combined with a hyphen e.g. YYYY-YYYY.
Each key in the definitions dictionary is the length of the timestamp. Each value is itself a dictionary, with keys:
datetime_str: the string required to convert a timestamp of this length into a datetime using
datetime.datetime.strptime
generic_regexp: a regular expression which will match timestamps in this format
expected_timestep: a
dateutil.relativedelta.relativedelta
object which contains the expected timestep in files with this timestamp
- Returns
- Return type
Examples
>>> self.timestamp_definitions[len("2012")]["datetime_str"] "%Y"
-
Weights API¶
Module which calculates the weights to be used when taking SCM-box averages
This typically requires considering both the fraction of each cell which is of the desired type (e.g. land or ocean) and the area of each cell. The combination of these two pieces of information creates the weights for each cell which are used when taking area-weighted means.
-
class
netcdf_scm.weights.
AreaSurfaceFractionWeightCalculator
(cube, **kwargs)[source]¶ Bases:
netcdf_scm.weights.CubeWeightCalculator
Calculates weights which are both area and surface fraction weighted
\[\begin{split}w(lat, lon) = a(lat, lon) \\times s(lat, lon)\end{split}\]where \(w(lat, lon)\) is the weight of the cell at given latitude and longitude, \(a\) is area of the cell and \(s\) is the surface fraction of the cell (e.g. fraction of ocean area for ocean based regions).
-
get_weights
(weights_names, log_failure=False)¶ Get a number of weights
- Parameters
weights_names (list of str) – List of weights to attempt to load/calculate.
log_failure (bool) – Should failures be logged? If no, failures are raised as warnings.
- Returns
dict of str – Dictionary where keys are weights names and values are
np.ndarray
of bool. The result only contains valid weights. Any invalid weights are dropped.- Return type
np.ndarray
Notes
This method handles all exceptions and will only return weights which can actually be calculated. If no weights could be calculated, an empty dictionary will be returned.
-
get_weights_array
(weights_name)¶ Get a single weights array
If the weights have previously been calculated the precalculated result is returned from the cache. Otherwise the appropriate WeightFunc is called with any kwargs specified in the constructor.
- Parameters
weights_name (str) – Region to get weights for
- Returns
Weights for the region specified by
weights_name
- Return type
ndarray[bool]
- Raises
InvalidWeightsError – If the cube has no data which matches the input weights or is invalid in any other way
-
get_weights_array_without_area_weighting
(weights_name)¶ Get a single normalised weights array without any consideration of area weighting
The weights are normalised to be in the range [0, 1]
- Parameters
weights_name (str) – Region to get weights for
- Returns
Weights, normalised to be in the range [0, 1]
- Return type
np.ndarray
- Raises
InvalidWeightsError – If the requested weights cannot be found or evaluated
ValueError – The retrieved weights are not normalised to the range [0, 1]
-
-
class
netcdf_scm.weights.
AreaWeightCalculator
(cube, **kwargs)[source]¶ Bases:
netcdf_scm.weights.CubeWeightCalculator
Calculates weights which are area weighted but surface fraction aware.
This means that any cells which have a surface fraction of zero will receive zero weight, otherwise cells are purely area weighted.
\[\begin{split}w(lat, lon) = \\begin{cases} a(lat, lon), & s(lat, lon) > 0 \\\\ 0, & s(lat, lon) = 0 \\end{cases}\end{split}\]where \(w(lat, lon)\) is the weight of the cell at given latitude and longitude, \(a\) is area of the cell and \(s\) is the surface fraction of the cell (e.g. fraction of ocean area for ocean based regions).
-
get_weights
(weights_names, log_failure=False)¶ Get a number of weights
- Parameters
weights_names (list of str) – List of weights to attempt to load/calculate.
log_failure (bool) – Should failures be logged? If no, failures are raised as warnings.
- Returns
dict of str – Dictionary where keys are weights names and values are
np.ndarray
of bool. The result only contains valid weights. Any invalid weights are dropped.- Return type
np.ndarray
Notes
This method handles all exceptions and will only return weights which can actually be calculated. If no weights could be calculated, an empty dictionary will be returned.
-
get_weights_array
(weights_name)¶ Get a single weights array
If the weights have previously been calculated the precalculated result is returned from the cache. Otherwise the appropriate WeightFunc is called with any kwargs specified in the constructor.
- Parameters
weights_name (str) – Region to get weights for
- Returns
Weights for the region specified by
weights_name
- Return type
ndarray[bool]
- Raises
InvalidWeightsError – If the cube has no data which matches the input weights or is invalid in any other way
-
get_weights_array_without_area_weighting
(weights_name)¶ Get a single normalised weights array without any consideration of area weighting
The weights are normalised to be in the range [0, 1]
- Parameters
weights_name (str) – Region to get weights for
- Returns
Weights, normalised to be in the range [0, 1]
- Return type
np.ndarray
- Raises
InvalidWeightsError – If the requested weights cannot be found or evaluated
ValueError – The retrieved weights are not normalised to the range [0, 1]
-
-
class
netcdf_scm.weights.
CubeWeightCalculator
(cube, **kwargs)[source]¶ Bases:
abc.ABC
Computes weights for a given cube in a somewhat efficient manner.
Previously calculated weights are cached so each set of weights is only calculated once. This implementation trades off some additional memory overhead for the ability to generate arbitary weights.
Adding new weights
Additional weights can be added to
netcdf_scm.weights.weights
. The values inweights
should be WeightFunc’s. A WeightFunc is a function which takes a ScmCube, CubeWeightCalculator and any additional keyword arguments. The function should return a numpy array with the same dimensionality as the ScmCube.These WeightFunc’s can be composed together to create more complex functionality using e.g.
multiply_weights
.-
get_weights
(weights_names, log_failure=False)[source]¶ Get a number of weights
- Parameters
weights_names (list of str) – List of weights to attempt to load/calculate.
log_failure (bool) – Should failures be logged? If no, failures are raised as warnings.
- Returns
dict of str – Dictionary where keys are weights names and values are
np.ndarray
of bool. The result only contains valid weights. Any invalid weights are dropped.- Return type
np.ndarray
Notes
This method handles all exceptions and will only return weights which can actually be calculated. If no weights could be calculated, an empty dictionary will be returned.
-
get_weights_array
(weights_name)[source]¶ Get a single weights array
If the weights have previously been calculated the precalculated result is returned from the cache. Otherwise the appropriate WeightFunc is called with any kwargs specified in the constructor.
- Parameters
weights_name (str) – Region to get weights for
- Returns
Weights for the region specified by
weights_name
- Return type
ndarray[bool]
- Raises
InvalidWeightsError – If the cube has no data which matches the input weights or is invalid in any other way
-
get_weights_array_without_area_weighting
(weights_name)[source]¶ Get a single normalised weights array without any consideration of area weighting
The weights are normalised to be in the range [0, 1]
- Parameters
weights_name (str) – Region to get weights for
- Returns
Weights, normalised to be in the range [0, 1]
- Return type
np.ndarray
- Raises
InvalidWeightsError – If the requested weights cannot be found or evaluated
ValueError – The retrieved weights are not normalised to the range [0, 1]
-
-
exception
netcdf_scm.weights.
InvalidWeightsError
[source]¶ Bases:
Exception
Raised when a weight cannot be calculated.
This error usually propogates. For example, if a child weight used in the calculation of a parent weight fails then the parent weight should also raise an InvalidWeightsError exception (unless it can be satisfactorily handled).
-
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
netcdf_scm.weights.
get_ar6_region_weights
(region)[source]¶ Get a function to calculate the weights for a given AR6 region
AR6 regions defined in Iturbide et al., 2020 https://essd.copernicus.org/preprints/essd-2019-258/
- Parameters
region (str) – AR6 region to extract
- Returns
WeightFunc which weights out everything except the specified area
- Return type
WeightFunc()
-
netcdf_scm.weights.
get_binary_nh_weights
(weight_calculator, cube, **kwargs)[source]¶ Get binary weights to only include the Northern Hemisphere
- Parameters
weight_calculator (
CubeWeightCalculator
) – Cube weight calculator from which to retrieve the weightscube (
ScmCube
) – Cube to create weights forkwargs (Any) – Ignored (required for compatibility with
CubeWeightCalculator
)
- Returns
Binary northern hemisphere weights
- Return type
np.ndarray
-
netcdf_scm.weights.
get_default_sftlf_cube
[source]¶ Load netCDF-SCM’s default (last resort) surface land fraction cube
-
netcdf_scm.weights.
get_land_weights
(weight_calculator, cube, sftlf_cube=None, **kwargs)[source]¶ Get the land weights
The weights are always adjusted to have units of percentage. If the units are detected to be fraction rather than percentage, they will be automatically adjusted and a warning will be thrown.
If the default sftlf cube is used, it is regridded onto
cube
’s grid using a linear interpolation. We hope to use an area-weighted regridding in future but at the moment its performance is not good enough to be put into production ( approximately 100x slower than the linear interpolation regridding).- Parameters
weight_calculator (
CubeWeightCalculator
) – Cube weight calculator from which to retrieve the weightscube (
ScmCube
) – Cube to create weights forsftlf_cube (
ScmCube
) – Cube containing the surface land-fraction datakwargs (Any) – Ignored (required for compatibility with
CubeWeightCalculator
)
- Returns
Land weights
- Return type
np.ndarray
- Raises
AssertionError – The land weights are incompatible with the cube’s lat-lon grid
-
netcdf_scm.weights.
get_natural_earth_50m_scale_region_weights
(region)[source]¶ Get a function to calculate the weights for a given Natural Earth defined region
We use the 50m scale from Natural Earth and the implementation provided by regionmask.
- Parameters
region (str) – Natural Earth region to extract
- Returns
WeightFunc which weights out everything except the specified area
- Return type
WeightFunc()
-
netcdf_scm.weights.
get_nh_weights
(weight_calculator, cube, **kwargs)[source]¶ Get weights to only include the Northern Hemisphere
- Parameters
weight_calculator (
CubeWeightCalculator
) – Cube weight calculator from which to retrieve the weightscube (
ScmCube
) – Cube to create weights forkwargs (Any) – Ignored (required for compatibility with
CubeWeightCalculator
)
- Returns
Northern hemisphere weights
- Return type
np.ndarray
-
netcdf_scm.weights.
get_ocean_weights
(weight_calculator, cube, sftof_cube=None, **kwargs)[source]¶ Get the ocean weights
The weights are always adjusted to have units of percentage.
- Parameters
weight_calculator (
CubeWeightCalculator
) – Cube weight calculator from which to retrieve the weightscube (
ScmCube
) – Cube to create weights forsftof_cube (
ScmCube
) – Cube containing the surface ocean-fraction datakwargs (Any) – Ignored (required for compatibility with
CubeWeightCalculator
)
- Returns
Ocean weights
- Return type
np.ndarray
- Raises
AssertionError – The ocean weights are incompatible with the cube’s lat-lon grid
-
netcdf_scm.weights.
get_sh_weights
(weight_calculator, cube, **kwargs)[source]¶ Get weights to only include the Southern Hemisphere
- Parameters
weight_calculator (
CubeWeightCalculator
) – Cube weight calculator from which to retrieve the weightscube (
ScmCube
) – Cube to create weights forkwargs (Any) – Ignored (required for compatibility with
CubeWeightCalculator
)
- Returns
Southern hemisphere weights
- Return type
np.ndarray
-
netcdf_scm.weights.
get_weights_for_area
(lower_lat, left_lon, upper_lat, right_lon)[source]¶ Weights a subset of the globe using latitudes and longitudes (in degrees East)
Iris’ standard behaviour is to include any point whose bounds overlap with the given ranges e.g. if the range is (0, 130) then a cell whose bounds were (-90, 5) would be included even if its point were -42.5.
This can be altered with the
ignore_bounds
keyword argument tocube.intersection
. In this case only cells whose points lie within the range are included so if the range is (0, 130) then a cell whose bounds were (-90, 5) would be excluded if its point were -42.5.Here we follow the
ignore_bounds=True
behaviour (i.e. only include if the point lies within the specified range). If we want to only include the cell if the entire box is within a point we’re going to need to tweak things. Given this isn’t available in iris, it seems to be an unusual way to do intersection so we haven’t implemented it.Circular coordinates (longitude) can cross the 0E.
- Parameters
- Returns
WeightFunc which weights out everything except the specified area
- Return type
WeightFunc()
-
netcdf_scm.weights.
get_world_weights
(weight_calculator, cube, **kwargs)[source]¶ Get weights for the world
- Parameters
weight_calculator (
CubeWeightCalculator
) – Cube weight calculator from which to retrieve the weightscube (
ScmCube
) – Cube to create weights forkwargs (Any) – Ignored (required for compatibility with
CubeWeightCalculator
)
- Returns
Weights which can be used for the world mean calculation
- Return type
np.ndarray
-
netcdf_scm.weights.
multiply_weights
(weight_a, weight_b)[source]¶ Take the product of two weights
- Parameters
weight_a (str or WeightFunc) – If a string is provided, the weights specified by the string are retrieved. Otherwise the WeightFunc is evaluated at runtime
weight_b (str or WeightFunc) – If a string is provided, the weights specified by the string are retrieved. Otherwise the WeightFunc is evaluated at runtime
- Returns
WeightFunc which multiplies the input weights
- Return type
WeightFunc()
-
netcdf_scm.weights.
subtract_weights
(weights_to_subtract, subtract_from)[source]¶ Subtract weights from some other number
e.g. useful to convert e.g. from fraction of land to ocean (where ocean fractions are 1 - land fractions)
- Parameters
- Returns
WeightFunc which subtracts the input weights from
subtract_from
- Return type
WeightFunc()
Citing API¶
Helper tools for citing Coupled Model Intercomparison Project data
-
netcdf_scm.citing.
check_licenses
(scmruns)[source]¶ Check datasets for non-standard licenses
Non-standard licenses result in a warning
- Parameters
scmruns (list of
scmdata.ScmRun
) – Datasets to check the licenses of- Returns
Datasets with non-standard licenses
- Return type
list of
scmdata.ScmRun
-
netcdf_scm.citing.
get_citation_tables
(database)[source]¶ Get citation tables for a given set of CMIP data
- Parameters
database (list of
ScmRun
) – Set of CMIP data for which we want to create citation tables- Returns
dict of str – Dictionary containing the citation table and bibtex references for each MIP era used in
database
- Return type
Union[List, pd.DataFrame]
- Raises
ValueError – Any
ScmRun
indatabase
has amip_era
other than “CMIP5” or “CMIP6”
Command-line interface¶
netcdf-scm¶
NetCDF-SCM’s command-line interface
netcdf-scm [OPTIONS] COMMAND [ARGS]...
Options
-
--log-level
<log_level>
¶ - Options
DEBUG | INFO | WARNING | ERROR | EXCEPTION | CRITICAL
crunch¶
Crunch data in src
to netCDF-SCM .nc
files in dst
.
src
is searched recursively and netcdf-scm will attempt to crunch all the files
found. The directory structure in src
will be mirrored in dst
.
Failures and warnings are recorded and written into a text file in dst
. We
recommend examining this file using a file analysis tool such as grep
. We
often use the command grep "\|WARNING\|INFO\|ERROR <log-file>
.
crunch_contact
is written into the output .nc
files’ crunch_contact
attribute.
netcdf-scm crunch [OPTIONS] SRC DST CRUNCH_CONTACT
Options
-
--drs
<drs>
¶ Data reference syntax to use for crunching.
- Default
Scm
- Options
Scm | MarbleCMIP5 | CMIP6Input4MIPs | CMIP6Output
-
--regexp
<regexp>
¶ Regular expression to apply to file directory (only crunches matches). Be careful, if you use a very copmlex regexp directory sorting can be extremely slow (see e.g. discussion at https://stackoverflow.com/a/5428712)!
- Default
^(?!.*(fx)).*$
-
--regions
<regions>
¶ Comma-separated regions to crunch.
- Default
World,World|Northern Hemisphere,World|Southern Hemisphere,World|Land,World|Ocean,World|Northern Hemisphere|Land,World|Southern Hemisphere|Land,World|Northern Hemisphere|Ocean,World|Southern Hemisphere|Ocean
-
--data-sub-dir
<data_sub_dir>
¶ Sub-directory of
dst
to save data in.- Default
netcdf-scm-crunched
-
-f
,
--force
,
--do-not-force
¶
Overwrite any existing files.
- Default
False
-
--small-number-workers
<small_number_workers>
¶ Maximum number of workers to use when crunching files.
- Default
10
-
--small-threshold
<small_threshold>
¶ Maximum number of data points (in millions) in a file for it to be processed in parallel with
small-number-workers
- Default
50.0
-
--medium-number-workers
<medium_number_workers>
¶ Maximum number of workers to use when crunching files.
- Default
3
-
--medium-threshold
<medium_threshold>
¶ Maximum number of data points (in millions) in a file for it to be processed in parallel with
medium-number-workers
- Default
120.0
-
--force-lazy-threshold
<force_lazy_threshold>
¶ Maximum number of data points (in millions) in a file for it to be processed in memory
- Default
1000.0
-
--cell-weights
<cell_weights>
¶ How to weight cells when calculating aggregates. If ‘area-surface-fraction’, land surface fraction weights will be included when taking cell means. If ‘area-only’, land surface fraction weights will not be included when taking cell means, hence cells will only be weighted by their area. If nothing is provided, netCDF-SCM will guess whether land surface fraction weights should be included or not based on the data being processed. See
netcdf_scm.iris_cube_wrappers.ScmCube.get_scm_timeseries_weights()
for more details.- Options
area-only | area-surface-fraction
Arguments
-
SRC
¶
Required argument
-
DST
¶
Required argument
-
CRUNCH_CONTACT
¶
Required argument
stitch¶
Stitch netCDF-SCM .nc
files together and write out in the specified format.
SRC
is searched recursively and netcdf-scm will attempt to stitch all the
files found. Output is written in DST
.
STITCH_CONTACT
is written into the header of the output files.
netcdf-scm stitch [OPTIONS] SRC DST STITCH_CONTACT
Options
-
--regexp
<regexp>
¶ Regular expression to apply to file directory (only stitches matches). Be careful, if you use a very copmlex regexp directory sorting can be extremely slow (see e.g. discussion at https://stackoverflow.com/a/5428712)!
- Default
^(?!.*(fx)).*$
-
--prefix
<prefix>
¶ Prefix to apply to output file names (not paths).
-
--out-format
<out_format>
¶ Format to re-write crunched data into. The time operation conventions follow those in Pymagicc .
- Default
mag-files
- Options
mag-files | mag-files-average-year-start-year | mag-files-average-year-mid-year | mag-files-average-year-end-year | mag-files-point-start-year | mag-files-point-mid-year | mag-files-point-end-year | magicc-input-files | magicc-input-files-average-year-start-year | magicc-input-files-average-year-mid-year | magicc-input-files-average-year-end-year | magicc-input-files-point-start-year | magicc-input-files-point-mid-year | magicc-input-files-point-end-year | tuningstrucs-blend-model
-
--drs
<drs>
¶ Data reference syntax to use to decipher paths. This is required to ensure the output folders match the input data reference syntax.
- Default
None
- Options
None | MarbleCMIP5 | CMIP6Input4MIPs | CMIP6Output
-
-f
,
--force
,
--do-not-force
¶
Overwrite any existing files.
- Default
False
-
--number-workers
<number_workers>
¶ Number of worker (threads) to use when stitching.
- Default
4
-
--target-units-specs
<target_units_specs>
¶ csv containing target units for stitched variables.
-
--normalise
<normalise>
¶ How to normalise the data relative to piControl (if not provided, no normalisation is performed).
- Options
31-yr-mean-after-branch-time | 21-yr-running-mean | 21-yr-running-mean-dedrift | 30-yr-running-mean | 30-yr-running-mean-dedrift
Arguments
-
SRC
¶
Required argument
-
DST
¶
Required argument
-
STITCH_CONTACT
¶
Required argument
wrangle¶
Wrangle netCDF-SCM .nc
files into other formats and directory structures.
src
is searched recursively and netcdf-scm will attempt to wrangle all the
files found.
wrangle_contact
is written into the header of the output files.
netcdf-scm wrangle [OPTIONS] SRC DST WRANGLE_CONTACT
Options
-
--regexp
<regexp>
¶ Regular expression to apply to file directory (only wrangles matches). Be careful, if you use a very copmlex regexp directory sorting can be extremely slow (see e.g. discussion at https://stackoverflow.com/a/5428712)!
- Default
^(?!.*(fx)).*$
-
--prefix
<prefix>
¶ Prefix to apply to output file names (not paths).
-
--out-format
<out_format>
¶ Format to re-write crunched data into. The time operation conventions follow those in Pymagicc.
- Default
mag-files
- Options
mag-files | mag-files-average-year-start-year | mag-files-average-year-mid-year | mag-files-average-year-end-year | mag-files-point-start-year | mag-files-point-mid-year | mag-files-point-end-year | magicc-input-files | magicc-input-files-average-year-start-year | magicc-input-files-average-year-mid-year | magicc-input-files-average-year-end-year | magicc-input-files-point-start-year | magicc-input-files-point-mid-year | magicc-input-files-point-end-year | tuningstrucs-blend-model
-
--drs
<drs>
¶ Data reference syntax to use to decipher paths. This is required to ensure the output folders match the input data reference syntax.
- Default
None
- Options
None | MarbleCMIP5 | CMIP6Input4MIPs | CMIP6Output
-
-f
,
--force
,
--do-not-force
¶
Overwrite any existing files.
- Default
False
-
--number-workers
<number_workers>
¶ Number of worker (threads) to use when wrangling.
- Default
4
-
--target-units-specs
<target_units_specs>
¶ csv containing target units for wrangled variables.
Arguments
-
SRC
¶
Required argument
-
DST
¶
Required argument
-
WRANGLE_CONTACT
¶
Required argument
Crunching API¶
Module for crunching raw netCDF data into netCDF-SCM netCDF files
Definitions API¶
Miscellaneous definitions used in netCDF-SCM
-
netcdf_scm.definitions.
NAME_COMPONENTS_SEPARATOR
= '_'¶ Character assumed to separate different components within a filename
For example, if we come across a filename like ‘tas_r1i1p1f1_UoM-Fancy’ then we assume that ‘tas’, ‘r1i1p1f1’ and ‘UoM-Fancy’ all refer to different bits of metadata which are encoded within the filename.
- Type
Errors API¶
netCDF-SCM’s custom error handling
-
exception
netcdf_scm.errors.
NoLicenseInformationError
[source]¶ Bases:
AttributeError
Exception raised when a dataset contains no license information
-
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
exception
netcdf_scm.errors.
NonStandardLicenseError
[source]¶ Bases:
ValueError
Exception raised when a dataset contains a non-standard license
For example, if a CMIP6 dataset does not contain a Creative Commons Attribution ShareAlike 4.0 International License
-
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
IO API¶
Input and output from netCDF-SCM’s netCDF format
-
netcdf_scm.io.
get_scmcube_helper
(drs)[source]¶ Get ScmCube helper for a given data reference syntax
- drsstr
Data reference syntax to get the helper cube for
- Returns
Instance of sub-class of
netcdf_scm.iris_cube_wrappers.ScmCube
which matches the input data reference syntax- Return type
- Raises
NotImplementedError –
drs
is equal to"None"
KeyError –
drs
is unrecognised
-
netcdf_scm.io.
load_mag_file
(infile, drs)[source]¶ Load
.MAG
file with automatic infilling of metadata if possible- Parameters
- Returns
pymagicc.io.MAGICCData
with the data and metadata contained in the file.- Return type
pymagicc.io.MAGICCData
- Warns
UserWarning – Some or all of the metadata couldn’t be determined from
infile
with the givendrs
.
Miscellaneous Readers API¶
Miscellaneous readers for files which can’t otherwise be read
-
netcdf_scm.misc_readers.
read_cmip6_concs_gmnhsh
(filepath, region_coord_name='sector')[source]¶ Read CMIP6 concentrations global and hemispheric mean data
- Parameters
- Returns
scmdata.ScmRun
containing the global and hemispheric mean data- Return type
scmdata.ScmRun
- Raises
AssertionError – Defensive assertion: the code is being used in an unexpected way
Normalisation API¶
Normalisation handling
Within netCDF-SCM, ‘normalisation’ refers to taking anomalies from some set of reference values. For example, subtracting a 21-year running mean from a pre-industrial control experiment from the results of a projection experiment.
-
netcdf_scm.normalisation.
get_normaliser
(key)[source]¶ Get the appropriate normaliser for a given key
- Parameters
key (str) – Key which specifies the type of normaliser to get
- Returns
Normaliser appropriate for
key
- Return type
- Raises
ValueError –
key
cannot be mapped to a known normaliser
Base API¶
Base class for normalisation operations
-
class
netcdf_scm.normalisation.base.
Normaliser
[source]¶ Bases:
abc.ABC
Base class for normalising operations
-
get_reference_values
(indata, picontrol, picontrol_branching_time)[source]¶ Get reference values for an experiment from its equivalent piControl experiment
- Parameters
indata (
scmdata.ScmRun
) – Experiment to calculate reference values forpicontrol (
scmdata.ScmRun
) – Pre-industrial control run datapicontrol_branching_time (
datetime.datetime
) – The branching time in the pre-industrial experiment. It is assumed that the first timepoint ininput
follows immediately from this branching time.
- Returns
Reference values with the same index and columns as
indata
- Return type
pd.DataFrame
- Raises
ValueError – The branching time data is not in
picontrol
dataNotImplementedError – The normalisation method is not recognised
-
method_name
¶ Name of the method used for normalisation
This string is included in the metadata of normalised data/files.
- Type
-
normalise_against_picontrol
(indata, picontrol, picontrol_branching_time)[source]¶ Normalise data against picontrol
- Parameters
indata (
scmdata.ScmRun
) – Data to normalisepicontrol (
scmdata.ScmRun
) – Pre-industrial control run datapicontrol_branching_time (
datetime.datetime
) – The branching time in the pre-industrial experiment. It is assumed that the first timepoint ininput
follows immediately from this branching time.
- Returns
Normalised data including metadata about the file which was used for normalisation and the normalisation method
- Return type
scmdata.ScmRun
- Raises
NotImplementedError – Normalisation is being done against a timeseries other than piControl
ValueError – The branching time data is not in
picontrol
dataNotImplementedError – The normalisation method is not recognised
-
After branch time mean API¶
Module for the normaliser which calculates anomalies from a mean of a fixed number of years in the pre-industrial control run
-
class
netcdf_scm.normalisation.after_branch_time_mean.
AfterBranchTimeMean
[source]¶ Bases:
netcdf_scm.normalisation.base.Normaliser
Normaliser which calculates anomalies from a mean of a fixed number of years after the branch time in the pre-industrial control run
At present, only a 31-year mean after the branch time is implemented.
-
get_reference_values
(indata, picontrol, picontrol_branching_time)¶ Get reference values for an experiment from its equivalent piControl experiment
- Parameters
indata (
scmdata.ScmRun
) – Experiment to calculate reference values forpicontrol (
scmdata.ScmRun
) – Pre-industrial control run datapicontrol_branching_time (
datetime.datetime
) – The branching time in the pre-industrial experiment. It is assumed that the first timepoint ininput
follows immediately from this branching time.
- Returns
Reference values with the same index and columns as
indata
- Return type
pd.DataFrame
- Raises
ValueError – The branching time data is not in
picontrol
dataNotImplementedError – The normalisation method is not recognised
-
method_name
¶ Name of the method used for normalisation
This string is included in the metadata of normalised data/files.
- Type
-
normalise_against_picontrol
(indata, picontrol, picontrol_branching_time)¶ Normalise data against picontrol
- Parameters
indata (
scmdata.ScmRun
) – Data to normalisepicontrol (
scmdata.ScmRun
) – Pre-industrial control run datapicontrol_branching_time (
datetime.datetime
) – The branching time in the pre-industrial experiment. It is assumed that the first timepoint ininput
follows immediately from this branching time.
- Returns
Normalised data including metadata about the file which was used for normalisation and the normalisation method
- Return type
scmdata.ScmRun
- Raises
NotImplementedError – Normalisation is being done against a timeseries other than piControl
ValueError – The branching time data is not in
picontrol
dataNotImplementedError – The normalisation method is not recognised
-
Running mean API¶
Module for the normaliser which calculates anomalies from a running mean in the pre-industrial control run
-
class
netcdf_scm.normalisation.running_mean.
NormaliserRunningMean
(nyears=21)[source]¶ Bases:
netcdf_scm.normalisation.base.Normaliser
Normaliser which calculates anomalies from a running mean in the pre-industrial control run
Each normalisation value is an n-year mean, centred on the equivalent point in the pre-industrial control simulation. If there is insufficient data to create a full n-year window at the edge of the simulation then a linear extrapolation of the running-mean is used to extend the normalisation values to cover the required full range.
-
get_reference_values
(indata, picontrol, picontrol_branching_time)¶ Get reference values for an experiment from its equivalent piControl experiment
- Parameters
indata (
scmdata.ScmRun
) – Experiment to calculate reference values forpicontrol (
scmdata.ScmRun
) – Pre-industrial control run datapicontrol_branching_time (
datetime.datetime
) – The branching time in the pre-industrial experiment. It is assumed that the first timepoint ininput
follows immediately from this branching time.
- Returns
Reference values with the same index and columns as
indata
- Return type
pd.DataFrame
- Raises
ValueError – The branching time data is not in
picontrol
dataNotImplementedError – The normalisation method is not recognised
-
method_name
¶ Name of the method used for normalisation
This string is included in the metadata of normalised data/files.
- Type
-
normalise_against_picontrol
(indata, picontrol, picontrol_branching_time)¶ Normalise data against picontrol
- Parameters
indata (
scmdata.ScmRun
) – Data to normalisepicontrol (
scmdata.ScmRun
) – Pre-industrial control run datapicontrol_branching_time (
datetime.datetime
) – The branching time in the pre-industrial experiment. It is assumed that the first timepoint ininput
follows immediately from this branching time.
- Returns
Normalised data including metadata about the file which was used for normalisation and the normalisation method
- Return type
scmdata.ScmRun
- Raises
NotImplementedError – Normalisation is being done against a timeseries other than piControl
ValueError – The branching time data is not in
picontrol
dataNotImplementedError – The normalisation method is not recognised
-
Running mean de-drift API¶
Module for the normaliser which only removes drift in the pre-industrial control run (drift is calculated using a running-mean)
-
class
netcdf_scm.normalisation.running_mean_dedrift.
NormaliserRunningMeanDedrift
(nyears=21)[source]¶ Bases:
netcdf_scm.normalisation.running_mean.NormaliserRunningMean
Normaliser which calculates drift in the pre-industrial control using a running mean
Each normalisation value is the change in an n-year mean with respect to the running mean at the branch point. This means that the reference values are always zero in their first timestep. Each point is centred on the equivalent point in the pre-industrial control simulation.
If there is insufficient data to create a full n-year window at the edge of the simulation then a linear extrapolation of the running-mean is used to extend the normalisation values to cover the required full range.
-
get_reference_values
(indata, picontrol, picontrol_branching_time)¶ Get reference values for an experiment from its equivalent piControl experiment
- Parameters
indata (
scmdata.ScmRun
) – Experiment to calculate reference values forpicontrol (
scmdata.ScmRun
) – Pre-industrial control run datapicontrol_branching_time (
datetime.datetime
) – The branching time in the pre-industrial experiment. It is assumed that the first timepoint ininput
follows immediately from this branching time.
- Returns
Reference values with the same index and columns as
indata
- Return type
pd.DataFrame
- Raises
ValueError – The branching time data is not in
picontrol
dataNotImplementedError – The normalisation method is not recognised
-
method_name
¶ Name of the method used for normalisation
This string is included in the metadata of normalised data/files.
- Type
-
normalise_against_picontrol
(indata, picontrol, picontrol_branching_time)¶ Normalise data against picontrol
- Parameters
indata (
scmdata.ScmRun
) – Data to normalisepicontrol (
scmdata.ScmRun
) – Pre-industrial control run datapicontrol_branching_time (
datetime.datetime
) – The branching time in the pre-industrial experiment. It is assumed that the first timepoint ininput
follows immediately from this branching time.
- Returns
Normalised data including metadata about the file which was used for normalisation and the normalisation method
- Return type
scmdata.ScmRun
- Raises
NotImplementedError – Normalisation is being done against a timeseries other than piControl
ValueError – The branching time data is not in
picontrol
dataNotImplementedError – The normalisation method is not recognised
-
Output API¶
Module for handling crunching output tracking
This module handles checking whether a file has already been crunched and if its source files have been updated since it was last crunched.
-
class
netcdf_scm.output.
OutputFileDatabase
(out_dir)[source]¶ Bases:
object
Holds a list of output files which have been written.
Also keeps track of the source files used to create each output file.
-
load_from_file
()[source]¶ Load database from
self.out_dir
- Returns
Handle to the loaded filepath
- Return type
- Raises
ValueError – The loaded file contains more than one entry for a given filename
-
Retractions API¶
Utilities for checking for retracted datasets
-
netcdf_scm.retractions.
check_depends_on_retracted
(mag_files, raise_on_mismatch=True, **kwargs)[source]¶ Check if a
.MAG
file was calculated from now retracted dataNotes
This queries external ESGF servers. Please limit the number of parallel requests.
- Parameters
mag_files (list of str) – List of
.MAG
files to checkraise_on_mismatch (bool) – If a file cannot be processed, should an error be raised? If
False
, an error message is logged instead.**kwargs (any) – Passed to
check_retractions()
- Returns
Dataframe which describes the retracted status of each file in
mag_files
. The columns are:”mag_file”: the files in
mag_files
- ”dependency_file”: file which the file in the “mag_file” column depends on (note that
the
.MAG
files may have more than one dependency so they may appear more than once in the “mag_file” column)
”dependency_instance_id”: instance id (i.e. unique ESGF identifier) of the dependency file
- ”dependency_retracted”: whether the dependency file has been retracted or not (
True
if the file has been retracated)
- ”dependency_retracted”: whether the dependency file has been retracted or not (
The list of retracted
.MAG
files can then be accessed with e.g.res.loc[res["dependency_retracted"], "mag_file"].unique()
- Return type
pd.DataFrame
- Raises
ValueError – The
.MAG
file is not based on CMIP6 data (retractions cannot be checked automatically for CMIP5 data with netCDF-SCM).ValueError – Metadata about a
.MAG
file’s source is not included in the.MAG
file.
-
netcdf_scm.retractions.
check_retracted_files
(filenames_or_dir, filename_filter='*.nc', **kwargs)[source]¶ Check if any files are retracted
Notes
This queries external ESGF servers. Please limit the number of parallel requests.
- Parameters
filenames_or_dir (list of str or str) – A list of filenames or a directory to check for any retractions. If a string is provided, it is assumed to reference a directory and any files within that directory matching the filename_filter will be checked.
filename_filter (str) – If a directory is passed all files matching the filter will be checked.
**kwargs (any) – Passed to
check_retracted()
- Returns
- Return type
List of the retracted files
-
netcdf_scm.retractions.
check_retractions
(instance_ids, esgf_query_batch_size=100, nworkers=8)[source]¶ Check a list of
instance_ids
for any retracted datasetsNotes
This queries external ESGF servers. Please limit the number of parallel requests.
- Parameters
instance_ids (list of str) – Datasets to check.
instance_id
is the unique identifier for a dataset, for example CMIP6.CMIP.CSIRO.ACCESS-ESM1-5.esm-hist.r1i1p1f1.Amon.rsut.gn.v20191128esgf_query_batch_size (int) – Maximum number of ids to include in each query.
nworkers (int) – Number of workers to parallel queries to ESGF.
- Returns
A list of retracted
instance_ids
- Return type
list of str
Stitching API¶
Module for stitching netCDF-SCM netCDF files together
‘Stitching’ here means combining results from multiple experiments e.g. combining historical and scenario experiments. This relies on the ‘parent’ conventions within CMIP experiments which define the experiment from which a given set of output started (in CMIP language, the experiment from which a given experiment ‘branched’).
-
netcdf_scm.stitching.
get_branch_time
(openscmrun, parent=True, source_path=None, parent_path=None)[source]¶ Get branch time of an experiment
- Parameters
openscmrun (
scmdata.ScmRun
) – Data of which to get the branch timeparent (bool) – Should I get the branch time in the parent experiment’s time co-ordinates? If
False
, return the branch time in the child (i.e.openscmrun
’s) time co-ordinates.source_path (str) – Path to the data file from which
openscmrun
is derived. This is only required ifparent
isFalse
. It is needed because information about the time calendar and units of the data inopenscmrun
is only available in the source file.parent_path (str) – Path to the data file containing the parent data of
openscmrun
. This is only required if the data is from CMIP5 because CMIP5 data does not store information about the parent experiment’s time calendar and units.
- Returns
The branch time, rounded to the nearest year, month and day. netCDF-SCM is not designed for very precise calculations, if you need to keep finer information, please raise an issue on our issue tracker to discuss.
- Return type
- Raises
ValueError –
parent is not True
and the data is CMIP5 data. It is impossible to determine the branch time in the child time co-ordinates from CMIP5 data because of a lack of information.ValueError –
parent_path is None
and the data is CMIP5 data. You must supply the parent path if the data is CMIP5 data because the parent file is the only place the parent experiment’s time units and calendar information is available.
-
netcdf_scm.stitching.
get_continuous_timeseries_with_meta
(infile, drs, return_picontrol_info=True, log_warning=False)[source]¶ Load a continuous timeseries with metadata
Continuous here means including all parent experiments up to (but not including) piControl
- Parameters
infile (str) – netCDF-SCM crunched file to load
drs (str) – Data reference syntax which applies to this file
return_picontrol_info (bool) – If supplied, piControl information will be returned in the second and third outputs if available (rather than
None
). A caveat is that if the experiment itself is a piControl experiment,None
will be returned in the second and third outputs.log_warning (bool) – Should warnings be logged? If
False
, warnings are raised withwarnings.warn
instead.
- Returns
scmdata.ScmRun
– Loaded timseries, including metadatadt.datetime
– Branch time from piControl. Ifinfile
points to a piControl or piControl-spinup experiment then this will beNone
.str – Path from which the piControl data was loaded. If
infile
points to a piControl or piControl-spinup experiment then this will beNone
.
-
netcdf_scm.stitching.
get_parent_file_path
(infile, parent_replacements, drs, log_warning=False)[source]¶ Get parent file path for a given file
If multiple versions are available the latest version is chosen.
- Parameters
infile (str) – File path of which to get the parent
parent_replacements (dict of str : str) – Replacements to insert in
infile
to determine the parent filepathdrs (str) – Data reference syntax which is applicable to these filepaths
log_warning (bool) – Should a warning be logged? If no, the warning is raised using
warnings.warn
.
- Returns
Path of the parent file
- Return type
- Raises
IOError – Parent data cannot be found
AssertionError – Parent files vary by more than just version
-
netcdf_scm.stitching.
get_parent_replacements
(scmdf)[source]¶ Get changes in metadata required to identify a dataset’s parent file
-
netcdf_scm.stitching.
step_up_family_tree
(in_level)[source]¶ Step name up the family tree
- Parameters
in_level (str) – Level from which to step up
- Returns
Level one up from
in_level
- Return type
Examples
>>> step_up_family_tree("(child)") "(parent)"
>>> step_up_family_tree("(parent)") "(grandparent)"
>>> step_up_family_tree("(grandparent)") "(grandparent)"
>>> step_up_family_tree("(greatgreatgrandparent)") "(greatgreatgreatgrandparent)"
Utils API¶
Utils contains a number of helpful functions for doing common cube operations.
For example, applying masks to cubes, taking latitude-longitude means and getting timeseries from a cube as datetime values.
-
netcdf_scm.utils.
apply_mask
(in_scmcube, in_mask)[source]¶ Apply a mask to an scm cube’s data
- Parameters
in_scmcube (
ScmCube
) – AnScmCube
instance.in_mask (np.ndarray) – The mask to apply
- Returns
A copy of the input cube with the mask applied to its data
- Return type
ScmCube
-
netcdf_scm.utils.
assert_all_time_axes_same
(time_axes)[source]¶ Assert all time axes in a set are the same.
- Parameters
time_axes (list_like of array_like) – List of time axes to compare.
- Raises
AssertionError – If not all time axes are the same.
-
netcdf_scm.utils.
broadcast_onto_lat_lon_grid
(cube, array_in)[source]¶ Broadcast an array onto the latitude-longitude grid of
cube
.Here, broadcasting means taking the array and ‘duplicating’ it so that it has the same number of dimensions as the cube’s underlying data.
For example, given a cube with a time dimension of length 3, a latitude dimension of length 4 and a longitude dimension of length 2 (shape 3x4x2) and
array_in
of shape 4x2, results in a 3x4x2 array where each slice in the broadcasted array’s time dimension is identical toarray_in
.- Parameters
cube (
ScmCube
) –ScmCube
instance whose lat-lon grid we want to check againsarray_in (np.ndarray) – The array we want to broadcast
- Returns
The original array, broadcast onto the cube’s lat-lon grid (i.e. duplicated along all dimensions except for latitude and longitude). Note: If the cube has lazy data, we return a
da.Array
, otherwise we return annp.ndarray
.- Return type
array_out
- Raises
AssertionError –
array_in
cannot be broadcast onto the cube’s lat-lon grid because their shapes are not compatibleValueError –
array_in
cannot be broadcast onto the cube’s lat-lon grid byiris.util.broadcast_to_shape
-
netcdf_scm.utils.
cube_lat_lon_grid_compatible_with_array
(cube, array_in)[source]¶ Assert that an array can be broadcast onto the cube’s lat-lon grid
- Parameters
cube (
ScmCube
) –ScmCube
instance whose lat-lon grid we want to check agains
- array_innp.ndarray
The array we want to ensure is able to be broadcast
- Returns
True
if the cube’s lat-lon grid is compatible witharray_in
, otherwiseFalse
- Return type
- Raises
AssertionError – The array cannot be broadcast onto the cube’s lat-lon grid
-
netcdf_scm.utils.
get_cube_timeseries_data
(scm_cube, realise_data=False)[source]¶ Get a timeseries from a cube.
This function only works on cubes which are on a time grid only i.e. have no other dimension coordinates.
- Parameters
scm_cube (
ScmCube
) – AnScmCube
instance with only a ‘time’ dimension.realise_data (bool) – If
True
, force the data to be realised before returning
- Returns
The cube’s timeseries data. If
realise_data
isFalse
then ada.Array
will be returned if the data is lazy.- Return type
np.ndarray
-
netcdf_scm.utils.
get_scm_cube_time_axis_in_calendar
(scm_cube, calendar)[source]¶ Get a cube’s time axis in a given calendar
- Parameters
scm_cube (
ScmCube
) – AnScmCube
instance.calendar (str) – The calendar to return the time axis in e.g. ‘365_day’, ‘gregorian’.
- Returns
Array of datetimes, containing the cube’s calendar.
- Return type
np.ndarray
-
netcdf_scm.utils.
take_lat_lon_mean
(in_scmcube, in_weights)[source]¶ Take the latitude longitude mean of a cube with given weights
- Parameters
in_scmcube (
ScmCube
) – AnScmCube
instance.in_weights (np.ndarray) – Weights to use when taking the mean.
- Returns
First output is a copy of the input cube in which the data is now the latitude-longitude mean of the input cube’s data. Second output is the sum of weights i.e. normalisation used in the weighted mean.
- Return type
ScmCube
, float
-
netcdf_scm.utils.
unify_lat_lon
(cubes, rtol=1e-06)[source]¶ Unify latitude and longitude co-ordinates of cubes in place.
The co-ordinates will only be unified if they already match to within a given tolerance.
- Parameters
cubes (
iris.cube.CubeList
) – List of iris cubes whose latitude and longitude co-ordinates should be unified.rtol (float) – Maximum relative difference which can be accepted between co-ordinate values.
- Raises
ValueError – If the co-ordinates differ by more than relative tolerance or are not compatible (e.g. different shape).
Wranglers API¶
Functions used to ‘wrangle’ netCDF-SCM netCDF files into other formats
-
netcdf_scm.wranglers.
convert_scmdf_to_tuningstruc
(scmdf, outdir, prefix=None, force=False)[source]¶ Convert an
scmdata.ScmRun
to a matlab tuningstrucOne tuningstruc file will be created for each unique [“model”, “scenario”, “variable”, “region”, “unit”] combination in the input
scmdata.ScmRun
.- Parameters
scmdf (
scmdata.ScmRun
) –scmdata.ScmRun
to convert to a tuningstrucoutdir (str) – Directory in which to save the tuningstruc
prefix (str) – Prefix for the filename. The rest of the filename is generated from the metadata. .mat is also appended automatically. If
None
, no prefix is used.force (bool) – If True, overwrite any existing files
- Returns
List of files which were not re-written as they already exist
- Return type
- Raises
AssertionError – If timeseries are not unique for a given [“climate_model”, “model”, “scenario”, “variable”, “region”, “unit”] combination.
-
netcdf_scm.wranglers.
convert_tuningstruc_to_scmdf
(filepath, variable=None, region=None, unit=None, scenario=None, model=None)[source]¶ Convert a matlab tuningstruc to an
scmdata.ScmRun
- Parameters
filepath (str) – Filepath from which to load the data
variable (str) – Name of the variable contained in the tuningstruc. If None, convert_tuningstruc_to_scmdf will attempt to determine it from the input file.
region (str) – Region to which the data in the tuningstruc applies. If None, convert_tuningstruc_to_scmdf will attempt to determine it from the input file.
unit (str) – Units of the data in the tuningstruc. If None, convert_tuningstruc_to_scmdf will attempt to determine it from the input file.
scenario (str) – Scenario to which the data in the tuningstruc applies. If None, convert_tuningstruc_to_scmdf will attempt to determine it from the input file.
model (str) – The (integrated assessment) model which generated the emissions scenario associated with the data in the tuningstruc. If None, convert_tuningstruc_to_scmdf will attempt to determine it from the input file and if it cannot, it will be set to “unspecified”.
- Raises
KeyError – If a metadata variable is not supplied and it cannot be determined from the tuningstruc.
- Returns
scmdata.ScmRun
with the tuningstruc data- Return type
scmdata.ScmRun
-
netcdf_scm.wranglers.
get_tuningstruc_name_from_df
(df, outdir, prefix)[source]¶ Get the name of a tuningstruc from a
pd.DataFrame
- Parameters
- Returns
tuningstruc name
- Return type
- Raises
ValueError – A name cannot be determined because e.g. more than one scenario is contained in the dataframe
Wrangling API¶
Module for wrangling netCDF-SCM netCDF files into other formats
Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
The changes listed in this file are categorised as follows:
Added: new features
Changed: changes in existing functionality
Deprecated: soon-to-be removed features
Removed: now removed features
Fixed: any bug fixes
Security: in case of vulnerabilities.
v2.1.0 - 2021-03-31¶
Added¶
(!82) Safely convert branch time strings to integers (closes #62)
(!80) Country masks based on Natural Earth using regionmask’s implementation
(!80)
netcdf_scm.weights.CubeWeightCalculator.get_weights()
now takes an extra argument,log_failure
, which controls whether failed retrievals are logged or raised as warnings
Changed¶
(!83) Raise warning or log warning if the branch times cannot be verified rather than raising a
NotImplementedError
(partially addresses #61)(!80) Require
xarray<0.17
until xarray #5050 is resolved
v2.0.0 - 2021-01-19¶
Added¶
(!76) Added missing modules to documentation
(!72) v2 paper revisions round 2
(!67) v2 paper revisions
(!69) Added AR6 reference regions
(!68) “30-yr-running-mean” and “30-yr-running-mean-dedrift” normalisation options when stitching
(!68)
nyears
keyword argument when initialisingnetcdf_scm.normalisation.NormaliserRunningMean
andnetcdf_scm.normalisation.NormaliserRunningMeanDedrift
so that the number of years to use when calculating the running-mean is now arbitrary (default value is 21 so there is now change to the default behaviour)(!32) First submission to Earth System Science Data (ESSD)
(!56) Instructions and scripts for doing zenodo releases
(!51) Add normalisation module to docs
(!49) Add progress bar to directory sorting so it’s obvious when things are going very slowly
(!43) Add normalisation method
21-yr-running-mean-dedrift
(!39) Put basic license checking tools in new module:
netcdf_scm.citing
(closes #30)(!34) Add convenience
.MAG
reader (netcdf_scm.io.load_mag_file
) which automatically fills in metadata. Also addsnetcdf_scm.io.get_scmcube_helper
to the ‘public’ API.(!25) Add regular test of conda installation
(!30) Added scipy to dependencies to pip install works
(!26) Added 21-year running mean normalisation option
(!22) Allow user to choose weighting scheme in CLI
(!16) Add CMIP5 stitching support
(!1) Add
netcdf-scm-stitch
so e.g. historical and scenario files can be joined and also normalised against e.g. piControl(#108 (github)) Optimise wranglers and add regression tests
(#107 (github)) Add wrangling options for average/point start/mid/end year time manipulations for
.MAG
and.IN
files(#104 (github)) Allow wranglers to also handle unit conversions (see #101 (github))
(#102 (github)) Keep effective area as metadata when calculating SCM timeseries (see #100 (github))
(#98 (github)) Add support for reading CMIP6 concentration GMNHSH data
(#95 (github)) Add support for CO2 flux data (fgco2) reading, in the process simplifying crunching and improving lazy weights
(#87 (github)) Add support for crunching data with a height co-ordinate
(#84 (github)) Add ability to crunch land, ocean and atmosphere data separately (and sensibly)
(#75 (github)) Check
land_mask_threshold
is sensible when retrieving land mask (automatically update if not)(#69 (github)) Add El Nino 3.4 mask
(#66 (github)) Add devops tools and refactor to pass new standards
(#62 (github)) Add netcdf-scm format and crunch to this by default
(#61 (github)) Add land fraction when crunching scm timeseries cubes
Changed¶
(!73) Handling of invalid regions while crunching. If crunching requests regions which aren’t compatible with a file, a warning will be raised but the crunching will continue with all the valid regions it can. Previously, if invalid regions were requested, the crunch would fail and no regions would be crunched for that file.
(!73) Renamed
netcdf_scm.weights.InvalidWeights
tonetcdf_scm.weights.InvalidWeightsError
and ensured that all weights-related errors are now raised asnetcdf_scm.weights.InvalidWeightsError
rather than being a mix ofnetcdf_scm.weights.InvalidWeightsError
andValueError
as was previously the case.(!73)
netcdf_scm.iris_cube_wrappers.ScmCube.get_scm_timeseries_cubes()
will now raise anetcdf_scm.weights.InvalidWeightsError
if none of the requested regions have valid weights.(!73) Improved logging handling so only netCDF-SCM’s logger is used by netCDF-SCM, with the root logger never being used.
(!71) Rename prefix for AR6 regions from
World|AR6 regions
toWorld|AR6
(!70) Update default land-fraction cube,
netcdf_scm.weights.default_land_ocean_weights.nc
, so they’re based on CMIP6 data and treat e.g. the Caspian Sea and Great Lakes not as purely land(!5) Use xarray to load crunched netCDF files in
netcdf_scm.io.load_scmrun()
, reducing load time by about a factor of 3(!64) Upgraded to pymagicc 2.0.0rc5 and changed all use of
scmdata.ScmDataFrame
toscmdata.ScmRun
(!64) netcdf_scm.io.load_scmdataframe to netcdf_scm.io.load_scmrun and this function now automatically drops the “todo” column on reading
(!62) Changed command-line interface to use groups rather than hyphens. Change in commands is
netcdf-scm-crunch
–>netcdf-scm crunch
,netcdf-scm-stitch
–>netcdf-scm stitch
,netcdf-scm-wrangle
–>netcdf-scm wrangle
.(!60) Target journal for v2 paper
(!55) Added check that region areas are sensible when calculating SCM timeseries cubes (see
ScmCube._sanity_check_area()
, closes #34)(!52) Put notebooks into documentation henced moved them from
notebooks
todocs/source/usage
(!48) Workaround erroneous whitespace in parent metadata when stitching (closes #36)
(!47) Rework CHANGELOG to follow Keep a Changelog (closes #27)
(!45) Move from https://gitlab.com/znicholls/netcdf-scm to https://gitlab.com/netcdf-scm/netcdf-scm
(!38) Split out normalisation module:
netcdf_scm.normalisation
(closes #31)(!37) Do not duplicate files into a
flat
directory when wrangling and stitching (closes #33)(!31) Rename
SCMCube
, it is nowScmCube
. Also use “netCDF” rather than “NetCDF” throughout.(!28) Move multiple stitching utility functions into the ‘public’ API
(!29) Parallelise directory sorting when crunching
(!27) Refactored stitching to module to make room for new normalisation method
(!24) Parallelise unit, integration and regression tests in CI to reduce run time
(!23) Split
netcdf_scm.cli
into smaller parts(!21) Remove use of
contourf
in notebooks as it can give odd results(!20) Update weight retrieval so that non-area weights are normalised (fixes #11)
(!19) Update notebooks and refactor so cubes can have multiple weights calculators
(#106 (github)) Upgrade to new Pymagicc release
(#105 (github)) Upgrade to new Pylint release
(#99 (github)) Switch to BSD-3-Clause license
(#92 (github)) Shrink test files (having moved entire repository to use git lfs properly)
(#90 (github)) Rely on iris for lazy crunching
(#89 (github)) Change crunching thresholds to be based on data size rather than number of years
(#82 (github)) Prepare to add land data handling
(#81 (github)) Refactor masks to use weighting instead of masking, doing all the renaming in the process
(#80 (github)) Refactor to avoid
import conftest
in tests(#77 (github)) Refactor
netcdf_scm.masks.get_area_mask
logic to make multi-dimensional co-ordinate support easier(#72 (github)) Monkey patch iris to speed up crunching and go back to linear regridding of default sftlf mask
(#70 (github)) Dynamically decide whether to handle data lazily (fix regression tests in process)
(#64 (github)) Update logging to make post analysis easier and output clearer
(#63 (github)) Switch to using cmor name for variable in SCM timeseries output and put standard name in standard_variable_name
(#58 (github)) Lock tuningstruc wrangling so it can only wrangle to flat tuningstrucs, also includes:
turning off all wrangling in preparation for re-doing crunching format
adding default sftlf cube
(#50 (github)) Make pyam-iamc a core dependency
Fixed¶
(!75) Check
regionmask
version before trying to accessregionmask
’s AR6 region definitions(!66) Upgraded to scmdata 0.7
(!59) Updated
SCMCube.lat_lon_shape
so it is better able to handle non-standard datasets(!58) Upgraded to pymagicc>=2.0.0rc3 to ensure pint compatible unit handling when writing
.MAG
files(!36) Ensure areas are only calculated based on non-masked data (fixes bugs identified in #35 and #37)
(!33) Fix bug in
stitching.get_branch_time
where wrong time units were used when converting raw time to datetime(!18) Hotfix tests
(!15) Fixed but in unit conversion which caused it to fail for
hfds
(!13) Make cube concatenation workaround small errors in raw data metadata
(!10) Add support for
esm*
experiments when stitching (fixes #2)(!11) Add ability to read CanESM5 ocean data with depth and ‘extra’ co-ordinates. Also:
split regression testing into smaller pieces so memory requirements aren’t so high
(!9) Add ability to read CanESM5 ocean data, making handling of ‘extra’ co-ordinates more robust
(!6) Allow hfds crunching to work by handling extra ocean data coordinates properly
(#114 (github)) Ensure that default sftlf file is included in wheel
(#111 (github)) Write tuningstrucs with data in columns rather than rows
(#97 (github)) Add support for tuningstruc data which has been transposed
(#88 (github)) Fix bug when reading more than one multi-dimensional file in a directory
(#74 (github)) Fix bug in mask generation
(#67 (github)) Fix crunching filenaming, tidy up more and add catch for IPSL
time_origin
time variable attribute(#55 (github)) Hotfix docs so they build properly
v1.0.0 - 2019-05-21¶
Changed¶
(#49 (github)) Make bandit only check
src
(#45 (github)) Refactor the masking of regions into a module allowing for more regions to be added as needed
Added¶
(#48 (github)) Add
isort
to checks(#47 (github)) Add regression tests on crunching output to ensure stability. Also:
fixes minor docs bug
updates default regexp option in crunch and wrangle to avoid
fx
filesrefactors
cli.py
a touch to reduce duplicationavoids
collections
deprecation warning inmat4py
Fixed¶
(#46 (github)) Fix a number of bugs in
netcdf-scm-wrangle
’s data handling when converting to tuningstrucs
v0.7.3 - 2019-05-16¶
Changed¶
(#44 (github)) Speed up crunching by forcing data to load before applying masks, not each time a mask is applied
v0.7.2 - 2019-05-16¶
Changed¶
(#43 (github)) Speed up crunching, in particular remove string parsing to convert cftime to python datetime
v0.7.1 - 2019-05-15¶
Added¶
(#42 (github)) Add
netcdf-scm-wrangle
command line interface
Fixed¶
(#41 (github)) Fixed bug in path handling of
CMIP6OutputCube
v0.6.2 - 2019-05-14¶
Added¶
(#39 (github)) Add
netcdf-scm-crunch
command line interface
v0.6.1 - 2019-05-13¶
Added¶
(#29 (github)) Put crunching script into formal testsuite which confirms results against KNMI data available here, however no docs or formal example until #6 (github) is closed
(#28 (github)) Added cmip5 crunching script example, not tested so use with caution until #6 (github) is closed
Changed¶
(#40 (github)) Upgrade to pyam v0.2.0
(#38 (github)) Update to using openscm releases and hence drop Python3.6 support
(#37 (github)) Adjusted read in of gregorian with 0 reference to give all data from year 1 back
(#34 (github)) Move to new openscm naming i.e. returning ScmDataFrame rather than OpenSCMDataFrameBase
(#32 (github)) Move to returning OpenSCMDataFrameBase rather than pandas DataFrame when crunching to scm format
Fixed¶
(#35 (github)) Fixed bug which prevented SCMCube from crunching to scm timeseries with default earth radius when areacella cube was missing
(#29 (github)) Fixed bug identified in #30 (github)
v0.5.1 - 2018-11-12¶
Changed¶
(#26 (github)) Expose directory and filename parsers directly
v0.4.2 - 2018-11-12¶
Changed¶
Update
setup.py
to install dependencies so that non-Iris dependent functionality can be run from a pip install
v0.4.1 - 2018-11-12¶
Added¶
(#23 (github)) Added ability to handle cubes with invalid calendar (e.g. CMIP6 historical concentrations cubes)
(#20 (github)) Added
CMIP6Input4MIPsCube
andCMIP6OutputCube
which add compatibility with CMIP6 data
v0.3.1 - 2018-11-05¶
Added¶
(#15 (github)) Add ability to load from a directory with data that is saved in multiple timeslice files, also adds:
adds regular expressions section to development part of docs
adds an example script of how to crunch netCDF files into SCM csvs
(#13 (github)) Add
load_from_path
method toSCMCube
(#10 (github)) Add land/ocean and hemisphere splits to
_get_scm_masks
outputs
Changed¶
(#17 (github)) Update to crunch global and hemispheric means even if land-surface fraction data is missing
(#16 (github)) Tidy up experimental crunching script
(#14 (github)) Streamline install process
(#12 (github)) Update to use output format that is compatible with pyam
Update
netcdftime
tocftime
to track name change
v0.2.0 - 2018-10-14¶
Added¶
- (#4 (github)) Add work done elsewhere previously
SCMCube
base class for handling netCDF filesreading, cutting and manipulating files for SCM use
MarbleCMIP5Cube
for handling CMIP5 netCDF files within a particular directory structureautomatic loading and use of surface land fraction and cell area files
returns timeseries data, once processed, in pandas DataFrames rather than netCDF format for easier use
demonstration notebook of how this first step works
CI for entire repository including notebooks
automatic documentation with Sphinx