Find files on the local filesystem#

Find files on the local filesystem.

Example configuration to find CMIP6 data on a personal computer:

projects:
  CMIP6:
    data:
      local-data:
        type: "esmvalcore.local.LocalDataSource"
        rootpath: ~/climate_data
        dirname_template: "{project}/{activity}/{institute}/{dataset}/{exp}/{ensemble}/{mip}/{short_name}/{grid}/{version}"
        filename_template: "{short_name}_{mip}_{dataset}_{exp}_{ensemble}_{grid}*.nc"

The module will find files matching the glob.glob() pattern formed by rootpath/dirname_template/filename_template, where the facets defined inside the curly braces of the templates are replaced by their values from the Dataset or the recipe. Note that the name of the data source, local-data in the example above, must be unique within each project but can otherwise be chosen freely.

To start using this module, download the complete file for personal computers here, copy it to the directory ~/.config/esmvaltool/, and tailor it for your own system if needed.

Example configuration files for popular HPC systems are also available:

Example configuration files for supported climate models are also available:

Classes:

DataSource(*args, **kwargs)

Data source for finding files on a local filesystem.

LocalDataSource(name, project, priority, ...)

Data source for finding files on a local filesystem.

LocalFile(*args, **kwargs)

File on the local filesystem.

Data:

GRIB_FORMATS

GRIB file extensions.

Functions:

find_files(*[, debug])

Find files on the local filesystem.

class esmvalcore.local.DataSource(*args, **kwargs)[source]#

Bases: LocalDataSource

Data source for finding files on a local filesystem.

Deprecated since version 2.13.0: This class is deprecated and will be removed in version 2.16.0. Please use ‘esmvalcore.local.LocalDataSource’ instead.

esmvalcore.local.GRIB_FORMATS = ('.grib2', '.grib', '.grb2', '.grb', '.gb2', '.gb')#

GRIB file extensions.

class esmvalcore.local.LocalDataSource(name: str, project: str, priority: int, rootpath: Path, dirname_template: str, filename_template: str)[source]#

Bases: DataSource

Data source for finding files on a local filesystem.

Attributes:

debug_info

A string containing debug information when no data is found.

dirname_template

The template for the directory names.

filename_template

The template for the file names.

name

A name identifying the data source.

priority

The priority of the data source.

project

The project that the data source provides data for.

regex_pattern

Get regex pattern that can be used to extract facets from paths.

rootpath

The path where the directories are located.

Methods:

find_data(**facets)

Find data locally.

find_files(**facets)

Find files.

get_glob_patterns(**facets)

Compose the globs that will be used to look for files.

path2facets(path, add_timerange)

Extract facets from path.

Parameters:
  • name (str)

  • project (str)

  • priority (int)

  • rootpath (Path)

  • dirname_template (str)

  • filename_template (str)

debug_info: str = ''#

A string containing debug information when no data is found.

dirname_template: str#

The template for the directory names.

filename_template: str#

The template for the file names.

find_data(**facets) list[LocalFile][source]#

Find data locally.

Return type:

list[LocalFile]

find_files(**facets) list[LocalFile][source]#

Find files.

Return type:

list[LocalFile]

get_glob_patterns(**facets) list[Path][source]#

Compose the globs that will be used to look for files.

Return type:

list[Path]

name: str#

A name identifying the data source.

path2facets(path: Path, add_timerange: bool) dict[str, str][source]#

Extract facets from path.

Parameters:
Return type:

dict[str, str]

priority: int#

The priority of the data source. Lower values have priority.

project: str#

The project that the data source provides data for.

property regex_pattern: str#

Get regex pattern that can be used to extract facets from paths.

rootpath: Path#

The path where the directories are located.

class esmvalcore.local.LocalFile(*args, **kwargs)[source]#

Bases: PosixPath, DataElement

File on the local filesystem.

Attributes:

attributes

Attributes are key-value pairs describing the data.

facets

Facets are key-value pairs that were used to find this data.

Methods:

prepare()

Prepare the data for access.

to_iris([ignore_warnings])

Load the data as Iris cubes.

property attributes: dict[str, Any]#

Attributes are key-value pairs describing the data.

property facets: Facets#

Facets are key-value pairs that were used to find this data.

prepare() None[source]#

Prepare the data for access.

Return type:

None

to_iris(ignore_warnings: list[dict[str, Any]] | None = None) CubeList[source]#

Load the data as Iris cubes.

Returns:

The loaded data.

Return type:

iris.cube.CubeList

Parameters:

ignore_warnings (list[dict[str, Any]] | None)

esmvalcore.local.find_files(*, debug: bool = False, **facets: FacetValue) list[LocalFile] | tuple[list[LocalFile], list[Path]][source]#

Find files on the local filesystem.

The directories that are searched for files are defined in esmvalcore.config.CFG under the 'rootpath' key using the directory structure defined under the 'drs' key. If esmvalcore.config.CFG['rootpath'] contains a key that matches the value of the project facet, those paths will be used. If there is no project specific key, the directories in esmvalcore.config.CFG['rootpath']['default'] will be searched.

See Input data for extensive instructions on configuring ESMValCore so it can find files locally.

Parameters:
  • debug (bool) – When debug is set to True, the function will return a tuple with the first element containing the files that were found and the second element containing the glob.glob() patterns that were used to search for files.

  • **facets (FacetValue) – Facets used to search for files. An '*' can be used to match any value. By default, only the latest version of a file will be returned. To select all versions use version='*'. It is also possible to specify multiple values for a facet, e.g. exp=['historical', 'ssp585'] will match any file that belongs to either the historical or ssp585 experiment. The timerange facet can be specified in ISO 8601 format.

Return type:

list[LocalFile] | tuple[list[LocalFile], list[Path]]

Note

A value of timerange='*' is supported, but combining a '*' with a time or period as supported in the recipe is currently not supported and will return all found files.

Examples

Search for files containing surface air temperature from any CMIP6 model for the historical experiment:

>>> esmvalcore.local.find_files(
...     project='CMIP6',
...     activity='CMIP',
...     mip='Amon',
...     short_name='tas',
...     exp='historical',
...     dataset='*',
...     ensemble='*',
...     grid='*',
...     institute='*',
... )
[LocalFile('/home/bandela/climate_data/CMIP6/CMIP/BCC/BCC-ESM1/historical/r1i1p1f1/Amon/tas/gn/v20181214/tas_Amon_BCC-ESM1_historical_r1i1p1f1_gn_185001-201412.nc')]
Returns:

The files that were found.

Return type:

list[LocalFile]

Parameters:
  • debug (bool)

  • facets (FacetValue)