PyNIO

PyNIO is a Python package that allows read and/or write access to a variety of data formats using an interface modelled on netCDF. PyNIO is composed of a C library called libnio along with a Python module based on and with an interface similar to the Scientific.IO.NetCDF module written by Konrad Hinsen. The C library contains the same data I/O code used in NCL, a scripting language developed for analysis and visualization of geo-scientific data. Currently supported formats include:

More detailed information about PyNIO's support for these formats is available at Data Formats Supported by the PyNIO module.

By default, as of version 1.2.0, PyNIO interfaces with the Python environment using the the multi-dimensional array module, NumPy. Earlier versions of PyNIO default to its predecessor, Numeric. While the default has changed, version 1.2.0 still supports the Numeric interface, and it is expected that this support will continue for a reasonable transition period. Except for a few cases, noted in the documentation below, PyNIO's behavior is the same when used with either package. References to NumPy alone can be assumed to apply equally when Numeric is used.

To aid with the transition between Numeric to NumPy, one can use the conversion routines:

  xnumeric = Numeric.asarray(numpy_array)   # needs Numeric 24.2 to work

  xnumpy = numpy.asarray(numeric_array)

Note that while the package name is PyNIO, the module name that is imported into a Python script is Nio.


Class NioFile: Access supported-format files for reading and/or writing

Constructor: open_file(filepath, mode='r', options=None, history='')

filepath

The full or relative path of a file with data in a supported format. The file path may contain a home directory indicator starting with "~". A filepath extension (a suffix consisting of the final characters following the last dot) indicates the expected format. The extension is required in the filepath argument, regardless of whether it is part of the actual filename. Valid extensions include:

  • .nc, .cdf, .netcdf (NetCDF)
  • .grb, .grib, .grb1, .grib1, .grb2, .grib2 (GRIB 1 and GRIB 2) (GRIB2 support available in version 1.2.0 or later.)
  • .hd, .hdf (HDF)
  • .he2, .he4, .hdfeos (HDFEOS 2)
  • .ccm (CCM history files)
PyNIO handles these extensions in a case-insensitive manner: .grib, .Grib, and .GRIB all indicate a GRIB file.

mode

This optional argument specifies the access mode. If not specified it defaults to 'r' -- read-only. PyNIO has three access modes, some of which can be specified in more than one way:

'r'
Open an existing file for reading. PyNIO returns an error if the file does not exist. Attempts to modify the data will fail.
'w','r+','rw','a'
Open an existing file for modification or if the file does not exist create it. PyNIO returns an error if the file cannot be created because of file system access issues, or if a non-writable format is specified. Note that PyNIO never overwrites an existing file with an entirely new file.
'c'
Create a file open for writing. PyNIO returns an error if the file exists, if the file cannot be created because of file system access issues, or if a non-writable format is specified.
options

This optional argument must be an instance of the NioOptions class created prior to calling the open_file constructor. It is used to set a number of file-format-specific options.

history

This optional argument is a string specifying text to be appended to the global "history" attribute of a file open for writing. If no "history" attribute currently exists in the file, it is created.

An NioFile object has three standard attributes:

You should never modify these dictionaries interactively or in a script. Even though no error is raised if you do change them, the data structures will not be updated correctly and your changes will not be propagated to the file.

Methods:

Class NioOptions: Specify format-specific options

Constructor: options()

The NioOptions class enables you to specify a number of format-specific options. No options need to be set for basic access to any of the supported file formats.

In order to set any of the available format-specific options, you must create an instance of the NioOptions class prior to calling the NioFile constructor. Add attributes with the names of the options to be set along with their desired values, and then pass the NioOptions instance as an argument to the NioFile constructor. All option names and string-typed values are handled in a case-insensitive manner. Only options where the value is different from the default need to be set. The same NioOptions instance can be used for multiple calls to the NioFile constructor. Currently, there are valid options only for the NetCDF and GRIB formats.

NetCDF file format options

CompressionLevel (available in version 1.2.0 or later)
Specify the level of data compression as an integer in the range 0 through 9. Increasing values indicate greater compression. Compression is lossless. There are tradeoffs between the time spent compressing the file, versus the amount of compression achieved. Informal tests show that compression level 9 results in a file only a few percent smaller than a compression level 5 file, but it requires 4 or 5 times the amount of time to create it. (This option is ignored unless the Format option is set to NetCDF4Classic.)
Format
This option has an effect only for files opened in "create" mode. It currently has four valid values, two of which are synonyms. The default value, "Classic", indicates that a standard NetCDF file should be created. Standard NetCDF files are more limited with respect to file size. Assuming the underlying file system has support for large files, the total size can exceed 2 GB, but there are severe restrictions regarding the number of large variables and the order in which they are written. In general, because it is more universal, the "classic" format is recommended if the total total file size will be less than 2 GB.

Specifying either "LargeFile" or "64BitOffset" results in the creation of a NetCDF file with support for larger variables and a theoretically much larger total size (about 9.22e+18 bytes). Each fixed-size variable, or each 'record' (element of the first dimension) of a variable with an unlimited dimension can have a size of up to 4 GB. Assuming the underlying file system has support for large files, PyNIO reads NetCDF files in either the classic or the 64-bit offset format. For more detailed information about large file support in NetCdf see http://www.unidata.ucar.edu/software/netcdf/docs/netcdf/Large-File-Support.html.

In version 1.2.0 or later, you can specify "NetCDF4Classic" to create a file using the NetCDF 4 classic model format. The classic model constrains the interface to the constructs provided by NetCDF 3 and earlier. However, the underlying file format, like that of all NetCDF 4 files, is HDF 5. Files written in this format can take advantage of the built-in file compression available in HDF 5. Use the CompressionLevel option to enable compression. Also the HDF 5 format removes virtually all restrictions on file and individual variable size. PyNIO version 1.2.0 provides beta-level support for this format because NetCDF 4 and the release of HDF 5 that it depends on are both still in the beta-testing phase of development. It should probably not be used for mission-critical file creation and it is not yet available on every system that PyNIO runs on.

HeaderReserveSpace
This option has an effect only for files opened for writing. This option reserves extra space within the header of a NetCDF file. Its value is an integer that specifies the number of bytes to reserve in addition to the bytes used for the currently defined dimensions, variables, and attributes. This option can improve performance when it is likely that new dimensions, variables, or attributes will be added to an already large file.

MissingToFillValue
If set to its default value, True, this option causes a "virtual" _FillValue attribute to be created for any variable that has the attribute missing_value but not _FillValue. The purpose is to more gracefully handle files that use the COARDS-compliant missing_value instead of _FillValue to indicate missing data. Note that if a variable in a file has both a missing_value and a _FillValue, or if it has neither, the option does nothing. The virtual _FillValue attribute is not actually part of the NetCDF file, but only appears to be from within PyNIO. However, If the file is opened for writing and you assign to the attribute, it becomes an actual attribute.

PreFill
This option has an effect only when a file is opened for writing. It is logical-valued with a default value of True. If set False, PyNIO alters the standard behavior of the NetCDF library such that variable element locations in the file are not "pre-filled" with the missing (fill) value associated with the variable. This can noticeably improve performance when writing large datasets. However, if you set this option False, you are responsible for ensuring that all the elements of the variables you have defined are assigned a valid value.

SafeMode
This logical-valued option may be set for any NetCDF file. Its default value is False, meaning that PyNIO only closes a NetCDF file when the close is invoked. If set to True, PyNIO closes the file after each operation it performs, including defining a dimension or variable, adding or modifying an attribute, or reading or writing data from any variable. This helps ensure the file's integrity for writable files if the close method does not get called for some reason. However, it may result in loss of performance, particularly when adding new variables, dimensions, or attributes to files that already have large variables defined. This is because each time a new element is defined, all existing data in the file must be moved to make room for the metadata of the new element in the header. One way to mitigate the performance loss is to use the HeaderReserveSpace option when first creating the file to make room in the header for subsequently defined NetCDF elements.

GRIB file format options

DefaultNCEPPtable (ignored unless file is in GRIB 1 format)
This option has two valid values: "Operational", the default, or "Reanalysis". It specifies whether to default to the use of the NCEP operational parameter table (http://www.ncl.ucar.edu/Document/Manuals/Ref_Manual/ncep_opn.htm) or the NCEP reanalysis parameter table (http://www.ncl.ucar.edu/Document/Manuals/Ref_Manual/ncep_reanal.htm). The option only applies in cases where PyNIO, on its own, cannot definitively determine which of these tables to use because of historical ambiguities in NCEP usage.

InitialTimeCoordinateType
This string-valued option has two valid values: "Numeric", the default, or "String". Note that in PyNIO's representation of a GRIB file the initial time dimension is distinguished from the forecast time dimension, whose coordinate values are numerical offsets from a particular initial time. The default value results in initial time coordinates that are COOARDS and CF compliant, with the time represented in units of hours since 1800-01-01. Setting the option to "String" results in human-readable time coordinates, but with the disadvantage that they are not compliant with standard conventions and are likely not to be understood by many processing and visualization software packages. Note that in either case both the string and numerical coordinates are available as variables -- the only difference is which is considered to be the coordinate dimension.

SingleElementDimensions (available in version 1.2.0 or later)
This option allows the user to specify that variables with only a single initial time, forecast time, level, ensemble or probability value, usually handled as attributes, be treated as containing single element dimensions. It is a string-valued option whose default value "None" means that no single-element dimensions will appear in PyNIO's representation of the GRIB file. Conversely, if the option is given the value "All", then all possible dimensions will be created for each variable. Otherwise, the desired single element dimensions may be specified individually. The valid choices are "Initial_time", "Forecast_time", "Level", "Ensemble", and "Probability".

Note that dimensions are not created if the variable does not have an actual value associated with the dimension type, regardless of the value given to this option. For example, variables that are not part of an ensemble forecast will never have an ensemble dimension, and variables whose level type (e.g. Tropopause) does not have a numerical value will never have a level dimension. In the case of level types, it may depend on who wrote the record: files written by some centers may give no value for certain level types where others may use a numerical value such as 0.The intent of this option is to make it easier to concatenate conforming variables from multiple files together.

ThinnedGridInterpolation
This string-valued option has two valid values: "Linear", the default, or "Cubic". It has an effect only for GRIB files that contain data on a thinned grid. The GRIB documentation refers to these grids as "quasi-regular". The option controls the interpolation performed in converting variable data on the grid to the standard rectangular form that is returned by PyNIO.

Class NioVariable: Variable contained in a NioFile object

PyNIO creates an NioVariable object for each variable in the file when the open_file constructor is called. PyNIO also creates an NioVariable object when you call the create_variable method on an NioFile object for writing.

NioVariable objects behave much like array objects defined in the NumPy module, except that their data resides in a file. Data is read by indexing and written by assigning to an indexed subset. You can access the complete contents of the variable using the [:] index notation or using the methods get_value and assign_value.

NioVariable objects have 4 standard attributes:

Like the NioFile object attributes, these attributes should be considered read-only.

Methods:


Usage guide

This guide uses small snippets of sample code to document the basic capabilities of the PyNIO module. Operations that can be performed in read-only mode are listed first.
The following operations require the file to be opened for writing.

Note: the preceding usage information was modelled after this guide to the Python NetCDF module