PyNIO
PyNIO is a Python package that allows read and/or write access to a variety of data formats using an interface modelled on netCDF. PyNIO is composed of a C library called libnio along with a Python module based on and with an interface similar to the Scientific.IO.NetCDF module written by Konrad Hinsen. The C library contains the same data I/O code used in NCL, a scripting language developed for analysis and visualization of geo-scientific data. Currently supported formats include:- netCDF (read/write)
- GRIB 1 and GRIB 2 (read-only)(GRIB2 support available in version 1.2.0 or later.)
- HDF 4 (read/write for Scientific DataSets only)
- HDFEOS 2 (read-only for Grid and Swath data only)
- CCM history files (read-only)
More detailed information about PyNIO's support for these formats is available at Data Formats Supported by the PyNIO module.
By default, as of version 1.2.0, PyNIO interfaces with the Python environment using the the multi-dimensional array module, NumPy. Earlier versions of PyNIO default to its predecessor, Numeric. While the default has changed, version 1.2.0 still supports the Numeric interface, and it is expected that this support will continue for a reasonable transition period. Except for a few cases, noted in the documentation below, PyNIO's behavior is the same when used with either package. References to NumPy alone can be assumed to apply equally when Numeric is used.
To aid with the transition between Numeric to NumPy, one can use the conversion routines:
xnumeric = Numeric.asarray(numpy_array) # needs Numeric 24.2 to work xnumpy = numpy.asarray(numeric_array)
Note that while the package name is PyNIO, the module name that is imported into a Python script is Nio.
Class NioFile: Access supported-format files for reading and/or writing
Constructor: open_file(filepath, mode='r', options=None, history='')
- filepath
The full or relative path of a file with data in a supported format. The file path may contain a home directory indicator starting with "~". A filepath extension (a suffix consisting of the final characters following the last dot) indicates the expected format. The extension is required in the filepath argument, regardless of whether it is part of the actual filename. Valid extensions include:
.nc, .cdf, .netcdf(NetCDF).grb, .grib, .grb1, .grib1, .grb2, .grib2(GRIB 1 and GRIB 2) (GRIB2 support available in version 1.2.0 or later.).hd, .hdf(HDF).he2, .he4, .hdfeos(HDFEOS 2).ccm(CCM history files)
.grib, .Grib, and .GRIBall indicate a GRIB file.- mode
This optional argument specifies the access mode. If not specified it defaults to 'r' -- read-only. PyNIO has three access modes, some of which can be specified in more than one way:
'r'- Open an existing file for reading. PyNIO returns an error if the file does not exist. Attempts to modify the data will fail.
'w','r+','rw','a'- Open an existing file for modification or if the file does not exist create it. PyNIO returns an error if the file cannot be created because of file system access issues, or if a non-writable format is specified. Note that PyNIO never overwrites an existing file with an entirely new file.
'c'- Create a file open for writing. PyNIO returns an error if the file exists, if the file cannot be created because of file system access issues, or if a non-writable format is specified.
- options
This optional argument must be an instance of the NioOptions class created prior to calling the
open_fileconstructor. It is used to set a number of file-format-specific options.- history
This optional argument is a string specifying text to be appended to the global "history" attribute of a file open for writing. If no "history" attribute currently exists in the file, it is created.
An NioFile object has three standard attributes:
- dimensions
- This attribute is a dictionary with dimension names as keys and dimension lengths as values.
- variables
- This attribute is a dictionary with variable names as keys and the variable reference objects as values.
- __dict__
- This attribute is a dictionary that contains the global attributes
associated with the file. It has the special property that its
contents are managed using the Python
getattrandsetattrmethods.
- close(history='')
Closes the file. If the file was opened for writing, all buffers are flushed, ensuring the file is updated with all modifications. Any read or write access to the file or one of its variables after closing raises an exception. If the file is opened for writing, the optional argument history may be set with a string containing text to be appended to the global "history" attribute prior to closing the file. The "history" attribute will be created if it does not yet exist.
- create_dimension(name, length)
Creates a new dimension with the given name and length. length must be a positive integer or None, which stands for the unlimited dimension. Note that there can be only one unlimited dimension in a file.
- create_variable(name, type, dimensions)
Creates a new variable with the given name, type, and dimensions. The type is a 1 or 2 character string representing the type to be created. Note that only a subset of possible NumPy types can be created using the writable file formats currently supported by PyNIO. The valid type codes for the NumPy version of PyNIO are slightly different from the valid type codes for the Numeric version. See the usage guide for further details. The dimensions argument must be a tuple containing dimension names that have been previously defined. A scalar variable is specified using an empty tuple.
The return value is the NioVariable object representing the new variable.
Class NioOptions: Specify format-specific options
Constructor: options()
The NioOptions class enables you to specify a number of format-specific options. No options need to be set for basic access to any of the supported file formats.
In order to set any of the available format-specific options, you must create an instance of the NioOptions class prior to calling the NioFile constructor. Add attributes with the names of the options to be set along with their desired values, and then pass the NioOptions instance as an argument to the NioFile constructor. All option names and string-typed values are handled in a case-insensitive manner. Only options where the value is different from the default need to be set. The same NioOptions instance can be used for multiple calls to the NioFile constructor. Currently, there are valid options only for the NetCDF and GRIB formats.
NetCDF file format options
- CompressionLevel (available in version 1.2.0 or later)
- Specify the level of data compression as an integer in the range 0 through 9. Increasing values indicate greater compression. Compression is lossless. There are tradeoffs between the time spent compressing the file, versus the amount of compression achieved. Informal tests show that compression level 9 results in a file only a few percent smaller than a compression level 5 file, but it requires 4 or 5 times the amount of time to create it. (This option is ignored unless the Format option is set to NetCDF4Classic.)
- Format
- This option has an effect only for files opened in
"create" mode. It currently has four valid values, two of which are synonyms.
The default value, "Classic", indicates
that a standard NetCDF file should be created. Standard NetCDF files
are more limited with respect to file size. Assuming the underlying file
system has support for large files, the total size can exceed 2 GB,
but there are severe restrictions regarding the number of large
variables and the order in which they are written. In general, because
it is more universal, the "classic" format is recommended if the total
total file size will be less than 2 GB.
Specifying either "LargeFile" or "64BitOffset" results in the creation of a NetCDF file with support for larger variables and a theoretically much larger total size (about 9.22e+18 bytes). Each fixed-size variable, or each 'record' (element of the first dimension) of a variable with an unlimited dimension can have a size of up to 4 GB. Assuming the underlying file system has support for large files, PyNIO reads NetCDF files in either the classic or the 64-bit offset format. For more detailed information about large file support in NetCdf see http://www.unidata.ucar.edu/software/netcdf/docs/netcdf/Large-File-Support.html.
In version 1.2.0 or later, you can specify "NetCDF4Classic" to create a file using the NetCDF 4 classic model format. The classic model constrains the interface to the constructs provided by NetCDF 3 and earlier. However, the underlying file format, like that of all NetCDF 4 files, is HDF 5. Files written in this format can take advantage of the built-in file compression available in HDF 5. Use the CompressionLevel option to enable compression. Also the HDF 5 format removes virtually all restrictions on file and individual variable size. PyNIO version 1.2.0 provides beta-level support for this format because NetCDF 4 and the release of HDF 5 that it depends on are both still in the beta-testing phase of development. It should probably not be used for mission-critical file creation and it is not yet available on every system that PyNIO runs on.
- HeaderReserveSpace
- This option has an effect only for files opened for writing. This
option reserves extra space within the header of a NetCDF file. Its value
is an integer that specifies the number of bytes to reserve
in addition to the bytes used for the
currently defined dimensions, variables, and attributes. This option
can improve performance when it is likely that new dimensions,
variables, or attributes will be added to an already large file.
- MissingToFillValue
- If set to its default value, True, this option causes a
"virtual" _FillValue attribute to be created for any variable
that has the attribute missing_value but not
_FillValue. The purpose is to more gracefully handle files that
use the COARDS-compliant missing_value instead of
_FillValue to indicate missing data. Note that if a variable
in a file has both a missing_value and a _FillValue, or
if it has neither, the option does nothing. The virtual
_FillValue attribute is not actually part of the NetCDF file,
but only appears to be from within PyNIO.
However, If the file is opened for writing and you assign to the
attribute, it becomes an actual attribute.
- PreFill
- This option has an effect only when a file is opened for
writing. It is logical-valued with a default value of True. If
set False, PyNIO alters the standard behavior
of the NetCDF library such that variable element locations in the file
are not "pre-filled" with the missing (fill) value associated with the
variable. This can noticeably improve performance when writing large
datasets. However, if you set this option False, you are
responsible for ensuring that all the elements of the variables you
have defined are assigned a valid value.
- SafeMode
- This logical-valued option may be set for any NetCDF file. Its default value is False, meaning that PyNIO only closes a NetCDF file when the close is invoked. If set to True, PyNIO closes the file after each operation it performs, including defining a dimension or variable, adding or modifying an attribute, or reading or writing data from any variable. This helps ensure the file's integrity for writable files if the close method does not get called for some reason. However, it may result in loss of performance, particularly when adding new variables, dimensions, or attributes to files that already have large variables defined. This is because each time a new element is defined, all existing data in the file must be moved to make room for the metadata of the new element in the header. One way to mitigate the performance loss is to use the HeaderReserveSpace option when first creating the file to make room in the header for subsequently defined NetCDF elements.
GRIB file format options
- DefaultNCEPPtable (ignored unless file is in GRIB 1 format)
- This option has two valid values:
"Operational", the default, or "Reanalysis". It
specifies whether to default to the use of the NCEP operational
parameter table
(http://www.ncl.ucar.edu/Document/Manuals/Ref_Manual/ncep_opn.htm)
or the NCEP reanalysis parameter table
(http://www.ncl.ucar.edu/Document/Manuals/Ref_Manual/ncep_reanal.htm).
The option only applies in cases where PyNIO, on its own, cannot
definitively determine which of these tables to use because of
historical ambiguities in NCEP usage.
- InitialTimeCoordinateType
- This string-valued option has two valid values: "Numeric",
the default, or "String". Note that in PyNIO's representation of
a GRIB file the initial time dimension is distinguished from the
forecast time dimension, whose coordinate values are numerical offsets
from a particular initial time. The default value results in
initial time coordinates that are COOARDS and CF compliant, with the time
represented in units of hours since 1800-01-01. Setting the option to
"String" results in human-readable time coordinates, but with the disadvantage
that they are not compliant with standard conventions and are likely not to
be understood by many processing and visualization software packages. Note that
in either case both the string and numerical coordinates are available as variables -- the
only difference is which is considered to be the coordinate dimension.
- SingleElementDimensions (available in version 1.2.0 or later)
-
This option allows the user to specify that variables with only a
single initial time, forecast time, level, ensemble or probability
value, usually handled as attributes, be treated as containing single
element dimensions. It is a string-valued option whose default value
"None" means that no single-element dimensions will appear in
PyNIO's representation of the GRIB file. Conversely, if the option is
given the value "All", then all possible dimensions will be
created for each variable. Otherwise, the desired single element
dimensions may be specified individually. The valid choices are
"Initial_time", "Forecast_time", "Level",
"Ensemble", and "Probability".
Note that dimensions are not created if the variable does not have an actual value associated with the dimension type, regardless of the value given to this option. For example, variables that are not part of an ensemble forecast will never have an ensemble dimension, and variables whose level type (e.g. Tropopause) does not have a numerical value will never have a level dimension. In the case of level types, it may depend on who wrote the record: files written by some centers may give no value for certain level types where others may use a numerical value such as 0.The intent of this option is to make it easier to concatenate conforming variables from multiple files together.
- ThinnedGridInterpolation
- This string-valued option has two valid values: "Linear", the default, or "Cubic". It has an effect only for GRIB files that contain data on a thinned grid. The GRIB documentation refers to these grids as "quasi-regular". The option controls the interpolation performed in converting variable data on the grid to the standard rectangular form that is returned by PyNIO.
Class NioVariable: Variable contained in a NioFile object
PyNIO creates an NioVariable object for each variable in the file when the open_file constructor is called. PyNIO also creates an NioVariable object when you call the create_variable method on an NioFile object for writing.
NioVariable objects behave much like array objects defined in the NumPy module, except that their data resides in a file. Data is read by indexing and written by assigning to an indexed subset. You can access the complete contents of the variable using the [:] index notation or using the methods get_value and assign_value.
NioVariable objects have 4 standard attributes:
- rank
- This attribute is a scalar value indicating the number of dimensions in the variable.
- shape
- This attribute is a tuple containing the numbers of elements for each dimension in the same order as the dimensions themselves.
- dimensions
- This attribute is a tuple containing the names of the dimensions in order, left to right, of the slowest varying dimension to the fastest varying dimension.
- __dict__
- This attribute is a dictionary containing the attributes
associated with the variable in the file. It has the special property
that its contents are managed using the Python
getattrandsetattrmethods.
Like the NioFile object attributes, these attributes should be considered read-only.
Methods:
- assign_value(value)
Assigns value to the variable. This method is the only way to assign values to scalar variables, which cannot be indexed. Otherwise, the method requires that all elements of the variable array be supplied, since there is no way of indicating a slice.
- get_value()
Returns the value of the variable. This method is the only way to access the value of scalar variables, which cannot be indexed.
- typecode()
Return the variable's type code.
Usage guide
This guide uses small snippets of sample code to document the basic capabilities of the PyNIO module. Operations that can be performed in read-only mode are listed first.Importing the module
Using PyNIO version 1.2.0 or later:from numpy import * import Nio
PyNIO requires the NumPy module to be installed in the Python distribution you are using, because it depends on its C API to implement its data array interface. Importing the NumPy interface provides many useful tools for manipulating NumPy arrays. Therefore it is recommended that you import NumPy into your Python script or interactive session. However, it is not required for the basic operations provided by PyNIO.To use the Numeric module instead of the NumPy module, assuming the Numeric version of PyNIO is installed, substitute the following:
from Numeric import * import PyNIO_numeric.Nio as Nio
To import the NumPy module using earlier verions of PyNIO:from numpy import * import PyNIO_numpy.Nio as Nio
To import the Numeric module using earlier verions of PyNIO:from Numeric import * import Nio
Open a supported-format file
The constructor for an NioFile object isopen_file. It gathers all the essential metadata from the file and makes it available to the Python script.f = Nio.open_file("gribfile.grb")Open a GRIB-formatted file in the current directory named either "gribfile.grb" or "gribfile" using the default read-only mode. Note that the extension ".grb" need not be present in the actual file name, but serves to indicate the expected format of the file. Since GRIB is a read-only format for PyNIO, this file could not be opened using any mode other than read-only.f = Nio.open_file("~/data/netcdffile.nc","c")Create a new file called "netcdffile.nc" under the user's home directory in a directory called "data". This call will fail if a file with this name and path already exists.The other access modes ("w","rw","a","r+") all mean the same thing: open a file for writing. Do not overwrite an existing file, but allow modifications to it. Create a new file if it does not exist.
Open a supported-format file, setting options
opt = Nio.options() opt.PreFill = False opt.HeaderReserveSpace = 4000 f = Nio.open_file("netcdffile.nc","w",opt,"modified 2006/05/03")Open a file called "netcdffile.nc" for writing in the current directory. If it does not exist create it. To improve performance, set the option for prefilling the elements of newly created variables to False. Also reserve 4000 additional bytes of space in the file header to hold new dimensions, variables, and/or attributes that are to be created. Finally, add a line to the global 'history' attribute.f = Nio.open_file("netcdffile.nc",options=opt,history="modified 2006/05/03",mode="w")Alternate version of the previousopen_filecall using named arguments.Print a summary of the file contents
print f
Here is a sample print of a NetCDF file named "mars.nc":Nio file: mars.nc global attributes: title : 1 degree mars topography creation_date : Wed Mar 8 17:33:51 MST 2000 Conventions : COARDS dimensions: lon = 360 lat = 180 variables: float lon [ lon ] units : degrees_east long_name : longitude float lat [ lat ] units : degrees_north long_name : latitude float elev [ lat, lon ] units : meters long_name : median topography _FillValue : -99999Close a file
f.close()
f.close('modified 2006/05/03')Close the file, adding a line to the global 'history' attribute.Get the global attributes in the file
This construct returns the global attributes names from the file:globalAtts = f.__dict__.keys()
The global attribute names are also returned by the Pythondir()function, as in:globalAtts_plus = dir(f)
However, in this case, mixed in with the actual global attributes are the methods attached to the NioFile object. If you are only interested in the file attributes, you would need to cull out the method attributes:'close','create_dimension', and'create_variable'.You can get all the global attributes along with their associated values by getting the whole attribute dictionary. The following two lines of code are equivalent:
globalAttsAndVals = f.__dict__ globalAttsAndVals = getattr(f,'__dict__')
Get the value of a global attribute by name
The following three lines are equivalent:globalAttVal = f.globalAttNam globalAttVal = f.__dict__['globalAttName'] globalAttVal = getattr(f,'globalAttName')
Note that all numerical attribute values are returned as NumPy arrays, including scalar values. Character array attributes are returned as Python strings.Check for the existence of a global attribute
if hasattr(f,'globalAttName'): do_whatever
Get the dimension names from the file
dimNames = f.dimensions.keys()
Get the size of a dimension by name
dimSize = f.dimensions['dimName']
Get the sizes of all dimensions in the same order as the names
dim_sizes = f.dimensions.values()
Get all the variable names in the file
varNames = f.variables.keys()
Print a summary of a variable in the file
print f.variables['varName']
Here is a sample print of the variable "elev":Variable: elev Type: float Total Size: 259200 bytes 64800 values Number of Dimensions: 2 Dimensions and sizes: [lat | 180] x [lon | 360] Coordinates: lat: [89.5..-89.5] lon: [0.5..359.5] Number of Attributes: 3 units : meters long_name : median topography _FillValue : -99999Get a variable object reference to a specific variable in the file
var = f.variables['varName']
Get the type, number of dimensions, dimension sizes, and dimension names of a variable object
Assumingvar = f.variables['varName']
thentype = var.typecode() numDims = var.rank dimSizes = var.shape dimNames = var.dimensions
Note thattypecode()is a function, whilerank,shapeanddimensionsare attributes of the NioVariable object type.If the variable is scalar, the
rankattribute is set to 0, while theshapeanddimensionsattributes are both empty tuples.The typecode method generally returns a 1 or 2 character string with the same significance as the for the type argument in the NioFile create_variable method. Note however that in the current release the typecode returned for variable created as type 'long' is always
'i'. Also note that GRIB files that have an ensemble dimension contain an ensemble_info variable that is an array of strings. In this case, the NumPy version of PyNIO returns the type as 'Sn' where n represents the length of the strings. The Numeric version returns the type as 'O', which stands for the Numeric 'object' type.Get the attributes of a variable
Assumingvar = f.variables['varName']
varAtts = var.__dict__.keys()
As with global attributes, you can also get the variable attributes using thedir()function. However, the methods of the NioVariable object are also included in the return value. These are'assign_value','get_value', and'typecode'.varAtts_plus = dir(var)
You can get all the variable attributes and their associated values by getting the whole attribute dictionary. The following two lines of code are equivalent:
varAttsAndVals = var.__dict__ varAttsAndVals = getattr(var,'__dict__')
Get the value of a variable attribute
Assumingvar = f.variables.['varName']
The following 3 lines are equivalent:varAttVal = var.varAttName varAttVal = var.__dict__['varAttName'] varAttVal = getattr(var,'varAttName')
Note that all numerical attribute values are returned as NumPy arrays, including scalar values. Character array attributes are returned as Python strings.Check for the existence of a variable attribute
if hasattr(f.variables['varName'],'varAttName'): do_whatever
Get the data in a variable object into a NumPy array
If the variable is not scalar (rank > 0):
data = f.variables.['varName'][:]
ordata = f.variables.['varName'],get_value()
Assuming the shape of the variable is (5,20,30) get a slice along the first element of the first dimension:var = f.variables.['varName'] data = var[0]
Get the same slice but reverse the elements of the second dimension. The following two statements are equivalent:data = var[0,::-1,:] data = var[0,19::-1,:]
Note that when using a negative stride in Python the first subscript must be larger than the second. Also since the second subscript is an excluded bound, logically you would need a -1 for the second index if you wanted to explicitly specify all indexes including 0 (for this example, var[0,19:-1:-1,:] to reverse the entire second dimension). However, because negative subscripts in Python wrap around from the highest array index, a '-1' as the second subscript is equivalent to the '19', and the subscript 19:-1:-1 selects only one element of the 20 element array dimension.If the variable is scalar (rank == 0):
Scalar NioVariable objects cannot be subscripted. Therefore the only way to retrieve the variable value is to use theget_valuemethod:scalar_data = var.get_value()
Note thatscalar_valueis still a NumPy array. It can be distinguished as a scalar value because its shape attribute is an empty tuple:scalar_data.shape == ()
evaluates toTrue.
The following operations require the file to be opened for writing.
Create a global attribute in the file
The following 2 statements are equivalentf.globalAttName = globalAttVal setattr(f,'globalAttName',globalAttVal)
Note that creating an attribute by creating a new element in the__dict__dictionary does not work even though no error is raised and the attribute subsequently appears to be present in the dictionary.Create a dimension in the file
Assuming 'dimName' is a string, rather than a variable, create a dimension of length 74:f.create_dimension('dimName',74)Create a variable in the file
Create a variable named 'varName', of type float, and two dimensions named 'dim1' and 'dim2'. The dimensions must already exist. An NioVariable object is returned.var = f.create_variable('varName','f',('dim1',dim2'))Types corresponding to the single character type designators that can be used to create file variables are as follows:
(When using the NumPy version of PyNIO)
- 'd': 64 bit float
- 'f': 32 bit float
- 'l': long
- 'i': 32 bit integer
- 'h': 16 bit integer
- 'b': 8 bit integer
- 'S1': character
(When using the Numeric version of PyNIO)
- 'd': 64 bit float
- 'f': 32 bit float
- 'l': long
- 'i': 32 bit integer
- 's': 16 bit integer
- 'b': 8 bit integer
- 'c': character
The long ('l') type is 32 bits (the same as an integer 'i' type) on 32-bit computing platforms, but it is 64 bits in versions of PyNIO created for 64-bit computers. However, because NetCDF and HDF 4, the writable file formats, do not support 64-bit integers, PyNIO converts 64-bit integers to 32-bit integers before writing data to the file. This is true both for NumPy and Numeric versions of PyNIO. If the data contains values that exceed the maximum value of a 32-bit integer, truncation will occur without, in the current version, any warning.The dimensions parameter is a tuple. If you are creating a single dimensional variable, and, therefore, the tuple has only one element, you must place a comma after the dimension name for Python to recognize it properly, e.g:
('dim1',). To create a scalar variable use an empty tuple:( ).Create an attribute of a variable in the file
The following 2 statements are equivalentf.variables['varName'].varAttName = varAttVal setattr(f.variables['varName'],'varAttName',varAttVal)
As with global attributes, attempting to create an attribute by modifying the variables__dict__dictionary does not work even though no error is raised and the attribute subsequently appears to be present in the dictionary.Assigning values to a variable in the file
If the variable is not scalar (rank > 0):
You assign values to a file variable either using the subscript syntax or using theassign_valuemethod. Assumingdatais a NumPy array or a Python sequence containing the same number of elements as the target file variable, the following statements are equivalent:f.variables['varName'][:] = data f.variables['varName'].assign_value(data)
Also note that, even though it may look somewhat counterintuitive, assigning to an NioVariable object variable also works:var = f.variables['varName'] # create an NioVariable object variable var[:] = data # or var.assign_value(data)
Using other subscripting syntax, you can assign to subsets of the file variable. For instance, supposevar.shapeis equal to (5,6). Then you could make assignments like the following:var[1] = ( 1,2,3,4,5,6 ) # assign values to all elements indexed by the second element of the first dimension var[1:3,2:4] = ((12,13),(22,23)) # assign values to a 2x2 rectangular sub-space of the variable
You can assign values to character variables using Python strings. The last dimension of the variable should have a size equal to the length of the string. Strings used to populate multidimensional character variables should all have the same length. Pad shorter strings with blanks at the end.If the variable is scalar (rank == 0):
Scalar NioVariable objects cannot be subscripted. Therefore the only way to assign a value to a scalar variable is to use theassign_valuemethod:f.variables['scalar_var'].assign_value(42.0)
Note: the preceding usage information was modelled after this guide to the Python NetCDF module