4.1NCSA HDF Calling Interfaces and Utilities

Storing Rectangular Gridded Arrays of Scientific Data4.1

National Center for Supercomputing Applications

November 1989





4.1NCSA HDF Calling Interfaces and Utilities

Storing Rectangular Gridded Arrays of Scientific Data4.1

National Center for Supercomputing Applications

November 1989









Chapter 4Storing Rectangular Gridded Arrays of 
Scientific Data



Chapter Overview
Scientific Datasets
Reasons to Use Scientific Datasets
Header File
Writing Scientific Datasets to a File
The ÒSetÓ Routines:  Preparing to Write Scientific Datasets
Writing Scientific Datasets to a File
Writing Parts of a Scientific Dataset
Reading Scientific Datasets from a File
Getting the Dimensions of a Scientific Dataset
Reading an Entire Scientific Dataset
Getting Other Information About SDSs
Reading Parts of a Scientific Dataset
How SDS Routines Store and Convert Scientific Data
How HDF Normally Stores Arrays
How HDF Normally Represents Numbers
DFSDsettype:  Setting Scientific Dataset Attributes
Sample Programs
A FORTRAN Program
A C Program

Chapter Overview

This chapter describes the routines that are available for storing and retrieving scientific datasets.


Scientific Datasets

A scientific dataset (SDS) is an HDF set that stores rectangular gridded arrays of floating-point data, together with information about the data. Specifically, an SDS is a set of tags and associated information about scientific data. Assuming that a user of scientific data often will want information about the data, an SDS might contain the following information:

¥The actual data values in floating-point format

¥The coordinate system to be used when interpreting or displaying the data

¥The number of dimensions (rank) and the size of each dimension

¥Scales to be used along the different dimensions when interpreting or displaying the data

¥Labels for all dimensions (each label can be thought of as the name of an independent variable) and for the data (the dependent variable)

¥Units for all dimensions and for the data

¥Format specifications to be used when displaying values for the dimensions and for the data

¥The maximum and minimum values of the data

Other information could be added at a later time, such as the data type for the data values (other than floating-point), data types for the dimensions, and identification of regions of interest.

Figure 4.1 shows a conceptual view of a sample scientific dataset. The actual two-dimensional array of data is only one element in the set. Other elements include the number of dimensions (rank), the sizes of the dimensions, identifying information about the data and axes, and scales (ranges) for the axes.

Figure  4.1   HDF File with Scientific Dataset




The HDF library provides an SDS interface with routines for storing and retrieving scientific image sets. This user interface lets you (a) build an SDS and (b) extract data from an SDS. These routines can be called  from C and FORTRAN programs that have access to the library. They work on several computers, including the Cray, Silicon Graphics, Alliant, Sun, VAX, Macintosh, and IBM PC.

All routines are functions of type integer.

Table 4.1 lists the long and short names and the functions of the SDS routines currently contained in the HDF library. The following sections provide descriptions and examples of these calling routines.

Table 4.1Scientific Dataset Routines in the HDF Library
Long NameShort NameFunction

DFSDsetlengthsdsslenssets maximum lengths for strings that will hold labels, units, formats, and the Name of the coordinate system.

DFSDsetdimsdssdimssets the default rank and dimension sizes for succeeding files.

DFSDsetdimstrsdssdistsets label, unit,  and format specifications for a dimension and its scale.

DFSDsetdimscaledssdiscsets the scale for a dimension.


DFSDsetdatastrsdissdastsets label, unit, and format specifications for the data.

DFSDsetmaxmindssmaxmsets maximum and minimum data values.

Table 4.1Scientific Dataset Routines in the HDF Library (Continued)
Long NameShort NameFunction

DFSDputdatadspdatawrites the data to the file, overwriting other file contents.
DFSDadddatadsadataappends the data to the file, not overwriting other file contents.
DFSDcleardsclearclears all possible set values.

DFSDstartslicedssslcprepares system to write part of a dataset to a file.

DFSDputslicedspslcwrites part of a dataset to a file.

DFSDendslicedseslcindicates write completion for part of a dataset.

DFSDgetdimsdsgdimsgets the number of dimensions of the dataset and the sizes of the dimensions for the next SDS in the file.

DFSDgetdatadsgdatareads the next dataset in the file.

DFSDgetdimstrsdsgdistreads the label, unit, and format for a dimension and its scale.

DFSDgetdimscaledsgdiscreads the scale for a dimension.

DFSDgetdatastrsdsgdastreads the label, unit, and format  specification for the data.

DFSDgetmaxmindsgmaxmreads the maximum and minimum values.

DFSDrestartdsfirstsets the next get command to read from the first SDS in the file, rather than the next.

DFSDgetslicedsgslcreads part of a dataset.

DFSDsettypedsstypespecifies data attributes-data type and representation, system type, and array order.


Reasons to Use Scientific Datasets

HDF scientific datasets are self-describing. This means that you or your programs can find out from the file itself what must be known in order to interpret or display the scientific data in the file. You do not have to look elsewhere to find this information, nor possibly do without it.

NCSA's DataScope, for instance, uses the information from an SDS for displaying scientific data and images created from the data. If the coordinate system is polar, it displays one kind of image; if the coordinate system is Cartesian, it displays another. Because the system can have scales for each dimension, it can present the data or its image in a more meaningful context than would a display with no scales along the axes.

Sometimes it is useful to be able to store different kinds of data in a single file. With scientific datasets, you can store a variety of different scientific datasets-each accompanied by its self-description. You also can store one or more raster images, derived from a dataset, in the same file with a scientific dataset.

A frequent problem among scientists who want to share data is that datasets from different sources are often stored in very different formats. Since SDS provides a common data format, different machines and different programs can access the same files without relying on user-written conversion routines or trying to decipher special data formats when new data is encountered.


Header File

The header file dfsdg.h contains the declarations and definitions that are used by the routines listed here. This file can, if needed, be included with your C source code, and in some cases also with FORTRAN code.


Writing Scientific Datasets to a File

SDS information is written to a file in two steps. The first involves execution of a series of "set" calls, which put information about the actual data into a special structure in primary memory. If you do not wish to specify a certain item, you need not invoke its corresponding set call.

The second phase involves actually writing the data to a file, along with the information that has been set. To do this, execute either DFSDputdata or DFSDadddata.

In general, you perform these same two steps for each dataset you want to write to your file. However, it is not usually necessary to perform all set calls for every dataset you wish to write out. For example, if the rank and dimensions of all datasets are exactly the same, you only have to call the routine DFSDsetdims before writing out the first set. The HDF software remembers the rank and dimefsion values and associates them with all subsequent data arrays that are written to the same file, unless you change them.

In other words, once an item has been set, it does not normally go away even after a DFSDputdata or DFSDadddata call. If it does not get overwritten by another set call or otherwise get cleared, the same item is associated with all subsequent scientific datasets that are written to the same file. (The only exception to this is that the values set by DFSDsetmaxmin are cleared after they are written to a file.) The two functions, DFSDclear and DFSDsetdims, cause all previous set calls to be cleared. 


The "Set" Routines: 
Preparing to Write 
Scientific Datasets
DFSDsetlengths
FORTRAN:
INTEGER FUNCTION DFSDsetlengths(maxlen_label, maxlen_unit,maxlen_format, maxlen_coordsys)

INTEGERmaxlen_label-max length of any label
INTEGERmaxlen_unit-max length of any unit
INTEGERMaxlen_format-max length of any format
INTEGERmaxlen_coordsys-max length of any coordsys

C:
int DFSDsetlengths(maxlen_label, maxlen_unit,maxlen_format, maxlen_coordsys)

intmaxlen_label;/* max length of any label*/
intmaxlen_unit;/* max length of any unit*/
intMaxlen_format;/* max length of any format*/
intmaxlen_coordsys;/* max length of any coordsys */

Purpose:  To set, optionally, the maximum lengths for the strings that will hold labels, units, formats, and the name of the coordinate system.

Returns:  0 on success; -1 on failure.

These lengths are used by the routines DFSDgetdimstrs and DFSDgetdatastrs to determine the maximum lengths of strings that they get from the file.

Normally, DFSDsetlengths is not needed. If it is not called, default maximum lengths of 255 are used for all strings.


DFSDsetdims
FORTRAN:
INTEGER FUNCTION DFSDsetdims(rank, dimsizes)

INTEGERrank-number of dimensions
INTEGERdimsizes(rank)-sizes of the dimensions

C:
int DFSDsetdims(rank, dimsizes)

intrank;/* number of dimensions*/
int32dimsizes[];/* sizes of the dimensions */

Purpose:  To set  the rank and dimension sizes for subsequent scientific datasets that are written to the file.

Returns:  0 on success; -1 on failure.

This routine must be called before calling DFSDsetdimstrs and DFSDsetdimscale. DFSDsetdims need not be called if these routines are not called, and the correct dimensions are supplied in DFSDputdata or DFSDadddata.

If rank or dimension sizes change, all previous set calls are cleared.


DFSDsetdimstrs
FORTRAN:
INTEGER FUNCTION DFSDsetdimstrs(dim, label, unit, format)

INTEGERdim-dimension this label, unit and format refer to
CHARACTER*256label-label that describes this dimension
CHARACTER*256unit-unit to be used with this dimension
CHARACTER*256format-format to be used in displaying scale for this dimension

C:
int DFSDsetdimstrs(dim, label, unit, format)

intdim;/*dimension this label, unit and format refer to*/
char*label;/*label that describes this dimension*/
char*unit;/*unit to be used with this dimension*/
char*format;/*format to be used to display scale  */

Purpose:  To set the items corresponding to dimension dim that are to be stored as strings in the SDS, namely label, unit, and format.

Returns:  0 on success; -1 on failure.


DFSDsetdimscale
FORTRAN2
INTEGER FUNCTION DFSDsetdimscale(dim, size, scale)

INTEGERdim-dimension this scale goes with
INTEGERdimsize-size of scale
REALscale(dimsize)-the scale

C:
int DFSDsetdimscale(dim, dimsize, scale)

intdim;/*dimension this scale corresponds to */
int32dimsize;/* size of scale */
float32scale[];/* the scale */

Purpose:  To set the scale corresponding to dimension dim by taking it from the floating-point array scale.

Returns:  0 on success; -1 on failure.

The input parameter dimsize gives the size of the array scale.


DFSDsetdatastrs
FORTRAN:
INTEGER FUNCTION DFSDsetdatastrs(label,unit,format,coordsys)

CHARACTER*256  label-label that describes the data
CHARACTER*256  unit-unit to be used with the data
CHARACTER*256  format-format to be used in displaying data
CHARACTER*256  coordsys-coordinate system

C:
int DFSDsetdatastrs(label, unit, format, coordsys)

char*label;/*label that describes the data */
char*unit;/*unit to be used with the data */
char*format;/*format to be used in displaying the data */
char*coordsys;/*coordinate system */

Purpose:  To set the items corresponding to the data that are to be stored as strings in the SDS, namely label, unit, format, and  coordsys (coordinate system).

Returns:  0 on success; -1 on failure.


DFSDsetmaxmin
FORTRAN:
DFSDsetmaxmin FUNCTION DFSDsetmaxmin(max, min)

REALmax-highest value in data array
REALmin-lowest value in data array

C:
int DFSDsetmaxmin(max, min)

float32max;/*highest value in data array */
float32min;/*lowest value in data array */

Purpose:  To set maximum and minimum data values.

Returns:  0 on success; -1 on failure.

This routine does not compute the maximum and minimum values. It merely stores the values it is given as maximum and minimum.

NOTE:  When the maximum and minimum values are written to a file, the HDF element that holds these values is cleared, because it is assumed that subsequent datasets will have different values for max and min.


Writing Scientific Datasets to a File
DFSDputdata
FORTRAN:
INTEGER FUNCTION DFSDputdata(filename, rank, dimsizes, data)

CHARACTER*64filename-name of file to store SDS in
INTEGERrank-number of dimensions of data arrai to be stored
INTEGERdimsizes(rank)-array that holds sizes of dimensions
REALdata(*)-array holding data to be stored

C:
int DFSDputdata(filename, rank, dimsizes, data)

char*filename;/*name of file to store SDS in */
uint16rank;/*number of dimensions of data array to be stored */
int32dimsizes[];/*array that holds sizes of dimensions */
float32*data;/*array holding data to be stored */

Purpose:  To write to the file the scientific data in the floating-point array data, as well as all other information that has previously been set.

Returns:  0 on success; -1 on failure.

If the file is not empty, DFSDputdata overwrites whatever was previously in the file.


DFSDadddata
FORTRAN:
INTEGER FUNCTION DFSDadddata(filename, rank, dimsizes, data)

CHARACTER*64filename-name of file to store SDS in
INTEGERrank-number of dimensions of data array to be stored
INTEGERdimsizes(rank)-array that holds sizes of dimensions
REALdata(*)-array holding data to be stored

C:
int DFSDadddata(filename, rank, dimsizes, data)

char*filename;/*name of file to store SDS in */
uint16rank;/*number of dimensions of data array to be stored */
int32dimsizes[];/*array that holds sizes of dimensions */
float32*data;/*array holding data to be stored */

Purpose:  DFSDadddata does the same thing that DFSDputdata does, except that it appends the scientific dataset to the file. If there was other data in the file, it remains undisturbed.

Returns:  0 on success; -1 on failure.


DFSDclear
FORTRAN:
INTEGER FUNCTION DFSDclear()

C:
int DFSDclear()
Purpose:  To cause all possible 'set' values to be cleared.

Returns:  0 on success; -1 on failure.

After a call to DFSDclear,  values set by the "set" calls will not be written unless they have been set again.


Example : Writing an Array as a Scientific Dataset
Figure 4.2 shows a sample call that stores scientific data in a 
5 x 20 x 5000 array of reals called points. The array is to be stored as an SDS in a file called 'SDex3.hdf', with no labels, scales, or other information.

Figure 4.2Storing Just Scientific Data
FORTRAN:
INTEGERDFSDsetdims, DFSDputdata
REALpoints(5,20,5000)
INTEGERshape(3), ret

shape(1) = 5
shape(2) = 20
shape(3) = 5000

ret = DFSDsetdims(3, shape)
ret = DFSDputdata('SDex3.hdf',3, shape, points)
  .
  .
  .


Remarks:

¥This illustrates the simplest use of DFSDputdata. No values have been set for anything other than the rank and dimensions of the array, so nothing is stored in the file except the rank, dimensions, and data.


Example : Writing a Scientific Dataset with Associated Information
Figure 4.3 demonstrates a call that writes four 200 x 200 data arrays in succession to a file called 'SDex4.hdf'. The first two datasets have different values for data labels, but the same values for all other labels, units, and formats. The second two datasets have the same values for data labels, units, and formats, but no information associated with the dimensions.
Figure 4.3Storing Scientific Data with Associated Information
FORTRAN:
INTEGERDFSDsetdims, DFSDsetdatastrs, DFSDsetdimstrs
INTEGERDFSDsetdimscale, DFSDputdata
REALpress1(200,200), press2(200,200)
REALden1(200,200),den2(200,200)
INTEGERshape(2), ret
REALxscale(200), yscale(200)

shape(1) = 200
shape(2) = 200

ret = DFSDsetdims(2, shape)
ret = DFSDsetdatastrs('pressure 1','Pascals','E15.9','')
ret = DFSDsetdimstrs(1,'x','cm','F10.2')
ret = DFSDsetdimstrs(2,'y','cm','F10.2')
ret = DFSDsetdimscale(1, shape(1), xscale)
ret = DFSDsetdimscale(2, shape(2), yscale)
ret = DFSDadddata('SDex4.hdf', 2, shape, press1)

ret = DFSDsetdatastrs('pressure 2','Pascals','E15.9','')
ret = DFSDadddata('SDex4.hdf', 2, shape, press2)

ret = DFSDclear()
ret = DFSDsetdatastrs('density 1','g/cm3','E15.9','')
ret = DFSDadddata('SDex4.hdf', 2, shape, den1)

ret = DFSDsetdatastrs('density 2','g/cm3','E15.9','')
ret = DFSDadddata('SDex4.hdf', 2, shape, den2)
  .
  .
  .


Remarks:

¥The only differences between the first and second datasets are the data arrays and the two data labels ("pressure 1" and "pressure 2"). All other information that is set before the first DFSDputdata is associated with both scientific datasets in the file.

¥Before the third and fourth calls to DFSDputdata, the information that has been set has to be cleared so that it is not also associated with the third and fourth scientific datasets. This is done by calling DFSDclear.

¥The coordsys parameter is of no interest to the user in this application, so the empty string (' ') is given as the fourth argument to DFSDsetdatastrs.


Writing Parts of a Scientific Dataset
The routines DFSDstartslice, DFSDputslice, and DFSDendslice let you store an array in slices. To store an array in slices, make calls to these routines in the following order:

DFSDstartslice(filename)
DFSDputslice(winst, winend, data, ndims, dims)
DFSDputslice(winst, winend, data, ndims, dims)
...
DFSDputslice(winst, winend, data, ndims, dims)
DFSDendslice()

You must call DFSDstartslice before either of the other routines. Thereafter, DFSDputslice may be called many times to write several slices. DFSDendslice must be called to complete the write. No other HDF routines may be called between the calls to DFSDstartslice and DFSDendslice.


DFSDstartslice
FORTRAN:
INTEGER FUNCTION dfsdstartslice(filename)
CHARACTER*64 filename-name of HDF file

C:
int DFSDstartslice(filename)
char *filename;/* name of HDF file */

Purpose:  To prepare the system to write a slice to a file. 
Returns:  0 on success; -1 on failure.

Before this routine is called, DFSDsetdims must be called to specify the dimensions of the dataset to be written to the file. DFSDstartslice always appends a new dataset to an existing file. 

You must call DFSDstartslice before calling DFSDputslice or DFSDendslice.


DFSDputslice
FORTRAN:
INTEGER FUNCTION DFSDputslice(winst,winend,source,ndims,dims)

INTEGERwinst(*)-array with coordinates of start of slice
INTEGERwinend(*)-array with coordinates of end of slice
REALsource()-array containing slice
INTEGERndims-no of dims of array source
INTEGERdims(*)-dimensions of array source

C:
int DFSDputslice(winst, winend, source, ndims, dims)

int32 winst[], winend[];/* start and end of slice*/
float32 *source;/* array for storing slice*/
int ndims;/* number of dimensions of dest*/
int32 dims[];/* sizes of dimensions of dest*/

Purpose:  To write a slice to an SDS 
Returns:  0 on success; -1 on failure.

winst and winend ("window start" and "window end") specify the coordinates of the start and end of the slice, as shown in the above program boxes. winst and winend have as many elements as there are dimensions in the array. source is an array containing the slice. Note that the actual data to be written out is assumed to be contained in the last dimensions of the array source and is assumed to be at the beginning of each dimension. All parameters assume FORTRAN-style 1-based arrays.

NOTE:  If dims is larger than the size of the slice, the actual data may not be contiguous in the array source. 

Example:  If source is an 8 x 5 x 6 array and the slice being written out is 2 x 5, it is assumed to be contained in { 1, 1Ð2, 1Ð5 }. ndims specifies the number of dimensions of the array source in this example 3. dims is the array containing the actual dimensions of source, in this case {8, 5, 6}.


Writes MUST be contiguous. For example, if you wish to write a 
10 x 12 array, you may make a series of calls to DFSDstartslice, perhaps to write the following slices {1Ð2, 1Ð12}, {3Ð6, 1Ð12}, 
{7Ð7, 1Ð4}, {7Ð7, 5Ð12}, {8Ð10, 1Ð12}. Note that there are no gaps. A slice such as {1Ð2, 1Ð10} is not allowed, because it leaves the last two columns of the array unfilled.

In the above example it is assumed that the array is stored in row major order-the default order. If you have called DFSDsettype (see below) to write arrays to the file in column major order, then the writes must be column major contiguous. For instance, to write a 5 x 6 x 9 array, an acceptable order might be {1Ð5, 1Ð6, 1Ð4}, 
{1Ð5, 1Ð3, 5Ð5}, {1Ð3, 4Ð4, 5Ð5}, {4Ð5, 4Ð4, 5Ð5}, {1Ð5, 5Ð6, 5Ð5}, 
{1Ð5, 1Ð6,6Ð9}.


DFSDendslice
FORTRAN:
INTEGER FUNCTION DFSDendslice()

C:
int DFSDendslice()

Purpose:  To specify that the entire dataset has been written.

Returns:  0 on success; -1 on failure.

DFSDendslice must be called after all the slices are written. It checks to ensure that the entire dataset has been written, and if it has not, returns an error code.

Reading Scientific Datasets from a File

You can read an SDS from a file by executing a series of get calls. Each call retrieves from the file one or more pieces of information associated with the SDS.

You must invoke at least one of  two routines, DFSDgetdims or DFSDgetdata, before calling any of the others. These two routines open the desired file, allocate space for special HDF structures that must be loaded into primary memory, and perform other initializing operations. Once this initialization is done, the other routines can be called in any order and as many times as desired.


Getting the Dimensions of a Scientific Dataset
DFSDgetdims
FORTRAN:
INTEGER FUNCTION DFSDgetdims(filename, rank, sizes, maxrank)

CHARACTER*64 filename-name of file with SDS
INTEGERrank-number of dimensions
INTEGERmaxrank-size of array for holding dim sizes
INTEGERsizes(maxrank)-array for holding dim sizes

C:
int DFSDgetdims(filename, rank, sizes, maxrank)

char*filename;/*name of file with SDS */
int*rank;/*number of dimensions */
int32sizes[];/*array for holding dim sizes */
intmaxrank;/*size of array for holding dim sizes */

Purpose:  To get the rank (number of dimensions) of the dataset and the sizes of each dimension in the next SDS in the file.

Returns:  0 on success; -1 on failure.

The input argument maxrank tells DFSDgetdims the size of the array that is allocated for storing the array of dimension sizes. The value of rank cannot exceed the value of maxrank.


Reading an Entire Scientific Dataset
DFSDgetdata
FORTRAN:
INTEGER FUNCTION DFSDgetdata(filename, rank, sizes, data)

CHARACTER*64 filename-name of file with SDS
INTEGERrank-number of dimensions
INTEGERmaxrank-size of array for holding dim
 sizes
INTEGERsizes(maxrank)-array that holds sizes of 
dimensions
REALdata(*)-array for holding the data

C:
int DFSDgetdata(filename, rank, sizes, data)

char*filename;/*name of file with SDS */
intrank;/*number of dimensions */
int32maxsizes[];/*array that holds sizes of dimensions */
float32data[];/*array for holding the data */

Purpose:  To get the dataset from the next SDS in the file and store it in the floating-point array data.

Returns:  0 on success; -1 on failure.

The input argument filename is the same as it is in DFSDgetdims. rank tells the rank of the data to be read, and sizes gives the actual dimensions of the array data.

If you know the rank and dimensions of the dataset beforehand, then there is no need to call DFSDgetdims. Simply allocate arrays with the proper dimensions for the dataset and let DFSDgetdata read it in.

If you do not know the values of rank or sizes, you must call DFSDgetdims to get them and then use them to provide the right amount of space for the array data.

Each new call to DFSDgetdims or DFSDgetdata (or just to DFSDgetdata) reads from the SDS that succeeds the last one read. For example, if DFSDgetdata is called three times in succession, the third call reads data from the third SDS in the file. Of course, if you do not know the values of rank or maxsizes, you must call DFSDgetdims to get them and then use them to provide the right amount of space for the array data.

If DFSDgetdims or DFSDgetdata is called and there are no more scientific datasets left in the file, an error code is returned and nothing is read. DFSDrestart (see below) enables you to override this convention.


Getting Other Information About SDSs
DFSDgetdimstrs
FORTRAN:
INTEGER FUNCTION DFSDgetdimstrs(dim, label, unit, format)

INTEGERdim-dimension this label, unit and format refer to
CHARACTER*256label-label that describes this dimension
CHARACTER*256unit-unit to be used with this dimension
CHARACTER*256format-format to be used in displaying scale for this dimension

C:
int DFSDgetdimstrs(dim, label, unit, format)

intdim;/*dimension this label, unit and format
 refer to */
char*label;/*label that describes this dimension */
char*unit;/*unit to be used with this dimension */
char*format;/*format to be used in displaying scale for this dimension */

Purpose:  To get the items corresponding to the dimension dim that are stored as strings in the SDS, namely label, unit, and format.

Returns:  0 on success; -1 on failure.


DFSDgetdimscale
FORTRAN:
INTEGER FUNCTION DFSDgetdimscale(dim, size, scale)

INTEGERdim-dimension this scale corresponds to
INTEGERsize-size of scale
REALscale(size)-the scale

C:
int DFSDgetdimscale(dim, size, scale)

int  dim;/*dimension this scale corresponds to */
int32size;/*size of scale */
float32scale[];/*the scale */

Purpose:  To get the scale corresponding to the dimension dim and store it in the floating-point array scale.

Returns:  0 on success; -1 on failure.

The input parameter size gives the size of the scale array.


DFSDgetdatastrs
FORTRAN: 
INTEGER FUNCTION DFSDgetdatastrs(label, unit, format, coordsys)

CHARACTER*256label-label that Describes the data
CHARACTER*256unit-unit to be used with the data
CHARACTER*256format-format to be used in displaying data
CHARACTER*256coordsys-coordinate system

C:
int DFSDgetdatastrs(label, unit, format, coordsys)

char*label;/*label that describes the data */
char*unit;/*unit to be used with the data */
char*format;/*format to be used in displaying data */
char*coordsys;/*coordinate system */

Purpose:  To get information about the data itself from all strings.

Returns:  0 on success; -1 on failure.

The parameter coordsys gives the coordinate system that is to be used for interpreting the dimension information.


DFSDgetmaxmin
FORTRAN:
INTEGER FUNCTION DFSDgetmaxmin(max, min)

REALmax-highest value in data array
REALmin-lowest value in data array

C:
int DFSDgetmaxmin(max, min)

float32*max;/*highest value in data array */
float32*min;/*lowest value in data array */


Purpose:  To get the maximum and minimum of the values in the data array.

Returns:  0 on success; -1 on failure or if there are no max or min values.

NOTE:  These values need to have been set by a user via a call to DFSDsetmaxmin. They are not automatically stored.


DFSDrestart
FORTRAN:
INTEGER FUNCTION DFSDrestart()

C:
int DFSDrestart()


Purpose:  To cause the next get to read from the first SDS in the file, rather than the SDS following the one that was most recently read.

Returns:  0 on success; -1 on failure.


Example: Reading in a Simple Scientific Dataset
Figure 4.4 contains a simple call in which the dimensions of the dataset are already known, and nothing other than the data itself is desired.

Figure 4.4Reading in a Dataset
FORTRAN:
INTEGERDFSDgetdata
REALdensity(100, 500)
INTEGERsizes(2), ret

sizes(1) = 100
sizes(2) = 500
ret = DFSDgetdata('myfile.hdf', 2, sizes, density)
  .
  .
  .


Remarks:

¥The SDS is stored in a file called 'myfile.hdf'. The hdf extension is not required. Any valid filename can be used.

¥The data stored in the file is known to be a 100 x 500 array, of reals.

¥The array density is exactly the right size.

¥DFSDgetdata is declared as a function of type integer.

¥If DFSDgetdata executes successfully, 0 is assigned to ret. Otherwise Ð1 is assigned.


Example: Reading  Two Scientific Datasets from a File of Unknown Size
In Figure 4.5, two arrays of the same size are stored in a file. The size is not known ahead of time, but it is known that the arrays are two-dimensional and no larger than 800 x 500.

Figure 4.5Reading Multiple SDSs from a Single File of Unknown Size
FORTRAN:
INTEGERDFSDgetdims, DFSDgetdata, DFSDgetdatastrs
INTEGERDFSDgetdimstrs, DFSDgetdimscale
INTEGERrank, dimsizes(2), ret
CHARACTER*100datalabel, dataunit, datafmt
CHARACTER*100xlabel, ylabel, xunit, yunit, xfmt, yfmt
REALxscale(800), yscale(500)
REALpressure(800,500), density(800,500)

ret = DFSDgetdims('SDex2.hdf', rank, dimsizes, 2)
ret = DFSDgetdata('SDex2.hdf', rank, dimsizes, pressure)
ret = DFSDgetdatastrs(datalabel, dataunit, datafmt)
ret = DFSDgetdimstrs(1, xlabel, xunit, xfmt)
ret = DFSDgetdimstrs(2, xlabel, yunit, yfmt)
ret = DFSDgetdimscale(1, dimsizes(1), xscale)
ret = DFSDgetdimscale(2, dimsizes(2), yscale)

ret = DFSDgetdata('SDex2.hdf', 2, dimsizes, density)
ret = DFSDgetdatastrs(datalabel, dataunit, datafmt)
  .
  .
  .


Remarks:

¥The first call to DFSDgetdims provides sufficient information about the data, so you do not need to call it before the second array is loaded.

¥The full battery of get routines is called for getting the pressure data, but in this case the dimension information (scales, labels, units, formats) is already available from the first call.

¥The interface remembers from the first call to the second that one SDS has already been accessed, so on the second call it gets the second SDS.


Reading Part of a Scientific Dataset
DFSDgetslice
FORTRAN:
INTEGER FUNCTION DFSDgetslice(filename, winst, winend, dest, ndims, dims)

CHARACTER*(*) filename-name of HDF file
INTEGERwinst-array with coordinates of start of slice
INTEGERwinend-array with coordinates of end of slice
REALdest-array for returning slice
INTEGERndims-number of dimensions of array dest
INTEGERdims-dimensions of array dest

C:
int DFSDgetslice(filename, winst, winend, dest, ndims, dims)

char *filename;/* name of HDF file to use */
int32 winst[], /* array containing start of slice */
int32 winend[];/* array containing end of slice */
float32 dest[];/* array for returning slice */
int ndims;/* number of dimensions of array dest */
int32 dims[];/* dimension of array dest */

Purpose:  To read part of an SDS from a file.

Returns:  0 on success; -1 on failure.

DFSDgetslice accesses the dataset last accessed by DFSDgetdims. If DFSDgetdims has not been called for the named file, DFSDgetslice gets a slice from the next dataset in the file.

winst and winend are arrays that specify the start and end of the slice. The number of elements in winst and winend must be equal to the rank of the dataset.

NOTE: All the parameters on the call assume FORTRAN-style 1-based arrays.

For example, if the file contains a three dimensional dataset, winst may contain the values {2, 4, 3}, while wifend contains the values {4, 4, 6}. This will extract a 3 x 4 two-dimensional slice, containing the elements {2Ð4, 4, 3Ð6} from the original dataset. 

dest is the array into which the slice is read. It must be at least as big as the desired slice. For instance, dest may be a fourÐdimensional array {5, 4, 6, 5}. Then the 2D slice will be placed in the last two dimensions like this: {5, 4, 1Ð3, 1Ð4}. 

ndims is the number of dimensions of the array dest, in this 
case 4.

dims is an array containing the actual dimensions of the array dest. The user assigns values to dims before calling DFSDgetslice. In the above example, dims  should contain 
{5, 4, 6, 5}. In the event that dims is larger than the size of the slice, the actual data may not be contiguous in the array dest. 

When calling DFSDgetslice, make sure that the arrays winst, winend, and dims contain the right values, and that ndims is specified correctly. Note, for example, that a slice that is 2 x 10 x 1 is treated as a 2 x 10 slice, and would be placed in the last two dimensions of the array dest.


How SDS Routines Store and Convert Scientific Data

How HDF Normally Stores Arrays
Sometimes it is helpful to know how SDS routines store scientific data. When the data in a scientific dataset is stored in a file, it is stored in row major order. When the data is loaded into memory, it is stored in row major order if the DFSDgetdata was called from a C program, but it is stored in column major order if the DFSDgetdata was called from a FORTRAN program. This is exactly the way you normally want your data to be stored if you are a C or FORTRAN programmer, respectively.

When a dataset is taken from memory and put into a file, the reverse happens. Specifically, if the DFSDputdata is called from a FORTRAN program, it is assumed that the data is stored in primary memory in column major order, so the elements are "flipped" in order to put them into the file in row major order. If the calling program is a C program, no flipping is needed. The data is stored in the file in row major order, just as it is in primary memory.


How HDF Normally Represents Numbers 
Just as the protocol for storing arrays is language dependent, the internal representation of single precision floating-point numbers is machine dependent. On a Cray, for instance, floating-point numbers are stored in a 64-bit format. On many other machines, they are stored in IEEE standard floating-point format.

In HDF scientific datasets, data is stored by default in IEEE standard floating-point format. Therefore, when DFSDgetdata loads data from a SDS, it converts the data to the single precision floating-point format that is standard for the machine that gets it. And when DFSDputdata takes data from memory and stores it in an SDS, it converts the data from the machine's single precision floating-point format to the IEEE format. 

All of this converting can result in low order inaccuracies in the data. Data that has been converted from 64-bit to 32-bit floating-point representation is accurate to about 10-7.

In many instances it does not matter to a user how data is stored or what conversions it must undergo. However, sometimes these conversions cannot be tolerated, either because they slow down processing too much or because they introduce intolerable inaccuracies. 


DFSDsettype:  Setting Scientific Dataset Attributes

DFSDsettype
FORTRAN:
INTEGER FUNCTION DFSDsettype(datatype,systemtype,representation, arrayorder)

INTEGERdatatype;-format of numbers to be assumed in 
subsequent SD calls
INTEGERsystemtype;-type of system to organize data for
INTEGERnumbertype;-representation to use for the datatype
INTEGERarrayorder;-row major or column major

C:
int DFSDsettype(datatype,systemtype,representation, arrayorder)

intdatatype;/* format of numbers to be assumed in 
subsequent SD calls */
intsystemtype;/* type of system to organize data for */
intnumbertype;/* representation to use for the datatype*/
intarrayorder;/* row major or column major */

Purpose:  To specify attributes of the data to be stored in a SDS. These attributes are  in effect in subsequent calls to the SDS library.

Returns:  0 on success; -1 on failure.

datatype describes the format of numbers to be assumed in subsequent SDS calls. Currently, the only legal value for datatype is the constant DFNT_FLOAT; which specifies that floating-point numbers are being written, or 0, which specifies the default type (also floating-point). In later versions of HDF, other data types will be permitted.

systemtype specifies that the dataset should be written so that it can be read more efficiently on the specified system. The only value currently legal for systemtype is 0, which specifies the local system. 

representation specifies the representation to be used for the datatype. The only legal values for representation currently are 0, specifying the default representation; DFNTF_IEEE, which specifies that IEEE floating-point is to be written (the default); and on UNICOS only, DFNTF_CRAY, which specifies that 64-bit CRAY floating-point is to be written. 

arrayorder specifies whether the data should be stored in the file in row major or column major order. The legal values are 0, indicating the default (row major order); DFO_FORTRAN, indicating column major (FORTRAN) order; and DFO_C indicating row major (C) order.


Currently, the only real use of DFSDsettype is on UNICOS, to specify that FORTRAN-style arrays should be written, and/or that 64-bit Cray floating-point  should be written. If a file is to be written and read by FORTRAN programs on UNICOS, it would probably be considerably quicker and also preserve precision, to write column major, Cray floating-point datasets. This can be done with the call DFSDsettype (0, 0, DFNTF_CRAY, DFO_FORTRAN).

Note that currently, numbers in Cray floating-point format cannot be read on most other machines. Column major order, however, can be read. Hence if a file is produced by a FORTRAN program on UNICOS and is to be read often by a FORTRAN program on an Alliant, you may want to set column major order only. This may speed up the read on the Alliant.

WARNING:  Presently, Cray floating-point numbers cannot be read by NCSA DataScope, etc. In addition, current versions of NCSA DataScope, NCSA ImageTool, etc. cannot read arrays stored in column major order; thus, they will display such data transposed.


Sample Programs

Two sample complete programs, the first in FORTRAN, the second in C, are presented below.


A FORTRAN Program
This program does the following, in order:

¥Calls the routine randArray (not shown) to generate an array called pressure of random numbers

¥Calls the routine findMaxMin (listed after the main program) to find the maximum and minimum values in the array pressure

¥Writes the contents of pressure to an HDF file called "testsds.df", together with scales, label, unit, format and max/min information

¥Reads back the array from "testsds.df", together with the associated information

¥Compares the contents of the information read with the original information

Note that if this program were run on the Cray as is, the values that were written to the file would not, in general, be equal to the numbers that are read back in due to the loss of precision on the write. See the section on DFSDsettype for a discussion of this problem.


Figure 4.6FORTRAN Program Dealing with Scientific Datasets

FORTRAN:
      PROGRAM SDex5

      INTEGER           DFSDsetdims, DFSDsetdatastrs, DFSDsetdimstrs, DFSDsetdimscale
      INTEGER           DFSDsetmaxmin, DFSDputdata, DFSDgetdims, DFSDgetdata
      INTEGER           DFSDgetdatastrs, DFSDgetdimstrs, DFSDgetdimscale, DFSDgetmaxmin
      INTEGER           ret, i, j
      INTEGER           rank
      INTEGER           shape(2), inShape(2)
      REAL              pressure(10,10), inPressure(10,10)
      REAL              xscales(10), inXscales(10)
      REAL              yscales(10), inYscales(10)
      REAL              maxpressure, inMaxpressure
      REAL              minpressure, inMinpressure
      CHARACTER*256     datalabel, inDatalabel
      CHARACTER*256     dataunit, inDataunit
      CHARACTER*256     datafmt, inDatafmt
      CHARACTER*256     dimlabels(2), inDimlabels(2)
      CHARACTER*256     dimunits(2), inDimunits(2)
      CHARACTER*256     dimfmts(2), inDimfmts(2)
      CHARACTER*256     inDummy

      rank = 2
      shape(1) = 10
      shape(2) = 10

      datalabel         = 'Pressure 1'
      dataunit          = 'Pascals'
      datafmt           = 'E15.9'
      dimlabels(1)      = 'x'
      dimunits(1)       = 'cm'
      dimfmts(1)        = 'F10.2'
      dimlabels(2)      = 'y'
      dimunits(2)       = 'cm'
      dimfmts(2)        = 'F10.2'

      call randArray(pressure, 100)
      call findMaxMin(pressure, 100, maxpressure, minpressure)

      do 10 i = 1, 10
         xscales(i) = i
         yscales(i) = i
 10   continue

C  Write data to file
      ret = DFSDsetdims(rank, shape)
      ret = DFSDsetdatastrs(datalabel, dataunit, datafmt, '')
      ret = DFSDsetdimstrs(1, dimlabels(1), dimunits(1), dimfmts(1))
      ret = DFSDsetdimstrs(2, dimlabels(2), dimunits(2), dimfmts(2))
      ret = DFSDsetdimscale(1, shape(1), xscales)
      ret = DFSDsetdimscale(2, shape(2), yscales)
      ret = DFSDsetmaxmin(maxpressure, minpressure)
      ret = DFSDputdata('testsds.df', 2, shape, pressure)
C  Read data back
      ret = DFSDgetdims('testsds.df', inRank, inShape, 2)
Figure 4.6FORTRAN Program Dealing with Scientific Datasets (Continued)

      ret = DFSDgetdata('testsds.df', 2, inShape, inPressure)
      ret = DFSDgetdatastrs(inDatalabel, inDataunits, inDatafmt, inDummy)
      ret = DFSDgetdimstrs(1, inDimlabels(1), inDimunits(1), inDimfmts(1))
      ret = DFSDgetdimstrs(2, inDimlabels(2), inDimunits(2), inDimfmts(2))
      ret = DFSDgetdimscale(1, inShape(1), inXscales)
      ret = DFSDgetdimscale(2, inShape(2), inYscales)
      ret = DFSDgetmaxmin(inMaxpressure, inMinpressure)

C  Compare information read in with original information   :
      print *, 'Output rank is :', rank
      print *, 'Input rank is :', inRank

      print *, 'Output shape is :', shape(1), ',', shape(2)
      print *, 'Input shape is :', inShape(1), ',', inShape(2)

      do 200 i = 1, 10
         do 210 j = 1, 10
            if (pressure(i, j) .ne. inPressure(i, j)) then
               print *, 'Array position ', i, ',', j, 'is different'
            end if
 210     continue
 200  continue

      print *, 'Output datalabel is :', datalabel
      print *, 'Input datalabel is :', inDatalabel

      print *, 'Output dataunit is :', dataunit
      print *, 'Input dataunit is :', inDataunit

      print *, 'Output datafmt is :', datafmt
      print *, 'Input datafmt is :', inDatafmt

      print *, 'Output dimlabels(1) is :', dimlabels(1)
      print *, 'Input dimlabels(1) is :', inDimlabels(1)

      print *, 'Output dimunits(1) is :', dimunits(1)
      print *, 'Input dimunits(1) is :', inDimunits(1)

      print *, 'Output dimfmts(1) is :', dimfmts(1)
      print *, 'Input dimfmts(1) is :', inDimfmts(1)

      print *, 'Output dimlabels(2) is :', dimlabels(2)
      print *, 'Input dimlabels(2) is :', inDimlabels(2)

      print *, 'Output dimunits(2) is :', dimunits(2)
      print *, 'Input dimunits(2) is :', inDimunits(2)

      print *, 'Output dimfmts(2) is :', dimfmts(2)
      print *, 'Input dimfmts(2) is :', inDimfmts(2)

      do 300 i = 1, 10
         if (xscales(i) .ne. inXscales(i)) then
            print *, 'Xscales is different at position ', i
         end if
         if (Yscales(i) .ne. inYscales(i)) then
            print *, 'Yscales is different at position ', i
         end if
 300  continue

Figure 4.6FORTRAN Program Dealing with Scientific Datasets (Continued)

      print *, 'Output maxpressure is :', maxpressure
      print *, 'Input maxpressure is :', inMaxpressure

      print *, 'Output minpressure is :', minpressure
      print *, 'Input minpressure is :', inMinpressure

      print *, 'Check completed.'
      stop
      end


      subroutine findMaxMin(array, size, max, min)

      integer   size
      real      array(100), max, min
      integer   i

      max = array(1)
      min = array(1)
      max = amax1(max, array)
      min = amin1(min, array)

      return
      end



A C Program
This program does the following, in order:

¥Calls the routine randArray (also listed) to generate an array called pressure of random numbers

¥Calls the routine findMaxMin (also listed) to find the maximum and minimum values in the array pressure

¥Writes the contents of pressure to an HDF file called "testsds.df", together with scales, label, unit, format and max/min information

¥Reads back the array from "testsds.df", together with the associated information

¥Compares the contents of the information read with the original information

Figure 4.7C Program Dealing with Scientific Datasets 

C:
#include "df.h"
#include <stdio.h>

#define MAX_ROW 10
#define MAX_COL 10
#define SIZE_ARRAY (MAX_ROW * MAX_COL)

main()
{
  int ret, i, j;
  int rank, inRank;
  int32 shape[2], inShape[2];
  float32 pressure[MAX_ROW][MAX_COL], inPressure[MAX_ROW][MAX_COL];
  float32 xscales[MAX_ROW], inXscales[MAX_ROW];
  float32 yscales[MAX_COL], inYscales[MAX_COL];
  float32 maxpressure, inMaxpressure;
  float32 minpressure, inMinpressure;
  char *datalabel, inDatalabel[256];
  char *dataunit, inDataunit[256];
  char *datafmt, inDatafmt[256];
  char *dimlabels[2], inDimlabels[2][256];
  char *dimunits[2], inDimunits[2][256];
  char *dimfmts[2], inDimfmts[2][256];
  char inDummy[256];

  rank = 2;
  shape[0] = MAX_ROW;
  shape[1] = MAX_COL;

  datalabel= "Pressure 1";
  dataunit= "Pascals";
  datafmt= "E15.9";
  dimlabels[0] = "x";
  dimlabels[1] = "y";
  dimunits[0]= "cm";
  dimunits[1]= "cm";
  dimfmts[0]= "F10.2";
  dimfmts[1]= "F10.2";

  randArray(pressure, SIZE_ARRAY);
  findMaxMin(pressure, SIZE_ARRAY, &maxpressure, &minpressure);

  for(i=0;i<MAX_ROW;i++)
    xscales[i] = i;
  for(i=0;i<MAX_COL;i++)
    yscales[i] = i;

  ret = DFSDsetdims(rank, shape);
  printf("Return %d\n",ret);
  ret = DFSDsetdatastrs(datalabel, dataunit, datafmt, "");
  printf("Return %d\n",ret);
  ret = DFSDsetdimstrs(1, dimlabels[0], dimunits[0], dimfmts[0]);
  printf("Return %d\n",ret);
  ret = DFSDsetdimstrs(2, dimlabels[1], dimunits[1], dimfmts[1]);
  printf("Return %d\n",ret);
  ret = DFSDsetdimscale(1, shape[0], xscales);
  printf("Return %d\n",ret);
  ret = DFSDsetdimscale(2, shape[1], yscales);
  printf("Return %d\n",ret);


Figure 4.7C Program Dealing with Scientific Datasets (Continued)

  ret = DFSDsetmaxmin(maxpressure, minpressure);
  printf("Return %d\n",ret);
  ret = DFSDputdata("testsds.df", 2, shape, pressure);
  printf("Return %d\n",ret);

  puts("Getting");
  ret = DFSDgetdims("testsds.df", &inRank, inShape, 2);
  printf("Return %d\n",ret);
  ret = DFSDgetdata("testsds.df", 2, inShape, inPressure);
  printf("Return %d\n",ret);
  ret = DFSDgetdatastrs(inDatalabel, inDataunit, inDatafmt, inDummy);
  printf("Return %d\n",ret);
  ret = DFSDgetdimstrs(1, inDimlabels[0], inDimunits[0], inDimfmts[0]);
  printf("Return %d\n",ret);
  ret = DFSDgetdimstrs(2, inDimlabels[1], inDimunits[1], inDimfmts[1]);
  printf("Return %d\n",ret);
  ret = DFSDgetdimscale(1, inShape[0], inXscales);
  printf("Return %d\n",ret);
  ret = DFSDgetdimscale(2, inShape[1], inYscales);
  printf("Return %d\n",ret);
  ret = DFSDgetmaxmin(&inMaxpressure, &inMinpressure);
  printf("Return %d\n",ret);

  printf("Output rank is %d\nInput rank is %d\n", rank, inRank);

  printf("Output shape is %d, %d\nInput shape is %d, %d\n", shape[0], shape[1], 
inShape[0], inShape[1]);

  for(i=0;i<MAX_ROW;i++)
    for(j=0;j<MAX_COL;j++)
      if (pressure[i][j] != inPressure[i][j])
printf("Array position %d, %d is different\n", i, j);

  printf("Output datalabel is %s\nInput datalabel is %s\n", datalabel, inDatalabel);
  printf("Output dataunit is %s\nInput dataunit is %s\n", dataunit, inDataunit);
  printf("Output datafmt is %s\nInput datafmt is %s\n", datafmt, inDatafmt);
  printf("Output dimlabels[0] is %s\nInput dimlabels[0] is %s\n", 
dimlabels[0], inDimlabels[0]);
  printf("Output dimunits[0] is %s\nInput dimunits[0] is %s\n", dimunits[0], inDimunits[0]);
  printf("Output dimfmts[0] is %s\nInput dimfmts[0] is %s\n", dimfmts[0], inDimfmts[0]);
  printf("Output dimlabels[1] is %s\nInput dimlabels[1] is %s\n", 
dimlabels[1], inDimlabels[1]);
  printf("Output dimunits[1] is %s\nInput dimunits[1] is %s\n", dimunits[1], inDimunits[1]);
  printf("Output dimfmts[1] is %s\nInput dimfmts[1] is %s\n", dimfmts[1], inDimfmts[1]);

  for(i=0;i<MAX_ROW;i++)
    if (xscales[i] != inXscales[i])
      printf("Xscales is different at position %d\n", i);
  for(i=0;i<MAX_COL;i++)
    if (yscales[i] != inYscales[i])
      printf("Yscales is different at position %d\n", i);

  printf("Output maxpressure is %s\nInput maxpressure is %s\n", maxpressure, inMaxpressure);
  printf("Output minpressure is %s\nInput minpressure is %s\n", minpressure, inMinpressure);

  puts("Check Completed");
}


/*
 * findMaxMin
 * Finds the maximum and minimum values in a data array.
Figure 4.7C Program Dealing with Scientific Datasets (Continued)

 */
findMaxMin(data, size, max, min)
float32 data[];
int32 size;
float32 *max, *min;
{
  int32 i;

  *max = *min = data[0];
  for(i=1;i<size;i++) {
    if (*max < data[i])
      *max = data[i];
    else if (*min > data[i])
      *min = data[i];
  }
}

/*ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐ new file to be linked to main program ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐ*/
/* rand.c
 */
#include "rand.h"
#include "df.h"

/*
 * randArray
 *fills an array with random chars
 * Input:
 *array : pointer to the array
 *size  : size of the array in characters
 */
randArray(array, size)
float32 array[];
int size;
{
  int i;

  for (i=0; i<size; i++)
    array[i] = (float32) RND(BYTE_MAX);
}




