HDF Corruption Problem
In the HDF MFHDF (also referred to as SD) library, there is a potential
for corruption of some SDSs and/or dimensions, when a dimension has the
same name as an SDS. Prior to HDF4.2r1, the corruption could happen to
either one- or multi-dimensional datasets. As of HDF4.2r1, only
one-dimensional datasets can be affected because a fix was provided which
prevents the newly created multi-dimensional datasets from being corrupted.
The SD library uses a concept of a "variable" to store an SDS. This concept was introduced in order to conform to and harmonize with the netCDF data model. A "variable" has a name, spatial information (rank and dimension sizes), and other related information. When an attribute is assigned to or a dimension scale is written to a dimension, it too will be stored in a variable, as an SDS. This type of SDS is referred to as "dimension variable".
The corruption is introduced due to two limitations in the SD library. First, the library makes no distinction between "variables" and "dimension variables". Second, the library searches for a dimension simply by using its name and the search is completed when a variable with the same name as the dimension is found. This simple algorithm sometimes causes the wrong item to be found.
One such scenario, where an SDS is corrupted, can be described as followed:
- A 1-D SDS named "My Data" is created (SDcreate).
- Its dimension is also named "My Data" (SDsetdimname) and is assigned with an attribute (SDsetattr).
- The function SDsetattr searches the variable list to make sure there is not a variable associated with this dimension already.
- The search finds the SDS variable named "My Data" and concludes that a variable for the dimension "My Data" indeed already exists.
- Subsequent writing (SDsetdimscale) to the dimension variable actually writes to the SDS "My Data" instead.
Another scenario, where a dimension is corrupted:
- An SDS named "Dataset 1" is created (SDcreate) and stored as variable #0 in the variable list.
- Its dimension is named "My Data" (SDsetdimname) and is written with some dimension scale values (SDsetdimscale).
- As a result of SDsetdimscale, this dimension is stored in a dimension variable and the dimension variable is added to the variable list as variable #1.
- An SDS named "My Data" is then created (SDcreate) and safely stored as variable #2 in the variable list.
- All datasets are closed.
- The dataset named "My Data" is opened (SDnametoindex and SDselect) to write data to it.
- Using the search described above, SDnametoindex returns the index of the first variable with the name "My Data", namely the dimension variable "My Data", which is #1, instead of the SDS variable "My Data", which is #2.
- This results in the dimension scale of dimension "My Data" being corrupted when SDwritedata is called to write to the SDS named "My Data".
- This also means that the SDS named "My Data" does not receive the data that was supposed to be written to it.
As described, the affected datasets or dimensions are not recoverable. We are working on a fix so that future created data files will not suffer from these limitations.
HDF does not build on the Macintosh with gcc 4.*
HDF was built and tested with gcc 3.3, and is known not to work with gcc 4.* on the Macintosh. You can set your gcc version with the following command, in order to build HDF:
sudo gcc_select 3.3
HDF JAVA: Problem with creating an unlimited dimension SDS
-
With the HDF Java 2.2 software, there were some problems with
created unlimited dimension SDSs. The patched JAR files and
an example program can be downloaded from the following
directory:
ftp://ftp.hdfgroup.org/HDF5/hdf-java/bin/PATCH/
Problem building HDF 4.2r1 on Linux 2.6 with gcc 3.4
-
When building from source, HDF fails to build hdiff, because
the C math library is not linked:
hdiff_array.o(.text+0x10cc): In function `array_diff': : undefined reference to `sqrt'Add the path to the math library to LDFLAGS and then re-configure and build the software:
setenv LDFLAGS "-L/usr/lib -lm" ./configure ...Another solution is to go into the directory of the hdiff source code, build hdiff, and then continue with the make. A user was able to do:
#begin
cd /mfhdf/hdiff;
gcc -lm -O3 -fomit-frame-pointer -o hdiff hdiff.o
hdiff_array.o hdiff_gr.o hdiff_list.o hdiff_main.o hdiff_mattbl.o
hdiff_gattr.o hdiff_misc.o hdiff_sds.o hdiff_table.o hdiff_vs.o
../libsrc/libmfhdf.a ../../hdf/src/libdf.a -ljpeg -lz -ljpeg -lz
cd ../..
make
#end
Problem with hrepack and empty datasets
-
There is a problem with the hrepack tool that comes with
HDF 4.2r1. This problem affected datasets for which no data
had been written. The hrepack tool was incorrectly exiting in
this case. The correct behavior for hrepack is to continue,
creating the output as an empty dataset.
The changes which enable hrepack to continue for empty datasets can be obtained from the directory:
ftp://ftp.hdfgroup.org/HDF/HDF_Current/src/patches/The file containing the changed files is:
4.2r1-hrepack-patch.tar
Limitation of the following Fortran functions: vsfread, vsfwrit, vntrc
-
The HDF4 Fortran functions vsfread, vsfwrit, and vntrc support
only integer data buffers on Windows systems.
Known Problems at Release Time
- - Last modified:October 09th 2007
