|
|
|
|
teem
|
/
|
nrrd
|
General Description of the NRRD format
|
General Description of the NRRD format
(See also full definition of NRRD format.)
Like the nrrd library, the main virtue of the NRRD file
format is stark simplicity. The NRRD header is simple ASCII text, one field
per line. The fields in the header do not have a strict ordering, and
most of them are optional. Most strings are case insensitive, and
alternate forms of many of the identifiers and descriptors are
allowed. When writing non-ASCII data, the byte ordering is
recorded, but not altered to match one particular endianness.
Key/value pairs (of strings) are stored in plain text, one pair per line.
The NRRD file format was also conceived as being somewhat analogous to
the PPM format for color images: straight-forward, friendly to
programmers, and descriptive of a sufficiently large class of data to
be useful in research. Time and experience with the NRRD format has
gradually increased its complexity (such as with the introduction of
node- versus cell-centered samples), but the feature set has very
nearly converged. As a general representation of raster data, NRRD is
intended to occupy the very large but sparsely populated niche between
- Raw, headerless data, hopefully with some nearby README file
explaining the type and dimensions.
- Very sophisticated, powerful (complicated) formats such as HDF
(http://hdf.ncsa.uiuc.edu/).
Many aspects of the NRRD format borrow heavily from the PCGV volume
dataset format developed by James Durkin at the Cornell Program of
Computer Graphics.
Optional encodings
NRRD has two basic encodings: ascii and raw. It has other optional
encodings which are useful in different situations:
- hex: If you know enough PostScript to learn the image
dimensions, this allows you to snarf image data out of a PostScript file.
- gzip: Allows you to read and write data with the zlib
compression library, in a way that is compatible with the gzip/gunzip
command-line tools.
- bzip2: Allows you to read and write data with
bzip2 compression, compatible with the bzip2/bunzip2 command-line tools.
Having an optional encoding means that the nrrd library can be
compiled without these turned on, so that no external libraries are
needed. Builds of the nrrd library which are missing the
compression encodings will fail with a warning message when asked to
read or write compressed data.
Other optional encodings may be added in the future. However, there
is no risk that NRRD will turn into another TIFF, a format so flexible
that few readers actually support all of the 121
page specification. The only optional encodings which may be
added to NRRD in the future will be ones for which there exist freely
available command-line tools to convert the encoded data (in
isolation) to raw data.
If you have a NRRD file volume.nrrd, with an attached header,
using a data encoding not supported by the available nrrd
implementation, you can always use the unix/GNU/linux/cygwin command
"tail +N volume.nrrd" (where N is two plus the
number of lines in the header) to get at all the data, so as to pass
it onto a stand-alone converter. Or, the Utah Nrrd Utilities command
"unu data" is a much easier way of doing the exact same
thing. Data in a separate file, detached from the NRRD header, is
obviously trivial to pass to a converter.
I couldn't find stand-alone converters for hex data, so I wrote them:
Because unu data will always be able to spit out the
data portion of a NRRD file, even if the nrrd library on which
it was built wasn't compiled with the optional encodings enabled,
other non-teem NRRD readers should feel no obligation to support the
optional encodings.
NRRD compared to VTK format
The VTK
(http://public.kitware.com/VTK/pdf/file-formats.pdf) file
formats (non-XML versions) are more general than NRRD in the types of
information represented, and slightly less general that NRRD when it
comes to raster data.
Unlike NRRD, VTK can represent:
- Point sets, polygonal data, structure and unstructured meshes
of various types
- Multiple pieces of data in one file, allowing samples to have
many various attributes.
- Vector and tensor types explicitly. In NRRD, these are
represented implicitly, by using a short non-spatial axis prior to
the spatial axes.
But with raster data, unlike VTK, NRRD can:
- Read and write data in either byte-ordering (VTK is always
big-endian).
- Have the data in a seperate file from the header.
- Represent the difference between cell and node-centered samples
- Store data of of any dimension, and any C scalar type.
- Encode data in more than just ascii and binary, including gzip
and bzip2 compression.
- Store more peripheral information, such as axis labels and units.
NRRD compared to MetaImage format
This format (specs available from http://caddlab.rad.unc.edu/software/MetaIO/)
was developed at the Computer
Aided Diagnosis and Display Lab at UNC. It is extremely similar
to NRRD in terms of representational capabilities, in that it
represents arrays of general type and dimension. As of the version 4
of the NRRD file format (with magic "NRRD0004") there are
basically only superficial differences
between the representational abilities of the two formats.
In favor of MetaImage:
- The ElementNumberOfChannels field allows a nominally
3-D data header to describe what is logically a 4-D array. NRRD
suffers from a slight weirdness in this regard (a color image is a
three-dimensional nrrd), a consequence of its "everything is a scalar"
mentality.
In favor of NRRD:
- The NRRD format has a simple "magic" on a line by itself at the
beginning of the file, to unambiguously identify the type of the file
to multi-format readers.
- NRRD supports both gzip and bzip2
compression, as well as hex, and the underlying modular implementation
allows new encodings to be easily added, which work with and without
attached headers.
- A very conservative approach to representing optional information.
If you don't know information like sample spacing, you don't include
that field. The NRRD reader remembers that you didn't know spacing,
rather than contriving a default value.
- A flexible and general approach to defining the array orientation,
which enables the lossless representation of orientation information
from NIFTI-1, Analyze, and DICOM images.
- Has a powerful associated command-line program, unu for doing image
manipulation and pre-procesing quickly and easily.
The underlying nrrd library provides an
easy-to-use C API to the same functionality.
There are many one-to-one parallels between the header fields in
the two formats:
NRRD
| -
| MetaImage
|
#
| -
| Comment
|
dimension
| -
| NDims
|
sizes
| -
| DimSize
|
type
| -
| ElementType
|
endian
| -
| ElementByteOrderMSB
|
byte skip
| -
| HeaderSize
|
min
| -
| ElementMin
|
max
| -
| ElementMax
|
thicknesses
| -
| ElementSize
|
content
| -
| Name
|
axis mins
| -
| Position
|
space origin, space directions
| -
| Orientation, AnatomicalOrientation
|
data file
| -
| ElementDataFile
|
(not using a detached header)
| -
| ElementDataFile LOCAL
|
There are various MetaImage fields that NRRD has no immediate analog for,
because NRRD aims to be more minimalist in its representational abilibites.
This sort of information would be stored as key/value pairs in NRRD:
- ObjectType, ObjectSubType, TransformType,
Modality: descriptive strings
- ID, ParentID: integers
- Color: arrays
- SequenceID: 4-tuple of integers specific to DICOM format.
Some NRRD fields that MetaImage doesn't seem to have analogs for:
- centers: Records
whether samples are cell-centered or node-centered, which is
fundamental to properly representing, for example, histograms (cell-centered)
versus some simulation data (node-centered).
- axis maxs: Helpful
for unambiguously storing histograms, scatterplots, and images with a
specific field of view.
- old min,
old max: Very handy for remembering
the original range of floating point values, prior to quantization.
- line skip: When
snarfing data from other formats (VTK, VFF, VisPack, and even
PostScript), specifying the skip in terms of lines (rather than bytes)
is much simpler.
- labels: arbitrary
string per axis
- units: arbitrary string
per axis giving units that spacing is measured in.
- space: the space in
which the orientation of the array is defined can be explicitly named,
mainly to disambiguate between medical image format standards using
different spaces (DICOM's left-posterior-superior versus NIFTI-1's
right-anterior-superior).
- measurement
frame: for non-scalar data such as vectors and tensors, it is
extremely useful to record the coordinate frame in which the vector
and tensor coefficients are measured.
Future Extensions
I have some ideas on how the NRRD file format may be extended in
the future, but these are not likely to happen within the next year.
- More than one array per NRRD file: There are many situations
where it is good to logically associate multiple NRRD files together
as one "data set". Examples include a large volume and its
pre-computed univariate histogram (to help in isovalue navigation), or
a collection of different one-, two-, and three-dimensional lookup
tables which are good transfer functions for a given volume. I am
leaning towards implementing this multi-NRRD association with XML on
top of regular NRRD files. If you need this, though, you should
probably be using HDF
(http://hdf.ncsa.uiuc.edu/)
- Bricking: Whenever I do get around to implementing bricking in
nrrd, the results will be saved in NRRD files. One level of
bricking will turn a 3-D array into a 6-D array. For various subtle
reasons, the representation of the axis mins and maxs becomes
ambiguous in the case of cell-centered data, and this information is
meaningless for all the bricked axes. Brick overlap should be
represented too, but this means a new field.
- Other compression methods: It would be really nice to have a
compression method that worked well on floating point data.