Teem: bane tutorial: Step 1

`bane`/`gkms` tutorial: Step 1: Creating the Histogram Volume

This tutorial does not work with any recent versions of Teem. Sorry.

The options for creating histogram volumes are seen by running: "gkms hvol":
gkms hvol: Make histogram volume. The histogram volume is a three-dimensional
histogram recording the relationship between data value, gradient magnitude,
and the second directional derivative along the gradient direction. Creating
it is the first step in semi-automatic transfer function generation. 

Usage: gkms hvol [-s <incV incG incH>] [-d <dimV dimG dimH>] [-k00 <kernel>] \
       [-k11 <kernel>] [-k22 <kernel>] [-l] [-slow] [-gz] -i <volumeIn> \
       -o <hvolOut>

-s <incV incG incH> = Strategies for determining how much of the range of a
                      quantity should be included and quantized in its axis of
                      the histogram volume. Possibilities include:
                    o "f:<F>": included range is some fraction of the
                      total range, as scaled by F
                    o "p:<P>": exclude the extremal P percent of the
                      values
                    o "s:<S>": included range is S times the standard
                      deviation of the values
                    o "a:<min>,<max>": range is from <min> to <max>
                      (3 inclusion strategies)
                      default: "f:1.0 p:0.005 p:0.015"
-d <dimV dimG dimH> = Dimensions of histogram volume; number of samples along
                      each axis (3 ints); default: "256 256 256"
      -k00 <kernel> = value reconstruction kernel; default: "tent"
      -k11 <kernel> = first derivative kernel; default: "cubicd:1,0"
      -k22 <kernel> = second derivative kernel; default: "cubicdd:1,0"
                 -l = Use Laplacian instead of Hessian to approximate second
                      directional derivative. No faster, less accurate.
              -slow = Instead of allocating a floating point VGH volume and
                      measuring V,G,H once, measure V,G,H multiple times on
                      seperate passes (slower, but needs less memory)
                -gz = Use gzip compression for output histo-volume; much less
                      disk space, slightly slower to read/write
      -i <volumeIn> = input scalar volume for which a transfer function is
                      needed
       -o <hvolOut> = output histogram volume, used by "gkms scat" and "gkms
                      info" (string)
One result of my Master's thesis research is that the chances are very high that you don't need to worry about any of the optional command-line parameters! The default parameters generally work quite well. So, for the impatient, this may be all you need:
gkms hvol -i engine-crop.nrrd -o engine-hvol.nrrd
The main subtlety in histogram volume creation is deciding what range of derivative values should be represented by the index space of the histogram volume. This aspect of histogram volume creation is called the "inclusion strategy". I have found the most reliable way to determine inclusion is by looking at linear histograms of the derivative value, and then determining what range would exclude some percentile of the hits. This is why the default inclusion strategies for the first and second derivative axes (of the histogram volume) is "p:0.005 p:0.015". Any time that the data is something other than super-clean synthetic data, there are going to be a few very high derivative values. These are the voxels which should not be included in the histogram volume because it would decrease the precision with which we can represent the meaningful variations in the derivative values.
One the other hand, the range of data values should probably be fully represented, so the default inclusion strategy is "f:1.0": meaning 100% of the range of values should be included. In other cases, you know exactly what range of values to include, in which case it makes the most sense to use inclusion "a": absolute specification of the derivative values to be represented by the histogram. Some multiple of the ("s") standard deviation may be another interesting inclusion strategy.
The next option available in gkms is the dimensions ("-d") of the output histogram volume. I have found 256³ to work fine. If you want to skimp on how big the histogram volume is, it makes more sense to skimp on the derivative axes, than on the data value axis. For example, to make a histogram volume with only 128 bits for the two derivative axes, you'd run:
gkms hvol -i engine-crop.nrrd -d 256 128 128 -o tmp.hvol
Finally, there is the matter of just how to measure the second derivative. I know of three different ways of approximating the second directional derivative along the gradient direction: the Laplician, something based on the Hessian, and something based on the gradient of the gradient magnitude. As explained in Appendix C of my thesis, I feel the Hessian-based measure is probably the best, so it is made available in gkms. The other measurement methods were available in previous version, but all recent versions use gage for convolution-based derivative measurement. Thus, in gkms hvol you can choose the kernels that will be used for values, 1st derivatives, and 2nd derivatives. The default settings generate central and second central differences for 1st and 2nd derivatives, respectively.

bane/gkms tutorial: Step 1: Creating the Histogram Volume

`bane`/`gkms` tutorial: Step 1: Creating the Histogram Volume