ImageSeries

Originally, hexrd could only process GE images. We developed the imageseries package to allow for other image formats. The imageseries package provides a standard interface for images coming from different sources. The idea is that we could be working with a large number of images that we don’t want to keep in memory. Instead we load the images from a file or generate them dynamically, but the interface is independent of the source.

See Example of Imageseries Usage for an example of usage.

Open and Write. The imageseries package has two main functions: open and write.

ims = imageseries.open(file, format, **kwargs)
imageseries.write(ims, file, format, **kwargs):

The format refers to the source of the images; file and kwargs depend on the format. Possible formats currently are:

hdf5: The images are stored in an HDF5 file and loaded on demand. file is the name of the HDF5 file.
frame-cache: The images are stored sparse matrices in a numpy .npz file; all of the sparse arrays are loaded on open, and a full (not sparse) array is delivered on request for a frame. There are two ways this can be done. In one, file is the name of the npz, and metadata is stored in the npz file. In the other, file is a YAML file that includes the name of the npz file as well as the metadata.
image-files: The images are stored as one or more regular image files on the file system. file is a YAML file describing listing a sequence of image files and metadata.
raw-image: This is for nonstandard or less common image formats that do not load with fabio <https://pypi.org/project/fabio/>. In that case, you can define your own data format. This file argument is a YAML file.
array: images are stored as a 3D numpy array; used for testing. It takes no file argument, use None.

Processed Imageseries. This is a subclass of imageseries. It has a number of built-in operations, such as flipping, dark subtraction, restriction to a sub-rectangle, and selecting frames. It can be further subclassed by adding more operations. It is instantiated with an existing imageseries and a list of operations. When a frame is requested, the processed imageseries gets the frame from the original image series and applies the operations in order. It can then be saved as a regular imageseries and loaded as usual.

For more detail, see Processed Image Series.

Interface. The imageseries provides a standard interface for accessing images, somewhat like a 3D array. Note that indexing does not work for slices or multiple indices.

If ims is an imageseries instance:

len(ims) is the number of frames
ims[j] returns the j’th frame
ims.shape is the shape of each frame
ims.dtype is the numpy.dtype of each frame
ims.metadata is a dictionary of metadata

Stats module. This module delivers pixel by pixel stats on the imageseries. For each function below, there is also a corresponding iterator that does the same thing, but in smaller chunks. The iterators are much better for median and percentiles, which involve data for all frames. See the API docs. Functions are:

max(ims, nframes=0) gives a single image that is the max over all frames or a subset
min(ims, nframes=0) gives a single image that is the min over all frames or a subset
average(ims, nframes=0) gives the mean pixel value over all the frames or a subset
median(ims, nframes=0) gives median or a subset
percentile(ims, pct, nframes=0) gives the percentile over all frames or a subset

The median is typically used to generate background images, but percentile could also be used too.

Here is an example showing how to use the iterators. For example:

#  Using the standard function call
img = stats.average(ims)

# Using the iterable with 10 chunks
for img in stats.average_iter(ims, 10):
    # update progress bar
    pass

Omega module. For the HEDM work, we usually have a sequence of rotations about the vertical axis. Omega refers to the angle of rotation. The OmegaImageSeries is a subclass that has metadata for the rotation angles.

See Omega Module.

Example of Imageseries Usage

Here is an example of how the imageseries is used. One thing we commonly do is to process the raw image files by adding flips and subtract off the background. Then we save it as a new imageseries. This example saves it into a HDF5 file, but it is more common to use the frame-cache (sparse matrices), which is way smaller.

import numpy as np

from hexrd import imageseries
from hexrd.imageseries.process import ProcessedImageSeries as ProcessedIS
from hexrd.imageseries.omega import OmegaWedges

# Inputs
darkfile = 'dark-50pct-100f'
h5file = 'example.h5'
fname = 'example-images.yml'
mypath = '/example'

# Raw image series: directly from imagefiles
imgs = imageseries.open(fname, 'image-files')
print(
   "number of frames: ", len(imgs),
   "\ndtype: ", imgs.dtype,
   "\nshape: ", imgs.shape
)

# Make dark image from first 100 frames
pct = 50
nf_to_use = 100
dark = imageseries.stats.percentile(imgs, pct, nf_to_use)
np.save(darkfile, dark)


# Now, apply the processing options
ops = [('dark', dark), ('flip', 'h')]
pimgs = ProcessedIS(imgs, ops)


# Save the processed imageseries in HDF5 format
print(f"writing HDF5 file (may take a while): {h5file}")
imageseries.write(pimgs, h5file, 'hdf5', path=mypath)

Here is the YAML file for the raw image-series.

image-files:
  directory: GE
  files: "ti7_*.ge2"
options:
  empty-frames: 0
  max-frames: 2
meta:
  omega: "! load-numpy-array example-omegas.npy"

Keyword Options for imageseries

Each type of imageseries has its own keyword options for loading and saving.

HDF5

The format name is hdf5.

This is used at CHESS (Cornell High Energy Synchrotron Source). Raw data from the Dexela detectors comes out in HDF5 format. We still will do the dark subtraction and flipping.

On Open.

path: (required) path to directory containing data group (data set is named images)
dataname: name of data set, default = “images”; note that there is no actual write option for this.

On Write.

path: (required) path to directory containing data group
dataname: name of data set, default = “images”
shuffle: (default=True) HDF5 write option
gzip: (default=1) compression level
chunk_rows: (default=all) sets HDF5 chunk size in terms of number of rows in image

Frame Cache

The format name is frame-cache.

A better name might be sparse matrix format because the images are stored as sparse matrices in numpy npz file. There are actually two forms of the frame-cache. The original is a YAML-based format, which is now deprecated. The current format is a single .npz file that includes array data and metadata.

On Open. No options are available on open.

On Write.

threshold: (required) this is the main option; all data below the threshold is ignored; be careful because a too small threshold creates huge files; normally, however, we get a massive savings of file size since the images are usually over 99% sparse.
output_yaml: (default=False) This is deprecated.

Image Files

The format name is image-files.

On Open.

This is usually written by hand. It is a YAML-based format, so the options are in the file, not passed as keyword arguments. The file defines a list of image files. It could be a list of single images or a list of multi-imagefiles.

YAML keywords are:

image-files

dictionary defining the image files

directory: the directory containing the images
files: the list of images; it is a space separated list of file names or glob patterns

metadata

(required) it usually contains array data or string, but it can be empty

empty-frames

(optional) number of frames to skip at the beginning of each multiframe file; this is a commonly used option

max-total-frames

(optional) the maximum number of frames to make available in the imageseries; this option might be used for testing the data on a small number of frames

max-file-frames

(optional) the maximum number of frames to read per file; this would be unusual

On Write.

There is actually no write function for this type of imageseries. It is usually used to load image data to be sparsed and saved in another (usually frame-cache) format.

Raw Image

The format name is raw-image.

On Open.

There is another YAML based format.

YAML keywords are:

filename

name of the data file

scalar

This defines the scalar details.

type: can be “i”, “f”, “d”, or “b” for integer, float, double or bool
bytes: 1, 2, 4, or 8, for integer types only
signed: true or false
endian: can be big or little

shape

2-tuple of ints describing the shape

skip

number of bytes to skip at the beginning of the file (in the header)

Here is an example that describes the GE format:

#
# YAML example for raw image
#
# For scalar definition:
#   "type": i -> int, f -> float, d -> double, b -> bool
#    "bytes" and "signed" are only for int types
#    "bytes": 1, 2, 4, or 8
#    "signed": true | false
#    "endian": use sys.byteorder to determine value for local system
#
filename: RUBY_4537.ge
shape: 2048 2048
skip: 8192
scalar:
  type: i
  bytes: 2
  signed: false
  endian: little

On Write.

Like the image-files format, there is no writer for this format.

Array

This loads a 3D numpy array and treats it as an imageseries.

On Open.

data: The 3-dimensional numpy array data.
metadata: The metadata dictionary.

On Write.

There is no writer for this format.

Processed Image Series

This class is intended for image manipulations applied to all images of the imageseries. It is instantiated with an existing imageseries and a sequence of operations to be performed on each image. The class has built-in operations for common transformations and a mechanism for adding new operations. This class is typically used for preparing raw detector images for analysis, but other uses are possible. The rectangle operation is used in stats.percentile to compute percentiles one image section at a time to avoid loading all images at once.

Instantiation

Here is an example:

oplist = [('dark', darkimg), ('flip', 'v')]
frames = range(2, len(ims))
pims = ProcessedImageSeries(ims, oplist, frame_list=frames)

Here, ims is an existing imageseries with two empty frames. The operation list has two operations. First, the a dark (background) image is subtracted. Then it is flipped about a vertical axis. Order is important here; operations do not always commute. Note that the dark image is usually constructed from the raw images, so if you flipped first, the dark subtraction would be wrong. Finally, the only keyword argument available is frame_list; it takes a sequence of frames. In the example, the first two frames are skipped.

Built-In Operations

The operation list is a sequence of (key, data) pairs. The key specifies the operation, and the data is passed with the image to the requested function. Here are the built-in functions by key.

dark

dark subtraction; it’s data is an image

flip

These are simple image reorientations; the data is a short string; possible values are:

y or v: flip about y-axis (vertical)
x or h: flip about x-axis (horizontal)
vh, hv or r180: 180 degree rotation
t or T: transpose
ccw90 or r90: rotate 90 degrees
cw90 or r270: rotate 270

Note there are possible image shape changes in the last three.

rectangle

restriction to a sub-rectangle of the image; data is a 2x2 array with each row giving the range of rows and columns forming the rectangle

Methods

In addition to the usual imageseries methods, there are:

@classmethod
def addop(cls, key, func):
    """Add operation to processing options

    *key* - string to use to specify this op
    *func* - function to call for this op: f(img, data)
    """
@property
def oplist(self):
    """list of operations to apply"""

Omega Module

This module has two classes. The OmegaImageSeries is used for the analysis. It is basically am imageseries with omega metadata (and n x 2 numpy array of rotation angle ranges for each frame) and methods for associating the frames with the omega angles. The OmegaWedges is used for setting up the omega metadata. During a scan, the specimen is rotated through an angular range while frames are being written. We call a continous angular range a wedge. We commonly use a single wedge of 360 degrees or 180 degrees, but sometimes there are multiple wedges, e.g. if there is some fixture in the way.

Examples

Start with a couple examples. In the first example, we have 3 files, each with 240 frames, going through 180 degrees in quarter degree increments. The omega array is saved into a numpy file.

nf = 3*240
omw = OmegaWedges(nf)
omw.addwedge(0, nf*0.25, nf)
omw.save_omegas('ti7-145-147-omegas')

In the second example, there are four wedges, each with 240 frames. The wedges go through angular ranges of 0 to 60, 120 to 180, 180 to 240, and 300 to 360. The omega array is then added to the imageseries metadata.

nsteps = 240
totalframes = 4*nsteps
omwedges = omega.OmegaWedges(totalframes)
omwedges.addwedge(0, 60, nsteps)
omwedges.addwedge(120, 180, nsteps)
omwedges.addwedge(180, 240, nsteps)
omwedges.addwedge(300, 360, nsteps)

ims.metadata['omega'] = omwedges.omegas

OmegaWedges Class

__init__(self, nframes): instatiate with the number of frames.
omegas: (property) n x 2 array of omega values
nwedges: (property) number of wedges
addwedge_(self, ostart, ostop, nsteps, loc=None):: add a new wedge to wedge list; take starting omega, end omega, and number of steps; the keyword argument is where to insert the wedge (at the end by default)
delwedge_(self, i):: delete wedge i
wframes: (property) number of steps in each wedge
save_omegas(self, fname): save the omega array to a numpy file (for use with YAML-based formats)