aboutsummaryrefslogtreecommitdiff
path: root/doc/man
diff options
context:
space:
mode:
authorThomas White <taw@physics.org>2011-06-16 17:53:28 +0200
committerThomas White <taw@physics.org>2012-02-22 15:27:28 +0100
commit34b21127ea75e6a714a6c04a09f226180b2eb541 (patch)
treedb5d75b4365cbbda4728f0d512d24abcdd3ece88 /doc/man
parentaa4d05d94275baa8c87acc6343a23d16f1877b24 (diff)
Move documentation to manpages
Diffstat (limited to 'doc/man')
-rw-r--r--doc/man/crystfel_geometry.1123
-rw-r--r--doc/man/indexamajig.1220
-rw-r--r--doc/man/pattern_sim.136
-rw-r--r--doc/man/process_hkl.151
4 files changed, 430 insertions, 0 deletions
diff --git a/doc/man/crystfel_geometry.1 b/doc/man/crystfel_geometry.1
new file mode 100644
index 00000000..0deb058e
--- /dev/null
+++ b/doc/man/crystfel_geometry.1
@@ -0,0 +1,123 @@
+.\"
+.\" Geometry man page
+.\"
+.\" (c) 2009-2011 Thomas White <taw@physics.org>
+.\"
+.\" Part of CrystFEL - crystallography with a FEL
+.\"
+
+.TH CRYSTFEL\_GEOMETRY 1
+.SH NAME
+CrystFEL detector geometry files
+
+.SH OVERVIEW
+
+The detector geometry is taken from a text file rather than hardcoded into the
+program. Programs which care about the geometry (particularly indexamajig,
+pattern_sim and powder_plot) take an argument "--geometry=<file>"
+(or "-g <file>"), where <file> contains the geometry.
+
+A flexible (and pedantic) representation of the detector has been developed to
+avoid all possible sources of ambiguity. CrystFEL's representation of a
+detector is broken down into one or more "panels", each of which has its own
+camera length, geometry, resolution and so on. Each panel fits into the overall
+image taken from the HDF5 file, defined by minimum and maximum coordinates in
+the "fast scan" and "slow scan" directions. "Fast scan" refers to the direction
+whose coordinate changes most quickly as the bytes in the HDF5 file are moved
+through. The coordinates are specified inclusively, meaning that a minimum of 0
+and a maximum of 9 results in a width of ten pixels. Counting begins from zero.
+All pixels in the image must be assigned to a panel - gaps are not permitted.
+
+In the current version, panels are assumed to be perpendicular to the incident
+beam and to have their edges parallel. Within these limitations, any geometry
+can be constructed.
+
+The job of the geometry file is to establish a relationship between the array
+of pixel values in the HDF5 file, defined in terms only of the "fast scan" and
+"slow scan" directions, and the laboratory coordinate system defined as follows:
+
++z is the beam direction, and points along the beam (i.e. away from the source)
++y points towards the zenith (ceiling).
++x completes the right-handed coordinate system.
+
+Naively speaking, this means that CrystFEL at the images from the "into the
+beam" perspective, but please avoid thinking of things in this way. It's much
+better to consider the precise way in which the coordinates are mapped.
+
+The syntax for a simple geometry might include several entires of the following
+form:
+
+; Lines which should be ignored start with a semicolon.
+
+; The name before the slash indicates which panel is referred to. You can use
+; any name as long as it doesn't start with "bad" (see below).
+; The range of pixels in the HDF5 file which correspond to a panel are given:
+panel0/min_fs = 0
+panel0/min_ss = 0
+panel0/max_fs = 193
+panel0/max_ss = 184
+
+; The readout direction (x, y or 0). If more than three peaks are found in
+; the same readout region, they are all discarded. This helps to avoid
+; problems due to streaks appearing along the readout direction.
+; If the badrow direction is '-', then the culling described above will not
+; be performed for this panel.
+panel0/badrow_direction = -
+
+; The resolution (in pixels per metre) for this panel
+panel0/res = 9090.91
+
+; The characteristic peak separation in pixels. The peak detection will assume
+; that genuine peaks are separated by at least this amount.
+panel0/peak_sep = 6.0
+
+; You need to specify the peak integration radius, which should be a little
+; larger than the actual radii of the peaks in pixels
+panel0/integr_radius = 2.0
+
+; The camera length (in metres) for this panel
+; You can also specify the HDF path to a scalar floating point value containing
+; the camera length in millimetres.
+panel0/clen = /LCLS/detectorPosition
+
+; For this panel, the fast and slow scan directions correspond to the given
+; directions in the lab coordinate system described above, measured in pixels.
+panel0/fs = +y
+panel0/ss = -x
+
+; The corner of this panel, defined as the first point in the panel to appear in
+; the HDF5 file, is now given a position in the lab coordinate system.
+; Note that "first point in the panel" is a conceptual simplification. We refer
+; to that corner, and to the very corner of the pixel - NOT, for example, to the
+; centre of the first pixel to appear.
+panel0/corner_x = 429.39
+panel0/corner_y = -17.30
+
+; You can suppress indexing for this panel if required, by setting "no_index" to
+; "true" or "1".
+panel0/no_index = 0
+
+; You can also specify bad regions. Peaks with centroid locations within such
+; a region will not be integrated nor indexed. Bad regions are specified in
+; pixel units, but in the lab coordinate system (i.e. "y" points at the ceiling,
+; "z" is the beam direction and "x" completes the right-handed system).
+badregionA/min_x = -20.0
+badregionA/max_x = +20.0
+badregionA/min_y = -100.0
+badregionA/max_y = +100.0
+
+; If you have a bad pixel mask, you can include it in the HDF5 file as an
+; unsigned 16-bit integer array of the same size as the data. You need to
+; give its path within each HDF5 file, and two bitmasks. The pixel is
+; considered good if all of the bits which are set in "mask_good" are set, AND
+; if none of the bits which are set in "mask_bad" are set.
+mask = /processing/hitfinder/masks
+mask_good = 0x27
+mask_bad = 0x00
+
+; Any of the per-panel values can be given without a panel prefix, for example:
+peak_sep = 6.0
+; in which case the value will be used for all *subsequent* panels.
+
+
+See the "examples" folder for some examples (look at the ones ending in .geom).
diff --git a/doc/man/indexamajig.1 b/doc/man/indexamajig.1
new file mode 100644
index 00000000..fcb1afc4
--- /dev/null
+++ b/doc/man/indexamajig.1
@@ -0,0 +1,220 @@
+.\"
+.\" indexamajig man page
+.\"
+.\" (c) 2009-2011 Thomas White <taw@physics.org>
+.\"
+.\" Part of CrystFEL - crystallography with a FEL
+.\"
+
+.TH INDEXAMAJIG 1
+.SH NAME
+indexamajig \- bulk indexing and data reduction program
+.SH SYNOPSIS
+.PP
+.B indexamajig
+[options]
+
+.SH DESCRIPTION
+
+The "indexamajig" program takes as input a list of diffraction image files,
+currently in HDF5 format. For each image, it attempts to find peaks and then
+index the pattern. If successful, it will measure the intensities of the peaks
+at Bragg locations and produce a list in the form "h k l I", with some extra
+information about the locations of the peaks.
+
+For minimal basic use, you need to provide the list of diffraction patterns,
+the method which will be used to index, a file describing the geometry of the
+detector, a PDB file which contains the unit cell which will be used for the
+indexing, and that you'd like the program to output a list of intensities for
+each successfully indexed pattern. Here is what the minimal use might look like
+on the command line:
+
+indexamajig -i mypatterns.lst -j 10 -g mygeometry.geom --indexing=mosflm,dirax --peaks=hdf5 --cell-reduction=reduce -b myxfel..beam -o test.stream -p mycell.pdb --record=integrated
+
+More typical use includes all the above, but might also include a noise or
+common mode filter (--filter-noise or --filter-cm respectively) if detector
+noise causes problems for the peak detection. The HDF5 files might be in some
+folder a long way from the current directory, so you might want to specify a
+full pathname to be added in front of each filename. You'll probably want to
+run more than one indexing job at a time (-j <n>).
+
+You can include a table of saturation values for in the HDF5 file, if you have
+a method for estimating the intensities of saturated peaks. It goes in
+/processing/hitfinder/peakinfo_saturated, and should be an n*3 two dimensional
+array, where the first two columns contain fast scan and slow scan coordinates
+(in that order) and the third contains the value which should belong in a peak
+at the given location. The value will be spread in a small cross centred on
+that location.
+
+See doc/geometry for information about how to create a geometry description
+file.
+
+You can control what information is included in the output stream using
+' --record=<flags>'. Possible flags are:
+
+ pixels Include a list of sums of pixel values within the
+ integration domain, correcting for individual pixel
+ solid angles.
+
+ integrated Include a list of reflection intensities, produced by
+ integrating around predicted peak locations.
+
+ peaks Include peak locations and intensities from the peak
+ search.
+
+ peaksifindexed As 'peaks', but only if the pattern could be indexed.
+
+ peaksifnotindexed As 'peaks', but only if the pattern could NOT be indexed.
+
+So, if you just want the integrated intensities of indexed peaks, use
+"--record=integrated". If you just want to check that the peak detection is
+working, used "--record=peaks". If you want the integrated peaks for the
+indexable patterns, but also want to check the peak detection for the patterns
+which could not be indexed, you might use
+"--record=integrated,peaksifnotindexed" and then use "check-peak-detection" from
+the "scripts" folder to visualise the results of the peak detection.
+
+.SH PEAK DETECTION
+
+You can control the peak detection on the command line. Firstly, you can choose
+the peak detection method using "--peaks=<method>". Currently, two possible
+values for "method" are available. "hdf5" will take the peak locations from the
+HDF5 file. It expects a two dimensional array at /processing/hitfinder/peakinfo
+where size in the first dimension is the number of peaks and the size in the
+second dimension is three. The first two columns contain the x and y
+coordinate (see the "Note about data orientation" in geometry.txt for details),
+the third contains the intensity. However, the intensity will be ignored since
+the pattern will always be re-integrated using the unit cell provided by the
+indexer on the basis of the peaks.
+
+The "zaef" method uses a simple gradient search after Zaefferer (2000). You can
+control the overall threshold and minimum gradient for finding a peak using the
+"--threshold" and "--min-gradient" options. Both of these have units of "ADU"
+(i.e. units of intensity according to the contents of the HDF5 file).
+
+A minimum peak separation can also be provided in the geometry description file
+(see geometry.txt for details). This number serves two purposes. Firstly,
+it is the maximum distance allowed between the peak summit and the foot point
+(where the gradient exceeds the minimum gradient). Secondly, it is the minimum
+distance allowed between one peak and another, before the later peak will be
+rejected "by proximity".
+
+You can suppress peak detection altogether for a panel in the geometry file by
+specifying the "no_index" value for the panel as non-zero.
+
+
+.SH INDEXING METHODS
+
+You can choose between a variety of indexing methods. You can choose more than
+one method, in which case each method will be tried in turn until the later cell
+reduction step says that the cell is a "hit". Choose from:
+
+ dirax : invoke DirAx
+ mosflm : invoke MOSFLM (DPS)
+
+Depending on what you have installed. For "dirax" and "mosflm", you need to
+have the dirax or ipmosflm binaries in your PATH.
+
+Example: --indexing=dirax,mosflm
+
+.SH CELL REDUCTION
+
+You can choose from various options for cell reduction with the
+"--cell-reduction=" option. The choices are "none", "reduce" and "compare".
+This choice is important because all autoindexing methods produce an "ab
+initio" estimate of the unit cell (nine parameters), rather than just finding
+the orientation of the target cell (three parameters). It's clear that this is
+not optimal, and will hopefully be fixed in future versions.
+
+With "none", the raw cell from the autoindexer will be used. The cell probably
+won't match the target cell, but it'll still get used. Use this option to test
+whether the patterns are basically "indexable" or not, or if you don't know the
+cell parameters. In the latter case, you'll need to plot some kind of histogram
+of the resulting parameters from the output stream to see which are the most
+popular. If you're lucky, this will reveal the true unit cell.
+
+With "reduce", linear combinations of the raw cell will be checked against the
+target cell. If at least one candidate is found for each axis of the target
+cell, the angles will be checked to correspondence. If a match is found, this
+cell will be used for further processing. This option should generate the most
+matches, but might produce spurious results in many cases. The predicted peaks
+are always checked to verify that at least 10% of the predicted peaks are close
+to peaks located by the peak search. If not, the next candidate unit cell is
+tried until there are no more options.
+
+The "compare" method is like "reduce", but linear combinations are not taken.
+That means that the cell must either match or match after a simple permutation
+of the axes. This is useful when the target cell is subject to reticular
+twinning, such as if one cell axis length is close to twice another. With
+"reduce", there is a possibility that the axes might be confused in this
+situation. This happens for lysozyme (1VDS), so watch out.
+
+The tolerance for matching with "reduce" and "compare" is hardcoded as 5% in
+the reciprocal axis lengths and 1.5 degrees in the (reciprocal) angles. Cells
+from these reduction routines are further constrained to be right-handed. The
+unmatched raw cell might be left-handed: CrystFEL doesn't check this for you.
+Always using a right-handed cell means that the Bijvoet pairs can be told
+apart.
+
+If the unit cell is centered (i.e. if the space group begins with I, R, C, A or
+F), you should be careful when using "compare" for the cell reduction, since
+(for example) DirAx will always find a primitive unit cell, and this cell must
+be converted to the non-primitive conventional cell from the PDB.
+
+
+.SH TUNING CPU AFFINITIES FOR NUMA HARDWARE
+
+If you are running indexamajig on a NUMA (non-uniform memory architecture)
+machine, a performance gain can sometimes be made by preventing the kernel from
+allowing a process or thread to run on a CPU which is distant from the one on
+which it started. Distance, in this context, might mean that the CPU is able to
+access all the memory visible to the original CPU, but perhaps only relatively
+slowly via a cable link. In many cases a group of CPUs will have direct access
+to a certain region of memory, and so the process may be scheduled on any CPU in
+that group without any penalty. However, scheduling the process to any CPU
+outside the group may be slow. When running under Linux, indexamajig is able to
+avoid such sub-optimal process scheduling by setting CPU affinities for its
+threads. The CPU affinities are also inherited by subprocesses (e.g. MOSFLM or
+DirAx).
+
+To do this usefully, you need to give indexamajig some information about your
+hardware's architecture. Specify the size of the CPU groups using
+"--cpugroup=<n>". You also need to specify the overall number of CPUs, so that
+the program knows when to 'wrap around'. Using "--cpuoffset=<n>", where "n" is
+a group number (not a CPU number), allows you to manually skip a few CPUs, which
+may be useful if you do not want to use all the available CPUs but want to avoid
+running all your jobs on the same ones.
+
+Note that specifying the above options is NOT the same thing as giving the
+number of analyses to run in parallel (the 'number of threads'), which is done
+with "-j <n>". The CPU tuning options provide information to indexamajig about
+how to set the CPU affinities for its threads, but it does not specify how many
+threads to use.
+
+Example: 72-core Altix UV 100 machine at the author's institution
+
+This machine consists of six blades, each containing two 6-core CPUs and some
+local memory. Any CPU on any blade can access the memory on any other blade,
+but the access will be slow compared to accessing memory on the same blade.
+When running two instances of indexamajig, a sensible choice of parameters might
+be:
+
+1: --cpus=72 --cpugroup=12 --cpuoffset=0 -j 36
+2: --cpus=72 --cpugroup=12 --cpuoffset=36 -j 36
+
+This would dedicate half of the CPUs to one instance, and the other half to the
+other.
+
+
+.SH A NOTE ABOUT UNIT CELL SETTINGS
+
+CrystFEL's core symmetry module only knows about one setting for each unit cell.
+You must use the same setting. That means that the unique axis (for cells which
+have one) must be "c".
+
+
+.SH KNOWN BUGS
+
+Don't run more than one indexamajig jobs simultaneously in the same working
+directory - they'll overwrite each other's DirAx or MOSFLM files, causing subtle
+problems which can't easily be detected.
diff --git a/doc/man/pattern_sim.1 b/doc/man/pattern_sim.1
new file mode 100644
index 00000000..d53aeae5
--- /dev/null
+++ b/doc/man/pattern_sim.1
@@ -0,0 +1,36 @@
+.\"
+.\" pattern_sim man page
+.\"
+.\" (c) 2009-2011 Thomas White <taw@physics.org>
+.\"
+.\" Part of CrystFEL - crystallography with a FEL
+.\"
+
+.TH PATTERN\_SIM 1
+.SH NAME
+pattern\_sim \- Simulation of nanocrystallographic diffraction patterns
+.SH SYNOPSIS
+.PP
+.B pattern\_sim
+[options]
+
+.SH DESCRIPTION
+
+pattern_sim does not know about symmetry, so your input reflection list
+(give with "-i") must be expanded. You can do this with:
+
+$ get_hkl -i myfile.hkl -o output.hkl -y mypointgroup -e 1
+
+get_hkl does not currently understand symmetry, which means you'll have to
+expand any molecular model (the PDB) out to P1 to get the correct results. You
+can achieve that, for example, by loading it into Mercury, turning on "Packing"
+and re-saving. Alternatively, you can do this using CCP4 with a command like:
+
+$ echo symgen P63 | pdbset xyzin model.pdb xyzout model-P1.pdb
+
+While on this subject, you might also want to include hydrogens in the model
+using something like:
+$ echo HYDROGENS APPEND | hgen xyzin model.pdb xyzout model-with-H.pdb
+
+Please be sure to read the "Note about Unit Cell Settings" in the documentation
+for indexamajig.
diff --git a/doc/man/process_hkl.1 b/doc/man/process_hkl.1
new file mode 100644
index 00000000..6c626e31
--- /dev/null
+++ b/doc/man/process_hkl.1
@@ -0,0 +1,51 @@
+.\"
+.\" process_hkl man page
+.\"
+.\" (c) 2009-2011 Thomas White <taw@physics.org>
+.\"
+.\" Part of CrystFEL - crystallography with a FEL
+.\"
+
+.TH PROCESS\_HKL 1
+.SH NAME
+process\_hkl \- Monte Carlo merging program
+.SH SYNOPSIS
+.PP
+.B process\_hkl
+-i mypatterns.stream -o mydata.hkl -y mypointgroup [options]
+
+.SH DESCRIPTION
+
+This program takes as input the data stream from "indexamajig". It merges the
+many individual intensities together to form a single list of reflection
+intensities which are useful for crystallography.
+
+Typical usage is of the form:
+
+$ process_hkl -i mypatterns.stream -o mydata.hkl -y mypointgroup
+
+.SH CHOICE OF POINT GROUP FOR MERGING
+
+One of the main features of serial crystallography is that the orientations of
+individual crystals are random. That means that the orientation of each
+crystal must be determined independently, with no information about its
+relationship to the orientation of crystals in other patterns (as would be the
+case for a rotation series of patterns).
+
+Some Laue classes are merohedral. This means that the orientation will have an
+ambiguity, but this time more serious. The two (or more) possible
+orientations could be called "twins", but the mechanism of their formation is
+somewhat different to the conventional use of the term. In these cases, you
+will need to merge according to the point group corresponding holohedral Laue
+class.
+
+You can also tell process_hkl the "apparent" symmetry, which is the symmetry as
+far as whatever produced the stream was concerned. In the case of most indexing
+algorithms, this will be the corresponding holohedral point group (not the
+Laue class nor the holohedral Laue class). If you use the "-a" option to give
+this information, process_hkl will try to resolve the remaining orientational
+ambiguities to get from the apparent symmetry to the true symmetry (given with
+"-y"). Currently, it won't do a very good job of it.
+
+The document twin-calculator.pdf contains more detailed information about this
+issue, as well as tables which contain all the required information.