Frealix: beginner's documentation
This documentation is meant to help you get started with frealix.
First, you'll need to make sure you have installed frealix or that someone has done this for you.
To check that frealix was installed correctly, type "frealix" at the terminal prompt.
$ ./frealix
(assuming you copied the executable to the current directory)
You should see something like this:
** Welcome to Frealix ** Version: 0.8 SVN revision: 901 Build date: Aug 09 2011 Debug symbols?: No Interactive mode?: Yes User name (home directory): joe (/home/joe) Working directory: /data/joe Command line: ./frealix Host name: kosice.rose2.brandeis.edu Host, machine type: x86_64-linux, x86_64 OpenMP threading available?: No Date & time: 2011-08-16 18:21:59Please give FREALIX_MODE:
Frealix asks you to specify FREALIX_MODE, the mode you want Frealix to run. For example, if you want to bin (decimate) an image, you could give "BIN" here. Most of the time, you will actually be using mode FILAMENT_REFINE, though you will probably never have to type this yourself.
Now, press Ctrl-C (on *nix systems) to kill frealix. You should see something like this:
**info(SIGNAL_HANDLER): caught interrupt signal (SIGINT). This may be because you used Ctrl-C. Will terminate. Interrupt signal received Image PC Routine Line Source frealix 000000000073388A Unknown Unknown Unknown frealix 0000000000732405 Unknown Unknown Unknown frealix 0000000000668A16 Unknown Unknown Unknown frealix 000000000061D74B Unknown Unknown Unknown frealix 000000000046FDE5 runtime_parameter 182 runtime_parameters.f90 frealix 000000000046FCA1 runtime_parameter 269 runtime_parameters.f90 frealix 000000000061326C Unknown Unknown Unknown libpthread.so.0 0000003F6500EB70 Unknown Unknown Unknown libpthread.so.0 0000003F6500D93E Unknown Unknown Unknown frealix 0000000000689091 Unknown Unknown Unknown frealix 0000000000687F82 Unknown Unknown Unknown frealix 0000000000641287 Unknown Unknown Unknown frealix 00000000004C11D5 ui_mp_get_frealix 28 ui.f90 frealix 0000000000611820 MAIN__ 15 frealix.f90 frealix 000000000042ED7C Unknown Unknown Unknown libc.so.6 0000003F6441D994 Unknown Unknown Unknown frealix 000000000042EC79 Unknown Unknown Unknown Termination date & time : 2011-08-16 18:25:03 2011-08-16 18:25:03: FREALIX says sorry (total execution time: 3 minutes and 4 seconds)
After the stack trace (which shows which part of the code was being execuated when you killed Frealix), the date & time is printed, and Frealix says sorry! It is worth remembering that Frealix should always exit saying either "sorry" or "goodbye". So if you do "grep sorry" on the output Frealix and you get something, chances are something messed up and Frealix crashed out.
A simple use of Frealix: binning volumes
As an example to illustrate the ways Frealix can be run, we will pretend that you have an 3D volume called 3dvol_in.mrc with a voxel size of 1A, which you would like to "bin" or "decimate" so that it has a final voxel size of 4A. There are many software packages out there that let you do this, but if you decide you want to use Frealix to do it, there are essentially 3 ways you could go about it:
1. Interactively
2. With a shell script
3. With a command (.flx) file as argument
The wrapper script for amyloid fibril projects
This python script is called flx_wrap.py and is located at /gusr/alr99/cluster/workspace/frealix/flx_wrap.py. It should be all you need to use to obtain your first reconstructions of amyloid fibrils. It will handle most of the interactions with frealix for you, and will create and manage a directory structure for iterative refinement. I recommend to copy it to a fresh directory where the refinement will happen.
Preparing for the refinement & reconstruction
Make sure you have available:
- Your dataset's pixel size in Angstroms.
- All your scanned films in one folder, named consistently (e.g. film_???.mrc)
- Boxer-style coordinates files. One per filament, named consistently, with the first number in the filename matching the film number and the second number being the filament's index within that micrograph, starting at 1. (e.g. ???_001.box, ???_002.box etc)
- Output from CTFFIND or CTFTILT, named consistently with a number matching the film number (e.g. ctffind_???_output.log)
- A (generous) estimate of the maximum radial width of your fibrils, in Angstroms
- An estimate of the helical parameters for the fibril. The twist does not need to be exact (it will be recalculated on the fly), but the rise (probably 4.8) and symmetry should be.
- A symlink or a copy of the frealix2 executable in the directory where you'll do the refinement. You can symlink with this command:
ln -s /gusr/alr99/cluster/workspace/frealix/build_intel_openmp/frealix2
Preparing the base parameter file
The python script will fill in the details for you, but you need to provide it with a basic frealix-formatted parameter file.
Here is an example:
INPUT_BOX_FILES ../scans/????_???_bin4.box CTFFIND_FILES none CTFTILT_FILES ../ctftilt/????_ctftilt.log FILM_FILES ../scans/????_bin4.mrc FILM_SELECTION 1-7 TWIST_PER_SUBUNIT 179.705 #degrees RISE_PER_SUBUNIT 2.38 #Angstroms SYMMETRY_AXIAL 1 SYMMETRY_PERP 1 AMYLOID_BETA T ASYM_UNIT_MW 2550 #MW for the 26-residue fragment used for synth #4329.9 is the MW for Abeta(1-40) PIXEL_SIZE 4.6664 #2.3332 #1.1666 #Angstroms HELIX_MAX_RADIUS 100 #Angstroms #Optional parameters below POLARITY_CHECK T WRITE_MATCHING_PROJECTIONS T WRITE_FILAMENT_TRACES T EXTRA_CCF_PER_WP T NUM_MINIM_ROUNDS 4 SINGLE_FILAMENT_RECONSTRUCTIONS T #output central slices of individual filament reconstructions to disk #The following two options are for when you have prepared first and last coordinates before and after the first and last actual crossovers FIRST_LAST_WPS_NOT_XOVERS T FIRST_LAST_WPS_NOT_IN_3D T MIN_SCORE_FIL_TO_3D -1.0 #Filaments with score lower than this are not included in the 3D reconstruction. -1.0 means all filmanents will be included.
The format for this file is straight forward. Each line specifies a parameter and its value, and comments are preceded by # or !. Blank lines are allowed. The parameter's name and its value must be without blanks, spaces or tabs. The name, value and comments can be separated by spaces, tabs or other blank characters.
In the above example, a few optional parameters are specified towards the end, as indicated by the comment.
- POLARITY_CHECK: the program will try two possible orientations of each filament
- EXTRA_CCF_PER_WP: in addition to the full-filament scoring function maximisation, the program will alternate with 1-crossover-at-a-time, CCF-based alignments. This can help in getting out of local minima. (WP is short for waypoint; a crossover is a waypoint)
- NUM_MINIM_ROUNDS: the number of times to cycle through maximisation and CCF_PER_WP.
Running the wrapper script
The wrapper script can help you if you give it -h or --help:
[alr99@kosice ~]$ flx_wrap.py --help Usage: flx_wrap.py [options] flxRTPFile Options: -h, --help show this help message and exit -n NUM_ROUNDS, --num_rounds=NUM_ROUNDS number of refinement rounds to run -s START_MAX_RES, --start_max_res=START_MAX_RES max. resolution (fraction of Nyquist) for refinement in 1st round -f FINISH_MAX_RES, --finish_max_res=FINISH_MAX_RES max. resolution (fraction of Nyquist) for refinement in final round --num_nodes=NUM_NODES number of nodes for distributed processing -m STARTING_MODEL_3D, --model_3d=STARTING_MODEL_3D initial 3D model -p STARTING_PARAMETER_FILE, --input_parameters=STARTING_PARAMETER_FILE starting parameter file
The script has only one argument, the base parameter file you created above.
It has a few options, most of which are easily understood. If you don't specify a starting model 3D, one will be calculated for you. If you don't specify a starting parameter file (this time, by parameters, we mean alignment parameters for the filaments' waypoints), the results from the previous round will be used, or if no rounds have been run yet, a parameter file will be bootstrapped.
The START_MAX_RES and FINISH_MAX_RES specify the maximum resolution to be considered during refinements at the first and final rounds of refinement respectively. For reasons to do with noise bias as well as minimisation, it is really worth starting by considering only low resolution information, and gradually moving on to higher resolutions. The parameters here can range from 0.0 to 1.0, where 1.0 is the Nyquist frequency. An example of reasonable values for the first few rounds might be 10 rounds, starting at 0.25 and going to 0.75.
So, an example of a invocation of the wrapper script might be:
[alr99@delphi 100504_berlin_flx2]$ ./flx_wrap.py frealix_parameters_bin4.flx -n 1 -s 0.25 -f 0.25 -m orig_vol_lp_bin4.mrc
This would run 1 round of refinement, with maximum resolution of 0.25, and will force frealix to use the file orig_vol_lp_bin4.mrc as a starting 3D model.
About threading
Using option -t of the wrapper script, you can request that each node uses a number of threads in its computation. However, please note:
- Threading only becomes efficient when the pixel size is small enough, and therefore the volumes/images large enough.
- Even in the best circumstances, multiplying the number of threads by n will not lead to an n-fold speed-up
- If you have more filaments than cores available for processing (the Grigorieff/Nicastro cluster has 376 cores), then it should be always better to leave -t at 1.
- The 3D reconstruction part of frealix is not threaded, so definitely use -t 1 if only doing a reconstruction (no refinement)
Binning / Unbinning during refinement
General strategy
The wrapper script and frealix do not worry about binning, they only care about the pixel size you specify in the parameter file.
So: say you've refined your structure at bin4 for 10 rounds, and you want to move on to bin2, then you have to:
- prepare a new base parameter file
- "unbin" your 3D reconstruction (should be called frealix_round_010/3drec.mrc). You can call the unbinned 3D something like frealix_round_010/3drec_unbin2.mrc.
- "unbin" the final alignment parameters (should be called frealix_round_010/waypoints.par). Maybe call this frealix_round_010/waypoints_unbin2.par
- call the wrapper script with appropriate options, so it knows to start with the unbinned 3d/alignment rather than the default results from round_010. For example:
[alr99@delphi 100504_berlin_flx2]$ ./flx_wrap.py frealix_parameters_bin2.flx -n 10 -s 0.45 -f 0.99 -m frealix_round_010/3drec_unbin2.mrc -p frealix_round_010/waypoints_unbin2.par
Using frealix to do binning and unbinning
Frealix(2) provides an easy way to bin or unbin images, volumes, boxer files or alignment parameter files. Just call the program "frealix2", and give the following answers:
- for the first question (FREALIX_MODE), give "BIN"
- then give the input file (with extension),
- the output file (with extension),
- the (un)binning factor (say, 2),
- whether you want to unbin: T will unbin, F will bin.
- For the last question (shift correction), you should say F when binning micrographs, T when binning 3D volumes, and F when unbinning anything.
Frealix2 will treat the input file as an image if it has extension .mrc, .img, .hed or .spi. If the file has extension .box, it will treat it as a "database" (i.e. set of coordinates) from EMAN1's boxer program. Otherwise, it will try to bin/unbin the file as an alignment parameter file.
To run this binning/unbinning as a script, you can call frealix2 like this:
frealix2 <<EOF BIN input.mrc output.spi ${binning_factor} F F EOF
Binning/unbinning .box files from EMAN1's boxer
To prepare the boxer files for the very start of the analysis, you may need to bin or unbin from the scale at which you picked to scale at which you intend to start processing.
To do this, you can use frealix2 as detailled above.
Alternatively, I have two small c-shell scrips available:
/gusr/alr99/linux/scripts/bin_box.csh
and
/gusr/alr99/linux/scripts/unbin_box.csh
Both take an input filename as first argument and a binning factor as second argument. Both output the resulting boxer-formatted to STDOUT. So a usage might be:
/gusr/alr99/linux/scripts/bin_box.csh input_bin1.box 4 > input_bin4.box
In future
In future, this binning/unbinning logic could be handled by the wrapper script, but I am relunctant to do this for now - I don't think it will be needed that often, and it doesn't need much work. Let me know if it's really a pain..
Frealix conventions
Coordinate system
Frealix uses a 3D right-handed cartesian coordinate system.
Euler angles
Frealix uses the Spider & Frealign definitions for Euler angles. Angles are clockwise-positive, when looking from the positive end of an axis towards the origin.
Helical operators
The helical screw operators in Frealix consist of a twist angle TWIST (usually in degrees at the end-user level) and a rise distance RISE (usually in Angstroms at the end-user level). The polarity of the helix is defined as positive with increasing Z (from -Z towards +Z). The screw operator consists of a shift of +RISE along the Z axis (Z increases) and a rotation of TWIST around the Z axis (this is the Phi angle; it is positive-clockwise when looking from +Z towards the origin).
RISE should always be given a positive value.
If TWIST is positive, the resulting screw operator describes a left-handed helix. A negative TWIST value describes a right-handed helix.
Defocus
Increasing Z goes from the electron gun towards the sample.
Underfocus means that the focal plane is further from the gun than the point of interest (particle; filament; waypoint) is.
The focal plane is defined as Z=0. Therefore, since most cryo-EM exposures are underfocus, Z is mostly negative. Making Z less negative (increasing it) means getting closer to focus.