Single or multiple stacks and alignment files?

Forums

Hello,

I have about 200 micrographs (~200 GB) and want to do a frealign refinement on my cluster. Shall I group them all into one stack and one alignment file or is it better to keep them all in separate files and submit each of them on a separate node?
I guess it doesn't matter too much for the alignment part, but may matter quite a bit for the reconstruction part.. ?

By advance, many thanks.

If I understand you correctly, your micrographs take up 200 GB but your image stack might be less? Anyway, a big stack should not be a problem. If you look at the DLP data set you will see that this was a stack of 52 GB. The ~18,000 particles were quite large (880 x 880 pixels) and, therefore, the reconstruction took about 10 hours using the multiprocessor version of Frealign on an 8 CPU workstation. The icosahedral symmetry was another reason why this data set took long to reconstruct because each particle was inserted 60 times into the reconstruction. If you have smaller particles and lower symmetry, it will run a lot faster. For example, for 100,000 particles with no symmetry and 180 x 180 pixel size Frealign will take about 15 minutes on an 8 CPU workstation. I would advise to keep all your particles in one stack if possible.

In reply to by niko

The image stacks together are about 200 gb in size with the spider format. (image size 350x350, no sym.)
Why do you advise to keep them all in one stack? To make things easier or to avoid other issues?

Thanks!

In reply to by adesgeorges

Yes, basically to make it easier to manage the files. But you can also divide the stack into smaller stacks and treat them as separate data sets. Frealign will still be able to read in all the stacks and merge them into one 3D reconstruction.

In reply to by niko

I have a problem that may be related to this post, I am trying to run a refinement and reconstruction with a large stack (10417 particles 910 by 910, size 33 GB), the stack was created with EM2EM. I can display it in Chimera and aparently is fine, but when running frealign it sudenly stops while reading the stack without showing any error,

the output look like this

Opening MRC/CCP4 file for READ...
File : frealign_image_stack.mrc
NX, NY, NZ: 910 910 10417
MODE : real
Min, max : -6.927997 38.29705
Mean, RMS : 0.1728310E-04 0.000000
TITLE 1: NAME2/5 02/61 1183:5cim rgor_hpa2000fit. // Created by IMAGIC: SPIDER image = pr
TITLE 2: oj_0000001.spi 14-08-20 16:07:27 // // Created by IMAGIC: SPIDER image = freali
TITLE 3: gn_image_stack.spi 14-08-20 19:49:13 //

RES?

I tried SPIDER and MRC formats for my stack but both show the same error, I am wondering if this may be related to memory problems,

how can I see if my workstation has enough memory to handle this stack?

do you have any advice in how to run a job with this large stack?,

thanks a lot

ROGELIO

In reply to by rogelio

Hi Rogelio,

I'm pretty sure your problem is not related at all to your input image stack or memory allocation. Rather, it's probably one of your RMAX input parameters which does not make sense. You should have at your parameters.

The reason I say this is that the last thing printed out by Frealign is "RES?", right after it opened your input stack. This happens in the subroutine CARDS8AND9, which you can find in src/cards8and9.f. If you look around lines 40 to 47, you'll see that this is where your RMAX parameter is converted, and that there are a few checks which can lead the program STOPing with error message "RES?" (lines 45 to 47).

Hope this helps,
Alexis

In reply to by Alexis

Thanks Alexis,

I just had a look at the src code and realized that the smallest RREC I can use is 2.0,
by default I was setting RREC as the double of my pixel size, which in this case is a bit below one, hence the error,
I just increased RREC and now everything works again, thanks for the help

ROGELIO