Multiprocessor performance

Forums

I'm playing around with the v8 release of frealign. I have been bench-marking the various versions ("vanilla," fftw, mp, mpfftw) using the test data set you include (pdh). Perhaps I'm doing something wrong, but for the search example, I see absolutely no difference when I try to add more processors to the mp and mpfftw versions. I know that the program thinks it has access to more processors since I get things like

Parallel processing: NCPUS = 4

as output. But I never see more than a single processor being used, the time to run through the data set remains the same even though I have requested 1, 2 or 4 processors, etc. So it looks to me like nothing is changing even when I request additional processors. Should I expect to see a difference for the search example?

The multiprocessor version of Frealign affects mainly the reconstruction algorithm, not the search or refinement. To parallelize search or refinement, it is far more efficient to split up the job into smaller jobs that each deal with a portion of the data set (see multiprocessor examples in the distribution archive).

In reply to by niko

I run out of memory (32 gb) when I submit jobs to more than 3 cores on each node. That leaves 5 cores idle with this scheme.
Would a parallelised version of the search and refinement be able to circumvent that problem? Or is it an unsolvable problem?

In reply to by adesgeorges

You should not run out of memory with 32 GB. You mentioned that your particle dimension is 350 pixels. If you set FSTAT = F and IBLOW = 1, this means one job requires 1.3 GB. It sounds like you are using IBLOW = 4, which will increase the memory requirements for one job to 11 GB.

If you parallelize your job for the refinement of parameters, you should also set the first number (RELMAG) on the final CARD 6 to -100. This will skip the calculation of a 3D structure which you will not need when you just refine particle parameters (you run another Frealign job with MODE = 0 for the 3D reconstruction). Please read the 2007 paper on Frealign for more details.

Hi: I am not using qsub to submit mreconstruct.com, I use an alternative one for the submission, which does not have control on the number of cores. Is there a way to specify the number of cores some where else? It seems that freealign reads in NCPUS and copies it into OMP_NUM_THREADS. How/where can I specify NCPUS? Thanks!