I am attempting to process a (true) helical filament (symmetry H) with Frealign v9.11. The processing is being performed locally on a workstation.
After initial alignment of segments with RELION, I was able to port my data to Frealign and generate a believable 3.5 A reconstruction (based on comparison to known structures of the protomer) with Frealign utilizing the RELION alignment parameters with mode 0, running the provided frealign_calc_reconstructions script.
I am then able to get through a single round of "priming" refinement setting the parameter_mask values to "0 0 0 0 0" running "frealign_run_refine". However, when I try to then do an actual refinement, setting parameter_mask to "1 1 1 1 1", the refinement step fails in inconsistent ways. The refinement will commence, but then individual frealign processes will begin to fail, generally after writing the header but before writing any alignment parameters to their respective temporary parameter files in the scratch directory. Sometimes they will chug along for a little while but then die, without an obvious pattern to where they fail. Oddly, sometimes the frealign_run_refine script seems aware that this has happened, writing a "Job XXXXX crashed" note in the frealign.log file. However, it does not kill the other threads, and frealign_kill will not work to kill them. Other times, the threads begin to die off, without it noticing, and frealign_kill will work in this condition.
When a "Job XXXXX crashed" note does appear, it will generally point to a specific mult_refine log file in the scratch directory. However, I can find no difference between these files and those from successful threads, other than the lack of any scores.
I can't think of any obvious differences between this job and others we routinely run successfully, other than that the dataset is somewhat larger: ~160,000 segments vs. ~70,000. We are using a 512 box, so the stack is fairly large. We have been using mrc stacks: does this become an issue with large stacks? Would a different format potentially be better? Although it is a bit confusing that it only begins to fail once parameter_mask is set to "1 1 1 1 1"; I would think input-file read issues would manifest themselves systematically...
Apologies for the long post, and thanks very much in advance for any help!