giant TERA-sized Frealign .par files

Forums

I have noticed a recurring error where, on occasion (thankfully rarely), a giant .par file gets written that, when undetected at an early enough time point, can easily creep up to many TB in size. The first time I saw this, the file was 4.2 TB, now second time 2.4 TB.

The scripts most likely get stuck in a while loop writing a .par file (my guess is that this is around lines 410 in the mult_refine.com script), and the job, while still running, does not generate new output and thus doesn't go to the completed "end cycle". This results in potentially terabyte-sized .par files containing recurring parameters written to the file. I don’t know the exact source of this problem yet, but based on the Frealign log file, the particles have already been refined, and the scripts were writing the first batch of final .par files for the given cycle.

I'm attaching the frealign.log file and the first 10k lines of the giant .par file (originally 2.4 TB), where you can see that particles 1-1000 were written successfully, then it tries to do this over and over, but without writing the "C" headers.

Has anybody noticed this behaviour or may suggest possible solutions?

Dmitry