follow up on various versions of frealign

Forums

I've done some bench-marking of the various pre-compiled frealign versions (v8) for the search example using the 200 particle PDH data set. frealign_v8.exe and frealign_v8_fftw.exe seem to give essentially the same results, with some minor numerical differences (seen in the final values for things like FSPR, FSC, CC, SIG C and the ERFC. On my Ubuntu x86_64 machine, the fftw version runs about 8% faster. I will do some real stats (multiple runs) on this to see whether it is a real difference.

frealign_v8_mp.exe running on one processor seems to give exactly the same results as frealign_v8.exe and takes essentially exactly the same amount of time. The volume generated during this run is fine. The first time I tried to run using 2 processors, I got a floating exception after the last particle had been oriented. However, when I repeated that run exactly, the run finished (implying that there are run-time effects). There are differences in the final table, especially for the FSPR and FSC values, and the generated volume looks horrible (meaningless). The results running (once) on 4 processors avoided the floating exception but the values for FSPR and FSC are all NaN's. The other numbers collected in the final table are identical to those from the run using the single processor (and thus frealign_v8.exe). However, the volume is full of NaN's (which doesn't quite make sense given that some of the values in the resolution table are OK). A second run using 4 processors failed with a floating exception after the resolution table had been completed. That table was identical to the run with a single processor EXCEPT for the FSPR and FSC values. No final volume was generated (and the pair of volumes for the resolution test look bad), so I assume that the exception happened at that point...

The results with frealign_v8_mpfftw.exe are even stranger. When run on a single processor, the results appear identical to those from frealign_v8_fftw.exe (and thus almost identical to frealign_v8.exe). The volume generated using a single processor appears to be fine. The times for the fftw and mpfftw runs are also almost identical, so the 8% speedup noted above may be a real effect. When I run using 2 processors, there are minor differences in the FSPR and FSC values in the final table, but the rest of those values are identical. The time is a bit longer (~2%). When I tried with 3 processors, the program stopped simply with a glibc "corrupted double-linked list" error and an "abort" message after the last particle had been oriented. When I tried with 4 processors, the program hung at that point for 10's of minutes until I manually killed it.

I have also compiled all these programs locally on my Ubuntu machine. With the locally compiled version of frealign_v8_mpfftw.exe, I get the same double-linked list error and the program fails when using 2 or more processors.

And just to make things even messier, I have compiled all these versions on a different machine (one of the large IU clusters, that runs an x86_64 version of RHEL) and I get different results with the various MP versions.

All this seems to mean that you use the any of the MP enabled versions at your own risk, but with the understanding that since there seem to be huge differences between different machines (OS, compilers, etc.), your system may behave properly.

I am sorry the mp versions do not run reliably on your system. I have tested the currently posted version of Frealign using the pre-compiled binaries by running the search example (examples/search/frealign.com). I used the binary frealign_v8.08_mp.exe and assigned different numbers of CPUs (setenv NCPUS X, where X was 2, 3, 4 and 8) on three different systems. These systems run Redhat Enterprise Linux but with somewhat different kernels. One of them is a cluster. At this point I am out of ideas as to what the problem on your setup might be. I would recommend using the non_mp versions of Frealign until I have had a chance to reproduce and fix the problems.

When you recompiled frealign, which Makefiles were you using? How different were the results you obtained with the different mp versions that you ran on your Redhat cluster?

In reply to by niko

Since the posting earlier, I re-tested the version of frealign_v8_mp.exe I compiled on the RHEL system with 4 CPUs. Everything seemed to have run properly but when I looked at the final volume, it was garbage. However, the pair of volumes generated for the resolution measurement seem to be fine (and since the resolution table seems to be OK, that makes sense). I also know that the volume produced by this executable running on a single processor was OK, but I'm no longer confident that the volume I made using 3 processors was actually OK.

That version was compiled using Makefile_linux_amd64_gnu_mp.