frealign seg faults on specific particles during search

Forums

Hello Team Frealign. I am running frealign v9.09 in search mode for a helical reconstruction, running on my mac using the bin_OSX executables. The job starts well (the new scripts are great for streamlining everything - thanks), outputting reasonable alignment parameters for many particles, but then will suddenly fail at specific particles withe a score of -100 followed by a seg fault (see the end of a log file below). The failure happens whether I run as a single process or muliple processes, and no matter which particle number I start on (for example, if I start the search with the first particle it will fail on particle 125, if I start at particle 125 it fails immediately).

This happens consistently on the same images. The images look fine, appear to have normal statistics (similar to the particles immediately preceding or following them which are processed fine), and all have been normalized with an average of 0 and s.d. of 1. I thought it might be a formatting problem, so converted my spider stack into an mrc stack or an imagic stack, and got the same failure, but now on a different subset of particles (which I think indicates it's not something wrong with the particle images themselves). The problematic particles are dispersed throughout the data set, and don't seem to be unique in any way (i.e. they beginning or end of a filament, don't have similar defocus values, orientations, etc.).

Oh, one last thing I just tried - if I run in refinement mode things seem to work fine.

Any insights would be much appreciated.

Thanks,
Justin

end of a log file from failed job:
Best score for particle 121 at Rmin/Rmax 500.0 20.0: -4.698
values of PSI,THETA,PHI at FMATCH extraction 95.694 279.000 314.064
Best score for particle 122 at Rmin/Rmax 500.0 20.0: 8.185
values of PSI,THETA,PHI at FMATCH extraction 96.326 267.000 44.230
Best score for particle 123 at Rmin/Rmax 500.0 20.0: 11.257
values of PSI,THETA,PHI at FMATCH extraction 96.799 270.000 225.024
Best score for particle 124 at Rmin/Rmax 500.0 20.0: 8.473
values of PSI,THETA,PHI at FMATCH extraction 95.310 270.000 236.640
Best score for particle 125 at Rmin/Rmax 500.0 20.0: -100.000
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
frealign_v9.exe 000000010732567E Unknown Unknown Unknown
frealign_v9.exe 00000001073238E1 Unknown Unknown Unknown
frealign_v9.exe 00000001072EEB51 Unknown Unknown Unknown
frealign_v9.exe 00000001072EE986 Unknown Unknown Unknown
frealign_v9.exe 00000001072A0A3A Unknown Unknown Unknown
frealign_v9.exe 00000001072AA4BE Unknown Unknown Unknown
libsystem_platfor 00007FFF9453EF1A Unknown Unknown Unknown
frealign_v9.exe 00000001072080F6 Unknown Unknown Unknown
frealign_v9.exe 000000010727AFCE Unknown Unknown Unknown
frealign_v9.exe 000000010723F6F4 Unknown Unknown Unknown
frealign_v9.exe 00000001071FB501 Unknown Unknown Unknown
frealign_v9.exe 00000001071F492E Unknown Unknown Unknown

Hi Justin,

Have a look at your parameter file. Does it have values "*****" for any of the particles? e.g. particle 125?

I recall, a few years ago, unknown values, marked by "****" in the parameter files themselves, would get into the Frealign .par files. This would happen I believe in the first cycle after you converted your inputs, presumably from another package (at least that's how it was for me).

To fix this, temporarily, you can just change the "****" values to, say 0.000. At least that might get you going, and you would be able to run the refinement. However, I presume this is a bigger issue, and perhaps Niko might have to weigh in here ...

Dmitry

In reply to by dlyumkis

Thanks, Dmitry. No, that doesn't seem to be the problem - no strange values for any of the parameters in the input file. This is really stumping me. Especially strange as the problem only occurs when running in search mode and not in refinement mode. Still probing around trying to figure it out...

Justin

In reply to by jkoll

Hey Justin,

To get more of a clue, you could re-run this with a re-built version of frealign. I'd suggest modifying one of the Makefiles, to include compiler options so that you get a backtrace with source filenames and line numbers (rather than the cryptic trace you're getting now). If you use one of the gfortran Makefiles, I'd suggest turning off the optimisation (-O3 becomes -O0) and turning on backtrace (-fbacktrace), debug symbols (-g).

If you need more detailled instructions let me know.

Then once you have a new version of frealign, re-run the same script (but with only one or two particles, because it will be much, much slower) and hopefully the bug will be reproduced, but this time you'll be able to pinpoint where exactly in the program things are going wrong. With that information, we can probably find a fix or a workaround.

Cheers,
Alexis

In reply to by Alexis

Thanks, Alexis. I recompiled as you suggested, and get a bit more information. The log file not ends with a seg fault due to invalid memory reference (see error log below). Does that help the diagnosis? II have mem_per_cpu set to 1024, but have tried various values (machines has 64GB memory). I am running just on my desktop machine, with 12 processors. I have also tried using fewer processors, down to 1, and still get the error.

Thanks,
Justin

Time before particle 48833 was 19:06:26
Best score for particle 48833 at Rmin/Rmax 500.0 20.0: -100.000

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x104bd8be2
#1 0x104bd8e00
#2 0x7fff9453ef19
#3 0x104c183d4
#4 0x104c82390
#5 0x104c4c7c1
#6 0x104c11439
#7 0x104c13b0d

In reply to by jkoll

Hi Justin,

Oh well, I was hoping the backtrace would include source files and line numbers, but nevermind. We can get source filenames and line numbers using the addr2line utility.

So please do something like this:

addr2line -e /path/to/the/frealign/exe/with/debug/symbols.exe 0x104bd8be2 0x104bd8e00 0x7fff9453ef19 0x104c183d4 0x104c82390 0x104c4c7c1 0x104c11439 0x104c13b0d

and post the output here.

Alexis

In reply to by Alexis

Hi Alexis -

Hmm, doesn't look like this provides any more information:
??:0
??:0
??:0
??:0
??:0
??:0
??:0

But, I think I found a fix. I started digging through the scripts, looking for differences between search and refinement, as a refinement job with the same data worked fine. What I realized is that mult_hsearch.com sets the particle mask to 1 0 1 1 1 for helical jobs (makes sense, as omitting theta saves a lot of time and isn't really necessary for the initial search). So, then what I realized is that the search was barfing when it hit particles that had a fairly large deviation from 90deg in theta (with like 15-18 degrees of out-of-plane tilt). If I change my theta to 90 degrees for the problematic particles everything works for the search. So, as a workaround I'm just running search using dummy input angles of theta=90 and psi/phi=0 for every particle. It's running fine now.

I assume the error arises because of something to do with constraints on theta either by filament or for the entire data set? The initial parameters here were determined in Spider, but I was always suspicious of the high out-of-plane tilts that show up for many segments (part of why I wanted to run a full search in frealign, instead of just refining from the spider parameters).

Thanks,
Justin