refining helical structure - strange FSC curve behavior

Forums

I'm having a problem with refinement, and was hoping someone could help.

I have a helical structure refined to ~7.5Å (gold standard approach) using SPIDER, and I wanted to see if using frealign would improve the structure. I wasn't certain that I had converted the angle assignments correctly, so I ran an initial search (mode 4) using a high resolution cutoff of 10 Å. The structure after this first round had a reasonable enough looking FSC curve (see attached, black curve), and a resolution of ~7A at the 0.143 cutoff, and the map looked very similar to the map refined in SPIDER.

After this first search, I ran several rounds of refinement in mode 1, again using 10 Å as my high resolution limit, keeping all of the other parameters the same as were used in the initial search (same mparameters file). The scores jumped fairly dramatically for each particle at this point, which I took as encouraging. However, the output map looked like it had been low-pass filtered at 10 Å, and the FSC curve looks healthy above 10 Å but plummets to zero right at 10 Å (dashed curve in the attached FSC plot). So far, my attempts to vary input parameters for refinement or reconstruction have resulted in the same problem. I suspect I am setting some parameter to an inappropriate value, but can't seem to figure out which one.

The header of my output parameter file for the first refinement round is pasted in below. I'd be very grateful if anyone has insight into what I might be doing wrong.

Thanks,
Justin

C Image format . . . . . . . . . . . . S
C Mode . . . . . . . . . . . . . . . . 1
C PMASK for parameter refinement . . . 1 1 1 1 1
C Magnification refinement . . . . . . F
C Defocus refinement . . . . . . . . . F
C Astigmatism refinement . . . . . . . F
C Defocus ref. of individual particles F
C Ewald sphere correction. . . . . . . 0
C Beautify the final real space map. . F
C Apply Wiener filter to final map . . F
C B-factor correction of final map . . F
C Write out matching projections . . . F
C Calculate FSPR and FSC curves. . . . 0
C Calculate more statistics. . . . . . T
C Memory/speed optimization. . . . . . 3
C 3D interpolation . . . . . . . . . . 1
C Outer Radius of object [Angstroms] . 60.00
C Inner Radius of object [Angstroms] . 0.00
C Pixel size [Angstroms] . . . . . . . 2.44000
C Molecular mass [kDa] . . . . . . . . 540.000
C % Amplitude contrast . . . . . . . . 0.07
C STD level for 3D mask. . . . . . . . 0.00
C Score / B factor constant. . . . . . 100.00
C Average score for weighting. . . . . 3.00
C Symmetry card as input . . . . . . . H
C Helical rotation per subunit . . . 157.29
C Helical rise per subunit . . . . . 25.12
C Number of subunits to average. . . 2
C Number of starts . . . . . . . . . 1
C Stiffness parameter. . . . . . . . 5.00
C First particle . . . . . . . . . . . 1
C Last particle. . . . . . . . . . . . 3297
C Relative magnification . . . . . . . 1.0000
C Densitometer step size (microns) . . 7.6
C Score target . . . . . . . . . . . . 10.00
C Score threshold. . . . . . . . . . . 0.00
C Cs [mm]. . . . . . . . . . . . . . . 2.16
C Voltage [kV] . . . . . . . . . . . . 300.00
C Beam tilt Tx, Ty [mrad]. . . . . . . 0.00 0.00
C Resolution of reconstruction . . . . 4.880
C Low resol. limit refinement. . . . . 250.000
C High resol. limit refinement . . . . 10.000
C High resol. limit classification . . 8.000
C Defocus uncertainty. . . . . . . . . 100.000
C B-factor for parameter refinement. . 0.000
C Input image stack /Users/jmk/raid/140304quad/frealign_relgroup5/stack.spi
C Input parameter file /Users/jmk/raid/140304quad/frealign_relgroup5/q_2.par
C Output parameter file q_3.par_1_3297
C Output shifts file q_3.shft_1_3297
C 3D reconstruction file /Users/jmk/raid/140304quad/frealign_relgroup5/q_2.spi
C 3D weights file q_3_weights
C 3D reconstruction halfset 1 q_3_map1.spi
C 3D reconstruction halfset 2 q_3_map2.spi
C 3D ave phase residual file q_3_phasediffs
C 3D point spread function q_3_pointspread

I do not see anything unusual except that you set FSTAT=T. The FSTAT option (to calculate additional statistics) has not been maintained for a while as we have not used it much in our work. I have therefore removed it in the latest version of Frealign (v9.08). It is possible that by switching it on you have triggered some strange behavior/bug in Frealign. Could you either try running v9.08 (see download page) or set FSTAT=F?

In reply to by niko

HI Niko -

Thanks for the response. I set FSTAT=T to see if I could get any more info to try to figure things out - the problem happens when set to F as well. Also, just tried v9.08 and have the same behavior.

I'm wondering now if it is something to do with how my images have been prepared? When running in mode 4 I get this warning about the particle sigma in the log file (same for every particle):
Best score for particle 1 at Rmin/Rmax 250.0 10.0: 2.772
Resetting unrealistic sigma for particle 1

But I don't get that warning when refining in mode 1?

Curious, I looked at the reprojections for one round of refinement, and many have horizontal stripes across the particle image (not the reprojection, though). The input images look okay, but the stripiness appears on at least half of the particles in the diagnostic images.

Thanks,
Justin

In reply to by justin.kollman

The warning is normal and just meas that in the first round (with Mode 4 in your case) encountered sigma values that were set to some default, i.e. they were not actually calculated from your data. Presumably, your script to convert parameters from Spider did this. It is not a problem and the warning should not come up in later refinement cycles when the sigma values were calculated by Frealign using your data.

We sometimes see the stripes in the matching projection montages as well. I need to track this down but it is not a problem that should affect particle alignment and reconstruction. It is limited to the generation of the matching projections.

Your problem might have another reason: During Mode 4, the resolution used in the alignment is increased stepwise from Rmax to Rrec, in your case from 10 to 4.88 A. Furthermore, because your input parameter file was generated from the Spider parameters, it does not contain any FSC and SSNR curves (these curves are output at the end of the parameter file by Frealign when calculating a reconstruction). The SSNR curve is used by Frealign to apply appropriate weighting to the correlation coefficient used for alignment. If no SSNR curve is available, Frealign uses some default that will not reflect the correct signal distribution in your data and reference. This, together with the Mode 4 refinement at 4.88 A, might lead to severely overfitted reconstructions and a biased FSC.

I would recommend using your Spider alignment parameters as starting parameters and calculating a 3D reconstruction with Frealign and these parameters as the first step. Paste the resolution (FSC and SSNR) table generated with this reconstruction at the end of your converted parameter file. Then you can run additional refinement cycles with Frealign and Mode 1.

It is always a bit complicated to convert from Spider to Frealign. Working with another person from the Agard lab who appears to be using Spider scripts that are related to yours, I wrote a little Fortran program that converts the Spider files into A Frealign parameter file. I have attached an archive with the program and an example script to run it. Please try it to see if this is helpful.

Hi Justin,

The sigma warning is normal, and is not causing your problem.

Could you perhaps post the header of the log file which you used for mode 4 search? I suspect that may be where the problem is happening.

On another note, I believe that we have spider conversion scripts posted in example scripts somewhere. I am sure Chuck Sindelar or alexis rohou has scripts for conversion as well.

HTH,

Axel

Thanks Niko and Axel -

I think you were right about both overfitting in mode 4 (I misunderstood how the post-search refinement works), and the SSNR weighting of the initial model from spider.

I did as you suggested, calculating a reconstruction in Frealign using the converted spider parameters, and the bizarre behavior of the fscs has gone away. Subsequent refinement seems to be proceeding more reasonably.

Something is not quite right with the parameter conversion, as my output volume has a resolution of only about 20 A, rather than the ~7.5 from SPIDER. I suspect there is a problem in converting the x/y shifts, as the result is essentially the same if I just set all x/y shifts to zero. But, the 20A structure does look like a 20A version of the original, so I used that to initiate refinement, and things seem to be going okay - I have essentially now recovered my original structure after a few rounds of refinement in Frealign. The resolution still hasn't improved over the SPIDER recon, but there is a clear bi-modal distribution of the scores output by Frealign, which I think may suggest multiple conformational states. I have started looking at your paper on likelihood-based classification, and may try that using models with different helical symmetries.

Of course, probably better to just switch to frealix. I'll be trying that next, so you'll likely see more questions from me here.

Thanks for your help!

Justin

In reply to by justin.kollman

Hi Justin,

Since Frealix can't (yet) do multi-reference refinement, I'd recommend doing the sorting in Frealign. Then once you have converged onto your two (or more?) solutions, transition to using Frealix, which I'll be able to help with at that point.

Alexis