Add command to extract Neff scores for MSA #647

neftlon · 2022-12-12T20:01:44Z

As discussed in #638, this code adds a new command to output Neff scores. The command is called profile2neff. It takes a profile database as input and outputs per-residue Neff scores for a query sequence.

The scores are written to a DBWriter that then contains two lines for each sequence: a header similar to profile2pssm's output and a line containing tab-separated Neff scores (from the range [1;255]) for each residue. The score is converted from the internal float representation to char using the convertNeffToChar function from MathUtil.h.

neftlon · 2022-12-12T21:29:42Z

It appears like some of the CI tests are not passing. Am I missing something or are parts of the CI pipeline broken? Can someone help me on that?

milot-mirdita · 2022-12-13T08:42:20Z

No idea why windows is failing in azure, you didn't change anything that would affect that. Cirrus is currently okay to fail, something changed on their side and I didn't get around to fix the issue.

I think you are still using the wrong function. Neff is stored as a char, you need to use convertNeffToFloat to convert it back.

neftlon · 2022-12-13T11:34:30Z

Ok good, then I will ignore these pipelines.

The Neff scores I use come from the neffM field of the Sequence.h class. According to the following code, these are stored as floats.

MMseqs2/src/commons/Sequence.h

Line 453 in 7b95387

float *neffM;

Therefore I think the convertNeffToChar function is more appropriate since it takes a Neff score that is stored a float. (The convertNeffToFloat function expects its parameter as an unsigned char, which I don't have when using said neffM field.)

MMseqs2/src/commons/MathUtil.h

Lines 216 to 224 in 7b95387

    
           static char convertNeffToChar(const float neff) { 
        
               float retVal = std::min(255.0f, 1.0f+64.0f*flog2(neff) ); 
        
               return std::max(static_cast<unsigned char>(1), static_cast<unsigned char>(retVal + 0.5) ); 
        
           } 
        
           static float convertNeffToFloat(unsigned char neffToScale) { 
        
               float retNeff = fpow2((static_cast<float>(neffToScale)-1.0f)/64.0f);; 
        
               return retNeff; 
        
           }

Sorry if I am missing something here. Is there another location/a better way of extracting the Neff scores?

(I don't know whether this is just personal preference, but I like the idea of values not being floats when writing them to an output. A fixed range from [1;255] somehow sound more appealing to me than a floating point number with an obscure precision.)

milot-mirdita · 2022-12-16T03:51:02Z

Okay, sorry I didn't remember the code very well. Your initial implementation without the MathUtil functions was correct, the Sequence object already deals with the correction to float. I wouldn't use convertNeffToChar here, it just spreads the possible range of Neffs (0 to 20, but more realistically 0 to 14) over the char range (0 to 255). I don't think it makes a lot of sense to print a value from 0 to 255.

neftlon · 2023-01-16T08:53:48Z

Sorry for the long round-trip delay, I've reverted my changes to the original implementation :)

milot-mirdita · 2023-01-30T07:36:22Z

Thank you. I was traveling and forgot about the PR, sorry!

Add command to extract Neff scores for an MSA

f527bd9

neftlon force-pushed the master branch from 7cf4fe8 to f527bd9 Compare January 11, 2023 10:43

Merge branch 'soedinglab:master' into master

6ac0412

neftlon added 2 commits January 16, 2023 12:26

Make Neff output complient with other OpenMP code

981c189

Add more precision to Neff output

ae2f722

milot-mirdita merged commit 4148e09 into soedinglab:master Jan 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add command to extract Neff scores for MSA #647

Add command to extract Neff scores for MSA #647

neftlon commented Dec 12, 2022 •

edited

Loading

neftlon commented Dec 12, 2022

milot-mirdita commented Dec 13, 2022

neftlon commented Dec 13, 2022

milot-mirdita commented Dec 16, 2022

neftlon commented Jan 16, 2023

milot-mirdita commented Jan 30, 2023

Add command to extract Neff scores for MSA #647

Add command to extract Neff scores for MSA #647

Conversation

neftlon commented Dec 12, 2022 • edited Loading

neftlon commented Dec 12, 2022

milot-mirdita commented Dec 13, 2022

neftlon commented Dec 13, 2022

milot-mirdita commented Dec 16, 2022

neftlon commented Jan 16, 2023

milot-mirdita commented Jan 30, 2023

neftlon commented Dec 12, 2022 •

edited

Loading