-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port the Biological Scattering Tools website to smallangle.org #3
Comments
@smk78 has requested a zip/tar bundle of the requisite files from the current hostmaster at ISIS. |
The periodictable package already does most of the calculation so it may be available within the sasview sld calculator. The results I'm getting from the online activation calculator are somewhat different from those for BSLDC. Can someone with bio knowledge cross-check the code and update periodictable to produce the correct answer? Compare the β-casein protein. The number of H (1370+371 vs 1643.5) and O (309 vs 305) in the formula are different though C and N are the same. The estimated density also differs (1.27 g/cm³ vs 1.35 g/mL). Activation calculatorScattering from aa:RELEELNVPGEIVESLSSSEESITRINKKIEKFQSEEQQQTEDELQDKIHPFAQTQSLVYPFPGPIPNSLPQNIPPLTQTPVVVPPFLQPEVMGVSKVKEAMAPKHKEMPFPKYPVEPFTESQSLTLTDVENLHLPLPLLQSWMHQPHQPLPPTVMFPPQSVLSLSQSKVLPVPQKAVPYPQRDMPIQAFLLYQEPVLGPVRGPFPIIV
Neutron transmission is 0.819% for 1 cm of sample (after absorption and incoherent scattering). BSLDC toolTable 1. The Total number of residues, Chemical composition, Molecular weight in kilodaltons, scattering length, scattering length density (ρ), molecular volume and number of exchangeable hydrogens for the inputed protein given it's sequence, % deuteration, % D2O concentration and % exchange. The sequence submitted was called "Not entered" The sequence submitted was "RELEELNVPGEIVESLSSSEESITRINKKIEKFQSEEQQQTEDELQDKIHPFAQTQSLVYPFPGPIPNSLPQNIPPLTQTPVVVPPFLQPEVMGVSKVKEAMAPKHKEMPFPKYPVEPFTESQSLTLTDVENLHLPLPLLQSWMHQPHQPLPPTVMFPPQSVLSLSQSKVLPVPQKAVPYPQRDMPIQAFLLYQEPVLGPVRGPFPIIV " The percentage of D2O in the solution is "100" The percentage of deuteration is "0" The percentage of exchange is "90" The concentration of the biomolecule in mg/ml is "0" Total number of residues is 209. ('Chemical composition C1080', 'N', 268, 'O', 305, 'S', 6, 'H', 1643.5, 'P', 0) ('The molecular weight of the biomolecule is 23.826', 'kDa') The Scattering length is 812.63 x10-4Å The scattering length density (ρm) of the molecule is 2.768 x10-6Å-2 The scattering length denisty (ρs) of the solvent is 6.376 x10-6Å-2 The molecular volume is 29353.0 Å3 The number of exchangable hydrogens at pH 7.0 is 311.5 The density of the biomolecule is 1.35 g/ml The estimate of intensity at zero angle I(0) of the biomolecule is 0 cm-1 |
Here are the files that make the current website. |
I've updated the sld calculator to python 3 and recent scipy (see attached). I did a little bit of code formatting in the process, but left the generated html more or less intact. It could use some cleanup as well. For |
Doing some checking of the BSLDC code, I see some surprises and don't want to blindly copy the details. Overall the calculations done by periodictable and BSLDC are reasonable, it's just that they are using different molecules with different number of labile hydrogen. BSLDC uses Perkins(1986). See manual for full reference. periodictable (pt) uses Perkins(1988). DOI:10.1016/S0167-7306(08)60575-X](https://doi.org/10.1016/S0167-7306(08)60575-X) periodictable missed adding an extra H + OH for each sequence, which is need to terminate the dangling connectors at the ends of the sequence. The BSLDC code adds these to the molecular weight, but doesn't account for their scattering lengths. Most of the BSLDC calculation is embedded in the magic numbers for mw (molecular weight) and sl (scattering length) which are precomputed for each residue. I have code below to print the equivalent values for the four cases (deuterated/non-deuterated in H2O/D2O). For RNA and DNA the masses and scattering lengths differ by a sodium atom. These appear in Perkins(1988). No idea what Perkins(1986) is using for its formula. For the amino acids the number of hydrogen and the number of labile hydrogen differ quite a bit between the two sources. I do not know which one to use. Tryptophan in BSLDC is inconsistent. The molecular weights for D2O and H2O differ by one but it claims two labile hydrogen. The scattering lengths differ by 20.8 for the undeuterated case (as expected for 2 labile H) but 10.6 for the deuterated case (should be about 10.4 going from -3.7409 for 1-H to 6.674 for 2-H; I'm ignoring the .015% natural abundance of D in natural H). I'm guessing that this was a programming error, where the scattering length for deuterated T in H2O was computed with the wrong number of labile H. For several residues in BSLDC the number added to hydro does not match the number added to H. For example Threonine. This probably doesn't matter because "hydro" isn't used, and the C, H, O, N totals are only used for reporting, not calculation, but it will lead to inconsistencies between reported values in periodictable and BSLDC. Verifying that this is the source of the casein discrepancy will be tedious. periodictable used different values for residue volumes (Perkins 1988 rather than Perkins 1986). I don't understand the BSLDC volume fraction calculation. I expect it should be independent of exchange percent since you weigh the sample after deuteration, but before adding to the solvent. I suspect it is effectively equivalent to mass fraction since we are using it to compute overall transmission and scattering. That way we don't have to guess how much the sample will swell with the addition of solvent. # Code for precomputing H/D mass and scattering lengths
from periodictable import fasta, H, D, Na
def sum_b(f):
total = 0
for el, count in f.atoms.items():
total += el.neutron.b_c * count
return total
def gen_bsldc(codes):
for code, mol in codes.items():
f = mol.labile_formula
HH = f.replace(H[1], H)
HD = f.replace(H[1], D)
DH = f.replace(H, D).replace(H[1], H)
DD = f.replace(H, D).replace(H[1], D)
print(f"=== {mol.name} vol={mol.cell_volume:.2f} ex={f.atoms.get(H[1],0)} ===")
print(f"{code} mw HH:{HH.mass:6.1f} HD:{HD.mass:6.1f} DH:{DH.mass:6.1f} DD:{DD.mass:6.1f}")
print(f" sl HH:{sum_b(HH):6.1f} HD:{sum_b(HD):6.1f} DH:{sum_b(DH):6.1f} DD:{sum_b(DD):6.1f}")
#print(fasta.AMINO_ACID_CODES)
print(f"Na mass: {Na.mass:.1f}, b_c: {Na.neutron.b_c}")
print("!!! RNA")
gen_bsldc(fasta.RNA_BASES)
print("!!! DNA")
gen_bsldc(fasta.DNA_BASES)
print("!!! Amino acids")
gen_bsldc(fasta.AMINO_ACID_CODES) |
More questions:
|
Initial comments by email from Luke Clifton (originator of the Protein SLD Calculator):
|
There are inconsistencies other than Trp. For each of the residues in bsldc I compute the difference in molecular weight and the difference in scattering length divided by the number of labile hydrogens for the undeuterated (u) and the fully deuterated (d) cases. I expect the mw ratios to be 1 and the sl ratios to be ≈10.4, but I get something quite different (see below). I notice that sl u≈10.4 always. I think the underlying values come from the tables in the Perkins (1986). Is there a reason that the values in the remaining columns are all over the place? This does not address differences in the 1986 and 1988 chemical formulas for the residues, which sometimes differ in the total number of hydrogens and in the number of hydrogens which are labile.
Note: I see that |
ORSO calculator at https://slddb.esss.dk/slddb/bio_blender produces yet another result for casein (see here) Compound protein Their table of amino acid chemical formulas and unit volumes is here |
Comments from a bio user re: differences in H between bsldc/ORSO and periodictable for Histidine and Cysteine:
Clearly this is a matter for the larger neutron bio community to address. I'm leaving periodictable as is until a consensus is reached. |
Regarding dna/rna volumes, the values in bsldc/ORSO are traceable to [1], using the rna volumes of ACG for dna, with a correction of 18.3 mL/mol to account for + PO₃ – H₂O in chain relative to the isolated neucleoside. I'm switching to these volumes for periodictable, though using the measured values for the deoxygenated forms of ACG for dna. [1] V.A. Buckin, B.I. Kankiya, R.L. Kazaryan, |
This is not strictly a SasView issue, but is concerned with a related analysis tool.
In late 2019 a round of emails between @smk78 , @timsnow , @butlerpd and @ajj agreed that CanSAS would take over the hosting/maintenance of the Biological Scattering Tools website currently hosted at psldc.isis.rl.ac.uk . This site is most notable for its BSLDC tool which gets a lot of use. But the developer behind the site and calculator left employment with ISIS/Diamond several years earlier.
The intention was to re-establish the site at http://psldc.smallangle.org, but that link currently points to the DANSE SANS project, the default for addresses on smallangle.org that do not have a specific DNS entry.
@ajj confirms that we are free to use the address and appropriate virtualhost setup.
This task therefore probably needs someone to edit the Apache configs. @ajj may need to add the DNS entry as he owns smallangle.org (unless he has permitted more admins?).
The text was updated successfully, but these errors were encountered: