Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type guessing should attempt to go via elements first #4697

Open
IAlibay opened this issue Aug 31, 2024 · 3 comments
Open

Type guessing should attempt to go via elements first #4697

IAlibay opened this issue Aug 31, 2024 · 3 comments
Assignees
Milestone

Comments

@IAlibay
Copy link
Member

IAlibay commented Aug 31, 2024

Current behaviour

The current behaviour, and something we reinforce in #3753 is that type works by attempting to guess elements from atom names.

Proposed behaviour

The proposal here is to always try to attempt to read from elements first rather than guessing through names. If elements exist AND they are complete, then you return those, rather than guessing them.

Where would this matter?

A good example here is FHIAIMS where:

names == elements
types == guessed elements from names

In this case it would have just been "safer" (i.e. fewer code bits gone through) to just do names == elements == types.

Target release

This needs discussion. From my own limited look at things, there aren't any cases where making this behaviour change would negatively impact behaviour. Indeed, I'm not sure I can see any cases where behaviour would change.

@IAlibay
Copy link
Member Author

IAlibay commented Aug 31, 2024

cc @lilyminium

@lilyminium
Copy link
Member

Agreed, I think this would make a lot more sense! It also avoids potential weirdness in cases like the RDKitParser, where type guessing can occur as trying to guess the element of atom names that can be variously MonomerInfo names or _TriposAtomNames. However this probably needs @MDAnalysis/coredevs consensus.

@isolated-matrix
Copy link

Hi there, I was just wondering, would this proposed change address the issue where, as is stated in the MDAnalysis documentation, 'atoms named “CA” are much more likely to represent an alpha-carbon than a calcium atom'? Because I'm trying to analyse a system containing argon (Ar) and MDAnalysis is coming up with an error stating that the atom mass cannot be guessed.

@IAlibay IAlibay modified the milestones: Release 2.8.0, Release 3.0 Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants