Releases: dan2097/opsin
v2.8.0
- Support for undecahectane/undecadictane (previously only hendeca was supported)
- Support for dicarboximido
- Improved support for lysergic acid derivatives
- Added a few more sugars e.g. digitalose
- Added borodeuteride and hydro contractions of pharmaceutical salts e.g. hydromethanesulfonate
- Support substitution on glyceric acid
- Corrected interpretation of imidazolium, trioxane and phthalhydrazide
v2.7.0
- Improved coverage of flavonoid parent structures
- Support for apiofuranosyl, added 5 locant to apiose
- Improved support for n-amyl
- Superscripted numbers in poly spiro systems are now intelligently determined if the input lacks superscript indication
- Support for annulynes
- Fixed issues where amino acid salts were being interpreted as functionalisation of the amino acid
- Fixed bug where annulene parsing was case sensitive
- Chalcone, in accordance with current IUPAC recommendations, is now interpreted as specifically the trans isomer
- Minor dependency updates
v2.6.0
- OPSIN now requires Java 8 (or higher)
- OPSIN command-line functionality moved to opsin-cli module
- OPSIN standalone jars are now built with mvn package
- Updated from InChI 1.03 to InChI 1.06
- Support for capturing relative/racemic stereochemistry (output via CxSmiles) [contributed by John Mayfield]
- Support for deaza/dethia
- Support nitrile as a suffix on amino acids [contributed by John Mayfield]
- Support more glycero-n-phospho substituents
- Support for chloroxime and other haloximes
- Support cis/trans on rings where a stereocenter has two non-hydrogen substituents, using Cahn-Ingold-Prelog rules to determine which are relative
- Multiple improvements to implicit bracketting logic
- Corrected interpretation of methylselenopyruvate
- Added group 1/2 nitrides e.g. magnesium nitride
- Added molecular diatomics e.g. molecular hydrogen (or dihydrogen)
- Fixed out of memory error if a fusion bracket referenced an interior atom instead of a peripheral atom
- Fixed out of memory error while parsing very long ambiguous input, by switching parsing algorithm from breadth-first to depth-first
Dependency changes:
- Updated logging from Log4J v1.2.17 to the latest Log4J2 (v2.17.0). Neither OPSIN 2.5.0 nor 2.6.0 are vulnerable to Log4Shell. The logging implementation is only included in the opsin-cli module
- opsin-inchi now uses JNA-InChI rather than JNI-InChI. This supports the latest version of InChI and also support new Macs with ARM64 processors
- Woodstox now uses groupid com.fasterxml.woodstox (the groupid change did not signify a break in API compatibility)
- dk.brics.automaton now uses groupid dk.brics (the groupid change did not signify a break in API compatibility)
- commons-cli is only used by the opsin-cli module
v2.5.0
- OPSIN now requires Java 7 (or higher)
- Support for traditional oxidation state names e.g. ferric
- Added support for defining the stereochemistry of phosphines/arsines
- Added newly discovered elements
- Improved algorithm for correctly interpreting ester names with a missing space e.g. 3-aminophenyl-4-aminobenzenesulfonate
- Fixed structure of canavanine
- Corrected interpretation of silver oxide
- Vocabulary improvements
- Minor improvements/bug fixes
Internal XML Changes:
- tokenList files now all use the same schema (tokenLists.dtd)
v2.4.0
- OPSIN is now licensed under the MIT License
- Locant labels included in extended SMILES output
- Command-line now has a name flag to include the input name in SMILES/InChI output (tab delimited)
- Added support for carotenoids
- Added support for Vitamin B-6 related compounds
- Added support for more fused ring system bridge prefixes
- Added support for anilide as a functional replacement group
- Allow heteroatom replacement as a detachable prefix e.g. 3,6,9-triaza-2-(4-phenylbutyl)undecanoic acid
- Support Boughton system isotopic suffixes for 13C/14C/15N/17O/18O
- Support salts of acids in CAS inverted names
- Improved support for implicitly positively charged purine nucleosides/nucleotides
- Added various biochemical groups/substituents
- Improved logic for determining intended substitution in names with too few brackets
- Incorrectly capitalized locants can now be used to reference ring fusion atoms
- Some names no longer allow substitution e.g. water, hydrochloride
- Many minor precision/recall improvements
v2.3.1
- Fixed fused ring numbering algorithm incorrectly numbering some ortho- and peri-fused fused systems involving 7-membered rings
- Support P-thio to indicate thiophosphate linkage
- Count of isotopic replacements no longer required if locants given
- Fixed bug where CIP algorithm could assign priorities to identical substituents
- Fixed "DL" before a substituent not assigning the substituted alpha-carbon as racemic stereo
- L-stereochemistry no longer assumed on semi-systematic glycine derivatives e.g. phenylglycine
- Fixed some cases where substituents like carbonyl should have been part of an implicitly bracketed section
- Fixed interpretation of leucinic acid and 3/4/5-pyrazolone
v2.3.0
- D/L stereochemistry can now be assigned algorithmically e.g. L-2-aminobutyric acid
- Other minor improvements to amino acid support e.g. homoproline added
- Extended SMILES added to command-line interface
- Names intended to include the triiodide/tribromide anion no longer erroneously have three monohalides
- Ambiguity detected when applying unlocanted subtractive prefixes
- Better support for adjacent multipliers e.g. ditrifluoroacetic acid
- deoxynucleosides are now implicitly 2'-deoxynucleosides
- Added support for
<number>
as a syntax for a superscripted number - Added support for amidrazones
- Aluminium hydrides/chlorides/bromides/iodides are now covalently bonded
- Fixed names with isotopes less than 10 not being supported
- Fixed interpretation of some trivial names that clash with systematic names
v2.2.0
- Added support for IUPAC system for isotope specification e.g. (3-14C,2,2-2H2)butane
- Added support for specifying deuteration using the Boughton system e.g. butane-2,2-d2
- Added support for multiplied bridges e.g. 1,2:3,4-diepoxy
- Front locants after a von baeyer descriptor are now supported e.g. bicyclo[2.2.2]-7-octene
- onosyl substituents now supported e.g. glucuronosyl
- More sugar substituents e.g. glucosaminyl
- Improved support for malformed polycyclic spiro names
- Support for oximino as a suffix
- Added method [NameToStructure.getVersion()] to retrieve OPSIN version number
- Allowed bridges to be used as detachable prefixes
- Allow odd numbers of hydro to be added e.g. trihydro
- Added support for unbracketed R stereochemistry (but not S, for the moment, due to the ambiguity with sulfur locants)
- Various minor bug fixes e.g. stereochemistry was incorrect for isovaline
- Minor vocabulary improvements
v2.1.0
- Added support for fractional multipliers e.g. hemihydrochloride
- Added support for abbreviated common salts e.g. HCl
- Added support for sandwich compounds e.g. ferrocene
- Improved recognition of names missing the last 'e' (common in German)
- Support for E/Z directly before double bond indication e.g. 2Z-ylidene, 2Z-ene
- Improved support for functional class ethers e.g. "glycerol triglycidyl ether"
- Added general support for names involving an ester formed from an alcohol and an ate group
- Grignards reagents and certain compounds (e.g. uranium hexafluoride), are now treated as covalent rather than ionic
- Added experimental support for outputting extended SMILES. Polymers and attachment points are annotated explicitly
- Polymers when output as SMILES now have atom classes to indicate which end of the repeat unit is which
- Support * as a superscript indicator e.g. 6 to mean superscript 6
- Improved recognition of racemic stereochemistry terms
- Added general support for names like "beta-alanine N,N-diacetic acid"
- Allowed "one" and "ol" suffixes to be used in more cases where another suffix is also present
- "ic acid halide" is not interpreted the same as "ic halide"
- Fixed some cases where ambiguous operations were not considered ambiguous e.g. monosubstitututed phenyl
- Improvements/bug fixes to heuristics for detecting when spaces are omitted from ether/ester names
- Improved support for stereochemistry in older CAS index names
- Many precision improvements e.g. cyclotriphosphazene, thiazoline, TBDMS/TBDPS protecting groups, S-substituted-methionine
- Various minor bug fixes e.g. names containing "SULPH" not recognized
- Minor vocabulary improvements
Internal XML Changes:
- Synonymns of the same concept are now or-ed rather being seperate entities e.g.
<token>tertiary|tert-|t-</token>
v2.0.0
MAJOR CHANGES
-
Requires Java 1.6 or higher
-
CML (Chemical Markup Language) is now returned as a String rather than a XOM Element
-
OPSIN now attempts to identify if a chemical name is ambiguous. Names that appear ambiguous return with a status of WARNING with the structure provided being one interpretation of the name
-
Added support for "alcohol esters" e.g. phenol acetate [meaning phenyl acetate]
-
Multiplied unlocanted substitution is now more intelligent e.g. all substituents must connect to same group, and degeneracy of atom environments is taken into account
-
The ester interpretation is now preferred in more cases where a name does not contain a space but the parent is methanoate/ethanoate/formate/acetate/carbamate
-
Inorganic oxides are now interpreted, yielding structures with [O-2] ions
-
Added more trivial names of simple molecules
-
Support for nitrolic acids
-
Fixed parsing issue where a directly substituted acetal was not interpretable
-
Fixed certain groups e.g. phenethyl, not having their suffix attached to a specific location
-
Corrected interpretation of xanthyl, and various trivial names that look systematic
-
Name to structure is now ~20% faster
-
Initialisation time reduced by a third
-
InChI generation is now ~20% faster
-
XML processing dependency changed from XOM to Woodstox
-
Significant internal refactoring
-
Utility functions designed for internal use are no longer on the public API
-
Various minor bug fixes
Internal XML Changes:
- Groups lacking a labels attribute now have no locants (previously had ascending numeric locants)
- Syntax for addGroup/addHeteroAtom/addBond attributes changed to be easier to parse and allow specification of whether the name is ambiguous if a locant is not provided