Skip to content

Commit

Permalink
save basis for 200_Prescriptions_2011.smi
Browse files Browse the repository at this point in the history
This .csv is the basis for the 200_Prescription_2011.smi.  It is
derived from Wikipedia:WikiProjet_Pharmacology/Top_200_US_Prescriptions_2001
which was accessed under

https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Pharmacology/Top_200_US_Prescriptions_2011

Despite age and restriction to the market in the U.S.A., it was
used because the original list already contains crosslinks to the
individual entries to at least the English Wikipedia.  Then,
+ the page was print as .pdf,  and converted into an .xls Excel
  spreadsheet on https://www.pdftoexcel.com/.

+ The spreadsheat was simplified in gnumeric, removing empty lines
  and decorum, columns about rank and sales figures.  A new first
  column was installed about the SMILES note in the wikipedia's
  crosslinks.  When these mentioned more than one active ingredient,
  all active ingredient's SMILES were included, separated by "and".

  Entries clearly about small molecules as active ingredients (the
  then second table) were reformatted into a third list; entries
  with more than one active ingredient are now included as

  SMILES_1  entry_a_1
  SMILES_2  entry_a_2

  and spaces in the remaining second colum (e.g., "Proair HFA")
  should be replaced by underscores (e.g., "Proair_HFA").
  The result was exported as text / .csv file.  In the .smi,
  trailing ",," were removed, and "," substituted by " " (explicit
  space).

This commit is to preserve the basis of the listing's underlying
.csv in the state as left by yesterday, 2020-04-29 (YYYY-MM-DD).
  • Loading branch information
nbehrnd committed Apr 30, 2020
1 parent f7bc035 commit f534eca
Showing 1 changed file with 514 additions and 0 deletions.
Loading

0 comments on commit f534eca

Please sign in to comment.