-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ChEMBL relational database ER diagram #1306
Comments
Before I explain any specific normalization issue with the ChEMBL ER diagram, I define a few terms for clarification. Normalization: First Normal Form (1NF): 2NF: 3NF: 4NF: 5NF: |
Uniqueness of Rows: We have a table called Winner with two columns (FirstName and LastName) and the following records: Now, to make it unique for add another column called WinnerID as integer and we make it the primary key. 1, Michael, Jordan Now, if the winner number is 6 and we call the winner "Michael Jordan", the question is which "Michael Jordan"? WinnerID is a proxy unique key and is nothing to do with a real person and should not be created before defining a set of non-proxy attributes as a unique key (candidate key). |
Problem:No unique key(s) Table: target_dictionary Proxy Columns: A combined key of the following columns still is not unique: Conclusion: Unfortunately, there are many more tables in the ChEMBL database that have the same above issue. When we proved that all records in target_dictionary table are unique, then we are allowed to use "tid" or "chembl_id" as a proxy primary key. anywhere in the ER diagram. |
Why there is a table in a relational database only with one column? |
Hello,
Unfortunately, the ER diagram does not follow the Normal Form rules and has many errors. It means the resultsets from the JOIN queries are not reliable.
Best,
Saed
The text was updated successfully, but these errors were encountered: