Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drop pandas requirement? #179

Closed
speleo3 opened this issue Mar 3, 2021 · 3 comments · Fixed by #180
Closed

drop pandas requirement? #179

speleo3 opened this issue Mar 3, 2021 · 3 comments · Fixed by #180

Comments

@speleo3
Copy link
Contributor

speleo3 commented Mar 3, 2021

pdb2pqr uses pandas in only one place for a pretty trivial task: Pass a table from run_propka() to non_trivial() only to extract two columns. The code could easily be changed to:

--- a/pdb2pqr/main.py
+++ b/pdb2pqr/main.py
@@ -561,8 +560,7 @@ def run_propka(args, biomolecule):
         else:
             row_dict["coupled_group"] = None
         rows.append(row_dict)
-    df = pandas.DataFrame(rows)
-    return df, pka_str
+    return rows, pka_str
 
 
 def non_trivial(args, biomolecule, ligand, definition, is_cif):
@@ -621,7 +619,7 @@ def non_trivial(args, biomolecule, ligand, definition, is_cif):
             biomolecule.apply_pka_values(
                 forcefield_.name,
                 args.ph,
-                dict(zip(pka_df.group_label, pka_df.pKa)),
+                dict((row['group_label'], row['pKa']) for row in pka_df),
             )
         _LOGGER.info("Adding hydrogens to biomolecule.")
         biomolecule.add_hydrogens()

With this change, we could drop pandas from install_requires and move it to tests_require (it's used for a few tests).

I like to keep run dependencies lightweight when possible, and dropping pandas would mean dropping 10+MB when packaging pdb2pqr with PyMOL.

If run_propka() is considered part of the public API, then this would be a breaking change. Although compatible code could be written like this:

pka_list_or_df, pka_str = run_propka(args, biomolecule)
pka_df = pandas.DataFrame(pka_list_or_df)
@sobolevnrm
Copy link
Member

That makes sense to me. I can try to work on it this weekend unless you or @intendo want to submit a PR sooner. Thanks!

speleo3 added a commit to speleo3/pdb2pqr that referenced this issue Mar 3, 2021
@speleo3
Copy link
Contributor Author

speleo3 commented Mar 3, 2021

I already had the patch ready to be uploaded :-)
PR #180

@sobolevnrm
Copy link
Member

Nice; thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants