Skip to content
This repository has been archived by the owner on Nov 28, 2023. It is now read-only.

debug deeprank for models with no feature value #242

Merged
merged 26 commits into from
Nov 17, 2021
Merged

debug deeprank for models with no feature value #242

merged 26 commits into from
Nov 17, 2021

Conversation

manonreau
Copy link
Contributor

@manonreau manonreau commented Nov 9, 2021

  1. Compute grid_shape only if it is not provided as an input

  2. Compute the feature_mean if self.clip_features == True
    feature_mean was computed when self.normalize_features == True, this is not required anymore.

  3. Format logger message in the compute_norm function

  4. Allow feature clipping in the _clip_feature function only if values exists for that feature

  5. Add a condition not to transform features values when they correspond to an empty vector in the mapping process

  6. set save_hit_rate as False by default since it call the IRMSD target that may not be used by the users

  7. Added plots as optional in the test() function since the users, in principle, have no target information for the test set, excepted in benchmark conditions

@manonreau manonreau requested review from CunliangGeng and NicoRenaud and removed request for CunliangGeng November 9, 2021 19:23
@manonreau manonreau mentioned this pull request Nov 10, 2021
@coveralls
Copy link

coveralls commented Nov 11, 2021

Pull Request Test Coverage Report for Build 1468853462

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 77.12%

Totals Coverage Status
Change from base Build 1094124503: 0.0%
Covered Lines: 1628
Relevant Lines: 2111

💛 - Coveralls

Copy link
Member

@NicoRenaud NicoRenaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left a few comments but overall it looks good :) Thanks !

deeprank/learn/DataSet.py Outdated Show resolved Hide resolved
deeprank/learn/DataSet.py Outdated Show resolved Hide resolved
deeprank/learn/NeuralNet.py Outdated Show resolved Hide resolved
maxv = self.feature_mean[ic] + w * self.feature_std[ic]
if minv != maxv:
feature[ic] = np.clip(feature[ic], minv, maxv)
#feature[ic] = self._mad_based_outliers(feature[ic],minv,maxv)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that line commented ? should we simply remove it ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea, this line was already commented in the code

@CunliangGeng CunliangGeng self-assigned this Nov 15, 2021
compute the norm values if `self.clip_features` exists
compute the norm values if `self.clip_features` exists
compute the norm values if `self.clip_features` exists
Remove the benchmarking mode:
The prediction/benchmarking mode can be detected by simply checking the presence of target values for a given input data set when `plot == True`

Added the different required controls

Note that the hitrate function is only adapted for IRMSD input, should be modified
@manonreau
Copy link
Contributor Author

I have left a few comments but overall it looks good :) Thanks !

I modified the NeuralNet.py class to skip the input data set in the plotting function if no target is provided. This way no benchmarking mode is required.

Modify Hitrate so that it can handle any type of target values (not limited to irmsd anymore)

targ = self.data[l]['targets'].flatten()
try:
targ = self.data[l]['targets']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to comfirm: the targets data does not need to be flatten(), right? I see the old code is using targ = self.data[l]['targets'].flatten()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch ! It must be flattened

for fname, mol in data['mol']:

f5 = h5py.File(fname, 'r')
irmsd.append(f5[mol + '/targets/IRMSD'][()])
targets.append(f5[mol + f'/targets/self.data_set.select_target'][()])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to change f'/targets/self.data_set.select_target' to f'/targets/{self.data_set.select_target}'

Copy link
Member

@CunliangGeng CunliangGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @manonreau, the changes look good. I left a few comments for you to check.

manonreau and others added 10 commits November 16, 2021 11:40
1. Change field self.grid_shape to self._grid_shape
2. Update DataSet class docstring for grid_info
3. Update get_grid_shape method
1. Rename target_thr to hit_cutoff
2. Update hit_cutoff docstring
3. Replace print with logger
4. Shorten long lines
@CunliangGeng
Copy link
Member

Hi @manonreau I pushed two commits (since it's not easy to comment all the details), take a look please :-)

Copy link
Contributor Author

@manonreau manonreau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CunliangGeng, everything looks fine, I will merge that PR once you approve it

deeprank/learn/DataSet.py Outdated Show resolved Hide resolved
deeprank/learn/DataSet.py Outdated Show resolved Hide resolved

targ = self.data[l]['targets'].flatten()
try:
targ = self.data[l]['targets']
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch ! It must be flattened

@CunliangGeng CunliangGeng mentioned this pull request Nov 17, 2021
2 tasks
Copy link
Member

@CunliangGeng CunliangGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @manonreau, I approve now.

@CunliangGeng
Copy link
Member

BTW, please also close the related issues after merging the PR :-)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants