Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RF][PyROOT] New functions for conversion between RooDataHist and NumPy arrays #8784

Merged
merged 4 commits into from
Jan 14, 2022

Conversation

guitargeek
Copy link
Contributor

@guitargeek guitargeek commented Jul 30, 2021

This PR introduces new PyROOT features that allow for conversion between the RooDataHist and the NumPy arrays, following up a PR that already introduced similar functionality for the RooDataSet:
#9346

The new methods are (with checkmarks if they are already implemented in this PR):

  • RooDataHist.to_numpy()
  • RooDataHist.from_numpy()

These new methods are also advertised in the release notes, and the existing rf409_NumPyPandasToRooFit.py tutorial is extended to also explain the functionality introduced in this PR.

Note that any new Python functions prefixed with an underscore are not meant to be part of the public stable user interface, which is why they are not advertised in the release notes and also don't have docstrings.

@guitargeek guitargeek self-assigned this Jul 30, 2021
@root-project root-project deleted a comment from phsft-bot Aug 12, 2021
@root-project root-project deleted a comment from phsft-bot Aug 12, 2021
@root-project root-project deleted a comment from phsft-bot Aug 12, 2021
@root-project root-project deleted a comment from phsft-bot Aug 12, 2021
@root-project root-project deleted a comment from phsft-bot Aug 12, 2021
@root-project root-project deleted a comment from phsft-bot Aug 12, 2021
@root-project root-project deleted a comment from phsft-bot Aug 12, 2021
@root-project root-project deleted a comment from phsft-bot Aug 12, 2021
@guitargeek guitargeek force-pushed the roofit_numpy branch 4 times, most recently from 7c99dc8 to 6118536 Compare November 28, 2021 21:08
@root-project root-project deleted a comment from lgtm-com bot Nov 28, 2021
@root-project root-project deleted a comment from phsft-bot Nov 28, 2021
@root-project root-project deleted a comment from phsft-bot Nov 28, 2021
@root-project root-project deleted a comment from phsft-bot Nov 28, 2021
@root-project root-project deleted a comment from phsft-bot Nov 28, 2021
@root-project root-project deleted a comment from phsft-bot Nov 28, 2021
@root-project root-project deleted a comment from phsft-bot Nov 28, 2021
@root-project root-project deleted a comment from phsft-bot Nov 28, 2021
@root-project root-project deleted a comment from phsft-bot Nov 28, 2021
@root-project root-project deleted a comment from phsft-bot Nov 28, 2021
@root-project root-project deleted a comment from phsft-bot Nov 28, 2021

@property
def shape(self):
import ROOT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this import as it is not used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! You're right.

return self._to_array(self.sumW2Array())

@staticmethod
def from_numpy(hist_weights, variables, bins, ranges=None, weights_squared_sum=None, name=None, title=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we name it from_numpy and to_numpy or FromNumpy and ToNumpy to comply with the C++ naming (and also other non-C++ cases such as RDataFrame.AsNumpy)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, but let me argue why I'd rather stick with from_numpy and to_numpy.

By the way, RooFit follows camelCase, unlike the rest of ROOT. So if we try to comply with the C++ names, it would rather be toNumpy and fromNumpy.

def _to_array(self, buffer):
import numpy as np

# check if buffer is nullptr (for some reason comparing with ROOT.nullptr doesn't work)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the type of buffer here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps adding some docs would be good too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is meant to create a numpy array from a double *. The type of buffer usually a cppyy.LowLevelView, which is what you get when you call a C++ function that returns a double *. I'll write some doc!

Copy link
Contributor

@etejedor etejedor Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks, my question was more in the direction: if a cppyy.LowLevelView points to null, it's not equal to ROOT.nullptr?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked that, and indeed:

>>> ROOT.gInterpreter.Declare("double * foo() { return nullptr; }")
True
>>> buffer = ROOT.foo()
>>> buffer == ROOT.nullptr
False

but what you can do is use the boolean value of the cppyy.LowLevelView:

if not buffer: return None
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Nice, I will to de boolean check then, as it's much cleaner 👍

return a.reshape(self.shape)

def _var_is_category(self):
""" """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want some docs here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Indeed it looked like I forgot that...

@phsft-bot
Copy link
Collaborator

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-debian10-i386/cxx14.
Running on pcepsft10.dyndns.cern.ch:/build/workspace/root-pullrequests-build
See console output.

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-performance-centos8-multicore/default.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

@guitargeek
Copy link
Contributor Author

I think the CI failing is unrelated to the changes in this PR and should be fixed by #8784.

@phsft-bot
Copy link
Collaborator

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-debian10-i386/cxx14.
Running on pcepsft10.dyndns.cern.ch:/build/workspace/root-pullrequests-build
See console output.

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-performance-centos8-multicore/default.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

Copy link
Member

@lmoneta lmoneta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Thank you Jonas for this nice improvement

@phsft-bot
Copy link
Collaborator

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@guitargeek
Copy link
Contributor Author

Autosquashed fixup commits.

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-performance-centos8-multicore/default.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

Warnings:

  • [2022-01-14T14:09:58.858Z] /data/sftnight/workspace/root-pullrequests-build/root/graf2d/mathtext/src/table/adobeglyphlist.h:128:44: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
  • [2022-01-14T14:09:58.858Z] /data/sftnight/workspace/root-pullrequests-build/root/graf2d/mathtext/src/table/adobeglyphlist.h:128:55: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
  • [2022-01-14T14:15:45.572Z] /data/sftnight/workspace/root-pullrequests-build/root/graf2d/postscript/src/AdobeGlyphList.h:110:44: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
  • [2022-01-14T14:15:45.572Z] /data/sftnight/workspace/root-pullrequests-build/root/graf2d/postscript/src/AdobeGlyphList.h:110:55: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]

@guitargeek guitargeek merged commit 6394ef5 into root-project:master Jan 14, 2022
@guitargeek guitargeek deleted the roofit_numpy branch January 14, 2022 16:58
@phsft-bot
Copy link
Collaborator

Build failed on windows10/cxx14.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Errors:

  • [2022-01-14T18:54:46.963Z] CMake Error at C:/build/workspace/root-pullrequests-build/rootspi/jenkins/root-build.cmake:1064 (message):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants