feat(add): numpy ndarray--template-less init #255

Khushiyant · 2024-09-10T15:59:56Z

Initially following example use to raise error:

Example

from cppyy.gbl.std import vector
import numpy as np

def test():
    arr = np.random.randint(0, 10, size=(2,3,4))
    v1 = vector(arr)
    print(v1)
test()

Error:

Traceback (most recent call last):
  File "/workspaces/codespaces-blank/test.py", line 12, in <module>
    test()
  File "/workspaces/codespaces-blank/test.py", line 9, in test
    v1 = vector(arr)
         ^^^^^^^^^^^
  File "/workspaces/codespaces-blank/.venv/lib/python3.12/site-packages/cppyy/_cpython_cppyy.py", line 117, in __call__
    if args0 and (type(args0) is tuple or type(args0) is list):
       ^^^^^
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Hence, it was not possible to initialise the vector from numpy array without defining vector template and numpy array dtype

Current support

from cppyy.gbl.std import vector
import numpy as np

def test():
    arr = np.random.randint(0, 10, size=(2,3,4,2), dtype=np.int32)
    v1 = vector[vector[vector[vector[int]]]](arr)
    print(v1)
test()

It is not a feasible option to define the vector templates at time of writing code especially in with usage with various ML backends

CPyCppyy#28 was the cpp side implementation to provide this support but cppyy frondend implementation is better approach in comparison

tests: add template and dtpye less tests chore: remove cppyy import chore: remove numpy--import recover: deleted valgrind--file

aaronj0 · 2024-09-11T16:17:50Z

etc/valgrind-cppyy-cling.supp

@@ -755,4 +755,4 @@
   fun:_rl_init_terminal_io
   obj:/lib/libreadline.so.4.3
   fun:rl_initialize
-}
+}


Why is this changed?

aaronj0 · 2024-09-11T16:27:34Z

python/cppyy/_cpython_cppyy.py

+            ):
+                t = args0.dtype.type.__name__
+                if t.startswith("int"):
+                    t = "int"


this is incorrect, float32 should map to a float type. This seems to be something that cppyy already does at a later stage in the PyBuffer interface for numpy arrays. Why do we type translate here?

aaronj0 · 2024-09-11T17:26:22Z

python/cppyy/_cpython_cppyy.py

+                and type(args0).__name__ == "ndarray"
+                and hasattr(args0, "dtype")
+            ):
+                t = args0.dtype.type.__name__


If I understand correctly, this makes dtype optional if not passing the template. but required if you pass the template type. Which are two independent things.

Whether you specify dtype or not, the array always has a dtype that is either default or deduced based on the type of values. Current support always requires specifying a template type and hence the compatible type of numpy array, which is the expected support.

This works:

a = np.array([1, 2, 3], dtype= np.int32) b = vector[int](a)

This doesn't:

a = np.array([1, 2, 3], dtype= np.int32) b = vector(a)

Now the TypeError you attempt to solve is not a result of dtype being missing. I believe you use something like:

a = np.array([1, 2, 3]) b = vector['int'](a)

And the resulting type error as an example.

But this has nothing to do with dtype. The TypeError we were seeing (in our meeting with @vgvassilev) is because numpy defaults to np.int64 in the absence of dtype which requires long in the template argument. This is an expected error and can be fixed with:

b = vector['long'](a)

What I would suggest here is that if we want to drop the requirement for template args for std.vector (which is a question for @wlav) then we need a robust mapping from the dtype to equivalent C++ types, in order for something like:

a = np.array([1, 2, 3]) b = vector(a)

to work. In my opinion, this type requirement mixup should be dropped as the current support enforces the most typesafety, and can lead to failures when passing a vector of objects in which case you have to specify the class type as a template param for the overload selection to work.

The rest of the PR looks good in terms of removing the requirement to nest template types for nested std:: vectors initialized from multidimensional numpy arrays

What I would suggest here is that if we want to drop the requirement for template args for std.vector

Not requiring template args for vector came about b/c since C++17 it's legal not to do that in C++. That is, something like this "auto v = std::vector({1, 2, 3});" is legal C++ and the argument was that Python should be at least as simple. (It isn't really, b/c C++ literals are a richer type set than Python literals, but resolving basic cases was/is simple enough.)

Khushiyant added 2 commits September 10, 2024 21:19

feat(add): numpy ndarray--template-less init

c287260

chore: add np bulk insert, dict lookup

eaf4ff3

Khushiyant mentioned this pull request Sep 11, 2024

feat: modify VectorInit to handle numpy multi dimensional arrays wlav/CPyCppyy#28

Closed

Khushiyant force-pushed the np_args branch from 3979f9d to dafc18f Compare September 11, 2024 16:24

chore: remove imports--numpy, cppyy

8f5d781

tests: add template and dtpye less tests chore: remove cppyy import chore: remove numpy--import recover: deleted valgrind--file

Khushiyant force-pushed the np_args branch from dafc18f to 8f5d781 Compare September 11, 2024 16:27

aaronj0 reviewed Oct 22, 2024

View reviewed changes

chore: remove type indeference, add optimisation steps

8ea5d7c

Khushiyant requested a review from aaronj0 October 22, 2024 18:21

Khushiyant added 4 commits October 24, 2024 23:01

chore: remove contigous array logic

9750bab

test: add non-int test

7ec37a0

chore: add explicit init--double, long

b47ebc1

chore: add test for multiple dtype

1961a85

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(add): numpy ndarray--template-less init #255

feat(add): numpy ndarray--template-less init #255

Khushiyant commented Sep 10, 2024 •

edited

Loading

aaronj0 Sep 11, 2024

aaronj0 Sep 11, 2024

aaronj0 Sep 11, 2024

wlav Oct 22, 2024

feat(add): numpy ndarray--template-less init #255

Are you sure you want to change the base?

feat(add): numpy ndarray--template-less init #255

Conversation

Khushiyant commented Sep 10, 2024 • edited Loading

Example

Error:

Current support

aaronj0 Sep 11, 2024

Choose a reason for hiding this comment

aaronj0 Sep 11, 2024

Choose a reason for hiding this comment

aaronj0 Sep 11, 2024

Choose a reason for hiding this comment

wlav Oct 22, 2024

Choose a reason for hiding this comment

Khushiyant commented Sep 10, 2024 •

edited

Loading