Add implementation of `dpnp.unique` #1972

antonwolfy · 2024-08-06T12:01:59Z

The PR adds implementation of dpnp.unique function.

The implementation leverages on dpctl.tensor implementation when axis is None. Otherwise it is implemented through python calls. The functionality is covered by new tests and enabled third party tests.

Have you provided a meaningful PR description?
Have you added a test, reproducer or referred to issue with a reproducer?
Have you tested your changes locally for CPU and GPU devices?
Have you made sure that new changes do not introduce compiler warnings?
Have you checked performance impact of proposed changes?
If this PR is a work in progress, are you filing the PR as a draft?

github-actions · 2024-08-06T12:41:31Z

View rendered docs @ https://intelpython.github.io/dpnp/index.html

vtavana

When axis is given NumPy list all rows with NaN at the bottom while dpnp does not.

import dpnp, numpy
import numpy as np

a = numpy.array([[1, 0, 0], [1, 0, 0], [np.nan, np.nan, np.nan], [2, 3, 4], [1, 0, 1], [np.nan, np.nan, np.nan]])
numpy.unique(a, axis=0)
# array([[ 1.,  0.,  0.],
#       [ 1.,  0.,  1.],
#       [ 2.,  3.,  4.],
#       [nan, nan, nan],
#       [nan, nan, nan]])

dpnp.unique(dpnp.asarray(a), axis=0)
#array([[ 1.,  0.,  0.],
#       [nan, nan, nan],
#       [ 1.,  0.,  1.],
#       [ 2.,  3.,  4.],
#      [nan, nan, nan]])

In addition equal_nan=True is not working as expected when axis is given for both NumPy and dpnp. Is it the way it should be?

import dpnp, numpy
import numpy as np

a = numpy.array([[1, 0, 0], [1, 0, 0], [np.nan, np.nan, np.nan], [2, 3, 4], [1, 0, 1], [np.nan, np.nan, np.nan]])
numpy.unique(a, axis=0, equal_nan=True)
# array([[ 1.,  0.,  0.],
#       [ 1.,  0.,  1.],
#       [ 2.,  3.,  4.],
#       [nan, nan, nan],
#       [nan, nan, nan]])

dpnp.unique(dpnp.asarray(a), axis=0, equal_nan=True)
#array([[ 1.,  0.,  0.],
#       [nan, nan, nan],
#       [ 1.,  0.,  1.],
#       [ 2.,  3.,  4.],
#      [nan, nan, nan]])

dpnp/dpnp_iface_manipulation.py

antonwolfy · 2024-08-16T09:37:08Z

When axis is given NumPy list all rows with NaN at the bottom while dpnp does not.

import dpnp, numpy
import numpy as np

a = numpy.array([[1, 0, 0], [1, 0, 0], [np.nan, np.nan, np.nan], [2, 3, 4], [1, 0, 1], [np.nan, np.nan, np.nan]])
numpy.unique(a, axis=0)
# array([[ 1.,  0.,  0.],
#       [ 1.,  0.,  1.],
#       [ 2.,  3.,  4.],
#       [nan, nan, nan],
#       [nan, nan, nan]])

dpnp.unique(dpnp.asarray(a), axis=0)
#array([[ 1.,  0.,  0.],
#       [nan, nan, nan],
#       [ 1.,  0.,  1.],
#       [ 2.,  3.,  4.],
#      [nan, nan, nan]])

In addition equal_nan=True is not working as expected when axis is given for both NumPy and dpnp. Is it the way it should be?

import dpnp, numpy
import numpy as np

a = numpy.array([[1, 0, 0], [1, 0, 0], [np.nan, np.nan, np.nan], [2, 3, 4], [1, 0, 1], [np.nan, np.nan, np.nan]])
numpy.unique(a, axis=0, equal_nan=True)
# array([[ 1.,  0.,  0.],
#       [ 1.,  0.,  1.],
#       [ 2.,  3.,  4.],
#       [nan, nan, nan],
#       [nan, nan, nan]])

dpnp.unique(dpnp.asarray(a), axis=0, equal_nan=True)
#array([[ 1.,  0.,  0.],
#       [nan, nan, nan],
#       [ 1.,  0.,  1.],
#       [ 2.,  3.,  4.],
#      [nan, nan, nan]])

Thank you for noticing that.
I will handle the comment by separate PR if you don't mind.

antonwolfy added 2 commits July 18, 2024 14:02

Implement dpnp.unique()

0475aad

Remove TODO since resolved by dpctl

29887c4

antonwolfy self-assigned this Aug 6, 2024

antonwolfy and others added 3 commits August 6, 2024 14:06

Merge branch 'master' into impl_unique

1d90aff

Use dpnp.trim_zeros() call

bbfb2dc

Applied pre-commit hooks

c3295f2

antonwolfy and others added 7 commits August 6, 2024 19:02

Spil implementation into few internal functions

5f85846

Updated third party tests

130ff11

Implement more test to cover different use cases

3621be2

Add CFD tests

494cf02

Merge branch 'master' into impl_unique

cc3648f

Add a test per every integer dtype

804098d

Update code examples

a0758d5

antonwolfy requested review from npolina4, vlad-perevezentsev and vtavana August 8, 2024 17:30

vtavana reviewed Aug 14, 2024

View reviewed changes

dpnp/dpnp_iface_manipulation.py Show resolved Hide resolved

dpnp/dpnp_iface_manipulation.py Show resolved Hide resolved

Merge branch 'master' into impl_unique

82cb5ec

antonwolfy merged commit 82cb5ec into master Aug 16, 2024

antonwolfy deleted the impl_unique branch August 16, 2024 09:35

antonwolfy mentioned this pull request Aug 16, 2024

Add proper handling of NaN values in dpnp.unique implementation with axis not None #1989

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add implementation of `dpnp.unique` #1972

Add implementation of `dpnp.unique` #1972

Uh oh!

antonwolfy commented Aug 6, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Aug 6, 2024 •

edited

Loading

Uh oh!

vtavana left a comment

Uh oh!

Uh oh!

Uh oh!

antonwolfy commented Aug 16, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add implementation of dpnp.unique #1972

Add implementation of dpnp.unique #1972

Uh oh!

Conversation

antonwolfy commented Aug 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vtavana left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

antonwolfy commented Aug 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add implementation of `dpnp.unique` #1972

Add implementation of `dpnp.unique` #1972

antonwolfy commented Aug 6, 2024 •

edited

Loading

github-actions bot commented Aug 6, 2024 •

edited

Loading

antonwolfy commented Aug 16, 2024 •

edited

Loading