Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Calling pyarrow.dataset.ParquetFileFormat.make_write_options as a class method results in a segfault #39440

Closed
nonatomiclabs opened this issue Jan 3, 2024 · 3 comments

Comments

@nonatomiclabs
Copy link

nonatomiclabs commented Jan 3, 2024

Description

I had issues finding good examples of how to use make_write_options and ended up calling it without having instantiated the ParquetFileFormat, which resulted in a segmentation fault:

>>> import pyarrow.dataset
>>> pyarrow.dataset.ParquetFileFormat.make_write_options(42)
[1]    64569 segmentation fault  python

Expected behavior

Even though the way I called the method was nonsense, I'm guessing there could be better error handling than a segfault 🙂

Additional information

  • The issue happens regardless of the parameters passed (as long as there is at least one)
  • Tested on Python 3.11.6 and 3.8.14
  • PyArrow version 14.0.2
  • macOS 14.2 ARM

Component(s)

Python

@raulcd
Copy link
Member

raulcd commented Jan 8, 2024

This probably should be a blocker for the release, right?

@raulcd raulcd added the Priority: Blocker Marks a blocker for the release label Jan 8, 2024
@raulcd raulcd changed the title Calling pyarrow.dataset.ParquetFileFormat.make_write_options as a class method results in a segfault [Python] Calling pyarrow.dataset.ParquetFileFormat.make_write_options as a class method results in a segfault Jan 8, 2024
@jorisvandenbossche
Copy link
Member

Nice to have fixed given that it is a segfault, but it's not new, so I wouldn't say it's a blocker.

We have this issue in general, when calling cython methods as class instead of instance methods, many of them will segfault. For example also:

In [9]: pa.Array.get_total_buffer_size(1)
Segmentation fault (core dumped)

Because we nowhere do any type checking on self and just assume it is the correct type.

Now, in this case of ParquetFileFormat.make_write_options, it's more important I think, because it makes more sense to indeed try to call this thinking it is a static class method.

@raulcd raulcd removed the Priority: Blocker Marks a blocker for the release label Jan 9, 2024
@raulcd raulcd modified the milestones: 15.0.0, 16.0.0 Jan 9, 2024
@AlenkaF AlenkaF self-assigned this Apr 3, 2024
AlenkaF added a commit that referenced this issue Apr 5, 2024
…te_options as a class method results in a segfault (#40976)

### Rationale for this change

Calling `make_write_options()` method as class instead of instance method results in segfault.

### What changes are included in this PR?

Adds a type check on `self` and raises an error if not `ParquetFileFormat`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: #39440

Lead-authored-by: AlenkaF <frim.alenka@gmail.com>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
@AlenkaF
Copy link
Member

AlenkaF commented Apr 5, 2024

Issue resolved by pull request 40976
#40976

@AlenkaF AlenkaF closed this as completed Apr 5, 2024
tolleybot pushed a commit to tmct/arrow that referenced this issue May 2, 2024
…ke_write_options as a class method results in a segfault (apache#40976)

### Rationale for this change

Calling `make_write_options()` method as class instead of instance method results in segfault.

### What changes are included in this PR?

Adds a type check on `self` and raises an error if not `ParquetFileFormat`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#39440

Lead-authored-by: AlenkaF <frim.alenka@gmail.com>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 4, 2024
…ke_write_options as a class method results in a segfault (apache#40976)

### Rationale for this change

Calling `make_write_options()` method as class instead of instance method results in segfault.

### What changes are included in this PR?

Adds a type check on `self` and raises an error if not `ParquetFileFormat`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#39440

Lead-authored-by: AlenkaF <frim.alenka@gmail.com>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
rok pushed a commit to tmct/arrow that referenced this issue May 8, 2024
…ke_write_options as a class method results in a segfault (apache#40976)

### Rationale for this change

Calling `make_write_options()` method as class instead of instance method results in segfault.

### What changes are included in this PR?

Adds a type check on `self` and raises an error if not `ParquetFileFormat`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#39440

Lead-authored-by: AlenkaF <frim.alenka@gmail.com>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
rok pushed a commit to tmct/arrow that referenced this issue May 8, 2024
…ke_write_options as a class method results in a segfault (apache#40976)

### Rationale for this change

Calling `make_write_options()` method as class instead of instance method results in segfault.

### What changes are included in this PR?

Adds a type check on `self` and raises an error if not `ParquetFileFormat`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#39440

Lead-authored-by: AlenkaF <frim.alenka@gmail.com>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…ke_write_options as a class method results in a segfault (apache#40976)

### Rationale for this change

Calling `make_write_options()` method as class instead of instance method results in segfault.

### What changes are included in this PR?

Adds a type check on `self` and raises an error if not `ParquetFileFormat`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#39440

Lead-authored-by: AlenkaF <frim.alenka@gmail.com>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Signed-off-by: AlenkaF <frim.alenka@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants