Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Dummy PR to check maint-16.0.0 status #41118

Closed
wants to merge 32 commits into from
Closed

WIP: Dummy PR to check maint-16.0.0 status #41118

wants to merge 32 commits into from

Commits on Apr 9, 2024

  1. GH-40799: [Doc][Format] Implementation status page should list canoni…

    …cal extension types (#41053)
    
    ### Rationale for this change
    
    Two specifications and one implementation of canonical extension types were added and this should be documented.
    
    ### What changes are included in this PR?
    
    This represents current state of canonical extension types.
    
    ### Are these changes tested?
    
    No, docs only.
    
    ### Are there any user-facing changes?
    
    In so much as they read docs.
    * GitHub Issue: #40799
    
    Authored-by: Rok Mihevc <rok@mihevc.org>
    Signed-off-by: Rok Mihevc <rok@mihevc.org>
    rok authored and raulcd committed Apr 9, 2024
    Configuration menu
    Copy the full SHA
    35afd40 View commit details
    Browse the repository at this point in the history
  2. GH-41047: [C#] Address performance issue of reading from StringArray (#…

    …41048)
    
    ### Rationale for this change
    
    The motivation here is to address #41047. There is severe performance drawback in reading a StringArray as value array of a DictionaryArray, because of repeated and unnecessary UTF 8 string decoding.
    
    ### What changes are included in this PR?
    
    - Added a new function Materialize() to materialize the values to a list. When materialized, GetString() reads from the vector directly.
    - Added test coverage.
    
    ### Are these changes tested?
    
    Yes
    
    ### Are there any user-facing changes?
    
    No. This change maintains backwards compatibility on the API surface. It is up to the client application to decide whether to materialize the array and gain performance. 
    
    * GitHub Issue: #41047
    
    Authored-by: Keshuang Shen <keshen@microsoft.com>
    Signed-off-by: Curt Hagenlocher <curt@hagenlocher.org>
    keshen-msft authored and raulcd committed Apr 9, 2024
    Configuration menu
    Copy the full SHA
    3a4962d View commit details
    Browse the repository at this point in the history

Commits on Apr 10, 2024

  1. GH-38768: [Python] Empty slicing an array backwards beyond the start …

    …is now empty (#40682)
    
    ### What changes are included in this PR?
    
    `_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).
    
    ### Are these changes tested?
    
    Yes.
    
    ### Are there any user-facing changes?
    
    Fixing wrong data returned in an edge case.
    * GitHub Issue: #40642
    * GitHub Issue: #38768
    
    Lead-authored-by: LucasG0 <guillermou.lucas@gmail.com>
    Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
    Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
    2 people authored and raulcd committed Apr 10, 2024
    Configuration menu
    Copy the full SHA
    1558d3f View commit details
    Browse the repository at this point in the history
  2. GH-40689: [Docs] Add nanoarrow to implementation status page (#41052)

    ### Rationale for this change
    
    nanoarrow now supports most types and reading IPC streams. As discussed in the issue, there is interest in adding it to the implementation status page for visibility.
    
    ### What changes are included in this PR?
    
    A column was added to the table with the appropriate values characterizing the C library implementation status.
    
    ### Are these changes tested?
    
    Not needed (docs!)
    
    ### Are there any user-facing changes?
    
    No (docs!)
    * GitHub Issue: #40689
    
    Lead-authored-by: Dewey Dunnington <dewey@voltrondata.com>
    Co-authored-by: Dewey Dunnington <dewey@fishandwhistle.net>
    Signed-off-by: Dewey Dunnington <dewey@fishandwhistle.net>
    paleolimbot authored and raulcd committed Apr 10, 2024
    Configuration menu
    Copy the full SHA
    93d2db0 View commit details
    Browse the repository at this point in the history
  3. GH-41088: [CI][Crossbow] Fix GitHub Actions workflow syntax error (#4…

    …1091)
    
    ### Rationale for this change
    
    We can't use multiple top-level `env:` in workflow. GH-40949 introduced a top-level `env:` by `macros.github_header()`. It broke workflows that already have top-level `env:`.
    
    ### What changes are included in this PR?
    
    Omit top-level `env:` key and reuse the top-level `env:` key generated by `macros.github_header()` in workflows.
    
    ### Are these changes tested?
    
    Yes.
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #41088
    
    Authored-by: Sutou Kouhei <kou@clear-code.com>
    Signed-off-by: Sutou Kouhei <kou@clear-code.com>
    kou authored and raulcd committed Apr 10, 2024
    Configuration menu
    Copy the full SHA
    ef530eb View commit details
    Browse the repository at this point in the history
  4. GH-41043: [CI][Python] check message in test_make_write_options_error…

    … for Cython 2 (#41059)
    
    ### Rationale for this change
    
    `test_make_write_options_error` has been failing on Cython 2 crossbow build because in older versions of Cython the methods were "regular" C extension method had type check automatically built in. In Cython 3 that is not the case, see cython/cython#6127 and so the check for `ParquetFileFormat` was added in #40976.
    
    ### What changes are included in this PR?
    
    Checking the error raised for both messages, type check and the check for `ParquetFileFormat` added in #40976.
    
    ### Are these changes tested?
    
    Yes.
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #41043
    
    Authored-by: AlenkaF <frim.alenka@gmail.com>
    Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
    AlenkaF authored and raulcd committed Apr 10, 2024
    Configuration menu
    Copy the full SHA
    e138bdc View commit details
    Browse the repository at this point in the history

Commits on Apr 11, 2024

  1. GH-40801: [Docs] Clarify device identifier documentation in the Arrow…

    … C Device data interface (#41101)
    
    ### Rationale for this change
    
    It is not explicit what the value of the `ArrowDeviceArray::device_id` should be when a given device type has no notion of a device identifier (e.g., there is always only one).
    
    ### What changes are included in this PR?
    
    The text was clarified to recommend a value of -1. This was the value already used by Arrow C++.
    
    ### Are these changes tested?
    
    No tests needed (documentation)
    
    ### Are there any user-facing changes?
    
    No
    * GitHub Issue: #40801
    
    Authored-by: Dewey Dunnington <dewey@voltrondata.com>
    Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
    paleolimbot authored and raulcd committed Apr 11, 2024
    Configuration menu
    Copy the full SHA
    903be59 View commit details
    Browse the repository at this point in the history
  2. GH-40866: [C++][Python] Basic conversion of RecordBatch to Arrow Tens…

    …or - add support for row-major (#40867)
    
    ### Rationale for this change
    
    The conversion from `RecordBatch` to `Tensor` class now exists but it doesn't support row-major `Tensor` as an output. This PR adds support for an option to construct row-major `Tensor`.
    
    ### What changes are included in this PR?
    
    This PR adds a `row_major` option in `RecordBatch::ToTensor` so that row-major `Tensor` can be constructed. The default conversion will be row-major. This for example works:
    
    ```python
    >>> import pyarrow as pa
    >>> import numpy as np
    
    >>> arr1 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
    >>> arr2 = [10, 20, 30, 40, 50, 60, 70, 80, 90]
    >>> batch = pa.RecordBatch.from_arrays(
    ...     [
    ...         pa.array(arr1, type=pa.uint16()),
    ...         pa.array(arr2, type=pa.int16()),
    ... 
    ...     ], ["a", "b"]
    ... )
    
    # Row-major
    
    >>> batch.to_tensor()
    <pyarrow.Tensor>
    type: int32
    shape: (9, 2)
    strides: (8, 4)
    
    >>> batch.to_tensor().to_numpy().flags
      C_CONTIGUOUS : True
      F_CONTIGUOUS : False
      OWNDATA : False
      WRITEABLE : True
      ALIGNED : True
      WRITEBACKIFCOPY : False
    
    # Column-major
    
    >>> batch.to_tensor(row_major=False)
    <pyarrow.Tensor>
    type: int32
    shape: (9, 2)
    strides: (4, 36)
    
    >>> batch.to_tensor(row_major=False).to_numpy().flags
      C_CONTIGUOUS : False
      F_CONTIGUOUS : True
      OWNDATA : False
      WRITEABLE : True
      ALIGNED : True
      WRITEBACKIFCOPY : False
    ```
    
    ### Are these changes tested?
    
    Yes, in C++ and Python.
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #40866
    
    Lead-authored-by: AlenkaF <frim.alenka@gmail.com>
    Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
    Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
    Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
    2 people authored and raulcd committed Apr 11, 2024
    Configuration menu
    Copy the full SHA
    4d3d4ac View commit details
    Browse the repository at this point in the history
  3. GH-41119: [Archery][Packaging][CI] Avoid using --progress flag on Doc…

    …ker on Windows on archery (#41120)
    
    ### Rationale for this change
    
    Windows wheels are currently failing due to the change on `ARCHERY_DEBUG=1` by default. This uses `--progress` on `docker build` which is not supported on Windows.
    
    ### What changes are included in this PR?
    
    Do not use `--progress` on the Windows builds.
    
    ### Are these changes tested?
    
    Yes on CI via archery.
    
    ### Are there any user-facing changes?
    No
    * GitHub Issue: #41119
    
    Lead-authored-by: Raúl Cumplido <raulcumplido@gmail.com>
    Co-authored-by: Sutou Kouhei <kou@clear-code.com>
    Signed-off-by: Sutou Kouhei <kou@clear-code.com>
    raulcd and kou committed Apr 11, 2024
    Configuration menu
    Copy the full SHA
    b18b1f4 View commit details
    Browse the repository at this point in the history
  4. GH-41034: [C++][FS][Azure] Adjust DeleteDir/DeleteDirContents/GetFile…

    …InfoSelector behaviors against Azure for generic filesystem tests (#41068)
    
    ### Rationale for this change
    
    They are failing:
    
    ```text
    [  FAILED  ] TestAzureHierarchicalNSGeneric.DeleteDir
    [  FAILED  ] TestAzureHierarchicalNSGeneric.DeleteDirContents
    [  FAILED  ] TestAzureHierarchicalNSGeneric.GetFileInfoSelector
    ```
    
    ### What changes are included in this PR?
    
    * `DeleteDir()`: Check not a directory case
    * `DeleteDirContents()`: Check not a directory case
    * `GetFileInfoSelector()`:
      * Add not a directory check for input
      * Add support for returning metadata for directory 
    
    ### Are these changes tested?
    
    Yes.
    
    ### Are there any user-facing changes?
    
    Yes.
    * GitHub Issue: #41034
    
    Authored-by: Sutou Kouhei <kou@clear-code.com>
    Signed-off-by: Sutou Kouhei <kou@clear-code.com>
    kou authored and raulcd committed Apr 11, 2024
    Configuration menu
    Copy the full SHA
    2853ecb View commit details
    Browse the repository at this point in the history
  5. GH-40695 [C++] Expand Substrait type support (#40696)

    ### Rationale for this change
    
    See #40695 
    
    ### What changes are included in this PR?
    
    This PR does a few things:
    
     * Substrait is upgraded to the latest version
     * Support is added for the parameterized timestamp type (but not literals due to substrait-io/substrait#611).
     * Support is added for the following arrow-specific types:
       * fp16
       * date_millis
       * time_seconds
       * time_millis
       * time_nanos
       * large_string
       * large_binary
    
    When adding support for the new timestamp types I also relaxed the restrictions on the time zone column.  Substrait puts time zone information in the function and not the type.  In other words, to print the "America/New York" value of a column of instants one would do something like `to_char(my_timestamp, "America/New York")` instead of `to_char(cast(my_timestamp, timestamp("nanos", "America/New York")`.
    
    However, the current implementation makes it impossible to produce or consume a plan with `to_char(my_timestamp, "America/New York")` because it would reject the type because it has a non-UTC time zone.  With this latest change, we treat any non-empty timezone as a timezone_tz type.
    
    In addition, I have enabled conversions from "encoded types" to their unencoded representation.  E.g. a type of `DICTIONARY<INT32>` will convert to `INT32`.  At a logical expression / plan perspective these encodings are irrelevant.  If anything, they may belong in a more physical plan representation.  Should a need for them arise we can dig into it more later.  However, I believe it is better to err on the side of generating "something" rather than failing in these cases.  I don't consider this last change critical and can back it out if need be.
    
    ### Are these changes tested?
    
    Yes, I added new unit tests
    
    ### Are there any user-facing changes?
    
    Yes, via the Substrait conversion.  These changes should be backwards compatible in that they only add functionality in places that previously reported "Not Supported".
    * GitHub Issue: #40695
    
    Lead-authored-by: Weston Pace <weston.pace@gmail.com>
    Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com>
    Signed-off-by: Weston Pace <weston.pace@gmail.com>
    2 people authored and raulcd committed Apr 11, 2024
    Configuration menu
    Copy the full SHA
    5330df7 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    bcde34a View commit details
    Browse the repository at this point in the history
  7. GH-41147: [CI][C++] Use newer LLVM on Ubuntu 24.04 (#41150)

    ### What changes are included in this PR?
    
    Use LLVM 15 on Ubuntu 24.04, as LLVM 14 packages seem not always available.
    
    ### Are these changes tested?
    
    Yes, on CI.
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #41147
    
    Authored-by: Antoine Pitrou <antoine@python.org>
    Signed-off-by: Antoine Pitrou <antoine@python.org>
    pitrou authored and raulcd committed Apr 11, 2024
    Configuration menu
    Copy the full SHA
    84cd493 View commit details
    Browse the repository at this point in the history
  8. GH-41145: [R][CI] test-r-dev-duckdb fails installing duckdb (#41152)

    ### Rationale for this change
    
    An error is received installing R duckdb:
    
    ```
    #15 18.13 > remotes::install_github('duckdb/duckdb-r', build = FALSE)
    #15 18.27 Error: Failed to install 'unknown package' from **GitHub:**
    #15 18.27   Line starting 'Roxyg ...' is malformed!
    ```
    
    Some searching seems to suggest that this is because R cannot process UTF-8 characters in DESCRIPTION files if the `LANG` is set to `C`.
    
    ### What changes are included in this PR?
    
    The `LANG` is set to `C.UTF-8` in the dockerfile for this CI job
    
    ### Are these changes tested?
    
    The change only affects a test
    
    ### Are there any user-facing changes?
    
    No
    * GitHub Issue: #41145
    
    Authored-by: Weston Pace <weston.pace@gmail.com>
    Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
    westonpace authored and raulcd committed Apr 11, 2024
    Configuration menu
    Copy the full SHA
    a2c5a9a View commit details
    Browse the repository at this point in the history

Commits on Apr 12, 2024

  1. GH-41154: [C++] Fix Valgrind error in string-to-float16 conversion (#…

    …41155)
    
    ### Rationale for this change
    
    Only do the final conversion to float16 on success, to avoid conditional jump on uninitialized values.
    
    Note: this is a benign error. But we want our Valgrind CI job to be as successful as possible.
    
    ### Are these changes tested?
    
    Yes, on CI.
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #41154
    
    Authored-by: Antoine Pitrou <antoine@python.org>
    Signed-off-by: Antoine Pitrou <antoine@python.org>
    pitrou authored and raulcd committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    0f33954 View commit details
    Browse the repository at this point in the history
  2. GH-41127: [CI] Use GitHub Actions instead of Azure Pipelines for dock…

    …er-tests (#41153)
    
    We don't want to maintain multiple CI platforms to reduce maintenance cost.
    
    Use GitHub Actions for docker-tests.
    
    Yes.
    
    No.
    * GitHub Issue: #41127
    
    Authored-by: Sutou Kouhei <kou@clear-code.com>
    Signed-off-by: Sutou Kouhei <kou@clear-code.com>
    kou authored and raulcd committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    ba85125 View commit details
    Browse the repository at this point in the history
  3. GH-41004: [C++][FS][Azure] Don't run TestGetFileInfoGenerator() with …

    …Valgrind (#41163)
    
    ### Rationale for this change
    
    `GetFileInfo()` with generator reports false positive memory leak in Azure SDK for C++.
    
    ### What changes are included in this PR?
    
    Don't run `TestGetFileInfoGenerator()` with Valgrind.
    
    ### Are these changes tested?
    
    Yes.
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #41004
    
    Authored-by: Sutou Kouhei <kou@clear-code.com>
    Signed-off-by: Sutou Kouhei <kou@clear-code.com>
    kou authored and raulcd committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    3856bdd View commit details
    Browse the repository at this point in the history
  4. GH-41124: [CI][C++] Don't use CMake 3.29.1 with vcpkg (#41151)

    ### Rationale for this change
    
    vcpkg doesn't work with CMake 3.29.1.
    
    See also: microsoft/vcpkg#37968
    
    ### What changes are included in this PR?
    
    Use CMake 3.29.0 temporary.
    
    ### Are these changes tested?
    
    Yes.
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #41124
    
    Lead-authored-by: Sutou Kouhei <kou@clear-code.com>
    Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
    Co-authored-by: Jacob Wujciak-Jens <jacob@wujciak.de>
    Signed-off-by: Sutou Kouhei <kou@clear-code.com>
    3 people authored and raulcd committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    9bdf082 View commit details
    Browse the repository at this point in the history

Commits on Apr 15, 2024

  1. GH-41169: [CI][Release] Specify --build-config explicitly on Windows (#…

    …41178)
    
    ### Rationale for this change
    
    #37821 changed to use `add_test()` usage from old style to new style:
    
    1a1d2c8?diff=unified&w=1#diff-1ce47eec54afaee769086e1a720c5ed65bc347cd8fc60a233de67fd895dda329L763-R764
    
    MSVC generators multi-config generators. With old style, all tests are run without specifying `--build-config` explicitly. With new style, we need to specify `--build-config` explicitly.
    
    See also: https://cmake.org/cmake/help/latest/command/add_test.html
    
    ### What changes are included in this PR?
    
    Specify `--build-config` explicitly.
    
    ### Are these changes tested?
    
    Yes.
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #41169
    
    Authored-by: Sutou Kouhei <kou@clear-code.com>
    Signed-off-by: Sutou Kouhei <kou@clear-code.com>
    kou authored and raulcd committed Apr 15, 2024
    Configuration menu
    Copy the full SHA
    ed06440 View commit details
    Browse the repository at this point in the history
  2. GH-41167: [CI][Release][GLib][Conda] Pin gobject-introspection to 1.7…

    …8.1 (#41181)
    
    ### Rationale for this change
    
    GObject Introspection is merging into GLib. It's not completed yet.
    
    GLib related `.gir` files are moved to GLib from GObject Introspection since GLib/GObject Introspection 2.80.0. But glib 2.80.0 conda package doesn't support this merge yet:  conda-forge/glib-feedstock#174
    
    ### What changes are included in this PR?
    
    Pin gobject-introspection to 1.78.1 for now. 
    
    ### Are these changes tested?
    
    Yes.
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #41167
    
    Authored-by: Sutou Kouhei <kou@clear-code.com>
    Signed-off-by: Sutou Kouhei <kou@clear-code.com>
    kou authored and raulcd committed Apr 15, 2024
    Configuration menu
    Copy the full SHA
    4ec41de View commit details
    Browse the repository at this point in the history
  3. GH-41176: [C++] Stop defining ARROW_TEST_MEMCHECK in config.h.cmake (#…

    …41177)
    
    ### Rationale for this change
    
    We already have `ARROW_VALGRIND`.
    
    ### What changes are included in this PR?
    
    Remove redundant macro.
    
    ### Are these changes tested?
    
    Yes.
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #41176
    
    Authored-by: Sutou Kouhei <kou@clear-code.com>
    Signed-off-by: Sutou Kouhei <kou@clear-code.com>
    kou authored and raulcd committed Apr 15, 2024
    Configuration menu
    Copy the full SHA
    8c625d7 View commit details
    Browse the repository at this point in the history
  4. GH-38010: [Python] Construct pyarrow.Field and ChunkedArray through A…

    …rrow PyCapsule Protocol (#40818)
    
    ### Rationale for this change
    
    See #38010 (comment) for more context. Right now for _consuming_ ArrowSchema-compatible objects that implement the PyCapsule interface, we only have the private `_import_from_c_capsule` (on Schema, Field, DataType) and we check for the protocol in the public `pa.schema(..)`.
    
    But that means you currently can only consume objects that represent the schema of a batch (struct type), and not schemas of individual arrays. 
    
    ### What changes are included in this PR?
    
    Expand the `pa.field(..)` constructor to accept objects implementing the protocol method.
    
    ### Are these changes tested?
    
    TODO
    
    * GitHub Issue: #38010
    
    Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
    Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
    jorisvandenbossche authored and raulcd committed Apr 15, 2024
    Configuration menu
    Copy the full SHA
    41b99cb View commit details
    Browse the repository at this point in the history
  5. GH-39848: [Python][Packaging] Build pyarrow wheels with numpy RC inst…

    …ead of nightly (#41097)
    
    ### Rationale for this change
    
    Now NumPy has released a first RC for 2.0, we should update our PyArrow wheels to build with this released version instead of with the nightly numpy packages, to ensure we don't build our release wheels with an unstable numpy version.
    
    ### What changes are included in this PR?
    
    Increased the version requirement for numpy for the installed packages at build time to `numpy>=2.0.0rc1`, to force installing this RC instead of numpy 1.26
    
    ### Are these changes tested?
    
    The wheel tests ensure that those wheels still work with older versions of numpy
    * GitHub Issue: #39848
    
    Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
    Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
    jorisvandenbossche authored and raulcd committed Apr 15, 2024
    Configuration menu
    Copy the full SHA
    0d1a016 View commit details
    Browse the repository at this point in the history
  6. GH-41098: [Python] Add copy keyword in Array.__array__ for numpy 2.0+…

    … compatibility (#41071)
    
    ### Rationale for this change
    
    Adapting for changes in numpy 2.0 as decribed at https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword and future changes to pass copy=True (numpy/numpy#26208)
    
    ### What changes are included in this PR?
    
    Add a `copy=None` to the signatures of our `__array__` methods.
    
    This does have impact on the user facing behaviour, though. Questioning that upstream at numpy/numpy#25941 (comment)
    
    ### Are these changes tested?
    
    Yes
    
    ### Are there any user-facing changes?
    
    No (compared to usage with numpy<2)
    * GitHub Issue: #39532
    * GitHub Issue: #41098
    
    Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
    Signed-off-by: Antoine Pitrou <antoine@python.org>
    jorisvandenbossche authored and raulcd committed Apr 15, 2024
    Configuration menu
    Copy the full SHA
    0d3af8d View commit details
    Browse the repository at this point in the history
  7. GH-41016: [C++] Fix null count check in BooleanArray.true_count() (#4…

    …1070)
    
    ### Rationale for this change
    
    Loading the `null_count` attribute doesn't take into account the possible value of -1, leading to a code path where the validity buffer is accessed, but which is not necessarily present in that case.
    
    ### What changes are included in this PR?
    
    Use `data->MayHaveNulls()` instead of `data->null_count.load()`
    
    ### Are these changes tested?
    
    Yes
    
    * GitHub Issue: #41016
    
    Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
    Signed-off-by: Antoine Pitrou <antoine@python.org>
    jorisvandenbossche authored and raulcd committed Apr 15, 2024
    Configuration menu
    Copy the full SHA
    1e40252 View commit details
    Browse the repository at this point in the history
  8. GH-41121: [C++] Fix: left anti join filter empty rows. (#41122)

    ### Rationale for this change
    
    Since the left anti filter implementation is based on the left semi filter, and an assertion error occurs when the left semi filter rows are empty, this problem should be fixed.
    
    ### What changes are included in this PR?
    
    swiss_join.cc and hash_join_node_test.cc
    
    ### Are these changes tested?
    Yes
    
    ### Are there any user-facing changes?
    No
    
    * GitHub Issue: #41121
    
    Lead-authored-by: light-city <455954986@qq.com>
    Co-authored-by: Antoine Pitrou <pitrou@free.fr>
    Signed-off-by: Antoine Pitrou <antoine@python.org>
    2 people authored and raulcd committed Apr 15, 2024
    Configuration menu
    Copy the full SHA
    9cb361c View commit details
    Browse the repository at this point in the history

Commits on Apr 16, 2024

  1. GH-41227: [CI][Release][GLib][Conda] Unpin gobject-introspection (#41228

    )
    
    ### Rationale for this change
    
    Upstream problem conda-forge/glib-feedstock#174 has been fixed.
    
    ### What changes are included in this PR?
    
    Revert pinning.
    
    ### Are these changes tested?
    
    Yes.
    
    ### Are there any user-facing changes?
    
    No.
    * GitHub Issue: #41227
    
    Authored-by: Sutou Kouhei <kou@clear-code.com>
    Signed-off-by: Sutou Kouhei <kou@clear-code.com>
    kou authored and raulcd committed Apr 16, 2024
    Configuration menu
    Copy the full SHA
    b7187ad View commit details
    Browse the repository at this point in the history
  2. GH-35081: [Python] construct pandas.DataFrame with public API in `to_…

    …pandas` (#40897)
    
    ### Rationale for this change
    
    Avoiding using pandas internals to create Block objects ourselves, using a new API for pandas>=3
    
    * GitHub Issue: #35081
    
    Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
    Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
    jorisvandenbossche authored and raulcd committed Apr 16, 2024
    Configuration menu
    Copy the full SHA
    5a146ed View commit details
    Browse the repository at this point in the history
  3. GH-41201: [C++] Fix mistake in integration test. Explicitly cast std:…

    …:string to avoid compiler interpreting char* -> bool (#41202)
    
    * GitHub Issue: #41201
    
    Authored-by: David Li <li.davidm96@gmail.com>
    Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
    lidavidm authored and raulcd committed Apr 16, 2024
    Configuration menu
    Copy the full SHA
    eb5f162 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    2847737 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    4cd1b34 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    6a28035 View commit details
    Browse the repository at this point in the history