From 8567163d017e5816aa6bb65916ad2ca5f204cf70 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= Date: Wed, 16 Oct 2024 11:19:50 +0200 Subject: [PATCH] MINOR: [Release] Update CHANGELOG.md for 18.0.0 --- CHANGELOG.md | 346 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 346 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 6101f5d3cac25..3a9241cb953be 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,4 +1,350 @@ +# Apache Arrow 18.0.0 (2024-10-16) + +## Bug Fixes + +* [GH-36295](https://github.com/apache/arrow/issues/36295) - [C++] data corruption when using \`group\_by\` and \`aggregate\` on large data sets +* [GH-39789](https://github.com/apache/arrow/issues/39789) - [Go][Parquet] Close current row group when finished writing unbuffered batch (#43326) +* [GH-40557](https://github.com/apache/arrow/issues/40557) - [C++] Use `PutObject` request for S3 in OutputStream when only uploading small data (#41564) +* [GH-41396](https://github.com/apache/arrow/issues/41396) - [Ruby] Add workaround for re2.pc on Ubuntu 20.04 (#43721) +* [GH-41481](https://github.com/apache/arrow/issues/41481) - [CI] Update how extra environment variables are specified for the integration test docker job (#42009) +* [GH-41696](https://github.com/apache/arrow/issues/41696) - [Python][Packaging] Bump MACOSX_DEPLOYMENT_TARGET to 12 instead of 11 (#43137) +* [GH-41891](https://github.com/apache/arrow/issues/41891) - [C++] Clean up implicit fallthrough warnings (#41892) +* [GH-41993](https://github.com/apache/arrow/issues/41993) - [Go] IPC writer shift voffsets when offsets array does not start from zero (#43176) +* [GH-42240](https://github.com/apache/arrow/issues/42240) - [R] Fix crash in ParquetFileWriter$WriteTable and add WriteBatch (#42241) +* [GH-43046](https://github.com/apache/arrow/issues/43046) - [C++] Fix avx2 gather rows more than 2^31 issue in `CompareColumnsToRows` (#43065) +* [GH-43130](https://github.com/apache/arrow/issues/43130) - [C++][ArrowFlight] Crash due to UCS thread mode (#43120) +* [GH-43150](https://github.com/apache/arrow/issues/43150) - [Docs] Correct documentation of pyarrow.compute.microsecond (#43151) +* [GH-43152](https://github.com/apache/arrow/issues/43152) - [Release] Require "digest/sha1" explicitly for thread safety (#43154) +* [GH-43153](https://github.com/apache/arrow/issues/43153) - [R] pull on a grouped query returns the wrong column (#43172) +* [GH-43163](https://github.com/apache/arrow/issues/43163) - [R] Fix bindings in Math group generics (#43162) +* [GH-43167](https://github.com/apache/arrow/issues/43167) - [C++] Add workaround for missing Boost dependency of Thrift (#43328) +* [GH-43175](https://github.com/apache/arrow/issues/43175) - [C++] Skip not Emscripten ready tests in CSV tests (#43724) +* [GH-43183](https://github.com/apache/arrow/issues/43183) - [C++] Add `date{32,64}` to `date{32,64}` cast (#43192) +* [GH-43186](https://github.com/apache/arrow/issues/43186) - [Go] Use auto-aligned atomic int64 for pqarrow pathbuilders (#43206) +* [GH-43194](https://github.com/apache/arrow/issues/43194) - [R] R_existsVarInFrame isn't available earlier than R 4.2 (#43243) +* [GH-43202](https://github.com/apache/arrow/issues/43202) - [C++][Compute] Detect and explicit error for offset overflow in row table (#43226) +* [GH-43211](https://github.com/apache/arrow/issues/43211) - [C++] Fix decimal benchmarks to avoid out-of-bounds accesses (#43212) +* [GH-43217](https://github.com/apache/arrow/issues/43217) - [Java] Remove flight-core shaded jars (#43224) +* [GH-43218](https://github.com/apache/arrow/issues/43218) - [C++] Resolve Abseil like any other dependency in the build system (#43219) +* [GH-43221](https://github.com/apache/arrow/issues/43221) - [C++][Parquet] Refactor parquet::encryption::AesEncryptor to use unique_ptr (#43222) +* [GH-43228](https://github.com/apache/arrow/issues/43228) - [C++] Fix Abseil compile error on GCC 13 (#43157) +* [GH-43232](https://github.com/apache/arrow/issues/43232) - [Release][Packaging][Python] Add tzdata as conda env requirement to avoid ORC failure (#43233) +* [GH-43245](https://github.com/apache/arrow/issues/43245) - [Packaging][deb] Add missing libabsl-dev dependency (#43246) +* [GH-43267](https://github.com/apache/arrow/issues/43267) - [C#] Correctly import sliced arrays through the C Data interface (#44117) +* [GH-43270](https://github.com/apache/arrow/issues/43270) - [Release] Fix input variables on post-01-tag.sh (#43271) +* [GH-43276](https://github.com/apache/arrow/issues/43276) - [Go][Parquet] Make DeltaBitPacking Encoders/Decoders Generic (#43279) +* [GH-43282](https://github.com/apache/arrow/issues/43282) - [Release][Docs][Packaging] Upload correct docs job when uploading binaries (#43283) +* [GH-43284](https://github.com/apache/arrow/issues/43284) - [Release] Fix version detection timing for bump deb package names on post-12-bump-versions.sh script (#43294) +* [GH-43293](https://github.com/apache/arrow/issues/43293) - [Docs] Update code block for Installing Java Modules (#43295) +* [GH-43299](https://github.com/apache/arrow/issues/43299) - [Release][Packaging] Only include pyarrow folder when finding packages on setuptools (#43325) +* [GH-43314](https://github.com/apache/arrow/issues/43314) - [CI][Java] Delete arrow-maven-plugins from release script (#43313) +* [GH-43320](https://github.com/apache/arrow/issues/43320) - [Java] fix for SchemaChangeRuntimeException transferring empty FixedSizeListVector (#43321) +* [GH-43331](https://github.com/apache/arrow/issues/43331) - [C++] Add missing serde methods to Location (#43332) +* [GH-43346](https://github.com/apache/arrow/issues/43346) - [Docs][Format] Update broken links (#43347) +* [GH-43349](https://github.com/apache/arrow/issues/43349) - [R] Fix altrep string columns from readr (#43351) +* [GH-43357](https://github.com/apache/arrow/issues/43357) - [R] Fix some lints (#43338) +* [GH-43359](https://github.com/apache/arrow/issues/43359) - [Go][Parquet] ReadRowGroups panics with canceled context (#43360) +* [GH-43377](https://github.com/apache/arrow/issues/43377) - [Java][CI] Java-Jars CI is Failing with a linking error on macOS (#43385) +* [GH-43378](https://github.com/apache/arrow/issues/43378) - [Java][CI] Don't configure multithreading when building javadocs (#43674) +* [GH-43382](https://github.com/apache/arrow/issues/43382) - [C++][Parquet] min-max Statistics doesn't work well when one of min-max is truncated (#43383) +* [GH-43388](https://github.com/apache/arrow/issues/43388) - [Python] Give precedence to pycapsule interface in pa.schema(..) (#43486) +* [GH-43393](https://github.com/apache/arrow/issues/43393) - [C++][Parquet] parquet-dump-footer: Remove redundant link and fix --debug processing (#43375) +* [GH-43394](https://github.com/apache/arrow/issues/43394) - [Java][Benchmarking] Fix Java benchmarks for Java 17+ (#43395) +* [GH-43400](https://github.com/apache/arrow/issues/43400) - [C++] Ensure using bundled GoogleTest when we use bundled GoogleTest (#43465) +* [GH-43412](https://github.com/apache/arrow/issues/43412) - [Java][Benchmarking] Use JDK_JAVA_OPTIONS for JVM arguments (#43411) +* [GH-43414](https://github.com/apache/arrow/issues/43414) - [C++][Compute] Fix invalid memory access when resizing var-length buffer in row table (#43415) +* [GH-43429](https://github.com/apache/arrow/issues/43429) - [C++][FlightRPC] Fix Flight UCX build issues (#43430) +* [GH-43432](https://github.com/apache/arrow/issues/43432) - [Java][Packaging] Clean up java-jars job (#43431) +* [GH-43440](https://github.com/apache/arrow/issues/43440) - [R] Unable to filter a factor column with %in% (#43446) +* [GH-43447](https://github.com/apache/arrow/issues/43447) - [C++] FIlter out zero length buffers on gRPC transport (#43448) +* [GH-43449](https://github.com/apache/arrow/issues/43449) - [CI][Conan] Don't push used images (#43470) +* [GH-43463](https://github.com/apache/arrow/issues/43463) - [C++][Gandiva] Always use gdv_function_stubs.h in context_helper.cc (#43464) +* [GH-43467](https://github.com/apache/arrow/issues/43467) - [C++] Add support for the official LZ4 CMake package (#43468) +* [GH-43487](https://github.com/apache/arrow/issues/43487) - [Python] Sanitize Python reference handling in UDF implementation (#43557) +* [GH-43502](https://github.com/apache/arrow/issues/43502) - [Java] Fix Java JNI / AMD64 manylinux2014 Java JNI test not test dataset module (#43503) +* [GH-43506](https://github.com/apache/arrow/issues/43506) - [Java] Fix TestFragmentScanOptions result not match (#43639) +* [GH-43554](https://github.com/apache/arrow/issues/43554) - [Go] Handle excluded fields (#43555) +* [GH-43577](https://github.com/apache/arrow/issues/43577) - [Java] getBuffers method needs correction on clear flag usage (#43583) +* [GH-43588](https://github.com/apache/arrow/issues/43588) - [Python] Allow tuple for rename columns (#43609) +* [GH-43618](https://github.com/apache/arrow/issues/43618) - [Packaging][Python] Fix vcpkg version detection in macOS wheel build jobs (#43615) +* [GH-43627](https://github.com/apache/arrow/issues/43627) - [R] Fix summarize() performance regression (pushdown) (#43649) +* [GH-43635](https://github.com/apache/arrow/issues/43635) - [R][CI] Don't install Quarto (#43636) +* [GH-43665](https://github.com/apache/arrow/issues/43665) - [R] Remove references to bindings vignette (#43889) +* [GH-43667](https://github.com/apache/arrow/issues/43667) - [Java] Keeping Flight default header size consistent between server and client (#43697) +* [GH-43707](https://github.com/apache/arrow/issues/43707) - [Python] Fix compilation on Cython<3 (#43765) +* [GH-43717](https://github.com/apache/arrow/issues/43717) - [Java][FlightSQL] Add all ActionTypes to FlightSqlUtils.FLIGHT_SQL_ACTIONS (#43718) +* [GH-43735](https://github.com/apache/arrow/issues/43735) - [R] AWS SDK fails to build on one of CRAN's M1 builders (#43736) +* [GH-43743](https://github.com/apache/arrow/issues/43743) - [CI][Docs] Ensure creating build directory (#43744) +* [GH-43748](https://github.com/apache/arrow/issues/43748) - [R] Handle package_version in safe_r_metadata (#43895) +* [GH-43785](https://github.com/apache/arrow/issues/43785) - [Python][CI] Correct PARQUET_TEST_DATA path in wheel tests (#43786) +* [GH-43787](https://github.com/apache/arrow/issues/43787) - [C++] Register the new Opaque extension type by default (#43788) +* [GH-43815](https://github.com/apache/arrow/issues/43815) - [CI][Packaging][Python] Avoid uploading wheel to gemfury if version already exists (#43816) +* [GH-43837](https://github.com/apache/arrow/issues/43837) - [Go][IPC] Consolidate StreamWriter and FileWriter, ensuring that EOS indicator is written in file (#43890) +* [GH-43860](https://github.com/apache/arrow/issues/43860) - [Go][Parquet] Handle the error correctly (#43861) +* [GH-43868](https://github.com/apache/arrow/issues/43868) - [CI][Python] Skip test that requires PARQUET_TEST_DATA env on emscripten (#43906) +* [GH-43869](https://github.com/apache/arrow/issues/43869) - [Java][CI] Flight related failure in the AMD64 Windows Server 2022 Java JDK 11 CI (#43850) +* [GH-43870](https://github.com/apache/arrow/issues/43870) - [C++][Acero] Fix typos in join benchmark (#43871) +* [GH-43877](https://github.com/apache/arrow/issues/43877) - [Ruby] Add support for 0 decimal value (#43882) +* [GH-43885](https://github.com/apache/arrow/issues/43885) - [C++][CI] Catch potential integer overflow in PoolBuffer (#43886) +* [GH-43933](https://github.com/apache/arrow/issues/43933) - [CI] Remove docker-compose warnings (#43934) +* [GH-43952](https://github.com/apache/arrow/issues/43952) - [CI] Bump actions/{upload|download}-artifact from 3 to latest v4 in /.github/workflows (#43940) +* [GH-43960](https://github.com/apache/arrow/issues/43960) - [R] fix `str_sub` binding to properly handle negative `end` values (#44141) +* [GH-43966](https://github.com/apache/arrow/issues/43966) - [Java] Check for nullabilities when comparing StructVector (#43968) +* [GH-44046](https://github.com/apache/arrow/issues/44046) - [Python] Fix threading issues with borrowed refs and pandas (#44047) +* [GH-44050](https://github.com/apache/arrow/issues/44050) - [CI][Integration] Execute integration test again (#44051) +* [GH-44069](https://github.com/apache/arrow/issues/44069) - [Docs][R] Add note to to_arrow() docs about collect/compute (#44094) +* [GH-44071](https://github.com/apache/arrow/issues/44071) - [C++] Leak S3 structures if finalization happens too late (#44090) +* [GH-44076](https://github.com/apache/arrow/issues/44076) - [CI] Remove verify-rc-binaries-wheel-macos-11 which is now deprecated (#44077) +* [GH-44081](https://github.com/apache/arrow/issues/44081) - [C++][Parquet] Fix reported metrics in parquet-arrow-reader-writer-benchmark (#44082) +* [GH-44088](https://github.com/apache/arrow/issues/44088) - [Java] Fix copyFrom in BaseVariableWidthViewVector (#44078) +* [GH-44096](https://github.com/apache/arrow/issues/44096) - [C++] Don't use Boost.Process with Emscripten (#44097) +* [GH-44098](https://github.com/apache/arrow/issues/44098) - [C++] Add home made _mm256_set_m128i for compilers who are missing it (#44116) +* [GH-44122](https://github.com/apache/arrow/issues/44122) - [R] Don't use the new pipe yet (#44123) +* [GH-44127](https://github.com/apache/arrow/issues/44127) - [CI][R] Fix util_enable_core_dumps.sh path (#44128) +* [GH-44153](https://github.com/apache/arrow/issues/44153) - [GLib][FlightRPC] Fix closure annotation (#44154) +* [GH-44214](https://github.com/apache/arrow/issues/44214) - [C++] JsonExtensionType equality check ignores storage type (#44215) +* [GH-44218](https://github.com/apache/arrow/issues/44218) - [Benchmarking][Python] Avoid uwsgi install failure on macOS (#44221) +* [GH-44234](https://github.com/apache/arrow/issues/44234) - [CI][C++][AppVeyor] Use conda instead of Mamba (#44235) +* [GH-44253](https://github.com/apache/arrow/issues/44253) - [CI][Release][Python] Do not verify Python on Ubuntu 20.04 (#44254) +* [GH-44256](https://github.com/apache/arrow/issues/44256) - [C++][FS][Azure] Fix edgecase where GetFileInfo incorrectly returns NotFound on flat namespace and Azurite (#44302) +* [GH-44268](https://github.com/apache/arrow/issues/44268) - [Release][Ruby][CI] Pin version of glib used in verification script (#44270) +* [GH-44269](https://github.com/apache/arrow/issues/44269) - [C++][FS][Azure] Catch missing exceptions on HNS support check (#44274) +* [GH-44277](https://github.com/apache/arrow/issues/44277) - [CI] Use Miniforge instead of Mambaforge (#44278) +* [GH-44297](https://github.com/apache/arrow/issues/44297) - [Integration][CI] Skip nanoarrow IPC integration tests for compressed/dictionary-encoded files (#44298) +* [GH-44300](https://github.com/apache/arrow/issues/44300) - [Integration][Archery] Don't import unused testers (#44301) +* [GH-44303](https://github.com/apache/arrow/issues/44303) - [C++][FS][Azure] Fix minor hierarchical namespace bugs (#44307) +* [GH-44334](https://github.com/apache/arrow/issues/44334) - [C++] Fix S3 error handling in `ObjectOutputStream` (#44335) +* [GH-44337](https://github.com/apache/arrow/issues/44337) - [CI][GLib] Fix a flaky StreamDecoder and Buffer test (#44341) +* [GH-44342](https://github.com/apache/arrow/issues/44342) - [C++] Disable jemalloc by default on ARM (#44380) +* [GH-44358](https://github.com/apache/arrow/issues/44358) - [Packaging][Debian] Add workaround for CUDA include path (#44359) +* [GH-44369](https://github.com/apache/arrow/issues/44369) - [CI][Python] Remove ds requirement from test collection on test_dataset.py (#44370) +* [GH-44373](https://github.com/apache/arrow/issues/44373) - [Packaging][Java] Fix brew link to Python 3.13 on macOS (#44374) +* [GH-44381](https://github.com/apache/arrow/issues/44381) - [Ruby][Release] Pin not only glib but also python on verification jobs (#44382) +* [GH-44386](https://github.com/apache/arrow/issues/44386) - [Integration][Release] Pin Python 3.12 for Integration verification when using Conda (#44388) +* [GH-44422](https://github.com/apache/arrow/issues/44422) - [Packaging][Release][Linux] Upload artifacts before test (#44425) + + +## New Features and Improvements + +* [GH-15058](https://github.com/apache/arrow/issues/15058) - [C++][Python] Native support for UUID (#37298) +* [GH-17682](https://github.com/apache/arrow/issues/17682) - [C++][Python] Bool8 Extension Type Implementation (#43488) +* [GH-17682](https://github.com/apache/arrow/issues/17682) - [Go] Bool8 Extension Type Implementation (#43323) +* [GH-17682](https://github.com/apache/arrow/issues/17682) - [Format] Add Bool8 Canonical Extension Type (#43234) +* [GH-25118](https://github.com/apache/arrow/issues/25118) - [Python] Make NumPy an optional runtime dependency (#41904) +* [GH-28866](https://github.com/apache/arrow/issues/28866) - [Java] Java Dataset API ScanOptions expansion (#41646) +* [GH-30058](https://github.com/apache/arrow/issues/30058) - [Python] Add StructType attribute to access all its fields (#43481) +* [GH-30863](https://github.com/apache/arrow/issues/30863) - [JS] Use a singleton StructRow proxy handler (#44289) +* [GH-32538](https://github.com/apache/arrow/issues/32538) - [C++][Parquet] Add JSON canonical extension type (#13901) +* [GH-34529](https://github.com/apache/arrow/issues/34529) - [C++][Compute] Replace explicit checking with DCHECK for invariants in row segmenter (#44236) +* [GH-37756](https://github.com/apache/arrow/issues/37756) - [Format][Docs] Document IPC Compression (#43950) +* [GH-38041](https://github.com/apache/arrow/issues/38041) - [C++][CI] Improve IPC fuzzing seed corpus (#43621) +* [GH-38051](https://github.com/apache/arrow/issues/38051) - [Java] Remove Java 8 support (#43139) +* [GH-38183](https://github.com/apache/arrow/issues/38183) - [CI][Python] Use pipx to install GCS testbench (#43852) +* [GH-38255](https://github.com/apache/arrow/issues/38255) - [Java] Implement Flight SQL Bulk Ingestion (#43551) +* [GH-38847](https://github.com/apache/arrow/issues/38847) - [Documentation][C++] Explicitly note that compute is optional (#43629) +* [GH-39638](https://github.com/apache/arrow/issues/39638) - [Docs][R] Add r-universe instructions (#44033) +* [GH-39982](https://github.com/apache/arrow/issues/39982) - [Java] Add RunEndEncodedVector (#43888) +* [GH-40036](https://github.com/apache/arrow/issues/40036) - [C++] Azure file system write buffering & async writes (#43096) +* [GH-40154](https://github.com/apache/arrow/issues/40154) - [C++][Parquet] Separate encoders and decoder (#43972) +* [GH-40216](https://github.com/apache/arrow/issues/40216) - [Python][CI][Packaging] Don't upload sdist to scientific-python nightly channel (only wheels) (#43943) +* [GH-40216](https://github.com/apache/arrow/issues/40216) - [Python][CI][Packaging] Upload nightly wheels to main label of scientific-python-nightly-wheels channel (#43932) +* [GH-40216](https://github.com/apache/arrow/issues/40216) - [CI][Packaging][Python] Upload pyarrow nightly wheels to scientific python channel on Anaconda (#43862) +* [GH-40493](https://github.com/apache/arrow/issues/40493) - [GLib][Ruby] Add GArrowStreamDecoder (#44170) +* [GH-40570](https://github.com/apache/arrow/issues/40570) - [CI] Default environment to Ubuntu 22.04 instead of 20.04 (#44151) +* [GH-40860](https://github.com/apache/arrow/issues/40860) - [GLib][Parquet] Add `gparquet_arrow_file_writer_write_record_batch()` (#44001) +* [GH-40936](https://github.com/apache/arrow/issues/40936) - [Java] Implement Holder-based functions in \`ViewVarBinaryVector\` +* [GH-40937](https://github.com/apache/arrow/issues/40937) - [Java] Implement Holder-based functions for ViewVarCharVector & ViewVarBinaryVector (#44187) +* [GH-41056](https://github.com/apache/arrow/issues/41056) - [GLib][FlightRPC] Add gaflight_client_do_put() and related APIs (#43813) +* [GH-41272](https://github.com/apache/arrow/issues/41272) - [Java] LargeListViewVector Implementation (#43516) +* [GH-41291](https://github.com/apache/arrow/issues/41291) - [Java] LargeListViewVector Implementation transferPair implementation (#43637) +* [GH-41347](https://github.com/apache/arrow/issues/41347) - [FlightRPC][C#] Allow hosting flight server in pre-Kestrel .net versions (#41348) +* [GH-41569](https://github.com/apache/arrow/issues/41569) - [Java] ListViewVector Implementation for UnionListViewReader (#43077) +* [GH-41579](https://github.com/apache/arrow/issues/41579) - [C++][Python][Parquet] Support reading/writing key-value metadata from/to ColumnChunkMetaData (#41580) +* [GH-41584](https://github.com/apache/arrow/issues/41584) - [Java] ListView Implementation for C Data Interface (#43686) +* [GH-41585](https://github.com/apache/arrow/issues/41585) - [Java] LargeListView Implementation for C Data Interface +* [GH-41623](https://github.com/apache/arrow/issues/41623) - [Docs] Remove the warning for `arrow::dataset` (#43148) +* [GH-41640](https://github.com/apache/arrow/issues/41640) - [Go] Implement BYTE_STREAM_SPLIT Parquet Encoding (#43066) +* [GH-41665](https://github.com/apache/arrow/issues/41665) - [Python] Ensure (Chunked)Array/RecordBatch/Table methods don't crash with non-CPU data +* [GH-41673](https://github.com/apache/arrow/issues/41673) - [Format][Docs] Add arrow format introductory page (#41593) +* [GH-41909](https://github.com/apache/arrow/issues/41909) - [C++] Add arrow::ArrayStatistics (#43273) +* [GH-41922](https://github.com/apache/arrow/issues/41922) - [CI][C++] Update Minio version (#44225) +* [GH-41951](https://github.com/apache/arrow/issues/41951) - [Java] Add @FormatMethod annotations (#43376) +* [GH-42014](https://github.com/apache/arrow/issues/42014) - [Python] Let StructArray.from_array accept a type in addition to names or fields (#43047) +* [GH-42085](https://github.com/apache/arrow/issues/42085) - [Python] Test FlightStreamReader iterator (#42086) +* [GH-42102](https://github.com/apache/arrow/issues/42102) - [C++][Parquet] Add binary that extracts a footer from a parquet file (#42174) +* [GH-42222](https://github.com/apache/arrow/issues/42222) - [Python] Add bindings for CopyTo on RecordBatch and Array classes (#42223) +* [GH-42247](https://github.com/apache/arrow/issues/42247) - [C++] Support casting to and from utf8_view/binary_view (#43302) +* [GH-43044](https://github.com/apache/arrow/issues/43044) - [R] So-called non-API entry points (#43173) +* [GH-43069](https://github.com/apache/arrow/issues/43069) - [Python] Use Py_IsFinalizing from pythoncapi_compat.h (#43767) +* [GH-43075](https://github.com/apache/arrow/issues/43075) - [CI][Crossbow][Docker] Set timeout for docker-tests (#43078) +* [GH-43092](https://github.com/apache/arrow/issues/43092) - [Swift] Update ArrowData for Nested Types (allow children) (#43086) +* [GH-43095](https://github.com/apache/arrow/issues/43095) - [C++] Update bundled vendor/datetime to support for building with libc++ and C++20 (#43094) +* [GH-43097](https://github.com/apache/arrow/issues/43097) - [C++] Implement `PathFromUri` support for Azure file system (#43098) +* [GH-43114](https://github.com/apache/arrow/issues/43114) - [Archery][Dev] Support setuptools-scm >= 8.0.0 (#43156) +* [GH-43129](https://github.com/apache/arrow/issues/43129) - [C++][Compute] Fix the unnecessary allocation of extra bytes when encoding row table (#43125) +* [GH-43141](https://github.com/apache/arrow/issues/43141) - [C++][Parquet] Replace use of int with int32_t in the internal Parquet encryption APIs (#43413) +* [GH-43142](https://github.com/apache/arrow/issues/43142) - [C++][Parquet] Refactor Encryptor API to use arrow::util::span instead of raw pointers (#43195) +* [GH-43143](https://github.com/apache/arrow/issues/43143) - [C++][Parquet] Default initialize some parquet metadata variables (#43144) +* [GH-43160](https://github.com/apache/arrow/issues/43160) - [Swift] Add Struct Array (#43161) +* [GH-43164](https://github.com/apache/arrow/issues/43164) - [C++] Fix CMake link order for AWS SDK (#43230) +* [GH-43168](https://github.com/apache/arrow/issues/43168) - [Swift] Add buffer and array builders for Struct type (#43171) +* [GH-43169](https://github.com/apache/arrow/issues/43169) - [Swift] Add StructArray to ArrowReader (#43335) +* [GH-43185](https://github.com/apache/arrow/issues/43185) - [C++] Suggest a cast when Concatenate fails due to offsets overflow (#43190) +* [GH-43187](https://github.com/apache/arrow/issues/43187) - [C++] Support basic is_in predicate simplification (#43761) +* [GH-43197](https://github.com/apache/arrow/issues/43197) - [C++][AzureFS] Ignore password field in URI (#44220) +* [GH-43209](https://github.com/apache/arrow/issues/43209) - [C++] Add lint for DCHECK in public headers (#43248) +* [GH-43229](https://github.com/apache/arrow/issues/43229) - [Java] Update Maven project info (#43231) +* [GH-43238](https://github.com/apache/arrow/issues/43238) - [C++][FlightRPC] Reduce repetition in flight/types.cc in serde functions (#43237) +* [GH-43249](https://github.com/apache/arrow/issues/43249) - [C++][Parquet] remove useless template parameter of `DeltaLengthByteArrayEncoder` (#43250) +* [GH-43254](https://github.com/apache/arrow/issues/43254) - [C++] Always prefer mimalloc to jemalloc (#40875) +* [GH-43258](https://github.com/apache/arrow/issues/43258) - [C++][Flight] Use a Base CRTP type for the types used in RPC calls (#43255) +* [GH-43266](https://github.com/apache/arrow/issues/43266) - [C#] Add LargeBinary, LargeString and LargeList array types (#43269) +* [GH-43291](https://github.com/apache/arrow/issues/43291) - [C++] Expand the 'take' function tests to cover more chunked-array cases (#43292) +* [GH-43301](https://github.com/apache/arrow/issues/43301) - [C++][Parquet] Enhance the comment for ColumnReader/Decoder (#44003) +* [GH-43319](https://github.com/apache/arrow/issues/43319) - [R][Docs] Update packaging checklist (#43345) +* [GH-43329](https://github.com/apache/arrow/issues/43329) - [C++] Order classes in flight/types.h according to Flight.proto (#43330) +* [GH-43380](https://github.com/apache/arrow/issues/43380) - [Java] Add support for cross jdk version testing (#43381) +* [GH-43391](https://github.com/apache/arrow/issues/43391) - [Python] Add bindings for memory manager and device to Context class (#43392) +* [GH-43396](https://github.com/apache/arrow/issues/43396) - [Java] Remove/replace jsr305 (#43397) +* [GH-43418](https://github.com/apache/arrow/issues/43418) - [CI] Add wheels and java-jars to vcpkg group for tasks (#43419) +* [GH-43425](https://github.com/apache/arrow/issues/43425) - [Java] Upgrade JNI to version 10 (#43424) +* [GH-43427](https://github.com/apache/arrow/issues/43427) - [C++][Parquet] Deprecate ColumnChunk::file_offset field and no longer write Metadata at end of Chunk (#43428) +* [GH-43437](https://github.com/apache/arrow/issues/43437) - [Java] Update protobuf from 3.25.1 to 3.25.4 (#43436) +* [GH-43443](https://github.com/apache/arrow/issues/43443) - [Go][IPC] Infer schema from first record if not specified (#43484) +* [GH-43444](https://github.com/apache/arrow/issues/43444) - [C++] Add benchmark for binary view builder (#43445) +* [GH-43450](https://github.com/apache/arrow/issues/43450) - [CI] Temporarily turn off conda jobs that are failing (#43451) +* [GH-43453](https://github.com/apache/arrow/issues/43453) - [Format] Add Opaque canonical extension type (#43457) +* [GH-43454](https://github.com/apache/arrow/issues/43454) - [C++][Python] Add Opaque canonical extension type (#43458) +* [GH-43455](https://github.com/apache/arrow/issues/43455) - [Go] Add Opaque canonical extension type (#43459) +* [GH-43456](https://github.com/apache/arrow/issues/43456) - [Java] Add Opaque canonical extension type (#43460) +* [GH-43469](https://github.com/apache/arrow/issues/43469) - [Java] Change the default CompressionCodec.Factory to leverage compression support transparently (#43471) +* [GH-43479](https://github.com/apache/arrow/issues/43479) - [Java] Change visibility of MemoryUtil.UNSAFE (#43480) +* [GH-43483](https://github.com/apache/arrow/issues/43483) - [Java][C++] Support more CsvFragmentScanOptions in JNI call (#43482) +* [GH-43492](https://github.com/apache/arrow/issues/43492) - [C++] Thirdparty: Bump lz4 to 1.10.0 (#43493) +* [GH-43495](https://github.com/apache/arrow/issues/43495) - [C++][Compute] Widen the row offset of the row table to 64-bit (#43389) +* [GH-43500](https://github.com/apache/arrow/issues/43500) - [R][CI] Bump dev docs CI job from ubuntu 20.04 (#43501) +* [GH-43507](https://github.com/apache/arrow/issues/43507) - [C++] Use ViewOrCopyTo instead of CopyTo when pretty printing non-CPU data (#43508) +* [GH-43509](https://github.com/apache/arrow/issues/43509) - [R] Add link to ?acero from ?list_compute_functions (#44210) +* [GH-43512](https://github.com/apache/arrow/issues/43512) - [Java] ListViewVector Visitor-based component Integration (#43513) +* [GH-43514](https://github.com/apache/arrow/issues/43514) - [Python] Deprecate passing build flags to setup.py (#43515) +* [GH-43518](https://github.com/apache/arrow/issues/43518) - [Python][Packaging][CI] Drop Python 3.8 support (#43970) +* [GH-43519](https://github.com/apache/arrow/issues/43519) - [Python][CI] Add Python 3.13 conda test build (#44192) +* [GH-43519](https://github.com/apache/arrow/issues/43519) - [Python][CI][Packaging] Use released versions to build and test wheels on Python 3.13 (#44193) +* [GH-43519](https://github.com/apache/arrow/issues/43519) - [Python] Set up wheel building for Python 3.13 (#43539) +* [GH-43532](https://github.com/apache/arrow/issues/43532) - [Python] Remove usage of deprecated pkg_resources in setup.py (#43602) +* [GH-43536](https://github.com/apache/arrow/issues/43536) - [Python][CI] Add a Crossbow job with the free-threaded build (#43671) +* [GH-43536](https://github.com/apache/arrow/issues/43536) - [Python] Do not use borrowed references APIs (#43540) +* [GH-43536](https://github.com/apache/arrow/issues/43536) - [Python] Declare support for free-threading in Cython (#43606) +* [GH-43543](https://github.com/apache/arrow/issues/43543) - [FlightRPC][C++] Reduce the number of references to protobuf::Any (#43544) +* [GH-43548](https://github.com/apache/arrow/issues/43548) - [R][CI] Use grep -F to simplify matching or rchk output (#43477) +* [GH-43559](https://github.com/apache/arrow/issues/43559) - [Python][CI] Add a Crossbow job with a debug CPython interpreter (#43565) +* [GH-43578](https://github.com/apache/arrow/issues/43578) - [C++] Simplify arrow::ArrayStatistics::ValueType (#43581) +* [GH-43591](https://github.com/apache/arrow/issues/43591) - [C++][GLib] Don't install arrow-cuda.pc/arrow-cuda-glib.pc on Windows (#43593) +* [GH-43592](https://github.com/apache/arrow/issues/43592) - [C++] Remove redundant default constructor/deconstructor in arrow::ArrayStatistics (#43579) +* [GH-43594](https://github.com/apache/arrow/issues/43594) - [C++] Remove std::optional from arrow::ArrayStatistics::is_{min,max}_exact (#43595) +* [GH-43608](https://github.com/apache/arrow/issues/43608) - [CI][Archery] Prefer `docker compose` over `docker-compose` (#43586) +* [GH-43633](https://github.com/apache/arrow/issues/43633) - [R] Add tests for packages that might be tricky to roundtrip data to Tables + Parquet files (#43634) +* [GH-43638](https://github.com/apache/arrow/issues/43638) - [Java] LargeListViewVector RangeEqualVisitor and TypeEqualVisitor integration (#43642) +* [GH-43643](https://github.com/apache/arrow/issues/43643) - [Java] LargeListViewVector IPC Integration (#43681) +* [GH-43669](https://github.com/apache/arrow/issues/43669) - [Docs][Dev] Document archery --debug flag in section about docker (#43935) +* [GH-43672](https://github.com/apache/arrow/issues/43672) - [C#] Schema should be optional on FlightInfo (#43673) +* [GH-43677](https://github.com/apache/arrow/issues/43677) - [C++][FlightRPC] Move the FlightTestServer to its own .cc and .h files (#43678) +* [GH-43680](https://github.com/apache/arrow/issues/43680) - [Integration] Unskip nanoarrow in IPC integration tests (#43715) +* [GH-43684](https://github.com/apache/arrow/issues/43684) - [Python][Dataset] Python / Cython interface to C++ arrow::dataset::Partitioning::Format (#43740) +* [GH-43687](https://github.com/apache/arrow/issues/43687) - [C++] Compute: fix register kernel SimdLevel for AddMinMax512AggKernels (#43704) +* [GH-43688](https://github.com/apache/arrow/issues/43688) - [C++] Prevent Snappy from disabling RTTI when bundled (#43706) +* [GH-43690](https://github.com/apache/arrow/issues/43690) - [Python][CI] Simplify python/requirements-wheel-test.txt file (#43691) +* [GH-43702](https://github.com/apache/arrow/issues/43702) - [C++][FS][Azure] Use the latest Azurite and update the bundled Azure SDK for C++ to azure-identity_1.9.0 (#43723) +* [GH-43703](https://github.com/apache/arrow/issues/43703) - [C++][Parquet][CI] Parquet: Introducing more bad_data for testing (#43708) +* [GH-43712](https://github.com/apache/arrow/issues/43712) - [C++][Parquet] Dataset: Handle num-nulls in Parquet correctly when !HasNullCount() (#43726) +* [GH-43719](https://github.com/apache/arrow/issues/43719) - [C++] Clarify the way SIMD-enabled agg kernels come from the same code in different compilation units (#43720) +* [GH-43727](https://github.com/apache/arrow/issues/43727) - [Python] RecordBatch fails gracefully on non-cpu devices (#43729) +* [GH-43728](https://github.com/apache/arrow/issues/43728) - [Python] ChunkedArray fails gracefully on non-cpu devices (#43795) +* [GH-43732](https://github.com/apache/arrow/issues/43732) - [Go] Require Go 1.22 or above (#43864) +* [GH-43733](https://github.com/apache/arrow/issues/43733) - [C++] Fix Scalar boolean handling in row encoder (#43734) +* [GH-43738](https://github.com/apache/arrow/issues/43738) - [GLib] Add `GArrowAzureFileSytem` (#43739) +* [GH-43746](https://github.com/apache/arrow/issues/43746) - [C++] Add support for Boost 1.86 (#43766) +* [GH-43758](https://github.com/apache/arrow/issues/43758) - [C++] Compute: More comment in RowEncoder (#43763) +* [GH-43759](https://github.com/apache/arrow/issues/43759) - [C++] Acero: Minor code enhancement for Join (#43760) +* [GH-43764](https://github.com/apache/arrow/issues/43764) - [Go][FlightSQL] Add NewPreparedStatement function (#43781) +* [GH-43768](https://github.com/apache/arrow/issues/43768) - [C++] Fix the case when boolean_{any|all} meets constant input with length in Acero (#43799) +* [GH-43776](https://github.com/apache/arrow/issues/43776) - [C++] Add chunked Take benchmarks with a small selection factor (#43772) +* [GH-43790](https://github.com/apache/arrow/issues/43790) - [Go][Parquet] Add support for LZ4_RAW compression codec (#43835) +* [GH-43796](https://github.com/apache/arrow/issues/43796) - [C++] Indent preprocessor directives (#43798) +* [GH-43797](https://github.com/apache/arrow/issues/43797) - [C++] Attach `arrow::ArrayStatistics` to `arrow::ArrayData` (#43801) +* [GH-43802](https://github.com/apache/arrow/issues/43802) - [GLib] Add `GAFlightRecordBatchWriter` (#43803) +* [GH-43805](https://github.com/apache/arrow/issues/43805) - [C++] Enable filesystem automatically when one of ARROW_{AZURE,GCS,HDFS,S3}=ON is specified (#43806) +* [GH-43809](https://github.com/apache/arrow/issues/43809) - [Docs] Update extension type examples to not use UUID (#44120) +* [GH-43814](https://github.com/apache/arrow/issues/43814) - [GLib][FlightRPC] Add `GAFlightServerClass::do_put` (#43999) +* [GH-43840](https://github.com/apache/arrow/issues/43840) - [CI] Add cuda group to tasks.yml and minor updates for new cuda runner image (#43841) +* [GH-43846](https://github.com/apache/arrow/issues/43846) - [Python][Packaging] Remove numpy dependency from pyarrow packaging (#44148) +* [GH-43854](https://github.com/apache/arrow/issues/43854) - [C++] Expose the set of device types where a ChunkedArray is allocated (#43853) +* [GH-43872](https://github.com/apache/arrow/issues/43872) - [Go][CI] Disable Dependabot for Go (#44102) +* [GH-43873](https://github.com/apache/arrow/issues/43873) - [Go][CI] Remove Go related test CI (#44143) +* [GH-43874](https://github.com/apache/arrow/issues/43874) - [CI][Integration][Go] Use apache/arrow-go (#44142) +* [GH-43875](https://github.com/apache/arrow/issues/43875) - [Go][CI] Remove Go related lint configurations (#44144) +* [GH-43878](https://github.com/apache/arrow/issues/43878) - [Go][Release] Remove Go related codes from our release scripts (#44172) +* [GH-43879](https://github.com/apache/arrow/issues/43879) - [Go] Remove go related code (#44293) +* [GH-43883](https://github.com/apache/arrow/issues/43883) - [CI] Remove Python version guard when installing GCS testbench (#43884) +* [GH-43894](https://github.com/apache/arrow/issues/43894) - [R] format_aggregation() should print options too (#43896) +* [GH-43902](https://github.com/apache/arrow/issues/43902) - [Java] Support for Long memory addresses (#43903) +* [GH-43907](https://github.com/apache/arrow/issues/43907) - [C#][FlightRPC] Add Grpc Call Options support on Flight Client (#43910) +* [GH-43927](https://github.com/apache/arrow/issues/43927) - [C++] Make ChunkResolver::ResolveMany output a list of ChunkLocations (#43928) +* [GH-43944](https://github.com/apache/arrow/issues/43944) - [C++][Parquet] Add support for arrow::ArrayStatistics: non zero-copy int based types (#43945) +* [GH-43946](https://github.com/apache/arrow/issues/43946) - [C++][Parquet] Guard against use of cleared decryptor/encryptor (#43947) +* [GH-43953](https://github.com/apache/arrow/issues/43953) - [C++] Add tests based on random data and benchmarks to ChunkResolver::ResolveMany (#43954) +* [GH-43962](https://github.com/apache/arrow/issues/43962) - [Java] Consider warnings as errors for Adapter Module (#43963) +* [GH-43964](https://github.com/apache/arrow/issues/43964) - [Python] Build macOS and manylinux wheels for free-threading (#43965) +* [GH-43967](https://github.com/apache/arrow/issues/43967) - [C++] Enhance error message for URI parsing (#43938) +* [GH-43969](https://github.com/apache/arrow/issues/43969) - [CI][Dev] Prune .dockerignore (#43971) +* [GH-43973](https://github.com/apache/arrow/issues/43973) - [Python] Table fails gracefully on non-cpu devices (#43974) +* [GH-43979](https://github.com/apache/arrow/issues/43979) - [CI][C++][Dev] Add cpplint to pre-commit (#43982) +* [GH-43983](https://github.com/apache/arrow/issues/43983) - [C++][Parquet] Add support for arrow::ArrayStatistics: zero-copy types (#43984) +* [GH-43986](https://github.com/apache/arrow/issues/43986) - [C++][Acero] Some code cleanup to `Grouper` (#43988) +* [GH-43992](https://github.com/apache/arrow/issues/43992) - [C++] Add missing std::move() in array_nested.cc (#43993) +* [GH-43996](https://github.com/apache/arrow/issues/43996) - [Java] Mark new allocated ArrowSchema as released (#43997) +* [GH-43998](https://github.com/apache/arrow/issues/43998) - [C++][Docs] Add missing install command in building docs (#44000) +* [GH-44006](https://github.com/apache/arrow/issues/44006) - [GLib][Parquet] Add `gparquet_arrow_file_writer_new_row_group()` (#44039) +* [GH-44007](https://github.com/apache/arrow/issues/44007) - [GLib][Parquet] Add `gparquet_arrow_file_writer_new_buffered_row_group()` (#44100) +* [GH-44008](https://github.com/apache/arrow/issues/44008) - [C++][Parquet] Add support for arrow::ArrayStatistics: boolean (#44009) +* [GH-44011](https://github.com/apache/arrow/issues/44011) - [Java] Consider warnings as errors for C Module (#44012) +* [GH-44013](https://github.com/apache/arrow/issues/44013) - [Java] Consider warnings as errors for Dataset Module (#44014) +* [GH-44016](https://github.com/apache/arrow/issues/44016) - [Java] Consider warnings as errors for Format Module (#44017) +* [GH-44034](https://github.com/apache/arrow/issues/44034) - [Go][Format][FlightRPC] Update go_package in Flight.proto and FlightSql.proto (#44035) +* [GH-44036](https://github.com/apache/arrow/issues/44036) - [C++] IPC: ipc reader/writer code enhancement (#44019) +* [GH-44044](https://github.com/apache/arrow/issues/44044) - [Java] Consider warnings as errors for Vector Module (#44045) +* [GH-44052](https://github.com/apache/arrow/issues/44052) - [C++][Compute] Reduce the complexity of row segmenter (#44053) +* [GH-44058](https://github.com/apache/arrow/issues/44058) - [CI][Integration] Group logs on GitHub Actions (#44060) +* [GH-44062](https://github.com/apache/arrow/issues/44062) - [Dev][Archery][Integration] Reduce needless test matrix (#44099) +* [GH-44063](https://github.com/apache/arrow/issues/44063) - [Python] Deprecate the no longer used serialize/deserialize Pyarrow C++ functions (#44064) +* [GH-44072](https://github.com/apache/arrow/issues/44072) - [C++][Parquet] Add Float16 reading benchmarks (#44073) +* [GH-44079](https://github.com/apache/arrow/issues/44079) - [C++][Parquet] Remove deprecated APIs (#44080) +* [GH-44085](https://github.com/apache/arrow/issues/44085) - [CI][R] Update Ubuntu version for R force test (#44087) +* [GH-44095](https://github.com/apache/arrow/issues/44095) - [CI][Python] Enable S3 testing on Windows wheel builds (#44093) +* [GH-44111](https://github.com/apache/arrow/issues/44111) - [CI][Python] Enable S3 tests on macOS CI (#44129) +* [GH-44149](https://github.com/apache/arrow/issues/44149) - [Packaging][CI] Remove references to deprecated Ubuntu bionic (#44150) +* [GH-44155](https://github.com/apache/arrow/issues/44155) - [Archery][Integration] Rename "language" to "implementation" (#44156) +* [GH-44158](https://github.com/apache/arrow/issues/44158) - [Archery][Integration] Add more explanation how --target-implementations works (#44177) +* [GH-44167](https://github.com/apache/arrow/issues/44167) - [C++][Acero] Add more row segmenter tests (#44166) +* [GH-44178](https://github.com/apache/arrow/issues/44178) - [GLib][FlightRPC] Add GAFlightCallOptions:timeout (#44181) +* [GH-44186](https://github.com/apache/arrow/issues/44186) - [C++][Parquet] Fix typo in parquet/column_writer.cc (#40856) +* [GH-44194](https://github.com/apache/arrow/issues/44194) - [C++] Avoid repeated ArrayData::offset lookups (#44190) +* [GH-44206](https://github.com/apache/arrow/issues/44206) - [CI][macOS] Drop support for macOS 12 (#44212) +* [GH-44222](https://github.com/apache/arrow/issues/44222) - [C++][Gandiva] Accept LLVM 19.1 (#44233) +* [GH-44229](https://github.com/apache/arrow/issues/44229) - [Docs] Add PyArrow to JAX example to the docs (#44230) +* [GH-44237](https://github.com/apache/arrow/issues/44237) - [C#] Use stack allocated buffer when serializing decimal values (#44238) +* [GH-44249](https://github.com/apache/arrow/issues/44249) - [C++] Unify simd header includings (#44250) +* [GH-44271](https://github.com/apache/arrow/issues/44271) - [C#] Add support for Decimal32 and Decimal64 (#44272) +* [GH-44273](https://github.com/apache/arrow/issues/44273) - [C++][Decimal] Use 0E+1 not 0.E+1 for broader compatibility (#44275) +* [GH-44290](https://github.com/apache/arrow/issues/44290) - [Java][Flight] Add ActionType description getter (#44291) +* [GH-44314](https://github.com/apache/arrow/issues/44314) - [Packaging][Python] Use macOS 12 as deployment target to have macOS 12 pyarrow wheels (#44315) +* [GH-44347](https://github.com/apache/arrow/issues/44347) - [Packaging][C++] Enable Azure file system for deb/rpm (#44348) +* [GH-44355](https://github.com/apache/arrow/issues/44355) - [Packaging][Python] Disable interactive deb configuration in wheel-manylinux-*-cp313t-* (#44362) +* [GH-44415](https://github.com/apache/arrow/issues/44415) - [Release][Ruby] Remove pins from glib section of release verification script (#44407) + + + # Apache Arrow 6.0.1 (2021-11-18) ## Bug Fixes