Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
a0aac46
feat: consume Substrait Plan
davisusanibar Mar 9, 2023
0d91f09
fix: solving maven-dependency-plugin
davisusanibar Mar 13, 2023
0599dc2
feat: add support for execution of Substrait binary plans also
davisusanibar Mar 15, 2023
c794ae5
Upgrade to Java 11 to be able to consume Isthmus library
davisusanibar Mar 15, 2023
8cc5443
fix: profile to Java test with JDK11 (be able to consume Isthmus libr…
davisusanibar Mar 15, 2023
e5594f8
fix: solve error to call Isthmus by Dataset that use JDK8
davisusanibar Mar 15, 2023
223ddef
fix: detected both log4j-over-slf4j.jar AND bound slf4j-reload4j.jar …
davisusanibar Mar 16, 2023
795e619
fix: rollback changes on orc
davisusanibar Mar 16, 2023
3bd18f1
Merge branch 'main' into poc-substrait
davisusanibar Mar 16, 2023
088a101
fix: able to compile main source with jdk8 and test with jdk11
davisusanibar Mar 16, 2023
ba23e44
fix: able to compile main source with jdk8 and test with jdk11
davisusanibar Mar 16, 2023
8655815
fix: JAVA_HOME_11_X64: command not found
davisusanibar Mar 16, 2023
d22d6b1
fix: partial comments fix
davisusanibar Mar 19, 2023
f0d8a25
Update java/dataset/src/main/cpp/jni_util.h
davisusanibar Mar 20, 2023
632f90d
Update java/dataset/src/main/java/org/apache/arrow/dataset/substrait/…
davisusanibar Mar 20, 2023
9437f4e
fix: comments
davisusanibar Mar 21, 2023
61d6ee7
fix: comments
davisusanibar Mar 21, 2023
64c7607
fix: comments
davisusanibar Mar 22, 2023
721fe01
fix: hash boost_1_81_0 does not match expected value
davisusanibar Mar 22, 2023
b3c2e1e
fix: maven-shade-plugin:jar:3.1.1 -> org.ow2.asm:asm:jar:6.0: Failed …
davisusanibar Mar 22, 2023
f5596c9
Merge branch 'main' into poc-substrait
davisusanibar Mar 22, 2023
388446b
Merge branch 'main' into poc-substrait
davisusanibar Mar 28, 2023
ead80a8
fix: clean unit test, fix comments
davisusanibar Mar 28, 2023
0446453
fix: clean substrait method to get plan
davisusanibar Mar 28, 2023
8c57c16
fix: clean sout
davisusanibar Mar 28, 2023
766b383
fix: rollback maven-shade-plugin
davisusanibar Mar 28, 2023
5e8b887
fix: failures test
davisusanibar Mar 29, 2023
7f59fbd
fix: delete methods not needed, create files of substrait plan
davisusanibar Mar 30, 2023
0d2bcf8
fix: npe read resources
davisusanibar Mar 30, 2023
4380932
fix: add resources files for nosuchfile error
davisusanibar Mar 30, 2023
9bbe4fb
fix: add resources files for nosuchfile error
davisusanibar Mar 30, 2023
5351ee1
fix: update rst documentation
davisusanibar Mar 30, 2023
e966d32
Apply suggestions from code review
davisusanibar Mar 31, 2023
cfe4061
fix: code review
davisusanibar Mar 31, 2023
c7003a1
Added serialization and deserialization for ExtendedExpression. Upda…
westonpace Apr 1, 2023
2419896
Merge branch 'main' into poc-substrait
davisusanibar Apr 2, 2023
8811bc6
Merge branch 'main' into poc-substrait
davisusanibar Apr 2, 2023
ead4784
fix: rebase and changes to consider new arrow acero
davisusanibar Apr 3, 2023
9bfa15c
fix: solving PR comments
davisusanibar Apr 6, 2023
8a0eae6
Merge branch 'main' into poc-substrait
davisusanibar Apr 6, 2023
87e75eb
fix: solving PR comments
davisusanibar Apr 6, 2023
812921f
Merge branch 'main' into poc-substrait
davisusanibar Apr 10, 2023
89060eb
fix: rebase
davisusanibar Apr 10, 2023
33c634f
Update java/dataset/src/main/java/org/apache/arrow/dataset/substrait/…
davisusanibar Apr 11, 2023
34979a5
fix: comment on code review
davisusanibar Apr 11, 2023
1a6f0e5
fix: comment on code review
davisusanibar Apr 11, 2023
e388be5
fix: validate input on arrow Table associated with a given table name
davisusanibar Apr 12, 2023
8eb3e40
fix: code review
davisusanibar Apr 13, 2023
72bbf5d
rebase + serdeser expresion
davisusanibar Apr 26, 2023
84dfd44
Merge branch 'poc-substrait' into otro
davisusanibar Apr 26, 2023
d632bf1
feat: initia feature to support extended expression as a substrait pr…
davisusanibar May 12, 2023
f21ee31
fix: java lint
davisusanibar May 12, 2023
0da5ed6
Added serialization and deserialization for ExtendedExpression. Upda…
westonpace Apr 1, 2023
b244f8f
WIP
westonpace Apr 1, 2023
f2b0f8a
Added python bindings to extended expression
westonpace May 16, 2023
cf618ae
Support index-based FieldRefs
benibus May 26, 2023
3e540fe
build: merge + solving conflicts
davisusanibar May 29, 2023
4099ca6
feat: support also filter expression
davisusanibar May 30, 2023
d74c116
Merge branch 'GH-35579-parquet-dataset-field-refs' into GH-34252
davisusanibar May 30, 2023
dbb9e7a
feat: support Filter and Porjections
davisusanibar May 30, 2023
93f147d
fix: comment code to test partially
davisusanibar May 30, 2023
892dedc
feat: add documentation for Project and Filters usng Substrait
davisusanibar Jun 1, 2023
b0fb9a5
Merge branch 'main' into GH-34252
davisusanibar Jun 1, 2023
ec8230c
Apply suggestions from code review
davisusanibar Jun 3, 2023
9e81af0
Apply suggestions from code review
davisusanibar Jun 5, 2023
dbe2622
Merge branch 'main' into GH-34252
davisusanibar Jun 5, 2023
d57517e
fix: solve code review comments
davisusanibar Jun 6, 2023
fb9df5a
fix: rebase
davisusanibar Aug 23, 2023
b9f6f94
fix: clean code
davisusanibar Aug 23, 2023
7469725
Apply suggestions from code review
davisusanibar Aug 23, 2023
52c148d
fix: code review
davisusanibar Aug 24, 2023
5748a2e
Apply suggestions from code review
davisusanibar Aug 25, 2023
b82b142
fix: code review
davisusanibar Aug 28, 2023
35fce8c
fix: code review
davisusanibar Sep 1, 2023
1a54100
fix: datset tutorial
davisusanibar Sep 1, 2023
23117d0
fix: datset tutorial
davisusanibar Sep 1, 2023
696e9c9
fix: datset tutorial
davisusanibar Sep 1, 2023
b280863
Apply suggestions from code review
davisusanibar Sep 1, 2023
18eb414
fix: code review
davisusanibar Sep 1, 2023
b9e2818
Merge branch 'main' into GH-34252
davisusanibar Sep 6, 2023
daaa25a
fix: code review
davisusanibar Sep 7, 2023
ddd1ad0
fix: code review
davisusanibar Sep 7, 2023
9900315
fix: code review
davisusanibar Sep 7, 2023
5a4f462
Merge branch 'main' into GH-34252
davisusanibar Sep 12, 2023
4de37a3
fix: expect valid objects for projection and filter
davisusanibar Sep 12, 2023
173dbec
Update java/dataset/src/main/cpp/jni_wrapper.cc
davisusanibar Sep 14, 2023
a3ddae6
Merge branch 'main' into GH-34252
davisusanibar Sep 14, 2023
dadc809
Merge branch 'GH-34252' of github.com:davisusanibar/arrow into GH-34252
davisusanibar Sep 14, 2023
0c292cf
fix: code review
davisusanibar Sep 15, 2023
faadc50
fix: code review
davisusanibar Sep 15, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 24 additions & 5 deletions docs/source/java/dataset.rst
Original file line number Diff line number Diff line change
Expand Up @@ -132,12 +132,10 @@ within method ``Scanner::schema()``:

.. _java-dataset-projection:

Projection
==========
Projection (Subset of Columns)
==============================

User can specify projections in ScanOptions. For ``FileSystemDataset``, only
column projection is allowed for now, which means, only column names
in the projection list will be accepted. For example:
User can specify projections in ScanOptions. For example:

.. code-block:: Java

Expand All @@ -159,6 +157,27 @@ Or use shortcut construtor:

Then all columns will be emitted during scanning.

Projection (Produce New Columns) and Filters
============================================

User can specify projections (new columns) or filters in ScanOptions using Substrait. For example:

.. code-block:: Java

ByteBuffer substraitExpressionFilter = getSubstraitExpressionFilter();
ByteBuffer substraitExpressionProject = getSubstraitExpressionProjection();
// Use Substrait APIs to create an Expression and serialize to a ByteBuffer
ScanOptions options = new ScanOptions.Builder(/*batchSize*/ 32768)
.columns(Optional.empty())
.substraitExpressionFilter(substraitExpressionFilter)
.substraitExpressionProjection(getSubstraitExpressionProjection())
.build();

.. seealso::

:doc:`Executing Projections and Filters Using Extended Expressions <substrait>`
Projections and Filters using Substrait.

Read Data from HDFS
===================

Expand Down
Loading