-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(spark): Refactoring datasources #514
feat(spark): Refactoring datasources #514
Conversation
On branch 493-datasources-spark-refactoring Changes to be committed: modified: maven-projects/spark/datasources-33/src/main/scala/org/apache/graphar/datasources/GarDataSource.scala renamed: maven-projects/spark/datasources-33/src/main/scala/org/apache/graphar/datasources/GarCommitProtocol.scala -> maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/GarCommitProtocol.scala renamed: maven-projects/spark/datasources-33/src/main/scala/org/apache/graphar/datasources/GarScan.scala -> maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/GarScan.scala renamed: maven-projects/spark/datasources-33/src/main/scala/org/apache/graphar/datasources/GarScanBuilder.scala -> maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/GarScanBuilder.scala renamed: maven-projects/spark/datasources-33/src/main/scala/org/apache/graphar/datasources/GarTable.scala -> maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/GarTable.scala renamed: maven-projects/spark/datasources-33/src/main/scala/org/apache/graphar/datasources/GarWriterBuilder.scala -> maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/GarWriteBuilder.scala renamed: maven-projects/spark/datasources-33/src/main/scala/org/apache/graphar/datasources/csv/CSVWriterBuilder.scala -> maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/csv/CSVWriteBuilder.scala renamed: maven-projects/spark/datasources-33/src/main/scala/org/apache/graphar/datasources/orc/OrcOutputWriter.scala -> maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/orc/OrcOutputWriter.scala renamed: maven-projects/spark/datasources-33/src/main/scala/org/apache/graphar/datasources/orc/OrcWriteBuilder.scala -> maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/orc/OrcWriteBuilder.scala renamed: maven-projects/spark/datasources-33/src/main/scala/org/apache/graphar/datasources/parquet/ParquetWriterBuilder.scala -> maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/parquet/ParquetWriteBuilder.scala
@acezen What do you think about this? For me it looks like there are no changes for public API |
Hi, Sem, thanks for the draft, change to |
Yes, it should be. For example, check this:
but coordinates are still So, if you OK with it I will update 32 too and fix license headers. |
Looks NICE! The change is OK for me. Thanks for the work. |
On branch 493-datasources-spark-refactoring Changes to be committed: modified: licenserc.toml modified: maven-projects/spark/datasources-32/src/main/scala/org/apache/graphar/datasources/GarDataSource.scala renamed: maven-projects/spark/datasources-32/src/main/scala/org/apache/graphar/datasources/GarCommitProtocol.scala -> maven-projects/spark/datasources-32/src/main/scala/org/apache/spark/sql/graphar/GarCommitProtocol.scala renamed: maven-projects/spark/datasources-32/src/main/scala/org/apache/graphar/datasources/GarScan.scala -> maven-projects/spark/datasources-32/src/main/scala/org/apache/spark/sql/graphar/GarScan.scala renamed: maven-projects/spark/datasources-32/src/main/scala/org/apache/graphar/datasources/GarScanBuilder.scala -> maven-projects/spark/datasources-32/src/main/scala/org/apache/spark/sql/graphar/GarScanBuilder.scala renamed: maven-projects/spark/datasources-32/src/main/scala/org/apache/graphar/datasources/GarTable.scala -> maven-projects/spark/datasources-32/src/main/scala/org/apache/spark/sql/graphar/GarTable.scala renamed: maven-projects/spark/datasources-32/src/main/scala/org/apache/graphar/datasources/GarWriterBuilder.scala -> maven-projects/spark/datasources-32/src/main/scala/org/apache/spark/sql/graphar/GarWriteBuilder.scala renamed: maven-projects/spark/datasources-32/src/main/scala/org/apache/graphar/datasources/csv/CSVWriterBuilder.scala -> maven-projects/spark/datasources-32/src/main/scala/org/apache/spark/sql/graphar/csv/CSVWriteBuilder.scala renamed: maven-projects/spark/datasources-32/src/main/scala/org/apache/graphar/datasources/orc/OrcOutputWriter.scala -> maven-projects/spark/datasources-32/src/main/scala/org/apache/spark/sql/graphar/orc/OrcOutputWriter.scala renamed: maven-projects/spark/datasources-32/src/main/scala/org/apache/graphar/datasources/orc/OrcWriteBuilder.scala -> maven-projects/spark/datasources-32/src/main/scala/org/apache/spark/sql/graphar/orc/OrcWriteBuilder.scala renamed: maven-projects/spark/datasources-32/src/main/scala/org/apache/graphar/datasources/parquet/ParquetWriterBuilder.scala -> maven-projects/spark/datasources-32/src/main/scala/org/apache/spark/sql/graphar/parquet/ParquetWriteBuilder.scala modified: maven-projects/spark/datasources-33/src/main/scala/org/apache/graphar/datasources/GarDataSource.scala modified: maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/GarCommitProtocol.scala modified: maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/GarScan.scala modified: maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/GarScanBuilder.scala modified: maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/GarTable.scala modified: maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/GarWriteBuilder.scala modified: maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/csv/CSVWriteBuilder.scala modified: maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/orc/OrcOutputWriter.scala modified: maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/orc/OrcWriteBuilder.scala modified: maven-projects/spark/datasources-33/src/main/scala/org/apache/spark/sql/graphar/parquet/ParquetWriteBuilder.scala
Ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* feat(spark): Refactoring datasources (#514) ### Reason for this PR By moving datasources under `org.apache.spark.sql` we are able to access private Spark API. Last time when I was trying to fully migrate datasources to V2 it was a blocker. Detailed motivation is in #493 ### What changes are included in this PR? Mostly refactoring. ### Are these changes tested? Unit tests are passed I manually checked the generated JARs: ![image](https://github.com/apache/incubator-graphar/assets/29755009/1b094516-88b1-490a-a2ea-8dcd092a3b1d) ### Are there any user-facing changes? Mostly not because `GarDataSource` was left under the same package. Close #493 * feat(dev): Add release and verify scripts (#507) Reason for this PR Add scripts for developer or release manager to easily release version or verify a version. What changes are included in this PR? Add release and verify scripts related document is updated to website, see Update the release and verify document, and add development document incubator-graphar-website#18 Are these changes tested? yes Are there any user-facing changes? no --------- Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> * chore: Bump to version v0.12.0 (Round 1) (#517) Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> * chore: Add CHANGELOG.md (#513) Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> * Introduce buf - v2 - buf.gen - buf On branch format-definition-dev Your branch is up to date with 'origin/format-definition-dev'. Changes to be committed: new file: buf.gen.yaml new file: buf.yaml modified: format/adjacent_list.proto modified: format/edge_info.proto modified: format/graph_info.proto modified: format/property_group.proto modified: format/types.proto modified: format/vertex_info.proto --------- Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> Co-authored-by: Weibin Zeng <qiaozi.zwb@alibaba-inc.com>
* feat(spark): Refactoring datasources (#514) ### Reason for this PR By moving datasources under `org.apache.spark.sql` we are able to access private Spark API. Last time when I was trying to fully migrate datasources to V2 it was a blocker. Detailed motivation is in #493 ### What changes are included in this PR? Mostly refactoring. ### Are these changes tested? Unit tests are passed I manually checked the generated JARs: ![image](https://github.com/apache/incubator-graphar/assets/29755009/1b094516-88b1-490a-a2ea-8dcd092a3b1d) ### Are there any user-facing changes? Mostly not because `GarDataSource` was left under the same package. Close #493 * feat(dev): Add release and verify scripts (#507) Reason for this PR Add scripts for developer or release manager to easily release version or verify a version. What changes are included in this PR? Add release and verify scripts related document is updated to website, see Update the release and verify document, and add development document incubator-graphar-website#18 Are these changes tested? yes Are there any user-facing changes? no --------- Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> * chore: Bump to version v0.12.0 (Round 1) (#517) Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> * chore: Add CHANGELOG.md (#513) Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> * Introduce buf - v2 - buf.gen - buf On branch format-definition-dev Your branch is up to date with 'origin/format-definition-dev'. Changes to be committed: new file: buf.gen.yaml new file: buf.yaml modified: format/adjacent_list.proto modified: format/edge_info.proto modified: format/graph_info.proto modified: format/property_group.proto modified: format/types.proto modified: format/vertex_info.proto --------- Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> Co-authored-by: Weibin Zeng <qiaozi.zwb@alibaba-inc.com>
### Reason for this PR By moving datasources under `org.apache.spark.sql` we are able to access private Spark API. Last time when I was trying to fully migrate datasources to V2 it was a blocker. Detailed motivation is in apache#493 ### What changes are included in this PR? Mostly refactoring. ### Are these changes tested? Unit tests are passed I manually checked the generated JARs: ![image](https://github.com/apache/incubator-graphar/assets/29755009/1b094516-88b1-490a-a2ea-8dcd092a3b1d) ### Are there any user-facing changes? Mostly not because `GarDataSource` was left under the same package. Close apache#493
Reason for this PR
By moving datasources under
org.apache.spark.sql
we are able to access private Spark API. Last time when I was trying to fully migrate datasources to V2 it was a blocker. Detailed motivation is in #493What changes are included in this PR?
Mostly refactoring.
Are these changes tested?
Unit tests are passed
I manually checked the generated JARs:
Are there any user-facing changes?
Mostly not because
GarDataSource
was left under the same package.Close #493