-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Spark] Move datasources under org.apache.spark package #493
Comments
There is a problem if we move to org.apache.spark is that we can not release the datasource maven packages to the org.apache.spark in the future. |
It may also simplify the full support of |
But do we need it? We may release graphar itself and we may release PyPI package that already include JARs from datasources. Should datasources be released separately? |
The graphar package relys on datasource, I don't know how can we release graphar without release the datasource package? |
It will be a compile time dependency. For example, like in delta-spark: https://github.com/delta-io/delta/tree/master/spark/src/main/scala/org/apache/spark/sql |
It seems that as a compile time dependency, we need to put datasource back to maven-projects/graphar? Or do you know is there a way that datasource still be separated from graphar and can be a compile time dependency too? That would be perfect for this issue. |
I can experiment with it. It seems to me that yes, it should be possible. |
Great, thanks Sem, Can I assign this issue to you? |
### Reason for this PR By moving datasources under `org.apache.spark.sql` we are able to access private Spark API. Last time when I was trying to fully migrate datasources to V2 it was a blocker. Detailed motivation is in #493 ### What changes are included in this PR? Mostly refactoring. ### Are these changes tested? Unit tests are passed I manually checked the generated JARs: ![image](https://github.com/apache/incubator-graphar/assets/29755009/1b094516-88b1-490a-a2ea-8dcd092a3b1d) ### Are there any user-facing changes? Mostly not because `GarDataSource` was left under the same package. Close #493
* feat(spark): Refactoring datasources (#514) ### Reason for this PR By moving datasources under `org.apache.spark.sql` we are able to access private Spark API. Last time when I was trying to fully migrate datasources to V2 it was a blocker. Detailed motivation is in #493 ### What changes are included in this PR? Mostly refactoring. ### Are these changes tested? Unit tests are passed I manually checked the generated JARs: ![image](https://github.com/apache/incubator-graphar/assets/29755009/1b094516-88b1-490a-a2ea-8dcd092a3b1d) ### Are there any user-facing changes? Mostly not because `GarDataSource` was left under the same package. Close #493 * feat(dev): Add release and verify scripts (#507) Reason for this PR Add scripts for developer or release manager to easily release version or verify a version. What changes are included in this PR? Add release and verify scripts related document is updated to website, see Update the release and verify document, and add development document incubator-graphar-website#18 Are these changes tested? yes Are there any user-facing changes? no --------- Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> * chore: Bump to version v0.12.0 (Round 1) (#517) Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> * chore: Add CHANGELOG.md (#513) Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> * Introduce buf - v2 - buf.gen - buf On branch format-definition-dev Your branch is up to date with 'origin/format-definition-dev'. Changes to be committed: new file: buf.gen.yaml new file: buf.yaml modified: format/adjacent_list.proto modified: format/edge_info.proto modified: format/graph_info.proto modified: format/property_group.proto modified: format/types.proto modified: format/vertex_info.proto --------- Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> Co-authored-by: Weibin Zeng <qiaozi.zwb@alibaba-inc.com>
* feat(spark): Refactoring datasources (#514) ### Reason for this PR By moving datasources under `org.apache.spark.sql` we are able to access private Spark API. Last time when I was trying to fully migrate datasources to V2 it was a blocker. Detailed motivation is in #493 ### What changes are included in this PR? Mostly refactoring. ### Are these changes tested? Unit tests are passed I manually checked the generated JARs: ![image](https://github.com/apache/incubator-graphar/assets/29755009/1b094516-88b1-490a-a2ea-8dcd092a3b1d) ### Are there any user-facing changes? Mostly not because `GarDataSource` was left under the same package. Close #493 * feat(dev): Add release and verify scripts (#507) Reason for this PR Add scripts for developer or release manager to easily release version or verify a version. What changes are included in this PR? Add release and verify scripts related document is updated to website, see Update the release and verify document, and add development document incubator-graphar-website#18 Are these changes tested? yes Are there any user-facing changes? no --------- Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> * chore: Bump to version v0.12.0 (Round 1) (#517) Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> * chore: Add CHANGELOG.md (#513) Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> * Introduce buf - v2 - buf.gen - buf On branch format-definition-dev Your branch is up to date with 'origin/format-definition-dev'. Changes to be committed: new file: buf.gen.yaml new file: buf.yaml modified: format/adjacent_list.proto modified: format/edge_info.proto modified: format/graph_info.proto modified: format/property_group.proto modified: format/types.proto modified: format/vertex_info.proto --------- Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com> Co-authored-by: Weibin Zeng <qiaozi.zwb@alibaba-inc.com>
### Reason for this PR By moving datasources under `org.apache.spark.sql` we are able to access private Spark API. Last time when I was trying to fully migrate datasources to V2 it was a blocker. Detailed motivation is in apache#493 ### What changes are included in this PR? Mostly refactoring. ### Are these changes tested? Unit tests are passed I manually checked the generated JARs: ![image](https://github.com/apache/incubator-graphar/assets/29755009/1b094516-88b1-490a-a2ea-8dcd092a3b1d) ### Are there any user-facing changes? Mostly not because `GarDataSource` was left under the same package. Close apache#493
Describe the enhancement requested
Currently datasources implementation are under
org.apache.graphar
space. I suggest to move these classes toorg.apache.spark
package. It allows us, for example, to use spark internals. For example, it resolve the problem in #488 whenJSONOptions
isprivate[sql]
.Component(s)
Spark, PySpark
The text was updated successfully, but these errors were encountered: