Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat (format): Introduce buf #519

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .devcontainer/graphar-dev.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,12 @@ RUN git clone --branch v1.8.3 https://github.com/google/benchmark.git /tmp/bench
&& make install \
&& rm -rf /tmp/benchmark

RUN git clone --branch v3.6.0 https://github.com/catchorg/Catch2.git /tmp/catch2 --depth 1 \
&& cd /tmp/catch2 \
&& cmake -Bbuild -H. -DBUILD_TESTING=OFF \
&& cmake --build build/ --target install \
&& rm -rf /tmp/catch2

ENV LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/lib:/usr/local/lib64
ENV JAVA_HOME=/usr/lib/jvm/default-java

Expand Down
360 changes: 360 additions & 0 deletions CHANGELOG.md

Large diffs are not rendered by default.

13 changes: 8 additions & 5 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,7 @@ For small or first-time contributions, we recommend the dev container method. An
### Using a dev container environment

GraphAr provides a pre-configured [dev container](https://containers.dev/)
that could be used in [GitHub Codespaces](https://github.com/features/codespaces),
[VSCode](https://code.visualstudio.com/docs/devcontainers/containers), [JetBrains](https://www.jetbrains.com/remote-development/gateway/),
that could be used in [VSCode](https://code.visualstudio.com/docs/devcontainers/containers), [JetBrains](https://www.jetbrains.com/remote-development/gateway/),
[JupyterLab](https://jupyterlab.readthedocs.io/en/stable/).
Please pick up your favorite runtime environment.

Expand All @@ -107,6 +106,10 @@ Please pick up your favorite runtime environment.
Different components of GraphAr may require different setup steps. Please refer to their respective `README` documentation for more details.

- [C++ Library](cpp/README.md)
- [Java Library](java/README.md)
- [Spark Library](spark/README.md)
- [PySpark Library](pyspark/README.md)
- [Scala with Spark Library](spark/README.md)
- [Python with PySpark Library](pyspark/README.md) (under development)
- [Java Library](java/README.md) (under development)

----

This doc refer from [Apache OpenDAL](https://opendal.apache.org/)
16 changes: 13 additions & 3 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ Apache-2.0 licenses
The following components are provided under the Apache-2.0 License. See project link for details.
The text of each license is the standard Apache 2.0 license.

* spark 3.1.1 and 3.3.4 (https://github.com/apache/spark)
* Apache Spark 3.1.1 and 3.3.4 (https://github.com/apache/spark)
Files:
maven-projects/spark/datasourcs-32/src/main/scala/org/apache/graphar/datasources/GarCommitProtocol.scala
maven-projects/spark/datasourcs-32/src/main/scala/org/apache/graphar/datasources/GarDataSource.scala
Expand All @@ -234,9 +234,13 @@ The text of each license is the standard Apache 2.0 license.
maven-projects/spark/datasourcs-33/src/main/scala/org/apache/graphar/datasources/orc/ORCOutputWriter.scala
maven-projects/spark/datasourcs-33/src/main/scala/org/apache/graphar/datasources/orc/ORCWriteBuilder.scala
maven-projects/spark/datasourcs-33/src/main/scala/org/apache/graphar/datasources/parquet/ParquetWriteBuilder.scala
are modified from spark.
are modified from Apache Spark.

* Apache Arrow 12.0.0 (https://github.com/apache/arrow)
Files:
dev/release/setup-ubuntu.sh
are modified from Apache Arrow.

* arrow 12.0.0 (https://github.com/apache/arrow)
* fastFFI v0.1.2 (https://github.com/alibaba/fastFFI)
Files:
maven-projects/java/src/main/java/org/apache/graphar/stdcxx/StdString.java
Expand All @@ -251,6 +255,12 @@ The text of each license is the standard Apache 2.0 license.
maven-projects/java/src/main/java/org/apache/graphar/stdcxx/StdUnorderedMap.java
are modified from GraphScope.

* Apache OpenDAL v0.45.1 (https://github.com/apache/opendal)
Files:
dev/release/release.py
dev/release/verify.py
are modified from OpenDAL.

================================================================
MIT licenses
================================================================
Expand Down
8 changes: 8 additions & 0 deletions NOTICE
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,11 @@ which includes the following in its NOTICE file:

fastFFI
Copyright 1999-2021 Alibaba Group Holding Ltd.

--------------------------------------------------------------------------------

This product includes code from Apache OpenDAL, which includes the following in
its NOTICE file:

Apache OpenDAL
Copyright 2022 and onwards The Apache Software Foundation.
19 changes: 12 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -207,24 +207,29 @@ See [GraphAr C++
Library](./cpp) for
details about the building of the C++ library.


### The Scala with Spark Library

See [GraphAr Spark
Library](./maven-projects/spark)
for details about the Scala with Spark library.

### The Java Library

The Java library is under development.

The GraphAr Java library is created with bindings to the C++ library
(currently at version v0.10.0), utilizing
[Alibaba-FastFFI](https://github.com/alibaba/fastFFI) for
implementation. See [GraphAr Java
Library](./maven-projects/java) for
details about the building of the Java library.

### The Spark Library

See [GraphAr Spark
Library](./maven-projects/spark)
for details about the Spark library.
### The Python with PySpark Library

### The PySpark Library
The Python with PySpark library is under development.

The GraphAr PySpark library is developed as bindings to the GraphAr
The PySpark library is developed as bindings to the GraphAr
Spark library. See [GraphAr PySpark
Library](./pyspark)
for details about the PySpark library.
Expand Down
18 changes: 18 additions & 0 deletions buf.gen.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
version: v2
managed:
enabled: true
disable:
- file_option: java_package
plugins:
# Python classes
- remote: buf.build/protocolbuffers/python:v27.1
out: pyspark/graphar_pyspark/proto/
# Python headers for IDEs and MyPy
- remote: buf.build/protocolbuffers/pyi
out: pyspark/graphar_pyspark/proto/
# Cpp
- remote: buf.build/protocolbuffers/cpp:v27.1
out: cpp/src/proto
# Java
- remote: buf.build/protocolbuffers/java:v27.1
out: maven-projects/info/src/main/java/
3 changes: 3 additions & 0 deletions buf.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
version: v2
modules:
- path: format
4 changes: 2 additions & 2 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@ if (CMAKE_VERSION VERSION_GREATER_EQUAL "3.24.0")
endif()

set(GRAPHAR_MAJOR_VERSION 0)
set(GRAPHAR_MINOR_VERSION 11)
set(GRAPHAR_PATCH_VERSION 4)
set(GRAPHAR_MINOR_VERSION 12)
set(GRAPHAR_PATCH_VERSION 0)
set(GREAPHAR_VERSION ${GRAPHAR_MAJOR_VERSION}.${GRAPHAR_MINOR_VERSION}.${GRAPHAR_PATCH_VERSION})
project(graphar-cpp LANGUAGES C CXX VERSION ${GREAPHAR_VERSION})

Expand Down
4 changes: 1 addition & 3 deletions cpp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,7 @@ repository and navigated to the ``cpp`` subdirectory with:

```bash
$ git clone https://github.com/apache/graphar.git
$ cd graphar
$ git submodule update --init
$ cd cpp
$ cd graphar/cpp
```

Release build:
Expand Down
3 changes: 1 addition & 2 deletions cpp/test/test_arrow_chunk_reader.cc
Original file line number Diff line number Diff line change
Expand Up @@ -158,8 +158,7 @@ TEST_CASE_METHOD(GlobalFixture, "ArrowChunkReader") {
<< '\n';
std::cout << "Column Nums: " << table->num_columns() << "\n";
std::cout << "Column Names: ";
for (int i = 0;
i < table->num_columns() && i < expected_cols.size(); i++) {
for (int i = 0; i < table->num_columns(); i++) {
REQUIRE(table->ColumnNames()[i] == expected_cols[i]);
std::cout << "`" << table->ColumnNames()[i] << "` ";
}
Expand Down
32 changes: 32 additions & 0 deletions dev/download_test_data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

# A script to download test data for GraphAr

if [ -n "${GAR_TEST_DATA}" ]; then
if [[ ! -d "$GAR_TEST_DATA" ]]; then
echo "GAR_TEST_DATA is set but the directory does not exist, cloning the test data to $GAR_TEST_DATA"
git clone https://github.com/apache/incubator-graphar-testing.git "$GAR_TEST_DATA" --depth 1 || true
fi
else
echo "GAR_TEST_DATA is not set, cloning the test data to /tmp/graphar-testing"
git clone https://github.com/apache/incubator-graphar-testing.git /tmp/graphar-testing --depth 1 || true
echo "Test data has been cloned to /tmp/graphar-testing, please run"
echo " export GAR_TEST_DATA=/tmp/graphar-testing"
fi
22 changes: 22 additions & 0 deletions dev/release/conda_env_cpp.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

cmake
conda-forge::arrow-cpp=13.0.0
make
clangxx_linux-64
conda-forge::catch2=3.6.0
19 changes: 19 additions & 0 deletions dev/release/conda_env_scala.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

maven
openjdk=11.0.13
119 changes: 119 additions & 0 deletions dev/release/release.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
#!/usr/bin/env python3
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

# Derived from Apache OpenDAL v0.45.1
# https://github.com/apache/opendal/blob/5079125/scripts/release.py

import re
import subprocess
from pathlib import Path

ROOT_DIR = Path(__file__).parent.parent.parent

def get_package_version():
major_version = None
minor_version = None
patch_version = None
major_pattern = re.compile(r'set\s*\(\s*GRAPHAR_MAJOR_VERSION\s+(\d+)\s*\)', re.IGNORECASE)
minor_pattern = re.compile(r'set\s*\(\s*GRAPHAR_MINOR_VERSION\s+(\d+)\s*\)', re.IGNORECASE)
patch_pattern = re.compile(r'set\s*\(\s*GRAPHAR_PATCH_VERSION\s+(\d+)\s*\)', re.IGNORECASE)

file_path = ROOT_DIR / "cpp/CMakeLists.txt"
with open(file_path, 'r') as file:
for line in file:
major_match = major_pattern.search(line)
minor_match = minor_pattern.search(line)
patch_match = patch_pattern.search(line)

if major_match:
major_version = major_match.group(1)
if minor_match:
minor_version = minor_match.group(1)
if patch_match:
patch_version = patch_match.group(1)

if major_version and minor_version and patch_version:
return f"{major_version}.{minor_version}.{patch_version}"
else:
return None

def archive_source_package():
print(f"Archive source package started")

version = get_package_version()
assert version, "Failed to get the package version"
name = f"apache-graphar-{version}-incubating-src"

archive_command = [
"git",
"archive",
"--prefix",
f"apache-graphar-{version}-incubating-src/",
"-o",
f"{ROOT_DIR}/dist/{name}.tar.gz",
"HEAD",
]
subprocess.run(
archive_command,
cwd=ROOT_DIR,
check=True,
)

print(f"Archive source package to dist/{name}.tar.gz")


def generate_signature():
for i in Path(ROOT_DIR / "dist").glob("*.tar.gz"):
print(f"Generate signature for {i}")
subprocess.run(
["gpg", "--yes", "--armor", "--output", f"{i}.asc", "--detach-sig", str(i)],
cwd=ROOT_DIR / "dist",
check=True,
)

for i in Path(ROOT_DIR / "dist").glob("*.tar.gz"):
print(f"Check signature for {i}")
subprocess.run(
["gpg", "--verify", f"{i}.asc", str(i)], cwd=ROOT_DIR / "dist", check=True
)


def generate_checksum():
for i in Path(ROOT_DIR / "dist").glob("*.tar.gz"):
print(f"Generate checksum for {i}")
subprocess.run(
["sha512sum", str(i.relative_to(ROOT_DIR / "dist"))],
stdout=open(f"{i}.sha512", "w"),
cwd=ROOT_DIR / "dist",
check=True,
)

for i in Path(ROOT_DIR / "dist").glob("*.tar.gz"):
print(f"Check checksum for {i}")
subprocess.run(
["sha512sum", "--check", f"{str(i.relative_to(ROOT_DIR / 'dist'))}.sha512"],
cwd=ROOT_DIR / "dist",
check=True,
)


if __name__ == "__main__":
(ROOT_DIR / "dist").mkdir(exist_ok=True)
archive_source_package()
generate_signature()
generate_checksum()
Loading