Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BSE-4386] Add Linux ARM Support #83

Merged
merged 25 commits into from
Jan 2, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/actionlint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ self-hosted-runner:
- self-hosted-small
- self-hosted-medium
- self-hosted-large
- self-hosted-xlarge
10 changes: 7 additions & 3 deletions .github/workflows/_build_bodo_conda_linux_comm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,15 @@ on:
required: false
type: boolean
default: false
arm:
description: 'Is this an ARM build'
required: true
type: boolean

jobs:
build-bodo:
runs-on: ubuntu-latest
container: condaforge/linux-anvil-alma-x86_64:8
runs-on: ${{ inputs.arm && 'self-arm-medium' || 'ubuntu-latest' }}
container: ${{ inputs.arm && 'condaforge/linux-anvil-aarch64:alma8' || 'condaforge/linux-anvil-x86_64:alma8' }}
permissions:
id-token: write
contents: read
Expand Down Expand Up @@ -70,7 +74,7 @@ jobs:
- name: Upload Conda Package
uses: actions/upload-artifact@v4
with:
name: bodo-conda-linux-${{ inputs.python-version }}-community
name: bodo-conda-${{ inputs.python-version }}-linux-${{ inputs.arm && 'arm' || 'x86' }}
path: /github/home/conda-bld/

- name: Publish to Artifactory / Anaconda
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/_build_bodo_conda_native.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Bodo Conda Build (Linux)
name: Bodo Conda Build (Other)
on:
workflow_call:
inputs:
Expand Down
33 changes: 19 additions & 14 deletions .github/workflows/_build_bodo_pip.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,23 +7,21 @@ on:
description: 'Operating System to Build On/For'
type: string
required: true
name:
description: 'Name of the OS to Build For'
type: string
required: true
bodo_version:
description: 'Bodo Version to Build'
type: string
required: true

# Recommended with setup-micromamba
# https://github.com/mamba-org/setup-micromamba#about-login-shells
defaults:
run:
shell: bash -leo pipefail {0}

jobs:
build_bodo_wheels:
permissions:
id-token: write
contents: read
name: Build Bodo Wheels for ${{ inputs.os }}
name: Build Bodo Wheels for ${{ inputs.name }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you :). Having name is more descriptive and easier than numbers 13 and 14

runs-on: ${{ inputs.os }}

steps:
Expand All @@ -38,7 +36,7 @@ jobs:

# https://github.com/actions/runner-images/issues/10624
- name: Fix PATH in .bashrc
if: inputs.os == 'macos-14'
if: inputs.name == 'macos-arm'
run: |
sed -i '' '/; export PATH;/d' ~/.bashrc
echo 'export PATH="/opt/homebrew/bin:/opt/homebrew/sbin:$PATH"' >> ~/.bashrc
Expand All @@ -59,9 +57,16 @@ jobs:
if: contains(inputs.os, 'macos')
run: |
pixi global install sccache
# TODO: Remove once the self-hosted runners use Ubuntu
- name: Install Pipx on Self-Hosted Runner
if: inputs.name == 'linux-arm'
run: |
sudo dnf install -y python3-pip
python3 -m pip install --user pipx
python3 -m pipx ensurepath

- name: Build Wheels
uses: pypa/cibuildwheel@v2.21.3
run: ${{ contains(inputs.os, 'macos') && 'pipx' || 'python3 -m pipx' }} run cibuildwheel==2.22.0
env:
CIBW_BEFORE_ALL_LINUX: |
# Install Pixi and Environment
Expand All @@ -82,7 +87,7 @@ jobs:
SCCACHE_REGION=us-east-2
SCCACHE_S3_USE_SSL=true
SCCACHE_S3_SERVER_SIDE_ENCRYPTION=true
MACOSX_DEPLOYMENT_TARGET=${{ inputs.os == 'macos-14' && '12.0' || '10.15' }}
MACOSX_DEPLOYMENT_TARGET=${{ inputs.name == 'macos-arm' && '12.0' || '10.15' }}
BODO_VENDOR_MPICH=1
PATH=$HOME/.pixi/bin:$PATH
CONDA_PREFIX=$(pwd)/.pixi/envs/pip-cpp
Expand All @@ -92,8 +97,8 @@ jobs:
LD_LIBRARY_PATH=/project/.pixi/envs/pip-cpp/lib
CFLAGS="-isystem /project/.pixi/envs/pip-cpp/include"
CPPFLAGS="-isystem /project/.pixi/envs/pip-cpp/include"
CC=/project/.pixi/envs/pip-cpp/bin/x86_64-conda-linux-gnu-gcc
CXX=/project/.pixi/envs/pip-cpp/bin/x86_64-conda-linux-gnu-g++
CC=/project/.pixi/envs/pip-cpp/bin/$(uname -m)-conda-linux-gnu-gcc
CXX=/project/.pixi/envs/pip-cpp/bin/$(uname -m)-conda-linux-gnu-g++
DISABLE_CCACHE=1
SCCACHE_BUCKET=engine-codebuild-cache
SCCACHE_REGION=us-east-2
Expand Down Expand Up @@ -123,10 +128,10 @@ jobs:
--exclude libmpi.so.12 --exclude libmpi.so.40
--exclude libarrow.so.1801 --exclude libarrow_acero.so.1801 --exclude libarrow_dataset.so.1801
--exclude libarrow_python.so --exclude libparquet.so.1801
--plat manylinux_2_35_x86_64 {wheel} -w {dest_dir} &&
--plat manylinux_2_35_$(uname -m) {wheel} -w {dest_dir} &&
python buildscripts/bodo/pip/manylinux/patch_libs_for_pip.py -p {dest_dir}

- uses: actions/upload-artifact@v4
with:
name: cibw-wheels-${{ inputs.os }}
name: cibw-wheels-${{ inputs.name }}
path: ./wheelhouse/*.whl
4 changes: 1 addition & 3 deletions .github/workflows/_build_bodosql_conda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ jobs:
- name: 'Download the Bodo conda'
uses: actions/download-artifact@v4
with:
name: bodo-conda-linux-3.12-community
name: bodo-conda-3.12-linux-x86
run-id: ${{ github.event.workflow_run.id }}
path: ./Bodo-CondaPkg-Linux

Expand Down Expand Up @@ -72,8 +72,6 @@ jobs:
if: ${{ inputs.is-release }}
run: |
set -eo pipefail
echo "${{ secrets.PUBLISH_BINARY_SECRETS }}" > $HOME/secret_file
sudo chmod a+r $HOME/secret_file

echo "BODOSQL_VERSION: $BODOSQL_VERSION"
artifactory_channel=`./buildscripts/bodosql/get_channel.sh`
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/_build_iceberg_conda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,6 @@ jobs:
- name: 'Set Secret File Permissions and Conda Build and Publish Iceberg Binary to Artifactory'
run: |
set -eo pipefail
echo "${{ secrets.PUBLISH_BINARY_SECRETS }}" > $HOME/secret_file
sudo chmod a+r $HOME/secret_file

artifactory_channel=`./buildscripts/iceberg/get_channel.sh`
echo "artifactory_channel: $artifactory_channel"
Expand Down
25 changes: 17 additions & 8 deletions .github/workflows/build_wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,23 +31,32 @@ jobs:
outputs:
bodo_version: ${{ steps.get_version.outputs.bodo_version }}

build_bodo_linux_wheels:
build_bodo_linux:
uses: ./.github/workflows/_build_bodo_pip.yml
needs: [get_version]
with:
# Only Linux x86 in this job
os: ubuntu-latest
name: linux-x86
bodo_version: ${{ needs.get_version.outputs.bodo_version }}
secrets: inherit

build_bodo_macos_wheels:
build_bodo_other:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change this name? It's still only Mac?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In expectation of Windows :)

uses: ./.github/workflows/_build_bodo_pip.yml
needs: [get_version]
strategy:
fail-fast: false
matrix:
os: [macos-13, macos-14]
include:
- os: macos-13
name: macos-x86
- os: macos-latest
name: macos-arm
- os: self-arm-medium
name: linux-arm
with:
os: ${{ matrix.os }}
name: ${{ matrix.name }}
bodo_version: ${{ needs.get_version.outputs.bodo_version }}
secrets: inherit

Expand All @@ -58,12 +67,12 @@ jobs:
# The manylinux image we use to build the wheels can't install the wheels since it's too old.
# For this reason we test them separately
runs-on: ubuntu-latest
needs: build_bodo_linux_wheels
needs: build_bodo_linux
steps:
- uses: actions/download-artifact@v4
id: download-artifact
with:
pattern: cibw-wheels-ubuntu-*
pattern: cibw-wheels-linux-x86
path: .
- uses: actions/setup-python@v5
with:
Expand Down Expand Up @@ -104,7 +113,7 @@ jobs:
contents: read
name: Build no-arch BodoSQL wheels on ubuntu-latest
runs-on: ubuntu-latest
needs: build_bodo_linux_wheels
needs: build_bodo_linux

steps:
- name: Configure AWS Credentials
Expand Down Expand Up @@ -161,7 +170,7 @@ jobs:
- uses: actions/download-artifact@v4
id: download-bodo-artifact
with:
pattern: cibw-wheels-ubuntu-*
pattern: cibw-wheels-linux-x86
path: .
- uses: actions/download-artifact@v4
id: download-bodosql-artifact
Expand Down Expand Up @@ -231,7 +240,7 @@ jobs:
# TODO: Chain with E2E tests for effective testing

upload-all:
needs: [build-iceberg, test-iceberg-import, build_bodo_linux_wheels, build_bodo_macos_wheels, test-bodo-linux, build_bodosql_wheels, test-bodosql]
needs: [build-iceberg, test-iceberg-import, build_bodo_linux, build_bodo_other, test-bodo-linux, build_bodosql_wheels, test-bodosql]
permissions:
id-token: write
runs-on: ubuntu-latest
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/e2e_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ on:
jobs:
run-e2e:
name: Run E2E
runs-on: [self-hosted, xlarge]
runs-on: self-hosted-xlarge
steps:
- uses: actions/checkout@v4
- uses: prefix-dev/setup-pixi@v0.8.1
Expand Down Expand Up @@ -97,7 +97,7 @@ jobs:

run-examples:
name: Run Examples
runs-on: [self-hosted, large]
runs-on: self-hosted-large
steps:
- uses: actions/checkout@v4
- uses: prefix-dev/setup-pixi@v0.8.1
Expand Down
26 changes: 18 additions & 8 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
name: Release
on:
schedule:
# Two events so that we can limit MacOS runs for cost
- cron: '0 21 * * 1,3,5' # 9PM EST Mon, Wed, Fri
- cron: '0 21 * * 2,4' # 9PM EST Tue, Thu
- cron: '0 21 * * 1,2,3,4,5' # 9PM EST Mon, Wed, Fri
release:
types: [published]
workflow_dispatch:
Expand All @@ -18,7 +16,7 @@ jobs:
# But for the platform package to have the best potential performance,
# we want to build outside of a container on the native architecture.
# MacOS also builds on the native VM (no docker container). Thus, they are grouped together.
bodo-conda-linux:
bodo-conda-linux-x86:
strategy:
# Don't cancel other jobs if one fails
fail-fast: false
Expand All @@ -28,10 +26,22 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
is-release: ${{ github.event_name == 'release' }}
arm: false
secrets: inherit
bodo-conda-linux-arm:
strategy:
# Don't cancel other jobs if one fails
fail-fast: false
matrix:
# On pull requests, only test building for 3.12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this the only one we testing its build with PRs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So its faster. Plus, GitHub limits the number of concurrent MacOS agents, and so starting 3 at once is too much

python-version: ${{ fromJson(github.event_name == 'pull_request' && '["3.12"]' || '["3.12", "3.11", "3.10"]') }}
uses: ./.github/workflows/_build_bodo_conda_linux_comm.yml
with:
python-version: ${{ matrix.python-version }}
is-release: ${{ github.event_name == 'release' }}
arm: true
secrets: inherit
bodo-conda-mac:
# Only run every other day to save on MacOS VM costs
if: github.event.schedule != '0 21 * * 1,3,5'
strategy:
# Don't cancel other jobs if one fails
fail-fast: false
Expand Down Expand Up @@ -64,14 +74,14 @@ jobs:
secrets: inherit

bodosql-conda:
needs: bodo-conda-linux
needs: bodo-conda-linux-x86
uses: ./.github/workflows/_build_bodosql_conda.yml
with:
is-release: ${{ github.event_name == 'release' }}
secrets: inherit

docker-img:
needs: [bodo-conda-linux, iceberg-conda, bodosql-conda, azurefs-sas-conda]
needs: [bodo-conda-linux-x86, iceberg-conda, bodosql-conda, azurefs-sas-conda]
uses: ./.github/workflows/docker_build_and_publish.yml
with:
is-release: ${{ github.event_name == 'release' }}
Expand Down
2 changes: 1 addition & 1 deletion BodoSQL/bodosql/tests/test_javascript_udfs.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
create or replace function test_regex_udf(A varchar) RETURNS DOUBLE LANGUAGE JAVASCRIPT AS
$$
try {
return parseInt(A.match(/(\d+).*?(\d+).*?(\d+)/)[2]);
return parseInt(A.match(/(\\d+).*?(\\d+).*?(\\d+)/)[2]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file still runs (technically no tests cause I dont have V8 installed locally) but I would expect it to fail if this was wrong since this is a docstring

} catch (e) {
return null;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -609,7 +609,7 @@ def to_interval_scalar_fn(x):
),
pytest.param(
pd.Series(
[None] * 3 + list(pd.date_range("2022-1-1", periods=21, freq="40D5H4S"))
[None] * 3 + list(pd.date_range("2022-1-1", periods=21, freq="40D5h4s"))
).values,
id="timestamp-vector",
),
Expand Down Expand Up @@ -753,7 +753,7 @@ def impl(date_val):
[None] * 3
+ list(
pd.date_range(
"2022-1-1", periods=21, freq="40D5H4S", tz="US/Pacific"
"2022-1-1", periods=21, freq="40D5h4s", tz="US/Pacific"
)
)
).array,
Expand Down Expand Up @@ -800,7 +800,7 @@ def localize_scalar_fn(ts_val):
pd.Series(
[None] * 3
+ list(
pd.date_range("2022-1-1", periods=21, freq="40D5H4S")
pd.date_range("2022-1-1", periods=21, freq="40D5h4s")
.to_series()
.astype("str")
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ def impl(arr):
_dates = pd.Series(pd.date_range("2010-1-1", periods=10, freq="841D")).apply(
lambda x: x.date()
)
_timestamps = pd.Series(pd.date_range("20130101", periods=10, freq="H"))
_timestamps = pd.Series(pd.date_range("20130101", periods=10, freq="h"))
_dates_nans = _dates.copy()
_timestamps_nans = _timestamps.copy()
_dates_nans[4] = _dates_nans[7] = np.nan
Expand Down Expand Up @@ -968,7 +968,7 @@ def impl(arr):
-100,
3,
1,
pd.Series([None], dtype=pd.Float64Dtype),
pd.Series([None], dtype=pd.Float64Dtype()),
id="scalar_int_invalid",
),
pytest.param(
Expand All @@ -989,7 +989,7 @@ def impl(arr):
10.123,
2,
1,
pd.Series([None], dtype=pd.Float64Dtype),
pd.Series([None], dtype=pd.Float64Dtype()),
id="scalar_float_invalid",
),
pytest.param(
Expand All @@ -1010,7 +1010,7 @@ def impl(arr):
"10.123",
2,
1,
pd.Series([None], dtype=pd.Float64Dtype),
pd.Series([None], dtype=pd.Float64Dtype()),
id="scalar_string_invalid",
),
],
Expand Down
Loading
Loading