Align the scaling strategy of the Ansible-based setup with the other setups #201

julienrf · 2024-08-22T14:41:34Z

Fixes #192.

julienrf · 2024-08-22T15:47:45Z

I tested these changes by creating two Ubuntu containers with an SSH server, and a DynamoDB instance:

services:
  master:
    build: dockerfiles/ansible
  worker:
    build: dockerfiles/ansible

  dynamodb:
    command: "-jar DynamoDBLocal.jar -sharedDb -inMemory"
    image: "amazon/dynamodb-local:latest"
    expose:
      - 8000
    ports:
      - "8000:8000"
    working_dir: /home/dynamodblocal

Where, dockerfiles/ansible/Dockerfile is the following:

FROM ubuntu

RUN apt-get update && apt-get install -y openssh-server sudo software-properties-common iproute2

RUN mkdir /var/run/sshd

RUN useradd -ms /bin/bash ubuntu \
    && echo 'ubuntu:aaaaaa' | chpasswd \
    && sudo adduser ubuntu sudo \
    && echo "ubuntu ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/ubuntu

RUN echo "PasswordAuthentication yes" >> /etc/ssh/sshd_config \
    && echo "PermitRootLogin yes" >> /etc/ssh/sshd_config

EXPOSE 22

CMD ["/usr/sbin/sshd", "-D"]

I noted the IP addresses of the Spark master and worker nodes with the following commands:

docker inspect --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' scylla-migrator-worker-1
docker inspect --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' scylla-migrator-master-1

And I used those IP addresses in the Ansible inventory.

Then I ran ansible-playbook scylla-migrator.yml to set up the Migrator on both the Spark master and worker nodes.

Afterwards, I opened a terminal on both nodes to run start-spark.sh and start-slave.sh.

I created a DynamoDB table and put an item in it. Then, I edited the file dynamodb.config.yml to configure a migration from this table. Finally, I executed the migration with submit-alternator-job.sh.

guy9 · 2024-08-25T05:59:15Z

@pdbossman please review

- Use `{start,stop}-worker.sh` instead of the deprecated `{start,stop}-slave.sh` - Use `{start,stop}-mesos-shuffle-service.sh` instead of `{start,stop}-shuffle-service.sh`

…er solutions Fixes scylladb#192

According to Spark documentation, the configuration property `spark.driver.memory` has no effect in our case (we use the “client” deploy-mode): > In client mode, this config must not be set through the SparkConf directly in your application, because the driver JVM has already started at that point. Instead, please set this through the --driver-memory command line option or in your default properties file. https://spark.apache.org/docs/latest/configuration.html

julienrf added 3 commits August 27, 2024 18:07

Fix issues in the Ansible playbook and Spark 3.5

1874e1c

- Use `{start,stop}-worker.sh` instead of the deprecated `{start,stop}-slave.sh` - Use `{start,stop}-mesos-shuffle-service.sh` instead of `{start,stop}-shuffle-service.sh`

Align the scaling strategy of the Ansible-based solution with the oth…

e01c75b

…er solutions Fixes scylladb#192

julienrf force-pushed the align-ansible-setup branch from c0a2e06 to 84f8e9b Compare August 27, 2024 16:07

julienrf marked this pull request as ready for review August 27, 2024 16:07

julienrf merged commit 04aa85c into scylladb:master Aug 27, 2024
3 checks passed

julienrf deleted the align-ansible-setup branch August 27, 2024 16:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align the scaling strategy of the Ansible-based setup with the other setups #201

Align the scaling strategy of the Ansible-based setup with the other setups #201

julienrf commented Aug 22, 2024 •

edited

Loading

julienrf commented Aug 22, 2024

guy9 commented Aug 25, 2024

Align the scaling strategy of the Ansible-based setup with the other setups #201

Align the scaling strategy of the Ansible-based setup with the other setups #201

Conversation

julienrf commented Aug 22, 2024 • edited Loading

julienrf commented Aug 22, 2024

guy9 commented Aug 25, 2024

julienrf commented Aug 22, 2024 •

edited

Loading