Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Add a method to GenericContainer to expose a random container port with the same number as the host port #9553

Closed
linghengqian opened this issue Nov 23, 2024 · 8 comments

Comments

@linghengqian
Copy link

linghengqian commented Nov 23, 2024

Module

Core

Problem

  • Currently, it is not possible to expose a random port in a container to the same random port on the host. To achieve this goal, you can only use the deprecated class org.testcontainers.containers.FixedHostPortGenericContainer. For HiveServer2 with Zookeeper service discovery enabled, there are similar operations as follows.
import org.apache.curator.test.InstanceSpec;
import org.junit.jupiter.api.Test;
import org.testcontainers.containers.FixedHostPortGenericContainer;
import org.testcontainers.containers.GenericContainer;
import org.testcontainers.containers.Network;
public class ExampleTest {
    @Test
    void test() {
        Network network = Network.newNetwork();
        int randomPort = InstanceSpec.getRandomPort();
        try (
                GenericContainer<?> zookeeper = new GenericContainer<>("zookeeper:3.9.3-jre-17")
                        .withNetwork(network)
                        .withNetworkAliases("foo")
                        .withExposedPorts(2181);
                GenericContainer<?> hiveServer2 = new FixedHostPortGenericContainer<>("apache/hive:4.0.1")
                        .withNetwork(network)
                        .withEnv("SERVICE_NAME", "hiveserver2")
                        .withFixedExposedPort(randomPort, randomPort)
                        .dependsOn(zookeeper)
        ) {
            zookeeper.start();
            hiveServer2.withEnv("SERVICE_OPTS", "-Dhive.server2.support.dynamic.service.discovery=true" + " "
                    + "-Dhive.zookeeper.quorum=" + zookeeper.getNetworkAliases().get(0) + ":2181" + " "
                    + "-Dhive.server2.thrift.bind.host=0.0.0.0" + " "
                    + "-Dhive.server2.thrift.port=" + randomPort);
            hiveServer2.start();
        }
    }
}
  • The only purpose of org.apache.curator.test.InstanceSpec#getRandomPort() is to get a random host port. This can sometimes conflict with the port in the container.
  • This is like a Docker Compose unit like this,
services:
  zookeeper:
    image: zookeeper:3.9.3-jre-17
    ports:
      - "12181:2181"
  apache-hive-1:
    image: apache/hive:4.0.1
    depends_on:
      - zookeeper
    environment:
      SERVICE_NAME: hiveserver2
      SERVICE_OPTS: >-
        -Dhive.server2.support.dynamic.service.discovery=true
        -Dhive.zookeeper.quorum=zookeeper:2181
        -Dhive.server2.thrift.bind.host=0.0.0.0
        -Dhive.server2.thrift.port=23593
    ports:
      - "23593:23593"
  • There is almost no way around org.testcontainers.containers.FixedHostPortGenericContainer to use a random numeric port both on the host, inside the container, and in the container's environment variables.

Solution

  • I was expecting there to be a method in GenericContainer to expose the same host port number on a random container port. If this method is called org.testcontainers.containers.GenericContainer#withRandomExposedPorts(), it can expose a random container port. And allow the host to obtain this port number through org.testcontainers.containers.GenericContainer#getFirstMappedPort(), then the use of org.testcontainers.containers.FixedHostPortGenericContainer can obviously be simplified to,
import org.junit.jupiter.api.Test;
import org.testcontainers.containers.GenericContainer;
import org.testcontainers.containers.Network;
public class ExampleTest {
    @Test
    void test() {
        Network network = Network.newNetwork();
        try (
                GenericContainer<?> zookeeper = new GenericContainer<>("zookeeper:3.9.3-jre-17")
                        .withNetwork(network)
                        .withNetworkAliases("foo")
                        .withExposedPorts(2181);
                GenericContainer<?> hiveServer2 = new GenericContainer<>("apache/hive:4.0.1")
                        .withNetwork(network)
                        .withEnv("SERVICE_NAME", "hiveserver2")
                        .withRandomExposedPorts()
                        .dependsOn(zookeeper)
        ) {
            zookeeper.start();
            hiveServer2.withEnv("SERVICE_OPTS", "-Dhive.server2.support.dynamic.service.discovery=true" + " "
                    + "-Dhive.zookeeper.quorum=" + zookeeper.getNetworkAliases().get(0) + ":2181" + " "
                    + "-Dhive.server2.thrift.bind.host=0.0.0.0" + " "
                    + "-Dhive.server2.thrift.port=" + hiveServer2.getFirstMappedPort());
            hiveServer2.start();
        }
    }
}

Benefit

  • This helps simplify the process of starting a HiveServer2 with Zookeeper service discovery enabled.

Alternatives

Would you like to help contributing this feature?

No

@kiview
Copy link
Member

kiview commented Nov 28, 2024

Hey @linghengqian, what you are trying to achieve there, should already work like this:

import org.apache.curator.test.InstanceSpec;
import org.junit.jupiter.api.Test;
import org.testcontainers.containers.FixedHostPortGenericContainer;
import org.testcontainers.containers.GenericContainer;
import org.testcontainers.containers.Network;
public class ExampleTest {
    @Test
    void test() {
        Network network = Network.newNetwork();
        try (
                GenericContainer<?> zookeeper = new GenericContainer<>("zookeeper:3.9.3-jre-17")
                        .withNetwork(network)
                        .withNetworkAliases("foo")
                        .withExposedPorts(2181);
                GenericContainer<?> hiveServer2 = new GenericContainer<>("apache/hive:4.0.1")
                        .withNetwork(network)
                        .withEnv("SERVICE_NAME", "hiveserver2")
                        .withExposedPort(4711)
                        .dependsOn(zookeeper)
        ) {
            zookeeper.start();
            hiveServer2.withEnv("SERVICE_OPTS", "-Dhive.server2.support.dynamic.service.discovery=true" + " "
                    + "-Dhive.zookeeper.quorum=" + zookeeper.getNetworkAliases().get(0) + ":2181" + " "
                    + "-Dhive.server2.thrift.bind.host=0.0.0.0" + " "
                    + "-Dhive.server2.thrift.port=" + hiveServer2.getFirstMappedPort());
            hiveServer2.start();
        }
    }
}

I don't see why the port within the container needs to be the same and why the internal port needs to be random (I understand it can be arbitrary though).

@linghengqian
Copy link
Author

hiveServer2.withEnv("SERVICE_OPTS", "-Dhive.server2.support.dynamic.service.discovery=true" + " "
                    + "-Dhive.zookeeper.quorum=" + zookeeper.getNetworkAliases().get(0) + ":2181" + " "
                    + "-Dhive.server2.thrift.bind.host=0.0.0.0" + " "
                    + "-Dhive.server2.thrift.port=" + hiveServer2.getFirstMappedPort());
hiveServer2.start();
  • @kiview I've wanted to do something similar before, but the problem is that testcontainers simply don't allow it. If I call hiveServer2.getFirstMappedPort() before calling hiveServer2.start(), this will throw an exception. Mapped port can only be obtained after the container is started
java.lang.IllegalStateException: Mapped port can only be obtained after the container is started
	at org.testcontainers.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:513)
	at org.testcontainers.containers.ContainerState.getMappedPort(ContainerState.java:161)
	at io.github.linghengqian.hive.server2.jdbc.driver.thin.ZookeeperServiceDiscoveryTest.assertShardingInLocalTransactions(ZookeeperServiceDiscoveryTest.java:75)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)

@eddumelendez
Copy link
Member

eddumelendez commented Nov 28, 2024

Hi, what you can try is create a custom script, see LocalStackContainer for reference.

withCreateContainerCmdModifier(cmd -> {
cmd.withEntrypoint(
"sh",
"-c",
"while [ ! -f " + STARTER_SCRIPT + " ]; do sleep 0.1; done; " + STARTER_SCRIPT
);
});

The script must be copied to the container when it is starting

protected void containerIsStarting(InspectContainerResponse containerInfo) {
String command = "#!/bin/bash\n";
command += "export LAMBDA_DOCKER_FLAGS=" + configureServiceContainerLabels("LAMBDA_DOCKER_FLAGS") + "\n";
command += "export ECS_DOCKER_FLAGS=" + configureServiceContainerLabels("ECS_DOCKER_FLAGS") + "\n";
command += "export EC2_DOCKER_FLAGS=" + configureServiceContainerLabels("EC2_DOCKER_FLAGS") + "\n";
command += "export BATCH_DOCKER_FLAGS=" + configureServiceContainerLabels("BATCH_DOCKER_FLAGS") + "\n";
command += "/usr/local/bin/docker-entrypoint.sh\n";
copyFileToContainer(Transferable.of(command, 0777), STARTER_SCRIPT);
}

Hope that helps.

@linghengqian
Copy link
Author

  • @eddumelendez I must admit I'm a testcontainers newbie, and I'm assuming /usr/local/bin/docker-entrypoint.sh is some kind of built-in script? I further tested org.testcontainers.containers.GenericContainer#containerIsStarting at Removes unit test usage of FixedHostPortGenericContainer linghengqian/hive-server2-jdbc-driver#14 and concluded that the Docker Image of apache/hive:4.0.1 will not continue to monitor changes in environment variables after it is started. Therefore, modifying the environment variables in the mounted /testcontainers_start.sh has no effect.

I don't see why the port within the container needs to be the same and why the internal port needs to be random (I understand it can be arbitrary though).

  • @kiview Allow me to explain what the following Docker Compose file does.
services:
  zookeeper:
    image: zookeeper:3.9.3-jre-17
    ports:
      - "12181:2181"
  apache-hive-1:
    image: apache/hive:4.0.1
    depends_on:
      - zookeeper
    environment:
      SERVICE_NAME: hiveserver2
      SERVICE_OPTS: >-
        -Dhive.server2.support.dynamic.service.discovery=true
        -Dhive.zookeeper.quorum=zookeeper:2181
        -Dhive.server2.thrift.bind.host=0.0.0.0
        -Dhive.server2.thrift.port=23593
    ports:
      - "23593:23593"
  • At this time, for the znode /hiveserver2/serverUri=0.0.0.0:23593;version=4.0.1;sequence=0000000000 in service zookeeper, the value exists as hive.server2.instance.uri=0.0.0.0:23593;hive.server2.authentication=NONE;hive.server2.transport.mode=binary;hive.server2.thrift.sasl.qop=auth;hive.server2.thrift.bind.host=0.0.0.0;hive.server2.thrift.port=23593;hive.server2.use.SSL=false.
  • If I want to connect to this HiveServer2 instance apache-hive-1 in unit tests outside the Docker Network, I actually need to create a database connection with a JdbcUrl of jdbc:hive2://localhost:12181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2 through the JDBC Driver.
  • This database connection first connects to Zookeeper Server zookeeper, finds the host and port of HiveServer2 apache-hive-1 as 0.0.0.0:23593 from /hiveserver2/ znode, and then reconnects to HiveServer2 using the 0.0.0.0:23593 information.
  • This means that for apache-hive-1, the values ​​of -Dhive.server2.thrift.port, hostPort, and containerPort all need to be the same random number, such as 23593... It looks like there is no way for me to completely eliminate the use of org.testcontainers.containers.FixedHostPortGenericContainer.

@eddumelendez
Copy link
Member

the entrypoint in apache/hive image is /entrypoint.sh. What I am suggesting is override the entrypoint to create a custom one with /testcontainers_start.sh in order to create new env vars and calling /entrypoint.sh as part of the custom script for Hive initialization.

@linghengqian
Copy link
Author

the entrypoint in apache/hive image is /entrypoint.sh. What I am suggesting is override the entrypoint to create a custom one with /testcontainers_start.sh in order to create new env vars and calling /entrypoint.sh as part of the custom script for Hive initialization.

[main] ERROR tc.apache/hive:4.0.1 - Could not start container
org.testcontainers.containers.ContainerLaunchException: Timed out waiting for container port to open (localhost ports: [32818] should be listening)
	at org.testcontainers.containers.wait.strategy.HostPortWaitStrategy.waitUntilReady(HostPortWaitStrategy.java:112)
	at org.testcontainers.containers.wait.strategy.AbstractWaitStrategy.waitUntilReady(AbstractWaitStrategy.java:52)
	at org.testcontainers.containers.GenericContainer.waitUntilContainerStarted(GenericContainer.java:909)
	at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:500)
  • This requires two class tests in JDK17+. I don't seem to notice any explanation from the testcontainers documentation.
import com.github.dockerjava.api.command.InspectContainerResponse;
import org.testcontainers.containers.GenericContainer;
import org.testcontainers.images.builder.Transferable;
import java.io.IOException;
import java.net.ServerSocket;
@SuppressWarnings({"resource"})
public class HS2Container extends GenericContainer<HS2Container> {
    String zookeeperConnectionString;
    private static final String STARTER_SCRIPT = "/testcontainers_start.sh";
    private final int randomPortFirst = getRandomPort();
    public HS2Container(final String dockerImageName) {
        super(dockerImageName);
        withEnv("SERVICE_NAME", "hiveserver2");
        withExposedPorts(randomPortFirst);
        withCreateContainerCmdModifier(cmd ->
                cmd.withEntrypoint("sh", "-c", "while [ ! -f " + STARTER_SCRIPT + " ]; do sleep 0.1; done; " + STARTER_SCRIPT)
        );
    }
    public HS2Container withZookeeperConnectionString(final String zookeeperConnectionString) {
        this.zookeeperConnectionString = zookeeperConnectionString;
        return self();
    }
    @Override
    protected void containerIsStarting(InspectContainerResponse containerInfo) {
        String command = """
                #!/bin/bash
                export SERVICE_OPTS="-Dhive.server2.support.dynamic.service.discovery=true -Dhive.zookeeper.quorum=%s -Dhive.server2.thrift.bind.host=0.0.0.0 -Dhive.server2.thrift.port=%s"
                /entrypoint.sh
                """.formatted(zookeeperConnectionString, getMappedPort(randomPortFirst));
        copyFileToContainer(Transferable.of(command, 0777), STARTER_SCRIPT);
    }
    private int getRandomPort() {
        try (ServerSocket server = new ServerSocket(0)) {
            server.setReuseAddress(true);
            return server.getLocalPort();
        } catch (IOException exception) {
            throw new Error(exception);
        }
    }
}
import com.zaxxer.hikari.HikariConfig;
import com.zaxxer.hikari.HikariDataSource;
import org.apache.curator.framework.CuratorFramework;
import org.apache.curator.framework.CuratorFrameworkFactory;
import org.apache.curator.retry.ExponentialBackoffRetry;
import org.junit.jupiter.api.AfterAll;
import org.junit.jupiter.api.Test;
import org.testcontainers.containers.GenericContainer;
import org.testcontainers.containers.Network;
import org.testcontainers.junit.jupiter.Container;
import org.testcontainers.junit.jupiter.Testcontainers;
import javax.sql.DataSource;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;
import java.time.Duration;
import java.util.List;
import java.util.Properties;
import static org.awaitility.Awaitility.await;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.is;
@SuppressWarnings({"SqlDialectInspection", "SqlNoDataSourceInspection", "resource"})
@Testcontainers
class ZookeeperServiceDiscoveryTest {
    private static final Network NETWORK = Network.newNetwork();
    @Container
    private static final GenericContainer<?> ZOOKEEPER_CONTAINER = new GenericContainer<>("zookeeper:3.9.3-jre-17")
            .withNetwork(NETWORK)
            .withNetworkAliases("foo")
            .withExposedPorts(2181);
    private final String jdbcUrlSuffix = ";serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2";
    private final String jdbcUrlPrefix = "jdbc:hive2://" + ZOOKEEPER_CONTAINER.getHost() + ":" + ZOOKEEPER_CONTAINER.getMappedPort(2181) + "/";
    @AfterAll
    static void afterAll() {
        NETWORK.close();
    }
    @Test
    void assertShardingInLocalTransactions() throws SQLException {
        try (GenericContainer<?> hs2FirstContainer = new HS2Container("apache/hive:4.0.1")
                .withNetwork(NETWORK)
                .withZookeeperConnectionString(ZOOKEEPER_CONTAINER.getNetworkAliases().get(0) + ":" + ZOOKEEPER_CONTAINER.getMappedPort(2181))
                .dependsOn(ZOOKEEPER_CONTAINER)) {
            hs2FirstContainer.start();
            awaitHS2(hs2FirstContainer.getFirstMappedPort());
            HikariConfig config = new HikariConfig();
            config.setDriverClassName("org.apache.hive.jdbc.HiveDriver");
            config.setJdbcUrl(jdbcUrlPrefix + jdbcUrlSuffix);
            DataSource dataSource = new HikariDataSource(config);
            extractedSQL(dataSource);
        }
    }
    private static void extractedSQL(final DataSource dataSource) throws SQLException {
        try (Connection connection = dataSource.getConnection();
             Statement statement = connection.createStatement()) {
            statement.execute("CREATE DATABASE demo_ds_0");
        }
    }
    private void awaitHS2(final int hiveServer2Port) {
        String connectionString = ZOOKEEPER_CONTAINER.getHost() + ":" + ZOOKEEPER_CONTAINER.getMappedPort(2181);
        await().atMost(Duration.ofMinutes(1L)).ignoreExceptions().until(() -> {
            try (CuratorFramework client = CuratorFrameworkFactory.builder()
                    .connectString(connectionString)
                    .retryPolicy(new ExponentialBackoffRetry(1000, 3))
                    .build()) {
                client.start();
                List<String> children = client.getChildren().forPath("/hiveserver2");
                assertThat(children.size(), is(1));
                return children.get(0).startsWith("serverUri=0.0.0.0:" + hiveServer2Port + ";version=4.0.1;sequence=");
            }
        });
        await().atMost(Duration.ofMinutes(1L)).ignoreExceptions().until(() -> {
            DriverManager.getConnection(jdbcUrlPrefix + jdbcUrlSuffix, new Properties()).close();
            return true;
        });
    }
}

@eddumelendez
Copy link
Member

eddumelendez commented Nov 30, 2024

Thanks for sharing! Let's revisit the code.

HS2Container looks for a random port on the host. Meanwhile, getMappedPort looks for random port to the given port on the container. So, the current code is looking for a port that doesn't exist on the container, only in the host. I see the HiveServer2 default port is 10000, so, the code should be getMappedPort(10000). However, checking some docs hive.server2.thrift.port changes the hive serve2 port, so, I guess your only option is using fixed port. This is unfortunate with some services where client and server exchange server metadata and client fails because the server sent original port instead of the random one. I wonder if adding a proxy would help. If you have a isolated project with proper tests working without testcontainers I can help.

The original title description is not valid because testcontainers checks ports on the container not the host. I'll move this to discussions.

@linghengqian
Copy link
Author

No problem. Should I just close this issue? I prefer that there is no such proxy.

@eddumelendez eddumelendez closed this as not planned Won't fix, can't repro, duplicate, stale Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants