[llvm-ic-v0/cBench-v0] Leaderboard submission: Random Search.

facebookresearch · Mar 3, 2021 · 94b3a0c · 94b3a0c
1 parent bcb8b34
commit 94b3a0c
Show file tree

Hide file tree

Showing 7 changed files with 528 additions and 0 deletions.
diff --git a/leaderboard/llvm_codesize/random_search/BUILD b/leaderboard/llvm_codesize/random_search/BUILD
@@ -0,0 +1,22 @@
+# Copyright (c) Facebook, Inc. and its affiliates.
+#
+# This source code is licensed under the MIT license found in the
+# LICENSE file in the root directory of this source tree.
+
+py_library(
+    name = "random_search",
+    srcs = ["random_search.py"],
+    deps = [
+        "//leaderboard/llvm_codesize:eval_policy",
+    ],
+)
+
+py_test(
+    name = "random_search_test",
+    timeout = "short",
+    srcs = ["random_search_test.py"],
+    deps = [
+        ":random_search",
+        "//tests:test_main",
+    ],
+)
diff --git a/leaderboard/llvm_codesize/random_search/README.md b/leaderboard/llvm_codesize/random_search/README.md
@@ -0,0 +1,113 @@
+# Random Search
+
+**tldr;**
+A pure random policy that records the best result found within a fixed time
+budget.
+
+**Authors:**
+Facebook AI Research
+
+**Results:**
+[results_t60_p125.csv](results_t60_p125.csv),
+[results_t1800_p125.csv](results_t1800_p125.csv).
+
+**Publication:**
+<!-- TODO(cummins): Add CompilerGym citation when ready. -->
+
+**CompilerGym version:**
+0.1.3
+
+**Open source?**
+Yes, MIT licensed. [Source Code](random_search.py).
+
+**Did you modify the CompilerGym source code?**
+No.
+
+**What parameters does the approach have?**
+Search time *t*, patience *p*.
+
+**What range of values were considered for the above parameters?**
+Search time is fixed and is indicated by the leaderboard entry name, e.g.
+"Random search (t=30)" means a random search for 30 seconds. Eight values for
+patience were considered, *n/4*, *n/2*, *3n/4*, *n*, *5n/4*, *3n/2*, *7n/8*, and
+*2n*, where *n* is the size of the action space. The patience value was selected
+using the `blas-v0` dataset for validation, see appendix below.
+
+**Is the policy deterministic?**
+No.
+
+## Description
+
+This approach uses a simple random agent on the action space of the target
+program. This is equivalent to running
+`python -m compiler_gym.bin.random_search` on the programs in the test set.
+
+The random search operates by selecting actions randomly until a fixed number of
+steps (the "patience" of the search) have been evaluated without an improvement
+to reward. The search stops after a predetermined amount of search time has
+elapsed.
+
+Pseudo-code for this search is:
+
+```c++
+float search_with_patience(CompilerEnv env, int patience, int search_time) {
+    float best_reward = -INFINITY;
+    int end_time = time() + search_time;
+    while (time() < end_time) {
+        env.reset();
+        int p = patience;
+        bool done = false;
+        do {
+            env.step(env.action_space.sample());
+            if (env.reward() > best_reward) {
+                p = patience;  // Reset patience every time progress is made
+                best_reward = env.reward();
+            }
+        } while (--p && time() < end_time && !env.done())
+        // search terminates when patience or search time is exhausted, or
+        // terminal state is reached
+    }
+    return best_reward;
+}
+```
+
+To reproduce the search, run the [random_search.py](random_search.py) script.
+
+
+### Tuning the patience parameter
+
+The `--patience_ratio` value was selected by running a random 200 searches on
+programs from the `blas-v0` dataset and selecting the value that produced the
+best average reward:
+
+```sh
+#!/usr/bin/env bash
+set -euo pipefail
+
+SEARCH_TIME=30
+for patience_ratio in 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 ; do
+    echo -e "\nEvaluating --patience_ratio=$patience_ratio"
+    logfile="blas_t${SEARCH_TIME}_${patience_ratio}.csv"
+    python random_search.py --max_benchmarks=20 --n=10 --dataset=blas-v0 \
+        --search_time="${SEARCH_TIME}" --patience_ratio="${PATIENCE_RATIO}" \
+        --logfile="$LOGFILE"
+    python -m compiler_gym.bin.validate --env=llvm-ic-v0 \
+        --reward_aggregation=geomean < "$LOGFILE" | tail -n4
+done
+```
+
+The patience value that returned the best geomean reward can then be used as the
+value for the search on the test set. For example:
+
+```
+python random_search.py --search_time=30 --patience_ratio=1.25 --logfile=random_search_t30_p125.csv
+```
+
+
+### Experimental Setup
+
+|        | Hardware Specification                        |
+| ------ | --------------------------------------------- |
+| OS     | Ubuntu 20.04                                  |
+| CPU    | Intel Xeon Gold 6230 CPU @ 2.10GHz (80× core) |
+| Memory | 754.5 GiB                                     |
diff --git a/leaderboard/llvm_codesize/random_search/random_search.py b/leaderboard/llvm_codesize/random_search/random_search.py
@@ -0,0 +1,78 @@
+# Copyright (c) Facebook, Inc. and its affiliates.
+#
+# This source code is licensed under the MIT license found in the
+# LICENSE file in the root directory of this source tree.
+"""An implementation of a random search policy for the LLVM codesize task.
+
+The search is the same as the included compiler_gym.bin.random_search. See
+random_search.md for a detailed description.
+"""
+import os
+import sys
+from time import sleep
+
+import gym
+from absl import flags
+
+from compiler_gym.envs import LlvmEnv
+from compiler_gym.random_search import RandomAgentWorker
+
+# Import the ../eval_policy.py helper.
+sys.path.insert(0, os.path.dirname(os.path.realpath(__file__)) + "/..")
+from eval_policy import eval_policy  # noqa pylint: disable=wrong-import-position
+
+flags.DEFINE_float(
+    "patience_ratio",
+    1.0,
+    "The ratio of patience to the size of the action space. "
+    "Patience = patience_ratio * action_space_size",
+)
+flags.DEFINE_integer(
+    "search_time",
+    60,
+    "The minimum number of seconds to run the random search for. After this "
+    "many seconds have elapsed the best results are aggregated from the "
+    "search threads and the search is terminated.",
+)
+FLAGS = flags.FLAGS
+
+
+def random_search(env: LlvmEnv) -> None:
+    """Run a random search on the given environment."""
+    patience = int(env.action_space.n * FLAGS.patience_ratio)
+
+    # Start parallel random search workers.
+    workers = [
+        RandomAgentWorker(
+            make_env=lambda: gym.make("llvm-ic-v0", benchmark=env.benchmark),
+            patience=patience,
+        )
+        for _ in range(FLAGS.nproc)
+    ]
+    for worker in workers:
+        worker.start()
+
+    sleep(FLAGS.search_time)
+
+    # Stop the workers.
+    for worker in workers:
+        worker.alive = False
+    for worker in workers:
+        worker.join()
+
+    # Aggregate the best results.
+    best_actions = []
+    best_reward = -float("inf")
+    for worker in workers:
+        if worker.best_returns > best_reward:
+            best_reward, best_actions = worker.best_returns, list(worker.best_actions)
+
+    # Replay the best sequence of actions to produce the final environment
+    # state.
+    for action in best_actions:
+        _, _, done, _ = env.step(action)
+        assert not done
+
+
+if __name__ == "__main__":
+    eval_policy(random_search)
diff --git a/leaderboard/llvm_codesize/random_search/random_search_test.py b/leaderboard/llvm_codesize/random_search/random_search_test.py
@@ -0,0 +1,36 @@
+# Copyright (c) Facebook, Inc. and its affiliates.
+#
+# This source code is licensed under the MIT license found in the
+# LICENSE file in the root directory of this source tree.
+"""Tests for //leaderboard/llvm_codesize:eval_policy."""
+import pytest
+from absl import flags
+
+from leaderboard.llvm_codesize.random_search.random_search import (
+    eval_policy,
+    random_search,
+)
+from tests.test_main import main as _test_main
+
+FLAGS = flags.FLAGS
+
+
+def test_random_search():
+    FLAGS.unparse_flags()
+    FLAGS(
+        [
+            "argv0",
+            "--n=1",
+            "--max_benchmarks=1",
+            "--search_time=1",
+            "--nproc=1",
+            "--patience_ratio=0.1",
+            "--novalidate",
+        ]
+    )
+    with pytest.raises(SystemExit):
+        eval_policy(random_search)
+
+
+if __name__ == "__main__":
+    _test_main()