diff --git a/README.md b/README.md
index c2eb651..25a408a 100644
--- a/README.md
+++ b/README.md
@@ -24,10 +24,17 @@ If you find this tool useful in your research, please cite us at:
 
 #### Option 1: conda install
 
+To install using `conda` execute the following command:
+```
+$ conda install schmidt73::fastbe
+```
+Currently, `fastBE` is only available through `conda` for Linux. Additional
+platforms will be supported upon request.
 
 #### Option 2: Pre-compiled binaries
 
-Pre-compiled binaries for `fastBE` are available for both macOS and Linux under the releases section. To install, simply copy the binary into a directory which is recognized by your shell.
+Pre-compiled binaries for `fastBE` are available for both macOS and Linux under the releases section. 
+To install, simply copy the binary into a directory which is recognized by your shell.
 
 #### Option 3: Build from source
 
@@ -49,7 +56,7 @@ $ make
 ```
 The output binary will be located at `build/src/fastbe`.
 
-### Usage
+## Usage
 
 To run *fastbe*, simply execute the binary. 
 ```
@@ -65,32 +72,42 @@ Subcommands:
 ```
 
 The two modes of fastbe are `search` and `regress`. The `search` mode
-solves the variant allele frequency $\ell_1$-deconvolution problem, 
+infers the phylogenetic tree best fitting the input frequency matrix, while
+the `regress` mode infers the fit of the input tree to the frequency matrix.
+If one is interested in inferring a phylogenetic tree
+from a frequency matrix, the `search` mode should be used. 
+
+Formally, the `search` mode
+solves the variant allele frequency $\ell_1$-factorization problem, 
 while the `regress` mode solves the variant allele frequency 
 $\ell_1$-regression problem, both of which are defined in 
 our manuscript. 
-
-The `search` mode takes as input an $m \times n$ frequency matrix $F$ and outputs
-an $n$-clonal tree that best fits the frequency matrix. **Important note: the search
-command requires a root vertex specified with the `-f/--assigned-root` flag.**
-By default this root vertex is set to be $0$. When the root vertex is unknown,
-it suffices to append an extra column to the beginning of the frequency matrix 
-and specify the root as $0$.
-
+The `search` mode takes as input an $m \times n$ frequency matrix $F$ 
+and outputs an $n$-clonal tree that best fits the frequency matrix. 
 The `regress` mode
 takes as input an $m \times n$ frequency matrix $F$ and an $n$-clonal
 tree $\mathcal{T}$ and outputs the minimum value of 
 $\lVert F - UB_{\mathcal{T}} \rVert_1$ over all usage matrices $U$.
 
+[!IMPORTANT] 
+The search command requires a root vertex specified with the 
+`-f/--assigned-root` flag, corresponding to the appropriate column of the 
+frequency matrix. By default this root vertex is set to be $0$, or the
+first column of the frequency matrix. When the root vertex is unknown,
+it suffices to append an extra column to the beginning of the frequency matrix 
+and specify the root as $0$. As best practice, keep the root vertex
+as the first column of the frequency matrix.
+
 ### Input format
 
 The input format for the `search` mode of *fastbe* consists of a frequency 
 matrix $F$ in `.txt` format. Rows are separated by newlines
 and columns are separated by spaces. Rows correspond
-to distinct samples and columns correspond to distinct mutation clusters.
+to distinct samples and columns correspond to distinct mutations 
+(or mutation clusters).
 More formally, $F_{ij}$ is the frequency of the $j^{\text{th}}$ mutation
-cluster in the $i^{\text{th}}$ sample. As an example, a frequency matrix $F$ 
-describing $20$ samples and $10$ clones is:
+in the $i^{\text{th}}$ sample. As an example, a frequency matrix $F$ 
+describing $20$ samples and $10$ mutations is:
 ```
 1.0000 0.9801 0.0000 0.8265 0.0156 0.3683 0.2450 0.1218 0.1260 0.0000
 1.0000 1.0000 0.0000 0.1257 0.0000 0.0000 0.0000 0.0000 0.0000 0.1436
@@ -133,7 +150,9 @@ which consists of $10$ clones rooted at the $0$ vertex is:
 9
 ```
 
-The above clonal tree $\mathcal{T}$ is provided as an input file at `examples/sim_tree.txt`.
+The above clonal tree $\mathcal{T}$ is provided as an input file at `examples/sim_tree.txt`. Each
+vertex in the adjacency list corresponds to a clone. The name of the vertex corresponds
+to the index of the mutation in the frequency matrix on the edge leading to the vertex.
 
 The above frequency matrix and clonal tree were generated using the command,
 `python scripts/simulation.py --clones 20 --samples 10 --coverage 100 --seed 0 --mutations 100 --output examples/sim`
@@ -142,9 +161,9 @@ read depth of $100\times$ and $100$ mutations distribution across the $10$ mutat
 Several other files such as the mutation to clone mapping, the ground truth usage matrix $U$, clonal
 matrix $B$, and read count matrices are also provided in the `examples/` directory.
 
-## Usage Example
+### Example
 
-As an example, we will infer a phylogenetic tree from the simulated
+As an example, we infer a phylogenetic tree from the simulated
 data with $20$ samples and $10$ clones. To run `fastbe` on this data,
 execute:
 ```
@@ -153,3 +172,37 @@ fastbe search examples/sim_obs_frequency_matrix.txt -o examples/fastbe
 This command will output an adjacency list describing the clonal tree 
 at `examples/fastbe_tree.txt` and a `.json` file containing metadata
 at `examples/fastbe_results.json`.
+
+### Benchmarks
+
+The runtime of `fastbe` scales approximately linearly with the number 
+of samples and exponentially with the number of mutations. As a 
+reference, we provide the runtime of `fastbe` on the simulated data
+with $m = 100$ samples and a varying number of mutations $n$ with
+default parameters. The system used for benchmarking is an 
+Intel Core i7-8665U CPU @ 1.90GHz × 8 with 8GB of RAM. 
+
+| Mutations ($n$) | Samples ($m$) | Runtime (s) |
+|-----------------|---------------|-------------|
+| 10              | 100           | 4           |
+| 25              | 100           | 41          |
+| 50              | 100           | 148         |
+| 100             | 100           | 486         |
+| 200             | 100           | 923         |
+| 500             | 100           |             |
+| 1000            | 100           |             |
+
+The memory footprint of `fastbe` is $O(mn + nb)$ where $b$ is the
+beam width for almost all applications. The memory footprint of 
+`fastbe` is negligible for almost all applications.
+
+When running the `search` mode of `fastbe`, the beam width of the
+search algorithm can be specified with the `-b/--beam_width` flag.
+The beam width controls the number of candidate trees that are
+considered at each iteration of the search algorithm. The runtime
+of the search algorithm scales approximately linearly with the beam 
+width, and the quality of the inferred tree improves with increased 
+beam width. 
+
+[!TIP]
+Use the default beam width chosen by `fastbe` for most applications.
diff --git a/src/fastbe.cxx b/src/fastbe.cxx
index 27fc62a..86fb3cb 100644
--- a/src/fastbe.cxx
+++ b/src/fastbe.cxx
@@ -370,10 +370,8 @@ std::pair<digraph<clone_tree_vertex>, std::unordered_map<int, int>> beam_search(
         return one_fastbe(a, vertex_map, F, root) < one_fastbe(b, vertex_map, F, root);
     });
 
-    // print scores of partial trees
-    for (size_t t = 0; t < partial_trees.size(); ++t) {
-        spdlog::info("Tree {} objective is {}", t, one_fastbe(partial_trees[t], vertex_map, F, root));
-    }
+    spdlog::info("Best tree objective is {}", one_fastbe(partial_trees[0], vertex_map, F, root));
+    spdlog::info("Worst tree objective is {}", one_fastbe(partial_trees[partial_trees.size() - 1], vertex_map, F, root));
     
     return {partial_trees[0], vertex_map};
 }
@@ -484,13 +482,34 @@ void perform_search(const argparse::ArgumentParser &search) {
     auto root_it = std::find(clone_order.begin(), clone_order.end(), root);
     clone_order.erase(root_it);
     clone_order.insert(clone_order.begin(), root);
+    
+    int beam_width = search.get<int>("beam_width");
+    if (beam_width == -1) {
+        if (ncols <= 10) {
+            beam_width = 1000;
+        } else if (ncols <= 25) {
+            beam_width = 500;
+        } else if (ncols <= 50) {
+            beam_width = 250;
+        } else if (ncols <= 100) {
+            beam_width = 100;
+        } else if (ncols <= 200) {
+            beam_width = 25;
+        } else if (ncols <= 500) {
+            beam_width = 5;
+        } else {
+            beam_width = 1;
+        }
+
+        spdlog::info("Beam width not specified. Using default beam width of {} for {} clones.", beam_width, ncols);
+    }
 
     spdlog::info("Performing beam search to find tree(s)...");
     auto [clone_tree, vmap] = beam_search(
         frequency_matrix, 
         root, 
         clone_order, 
-        search.get<unsigned int>("beam_width"),
+        beam_width,
         num_threads
     );
 
@@ -572,8 +591,8 @@ int main(int argc, char *argv[])
 
     search.add_argument("-b", "--beam_width")
           .help("beam width")
-          .default_value(10U)
-          .scan<'u', unsigned int>();
+          .default_value(-1)
+          .scan<'d', int>();
 
     program.add_subparser(search);
     program.add_subparser(regress);