Skip to content

Commit

Permalink
Merge pull request #55 from outbrain/3mrSpeedup
Browse files Browse the repository at this point in the history
Few updates
  • Loading branch information
SkBlaz authored Nov 13, 2023
2 parents 78c205b + 170e9bd commit ac44d16
Show file tree
Hide file tree
Showing 8 changed files with 1,567 additions and 1,454 deletions.
31 changes: 31 additions & 0 deletions docs/DOCSMAIN.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,34 @@ outrank --help
* A minimal showcase of performing feature ranking on a generic CSV is demonstrated with [this example](https://github.com/outbrain/outrank/tree/main/scripts/run_minimal.sh).

* [More examples](https://github.com/outbrain/outrank/tree/main/examples) demonstrating OutRank's capabilities are also available.


# OutRank as a Python library
Once installed, _OutRank_ can be used as any other Python library. For example, generic feature ranking algorithms can be accessed as

```python
from outrank.algorithms.feature_ranking.ranking_mi_numba import (
mutual_info_estimator_numba,
)

# Some synthetic minimal data (Numpy vectors)
a = np.array([1, 0, 0, 0, 1, 1, 1, 0], dtype=np.int32)

lowest = np.array(np.random.permutation(a), dtype=np.int32)
medium = np.array([1, 1, 0, 0, 1, 1, 1, 1], dtype=np.int32)
high = np.array([1, 0, 0, 0, 1, 1, 1, 1], dtype=np.int32)

lowest_score = mutual_info_estimator_numba(
a, lowest, np.float32(1.0), False,
)
medium_score = mutual_info_estimator_numba(
a, medium, np.float32(1.0), False,
)
high_score = mutual_info_estimator_numba(
a, high, np.float32(1.0), False,
)

scores = [lowest_score, medium_score, high_score]
sorted_score_indices = np.argsort(scores)
assert np.sum(np.array([0, 1, 2]) - sorted_score_indices) == 0
```
33 changes: 33 additions & 0 deletions docs/outrank.html
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ <h2>Contents</h2>
<li><a href="#welcome-to-outranks-documentation">Welcome to OutRank's documentation!</a></li>
<li><a href="#setup">Setup</a></li>
<li><a href="#example-use-cases">Example use cases</a></li>
<li><a href="#outrank-as-a-python-library">OutRank as a Python library</a></li>
</ul>


Expand Down Expand Up @@ -96,6 +97,38 @@ <h1 id="example-use-cases">Example use cases</h1>
<li><p>A minimal showcase of performing feature ranking on a generic CSV is demonstrated with <a href="https://github.com/outbrain/outrank/tree/main/scripts/run_minimal.sh">this example</a>.</p></li>
<li><p><a href="https://github.com/outbrain/outrank/tree/main/examples">More examples</a> demonstrating OutRank's capabilities are also available.</p></li>
</ul>

<h1 id="outrank-as-a-python-library">OutRank as a Python library</h1>

<p>Once installed, _OutRank_ can be used as any other Python library. For example, generic feature ranking algorithms can be accessed as</p>

<div class="pdoc-code codehilite">
<pre><span></span><code><span class="kn">from</span> <span class="nn"><a href="outrank/algorithms/feature_ranking/ranking_mi_numba.html">outrank.algorithms.feature_ranking.ranking_mi_numba</a></span> <span class="kn">import</span> <span class="p">(</span>
<span class="n">mutual_info_estimator_numba</span><span class="p">,</span>
<span class="p">)</span>

<span class="c1"># Some synthetic minimal data (Numpy vectors)</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int32</span><span class="p">)</span>

<span class="n">lowest</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">permutation</span><span class="p">(</span><span class="n">a</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int32</span><span class="p">)</span>
<span class="n">medium</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int32</span><span class="p">)</span>
<span class="n">high</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int32</span><span class="p">)</span>

<span class="n">lowest_score</span> <span class="o">=</span> <span class="n">mutual_info_estimator_numba</span><span class="p">(</span>
<span class="n">a</span><span class="p">,</span> <span class="n">lowest</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">(</span><span class="mf">1.0</span><span class="p">),</span> <span class="kc">False</span><span class="p">,</span>
<span class="p">)</span>
<span class="n">medium_score</span> <span class="o">=</span> <span class="n">mutual_info_estimator_numba</span><span class="p">(</span>
<span class="n">a</span><span class="p">,</span> <span class="n">medium</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">(</span><span class="mf">1.0</span><span class="p">),</span> <span class="kc">False</span><span class="p">,</span>
<span class="p">)</span>
<span class="n">high_score</span> <span class="o">=</span> <span class="n">mutual_info_estimator_numba</span><span class="p">(</span>
<span class="n">a</span><span class="p">,</span> <span class="n">high</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">float32</span><span class="p">(</span><span class="mf">1.0</span><span class="p">),</span> <span class="kc">False</span><span class="p">,</span>
<span class="p">)</span>

<span class="n">scores</span> <span class="o">=</span> <span class="p">[</span><span class="n">lowest_score</span><span class="p">,</span> <span class="n">medium_score</span><span class="p">,</span> <span class="n">high_score</span><span class="p">]</span>
<span class="n">sorted_score_indices</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">argsort</span><span class="p">(</span><span class="n">scores</span><span class="p">)</span>
<span class="k">assert</span> <span class="n">np</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span> <span class="o">-</span> <span class="n">sorted_score_indices</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span>
</code></pre>
</div>
</div>

<input id="mod-outrank-view-source" class="view-source-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
Expand Down
Loading

0 comments on commit ac44d16

Please sign in to comment.