Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
Quarto GHA Workflow Runner committed Nov 6, 2023
1 parent c1f69b7 commit 1c30115
Show file tree
Hide file tree
Showing 5 changed files with 108 additions and 62 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
f4c8872c
92b70b93
21 changes: 12 additions & 9 deletions projects/data_cleaning/Project_module_02.html
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,11 @@ <h2 class="anchored" data-anchor-id="module-2-modules-and-classes">Module 2: Mod
<ul>
<li>Add default arguments to the functions. Commit.</li>
<li>Make sure that you only use positional arguments where there is only one argument. Use keyword arguments everywhere else. Commit.</li>
<li>Consider modifying the cleaning functions if they modify the input (remember that inputs are passed as reference, not a copy), e.g.&nbsp; <code>python data_cleaned = data.copy() ... return data_cleaned</code></li>
</ul></li>
<li>Consider modifying the cleaning functions if they modify the input (remember that inputs are passed as reference, not a copy), e.g.&nbsp;</li>
</ul>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>data_cleaned <span class="op">=</span> data.copy()</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a>...</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="cf">return</span> data_cleaned</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div></li>
<li>2.2 Modules
<ul>
<li>Move cleaner functions into a separate module <code>cleaning.py</code>. Commit.</li>
Expand All @@ -142,13 +145,13 @@ <h2 class="anchored" data-anchor-id="module-2-modules-and-classes">Module 2: Mod
</ul></li>
<li>modify <code>main.py</code> and check that it runs</li>
</ul>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>cleaners <span class="op">=</span> [</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> SpikeCleaner(max_jump<span class="op">=</span><span class="dv">10</span>),</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> OutOfRangeCleaner(min_val<span class="op">=</span><span class="dv">0</span>, max_val<span class="op">=</span><span class="dv">50</span>),</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> FlatPeriodCleaner(flat_period<span class="op">=</span><span class="dv">5</span>),</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>]</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> cleaner <span class="kw">in</span> cleaners:</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> data <span class="op">=</span> cleaner.clean(data)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>cleaners <span class="op">=</span> [</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> SpikeCleaner(max_jump<span class="op">=</span><span class="dv">10</span>),</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> OutOfRangeCleaner(min_val<span class="op">=</span><span class="dv">0</span>, max_val<span class="op">=</span><span class="dv">50</span>),</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> FlatPeriodCleaner(flat_period<span class="op">=</span><span class="dv">5</span>),</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a>]</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> cleaner <span class="kw">in</span> cleaners:</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> data <span class="op">=</span> cleaner.clean(data)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<ul>
<li>commit</li>
</ul></li>
Expand Down
89 changes: 66 additions & 23 deletions projects/data_cleaning/Project_module_05.html
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,40 @@
margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */
vertical-align: middle;
}
/* CSS for syntax highlighting */
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
}
pre.numberSource { margin-left: 3em; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
</style>


Expand Down Expand Up @@ -86,34 +120,43 @@ <h2 class="anchored" data-anchor-id="module-5-object-oriented-design">Module 5:
</ul></li>
<li>5.2 Data class
<ul>
<li>Make all the cleaner classes dataclasses. e.g.: ```python from dataclasses import dataclass</li>
<li>Make all the cleaner classes dataclasses.</li>
</ul>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> dataclasses <span class="im">import</span> dataclass</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="at">@dataclass</span></span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="kw">class</span> SpikeCleaner:</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> ...</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<ul>
<li>remove the <code>__init__</code> method (not needed anymore)</li>
<li>Check that the notebook still runs and that the classes indeed work as data classes (e.g.&nbsp;have a string representation and support equality testing etc)</li>
<li>Commit</li>
</ul></li>
<li>5.3 Module level function
<ul>
<li>Make a private module function <code>_print_stats()</code> that prints the number of cleaned values</li>
<li>call the function from each of the clean methods (note: inheritance is not required to obtain common functionality)</li>
</ul></li>
<li>5.4 Composition or inheritance
<ul>
<li>Create a new cleaner class called CleanerWorkflow that takes a list of cleaners when constructed and has a clean method that run all the cleaners’ clean methods.</li>
</ul>
<p><span class="citation" data-cites="dataclass">@dataclass</span> class SpikeCleaner:</p>
<pre><code> - remove the `__init__` method (not needed anymore)
- Check that the notebook still runs and that the classes indeed work as data classes (e.g. have a string representation and support equality testing etc)
- Commit
- 5.3 Module level function
- Make a private module function `_print_stats()` that prints the number of cleaned values
- call the function from each of the clean methods (note: inheritance is not required to obtain common functionality)
- 5.4 Composition or inheritance
- Create a new cleaner class called CleanerWorkflow that takes a list of cleaners when constructed and has a clean method that run all the cleaners' clean methods.
```python
class CleanerWorkflow:
def __init__(self, cleaners) -&gt; None:
self.cleaners = cleaners

def clean(self, data: pd.Series) -&gt; pd.Series:
data_cleaned = data.copy()
for cleaner in self.cleaners:
...</code></pre>
<pre><code>- Modify the notebook to use the CleanerWorkflow instead of looping over the cleaners
- Consider what type of validation you would want CleanerWorkflow to have? Is it better check validity up front or to just go ahead and handle problems afterwards?
- Consider whether it would be better to create a base class BaseCleaner - write down your considerations as a comment in the pull request, refer to specific lines of code
- e.g. how would you handle e.g. common plotting functionality in the cleaner classes? </code></pre>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="kw">class</span> CleanerWorkflow:</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="kw">def</span> <span class="fu">__init__</span>(<span class="va">self</span>, cleaners) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>.cleaners <span class="op">=</span> cleaners</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> <span class="kw">def</span> clean(<span class="va">self</span>, data: pd.Series) <span class="op">-&gt;</span> pd.Series:</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> data_cleaned <span class="op">=</span> data.copy()</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> <span class="cf">for</span> cleaner <span class="kw">in</span> <span class="va">self</span>.cleaners:</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> ...</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<ul>
<li>Modify the notebook to use the CleanerWorkflow instead of looping over the cleaners</li>
<li>Consider what type of validation you would want CleanerWorkflow to have? Is it better check validity up front or to just go ahead and handle problems afterwards?</li>
<li>Consider whether it would be better to create a base class BaseCleaner - write down your considerations as a comment in the pull request, refer to specific lines of code</li>
<li>e.g.&nbsp;how would you handle e.g.&nbsp;common plotting functionality in the cleaner classes?</li>
<li>Create pull request in GitHub and “request review” from your reviewers</li>
<li>Get feedback, Adjust code until approval, then merge (and delete branch)</li>
</ul></li>
</ul>
<p><a href="../../projects/data_cleaning/index.html">Back to course project overview</a></p>

Expand Down
Loading

0 comments on commit 1c30115

Please sign in to comment.