Skip to content

Commit

Permalink
Formating
Browse files Browse the repository at this point in the history
  • Loading branch information
ecomodeller committed Nov 6, 2023
1 parent c944b34 commit e89189d
Show file tree
Hide file tree
Showing 2 changed files with 29 additions and 24 deletions.
10 changes: 5 additions & 5 deletions projects/data_cleaning/Project_module_02.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ After last module, your script now uses functions `clean_spikes`, `clean_outofra
- Add default arguments to the functions. Commit.
- Make sure that you only use positional arguments where there is only one argument. Use keyword arguments everywhere else. Commit.
- Consider modifying the cleaning functions if they modify the input (remember that inputs are passed as reference, not a copy), e.g.
```python
data_cleaned = data.copy()
...
return data_cleaned
```
```python
data_cleaned = data.copy()
...
return data_cleaned
```
- 2.2 Modules
- Move cleaner functions into a separate module `cleaning.py`. Commit.
- Move the plotting function into a separate module `plotting.py`. Commit.
Expand Down
43 changes: 24 additions & 19 deletions projects/data_cleaning/Project_module_05.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,37 +6,42 @@ In this module you will benefit from the automatic testing that you have added i
- 5.1 Type Hints
- Add type hints to all functions and methods. Commit
- 5.2 Data class
- Make all the cleaner classes dataclasses. e.g.:
```python
from dataclasses import dataclass
- Make all the cleaner classes dataclasses.
```python
from dataclasses import dataclass

@dataclass
class SpikeCleaner:
```
@dataclass
class SpikeCleaner:
...
```
- remove the `__init__` method (not needed anymore)
- Check that the notebook still runs and that the classes indeed work as data classes (e.g. have a string representation and support equality testing etc)
- Commit


- 5.3 Module level function
- Make a private module function `_print_stats()` that prints the number of cleaned values
- call the function from each of the clean methods (note: inheritance is not required to obtain common functionality)
- 5.4 Composition or inheritance
- Create a new cleaner class called CleanerWorkflow that takes a list of cleaners when constructed and has a clean method that run all the cleaners' clean methods.
```python
class CleanerWorkflow:
def __init__(self, cleaners) -> None:
self.cleaners = cleaners

def clean(self, data: pd.Series) -> pd.Series:
data_cleaned = data.copy()
for cleaner in self.cleaners:
...
```
```python
class CleanerWorkflow:
def __init__(self, cleaners) -> None:
self.cleaners = cleaners

def clean(self, data: pd.Series) -> pd.Series:
data_cleaned = data.copy()
for cleaner in self.cleaners:
...
```
- Modify the notebook to use the CleanerWorkflow instead of looping over the cleaners
- Consider what type of validation you would want CleanerWorkflow to have? Is it better check validity up front or to just go ahead and handle problems afterwards?
- Consider whether it would be better to create a base class BaseCleaner - write down your considerations as a comment in the pull request, refer to specific lines of code
- e.g. how would you handle e.g. common plotting functionality in the cleaner classes?
- Create pull request in GitHub and "request review" from your reviewers
- Get feedback, Adjust code until approval, then merge (and delete branch)
- e.g. how would you handle e.g. common plotting functionality in the cleaner classes?
- Create pull request in GitHub and "request review" from your reviewers
- Get feedback, Adjust code until approval, then merge (and delete branch)





Expand Down

0 comments on commit e89189d

Please sign in to comment.