Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.6.0 #80

Merged
merged 119 commits into from
Jul 31, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
119 commits
Select commit Hold shift + click to select a range
51b89ce
Dates are now dates and not just strings
Oxid15 Jun 30, 2022
0ff03ce
Page size is now arg
Oxid15 Jul 4, 2022
e393f45
Merge pull request #62 from Oxid15/dates_in_metric_viewer
Oxid15 Jul 4, 2022
c29d83a
Add read meta from file util
Oxid15 Jul 4, 2022
d35e04a
Tests of traceable
Oxid15 Jul 4, 2022
b2c1d11
Merge pull request #63 from Oxid15/update_meta_from_str
Oxid15 Jul 4, 2022
e2efdd7
Close logging files at deletion of Repo
Oxid15 Jul 4, 2022
6b7c4f6
Test with pytest instead of unittests
Oxid15 Jul 4, 2022
84761f8
Change repo name in different tests
Oxid15 Jul 4, 2022
9bca388
Migrate apply modifier tests to pytest
Oxid15 Jul 4, 2022
052c3e1
Naively migrate all tests to pytest
Oxid15 Jul 5, 2022
2f5ce85
Use fixtures (#65)
Oxid15 Jul 5, 2022
26ba39e
Merge branch '0.6.0' into update_tests
Oxid15 Jul 5, 2022
e2a7640
Merge pull request #64 from Oxid15/update_tests
Oxid15 Jul 5, 2022
24ffe10
Add samplers tests
Oxid15 Jul 5, 2022
60f631d
Update tests
Oxid15 Jul 5, 2022
c1d4a7d
Warn in ModelAggregate
Oxid15 Jul 5, 2022
b3e8d02
Utils tests moved
Oxid15 Jul 5, 2022
12886ec
Migrate traceable test to pytest
Oxid15 Jul 5, 2022
481ea67
Image folder tests
Oxid15 Jul 5, 2022
82e1ff6
Add the ability to use different handlers not only json
Oxid15 Jul 8, 2022
6208cc1
Fix test in meta viewer
Oxid15 Jul 8, 2022
3c6699f
Rename custom encoder to custom json encoder
Oxid15 Jul 9, 2022
a9a6fbe
Mark hv tests as slow
Oxid15 Jul 9, 2022
4553ff5
Cover all dtypes in tests
Oxid15 Jul 9, 2022
c5609d1
Move json encoders default out of class
Oxid15 Jul 9, 2022
69a1b83
Add YAML support
Oxid15 Jul 9, 2022
ebe8106
Tests formatting
Oxid15 Jul 9, 2022
62e0277
Make yaml work using conversion of objects to dict first
Oxid15 Jul 9, 2022
2c990f5
Parametrize mh tests
Oxid15 Jul 9, 2022
0b0b283
Remove unused marks
Oxid15 Jul 9, 2022
23e387a
Add text reader
Oxid15 Jul 10, 2022
6770461
Add other stuff
Oxid15 Jul 10, 2022
44e5aa0
Drop support of SkClassifier
Oxid15 Jul 10, 2022
5ec5386
Let model line set meta format
Oxid15 Jul 10, 2022
99c59b9
Update docs
Oxid15 Jul 24, 2022
6ab2eae
Update docs
Oxid15 Jun 25, 2022
3b1cc95
Update docs
Oxid15 Jul 24, 2022
0177a5d
Add nbsphinx dependency to render ipynb in docs
Oxid15 Jul 24, 2022
e7a6f8e
Add an example for pipeline building
Oxid15 Jul 24, 2022
8e5a295
Revert "Drop support of SkClassifier"
Oxid15 Jul 10, 2022
7d77d76
Drop support but now with message!
Oxid15 Jul 24, 2022
d74e645
Fix description - model training is separate usecase
Oxid15 Jul 24, 2022
13a6163
Add BasicModel interface
Oxid15 Jul 26, 2022
db7c8c2
Test BasicModel
Oxid15 Jul 26, 2022
8d0fae6
Merge pull request #67 from Oxid15/drop_skclassifier
Oxid15 Jul 26, 2022
73a5572
Update model's docstring
Oxid15 Jul 26, 2022
93369cc
SkModel is now BasicModel
Oxid15 Jul 26, 2022
80a4c4a
Construct pipeline now static function
Oxid15 Jul 26, 2022
f32364a
Make it staticmethod
Oxid15 Jul 26, 2022
28a56d7
Typo... (need to implement this functionality)
Oxid15 Jul 26, 2022
69d9c75
Comment code that isn't in use
Oxid15 Jul 26, 2022
ce657e1
Add BasicModelModifier
Oxid15 Jul 26, 2022
8870aa0
Add constant baseline
Oxid15 Jul 26, 2022
db1ccba
Evaluate documentation
Oxid15 Jul 26, 2022
68dbca7
Remove old file
Oxid15 Jul 26, 2022
467c431
Add TorchModel
Oxid15 Jul 26, 2022
c7a1cda
Merge pull request #68 from Oxid15/torch_model
Oxid15 Jul 26, 2022
8f14871
Add model_training to toctree
Oxid15 Jul 26, 2022
099b356
Add draft model_training
Oxid15 Jul 26, 2022
155f461
Code of model training example
Oxid15 Jul 26, 2022
b4cae80
Fix the description
Oxid15 Jul 26, 2022
0c7c37c
Changes that are not related to docs
Oxid15 Jul 26, 2022
b40a3c6
Add comments to model training example
Oxid15 Jul 29, 2022
b02522a
Merge pull request #69 from Oxid15/update_docs
Oxid15 Jul 29, 2022
9c3be59
Clean up docs
Oxid15 Jul 29, 2022
e13739c
Merge branch '0.6.0' into model_update
Oxid15 Jul 29, 2022
0d453d0
Merge pull request #70 from Oxid15/model_update
Oxid15 Jul 29, 2022
b961fa6
Merge branch '0.6.0' into abstract_from_json
Oxid15 Jul 29, 2022
f34133c
Merge pull request #66 from Oxid15/abstract_from_json
Oxid15 Jul 29, 2022
eb39eb7
Refine docs
Oxid15 Jul 29, 2022
6c45ab8
Merge pull request #71 from Oxid15/serve_args
Oxid15 Jul 29, 2022
aa3476c
Numpy wrapper now has its own path in meta
Oxid15 Jul 29, 2022
e08760b
Fix the bug with num don't correspond to real number of run on Linux
Oxid15 Jul 29, 2022
097eb54
Add test for this fix
Oxid15 Jul 29, 2022
66606b6
Concatenator's meta is now dict
Oxid15 Jul 29, 2022
375894a
Add default value for constant
Oxid15 Jul 29, 2022
580208a
TorchModel now has graph description in meta
Oxid15 Jul 29, 2022
3cf76b6
Init methods are now in docs
Oxid15 Jul 29, 2022
b5adc5f
Make ModelRepo lines true alias for add_line
Oxid15 Jul 29, 2022
0c1b8d1
Add Repo interface
Oxid15 Jul 29, 2022
bd3fbb1
Add repo concatenator and the ability to sum repos
Oxid15 Jul 30, 2022
60c9884
Test add methods
Oxid15 Jul 30, 2022
3e74b27
Debug and test getitem interface
Oxid15 Jul 30, 2022
b3c6b36
Change cls to model_cls everywhere
Oxid15 Jul 30, 2022
3524e60
Merge pull request #73 from Oxid15/debug_num_in_mv
Oxid15 Jul 30, 2022
58e0436
Clean some code formatting
Oxid15 Jul 30, 2022
6687617
Now columns in selectors for graph are from flat table
Oxid15 Jul 30, 2022
b5ba8f1
Add include and exclude keywords
Oxid15 Jul 30, 2022
7a93fae
Add test of numpy wrapper
Oxid15 Jul 30, 2022
7d7f10b
Move undersampler's test to tests
Oxid15 Jul 30, 2022
15038e2
Add test for table dataset
Oxid15 Jul 30, 2022
6b06aa8
Add test for text classification dataset
Oxid15 Jul 30, 2022
8bd1dc6
Merge pull request #74 from Oxid15/meta_update
Oxid15 Jul 30, 2022
2372418
Merge pull request #75 from Oxid15/torch_model_update
Oxid15 Jul 30, 2022
d124256
Merge branch '0.6.0' into repo_math
Oxid15 Jul 30, 2022
d7f2cc9
Merge pull request #76 from Oxid15/repo_math
Oxid15 Jul 30, 2022
6f0e5f7
Merge pull request #77 from Oxid15/metric_viewer_update
Oxid15 Jul 30, 2022
47d5ae4
Merge branch '0.6.0' into docs_update
Oxid15 Jul 30, 2022
1a96d63
Update documentation sources
Oxid15 Jul 30, 2022
9812df9
Write utils documentation
Oxid15 Jul 30, 2022
b66892f
Merge branch '0.6.0' into utils_update
Oxid15 Jul 30, 2022
9f3effc
Merge pull request #78 from Oxid15/utils_update
Oxid15 Jul 30, 2022
879de99
Merge branch '0.6.0' into docs_update
Oxid15 Jul 30, 2022
0653bf1
Add utils source file
Oxid15 Jul 31, 2022
9dd547c
Add utils to toctree
Oxid15 Jul 31, 2022
caebd22
Missing import
Oxid15 Jul 31, 2022
9f93378
Update version
Oxid15 Jul 31, 2022
f9f92a7
Fix import
Oxid15 Jul 31, 2022
b941c27
Refine documentation, links
Oxid15 Jul 31, 2022
e488f2d
Merge pull request #79 from Oxid15/docs_update
Oxid15 Jul 31, 2022
092d056
Merge pull request #72 from Oxid15/0.6.0
Oxid15 Jul 31, 2022
dfc1a89
Fail if can't find requirements files in workflow
Oxid15 Jul 31, 2022
2d1269e
Fix paths to test requirements
Oxid15 Jul 31, 2022
52d78ae
Fix error in test meta viewer
Oxid15 Jul 31, 2022
0f7d7f9
Change order in accordance to sorting
Oxid15 Jul 31, 2022
1acf93c
Add utils tests to workflow
Oxid15 Jul 31, 2022
53fb56d
Add root path to utils tests
Oxid15 Jul 31, 2022
d1dbcba
Fix error in test of folder img ds
Oxid15 Jul 31, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 16 additions & 4 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,21 +24,33 @@ jobs:
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
pwd
ls
python -m pip install --upgrade pip
python -m pip install flake8 pytest
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
if [ -f utils_requirements.txt ]; then pip install -r utils_requirements.txt; fi
python -m pip install -r requirements.txt
python -m pip install -r cascade/tests/requirements.txt
python -m pip install -r utils_requirements.txt

- name: Lint with flake8
run: |
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics

- name: Test
run: |
pwd
ls
python --version
python -m unittest discover ./cascade/tests
cd ./cascade/tests
pytest --cov=cascade .
- name: Test utils
run: |
pwd
ls
cd ./cascade/utils/tests
pytest
4 changes: 2 additions & 2 deletions cascade/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
limitations under the License.
"""

__version__ = '0.5.2'
__version__ = '0.6.0'
__author__ = 'Ilia Moiseev'
__author_email__ = 'ilia.moiseev.5@yandex.ru'

Expand All @@ -25,7 +25,7 @@
from . import tests

# cascade does not have
# `from . import utils`
# from . import utils
# because it will bring additional dependencies that may not be needed by the user
# if you need to use cascade.utils, you can install utils_requirements.txt and then
# import as any other cascade module
1 change: 1 addition & 0 deletions cascade/base/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
from .meta_handler import MetaHandler
from .traceable import Traceable
from .meta_handler import CustomEncoder as JSONEncoder
102 changes: 89 additions & 13 deletions cascade/base/meta_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,23 +16,28 @@

import os
import json
from typing import Union, Dict, List
import datetime
from typing import List, Dict
from json import JSONEncoder

import yaml
import numpy as np


class CustomEncoder(JSONEncoder):
def default(self, obj):
if isinstance(obj, type):
return str(obj)
if isinstance(obj, (datetime.datetime, datetime.date, datetime.time)):
return obj.isoformat()

elif isinstance(obj, datetime.timedelta):
return (datetime.datetime.min + obj).time().isoformat()

elif isinstance(obj, (np.int_, np.intc, np.intp, np.int8,
np.int16, np.int32, np.int64, np.uint8,
np.uint16, np.uint32, np.uint64)):

return int(obj)

elif isinstance(obj, (np.float_, np.float16, np.float32, np.float64)):
Expand All @@ -44,29 +49,39 @@ def default(self, obj):
elif isinstance(obj, (np.ndarray,)):
return obj.tolist()

elif isinstance(obj, (np.bool_)):
elif isinstance(obj, np.bool_):
return bool(obj)

elif isinstance(obj, (np.void)):
elif isinstance(obj, np.void):
return None

return super(CustomEncoder, self).default(obj)

def obj_to_dict(self, obj):
return json.loads(self.encode(obj))

class MetaHandler:

class BaseHandler:
def read(self, path) -> List[Dict]:
raise NotImplementedError()

def write(self, path, obj, overwrite=True) -> None:
raise NotImplementedError()


class JSONHandler(BaseHandler):
"""
Handles the logic of dumping and loading json files
"""
def read(self, path) -> dict:
def read(self, path) -> Union[Dict, List[Dict]]:
"""
Reads json from path

Parameters
----------
path:
Path to the file. If no extension provided, then .json assumed
Path to the file. If no extension provided, then .json will be added
"""
assert os.path.exists(path)
_, ext = os.path.splitext(path)
if ext == '':
path += '.json'
Expand All @@ -77,16 +92,77 @@ def read(self, path) -> dict:
meta = json.loads(meta)
return meta

def write(self, name, obj, overwrite=True) -> None:
def write(self, name, obj:List[Dict], overwrite=True) -> None:
"""
Writes json to path using custom encoder
"""

if not overwrite and os.path.exists(name):
return

with open(name, 'w') as json_meta:
json.dump(obj, json_meta, cls=CustomEncoder, indent=4)
with open(name, 'w') as f:
json.dump(obj, f, cls=CustomEncoder, indent=4)


class YAMLHandler(BaseHandler):
def read(self, path) -> Union[Dict, List[Dict]]:
"""
Reads yaml from path

Parameters
----------
path:
Path to the file. If no extension provided, then .yml will be added
"""
_, ext = os.path.splitext(path)
if ext == '':
path += '.yml'

with open(path, 'r') as meta_file:
meta = yaml.safe_load(meta_file)
return meta

def write(self, path, obj, overwrite=True) -> None:
if not overwrite and os.path.exists(path):
return

obj = CustomEncoder().obj_to_dict(obj)
with open(path, 'w') as f:
yaml.safe_dump(obj, f)

def encode(self, obj):
return CustomEncoder().encode(obj)

class TextHandler(BaseHandler):
def read(self, path) -> Dict:
"""
Reads text file from path and returns dict in the form {path: 'text from file'}

Parameters
----------
path:
Path to the file
"""

with open(path, 'r') as meta_file:
meta = {path: ''.join(meta_file.readlines())}
return meta

def write(self, path, obj, overwrite=True) -> None:
raise NotImplementedError('MetaHandler does not write text files, only reads')


class MetaHandler:
def read(self, path) -> List[Dict]:
handler = self._get_handler(path)
return handler.read(path)

def write(self, path, obj, overwrite=True) -> None:
handler = self._get_handler(path)
return handler.write(path, obj, overwrite=overwrite)

def _get_handler(self, path) -> BaseHandler:
ext = os.path.splitext(path)[-1]
if ext == '.json':
return JSONHandler()
elif ext == '.yml':
return YAMLHandler()
else:
return TextHandler()
17 changes: 11 additions & 6 deletions cascade/base/traceable.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,19 @@
import warnings
from typing import List, Dict
from typing import List, Dict, Union


class Traceable:
def __init__(self, *args, meta_prefix=None, **kwargs) -> None:
if meta_prefix is None:
meta_prefix = {}
if isinstance(meta_prefix, str):
from . import MetaHandler

meta_prefix = MetaHandler().read(meta_prefix)
elif isinstance(meta_prefix, str):
meta_prefix = self._read_meta_from_file(meta_prefix)
self.meta_prefix = meta_prefix

def _read_meta_from_file(self, path: str) -> Union[List[Dict], Dict]:
from . import MetaHandler
return MetaHandler().read(path)

def get_meta(self) -> List[Dict]:
"""
Returns
Expand All @@ -30,10 +32,13 @@ def get_meta(self) -> List[Dict]:
self._warn_no_prefix()
return [meta]

def update_meta(self, obj: Dict) -> None:
def update_meta(self, obj: Union[Dict, str]) -> None:
"""
Updates meta_prefix, which is then updates dataset's meta when get_meta() is called
"""
if isinstance(obj, str):
obj = self._read_meta_from_file(obj)

if hasattr(self, 'meta_prefix'):
self.meta_prefix.update(obj)
else:
Expand Down
6 changes: 3 additions & 3 deletions cascade/data/concatenator.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,10 @@ def __repr__(self) -> str:

def get_meta(self) -> List[Dict]:
"""
Concatenator calls `get_meta()` of all its datasets and appends to its own meta
Concatenator calls `get_meta()` of all its datasets
"""
meta = super().get_meta()
meta[0]['data'] = []
meta[0]['data'] = {}
for ds in self._datasets:
meta[0]['data'] += ds.get_meta()
meta[0]['data'][repr(ds)] = ds.get_meta()
return meta
7 changes: 6 additions & 1 deletion cascade/data/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,11 @@ def __getitem__(self, item):
def __iter__(self):
for item in self._data:
yield item

def get_meta(self):
meta = super().get_meta()
meta[0]['obj_type'] = str(type(self._data))
return meta


class Wrapper(Dataset):
Expand All @@ -95,7 +100,7 @@ def __len__(self) -> int:
def get_meta(self):
meta = super().get_meta()
meta[0]['len'] = len(self)
meta[0]['obj_type'] = type(self._data)
meta[0]['obj_type'] = str(type(self._data))
return meta


Expand Down
6 changes: 3 additions & 3 deletions cascade/data/pickler.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,10 @@ class Pickler(Modifier):
"""
def __init__(self, path, dataset=None, *args, **kwargs) -> None:
"""
Loads pickled dataset or dumps one depending on parameters passed
Loads pickled dataset or dumps one depending on parameters passed:

If only path is passed - loads dataset from path provided if path exists
if path provided with a dataset dumps dataset to the path
1. If only path is passed - loads dataset from path provided if path exists
2. if path provided with a dataset dumps dataset to the path

Parameters
----------
Expand Down
13 changes: 13 additions & 0 deletions cascade/data/random_sampler.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,20 @@


class RandomSampler(Sampler):
"""
Shuffles dataset. Can randomly sample from dataset
if num_samples is not None and less than length of dataset.
"""
def __init__(self, dataset: Dataset, num_samples=None, **kwargs) -> None:
"""
Parameters
----------
dataset: Dataset
Input dataset to sample from
num_samples: int, optional
Should be less than len(dataset), but oversampling can be added in the future.
If None, then just shuffles the dataset.
"""
if num_samples is None:
num_samples = len(dataset)
super().__init__(dataset, num_samples, **kwargs)
Expand Down
1 change: 1 addition & 0 deletions cascade/docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
sphinx
furo
sphinx-copybutton
nbsphinx
5 changes: 0 additions & 5 deletions cascade/docs/source/build_pipeline.py

This file was deleted.

19 changes: 19 additions & 0 deletions cascade/docs/source/cascade.base.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
cascade.base
============
.. autoclass:: cascade.base.Traceable
:members:

|

|

|

.. autoclass:: cascade.base.MetaHandler
:members:

|

|

|
12 changes: 10 additions & 2 deletions cascade/docs/source/cascade.data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,15 +82,14 @@ cascade.data

|


.. autoclass:: cascade.data.FolderDataset
:members:

|

|

|
|

.. autoclass:: cascade.data.Pickler
:members:
Expand All @@ -99,6 +98,15 @@ cascade.data

|

|

.. autoclass:: cascade.data.RandomSampler
:members:

|

|

|

.. autoclass:: cascade.data.SequentialCacher
Expand Down
Loading