Skip to content

Commit

Permalink
Refactor submissions processor so we can grab the cells data from it
Browse files Browse the repository at this point in the history
  • Loading branch information
s2t2 committed Dec 14, 2023
1 parent 02a0c22 commit d393608
Show file tree
Hide file tree
Showing 14 changed files with 389 additions and 217 deletions.
12 changes: 8 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@

# ignore results saved to data dir:
data/*.csv
data/*.png
data/*.html
# ignore artifacts saved to results dir:
results/*.csv
results/*.png
results/*.html

#test/results/*




# Byte-compiled / optimized / DLL files
Expand Down
14 changes: 6 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,23 +71,21 @@ python -m app.submissions_manager
Process the starter file:

```sh
python -m app.jobs.starter
python -m app.starter_doc_processor

# FIG_SHOW=false python -m app.jobs.starter
# FIG_SHOW=false python -m app.starter_doc_processor

# FIG_SHOW=false CHUNK_SIZE=600 CHUNK_OVERLAP=0 python -m app.jobs.starter
# FIG_SHOW=false CHUNK_SIZE=600 CHUNK_OVERLAP=0 python -m app.starter_doc_processor

# FIG_SHOW=false CHUNK_SIZE=600 CHUNK_OVERLAP=0 SIMILARITY_THRESHOLD=0.75 python -m app.jobs.starter
# FIG_SHOW=false CHUNK_SIZE=600 CHUNK_OVERLAP=0 SIMILARITY_THRESHOLD=0.75 python -m app.starter_doc_processor
```

Process all submission files:

```sh
python -m app.jobs.submissions
python -m app.submissions_processor

#FIG_SHOW=false CHUNK_SIZE=600 CHUNK_OVERLAP=0 python -m app.jobs.submissions

# FIG_SHOW=false CHUNK_SIZE=600 CHUNK_OVERLAP=0 SIMILARITY_THRESHOLD=0.75 python -m app.jobs.submissions
#FIG_SHOW=false CHUNK_SIZE=600 CHUNK_OVERLAP=0 python -m app.submissions_processor
```

## Testing
Expand Down
2 changes: 1 addition & 1 deletion app/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@

import os

DATA_DIRPATH = os.path.join(os.path.dirname(__file__), "..", "data")
RESULTS_DIRPATH = os.path.join(os.path.dirname(__file__), "..", "results")
191 changes: 0 additions & 191 deletions app/jobs/submissions.py

This file was deleted.

File renamed without changes.
21 changes: 11 additions & 10 deletions app/submissions_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,10 @@

class SubmissionsManager:

def __init__(self, dirpath=SUBMISSIONS_DIRPATH, file_ext=".IPYNB"):
def __init__(self, dirpath=SUBMISSIONS_DIRPATH, file_ext=".IPYNB", starter_filename=None):
self.dirpath = dirpath
self.file_ext = file_ext
self.starter_filename = starter_filename

@cached_property
def filenames(self):
Expand All @@ -34,19 +35,19 @@ def find_filepath(self, substr):
return None


#@cached_property
#def starter_filepath(self):
# return self.find_filepath("STARTER")
@cached_property
def starter_filepath(self):
if self.starter_filename:
return os.path.join(self.dirpath, self.starter_filename)
else:
return self.find_filepath("STARTER")



if __name__ == "__main__":


sm = SubmissionsManager()
print(sm.dirpath)
print(len(sm.filenames))

starter_filepath = sm.find_filepath("STARTER")

print(starter_filepath)
print("SUBMISSIONS DIRPATH:", sm.dirpath)
print("FILES:", len(sm.filenames))
print("STARTER DOC:", sm.starter_filepath)
Loading

0 comments on commit d393608

Please sign in to comment.