Scribe does business lead generation on a large dataset of contact records, which includes a job titles column that isn't yet normalized, and could be used to identify seniority, department, job type etc.
minimum viable project is to assign classification from the canadian gov't job label classes database our dataset has no subset of pre-labeled data according to the desired bins
However we ought to be able to use canadian gov't job label classes to get started. Once the NLP system is functioning we can try to begin to define our own categorization with a mix of manual / semi-automatic labeling
I will try to provide for generating appropriate bags of words for this purpose in an iterative manner if I have the time to program it, but I will also write out my thoughts on the matter in the event that further code development is required for that purpose.
- Summary of set up
- Configuration
- Dependencies
- Database configuration
- How to run tests
- Deployment instructions