Skip to content

Commit

Permalink
Evaluate the similarity code with multiple parameters and regimes
Browse files Browse the repository at this point in the history
@corinne-hcr more examples of what the initial analysis might have looked like.

Build similarity models for:
- 100, 300m, 500m
- all combinations of filtering (yes/no) and cutoffs (yes/no)

Generate labels for all labeled trips
Determine ground truth by looking at: unique tuples and unique values for each
    of the user inputs
Use these models to compute the metrics (homogeneity score and request %)
    for all combinations, along with a few other metrics like the number of
    unique tuples, cluster_trip_pct, etc. At this point, we are focusing on
    ground truth from tuples since the homogeneity score is already fairly high.
    What we really need to do is to bring down the request %, or determine
    *why* the user % is so high so that we can fix it (e.g. polygon).

Some results in:
#28 (comment)
  • Loading branch information
shankari committed Jul 25, 2021
1 parent 1190416 commit abf4f78
Showing 1 changed file with 973 additions and 0 deletions.
Loading

0 comments on commit abf4f78

Please sign in to comment.