SeriesFinder

This is the code for paper "Learning to detect patterns of crime" (http://web.mit.edu/rudin/www/WangRuWaSeECML13.pdf).

The Series Finder model has two components. First, run learn_lambda.m to learn the pattern-general coefficients lambda, and then run find.m to grow series from given seeds.

Step 0: pre-computed similarity of each feature

In our project, we precomputed 11 feature matrix: LocEntry,MnsEntry, dayapart, HB_dis, premises,ransacked,residents, timeframe,dayweek,suspect and victim. For Ncrimes, these are Ncrimes by Ncrimes matrix where an element on i-th row and j-th column represents the similaerity between the i-th crime and j-th crime for that particular feature. Since the similarity matrix is symmetric, we only keep the upper triangular of each matrix and set the lower half to be 0.

Step 1: learn_lambda.m

Learn_lambda.m finds the optimal lambda. It initializes the model with a randomly generated starting point lambda_init, and uses coordinate ascent to gradually improve the objective until it converges. Since this could lead to a local optima, we initialize the search with different randomly generated starting points

trainSeries: a list of series, where trainSeries(t,:) contains the index of crimes in series t
NtrainSeries: the number of training series
stepTable: a table that contains results at different lambdas. For example, stepTable(j,r) stores the results when multiply the j-th feature by scales(r)
Maxlen: the maximum number of crime that can be discovered in a series
maxIter:maximum number of steps
Nchain: the number of times the searching is run from with a randomly generated starting point
scales: the scales that are used to be multiplied with each feature

Step 2: find.m

find.m grows series of crimes from seeds provided by users. Every time, the model computes the pattern-crime similarity between the current pattern and all candidate crimes, and choose the one that has the highest similarity, adds it to the pattern and update the pattern-specific coefficients eta accordingly. The series keeps growing until the cohesion drops below certain cuttoff or the size of the series is bigger than the maximum length allowed

Ncrimes: the number of all crimes from which we want to discover series.
Nseries: the number of series we want to discover
cutoff: the cutoff for cohesion of a series where if the cohesion is below the cutoff, the series stops growing
Maxlen:the maximum number of crime that can be discovered in a series

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.DS_Store		.DS_Store
PremisesIndex.m		PremisesIndex.m
README.md		README.md
compute_HB_dis.m		compute_HB_dis.m
compute_LocEntry.m		compute_LocEntry.m
compute_MnsEntry.m		compute_MnsEntry.m
compute_daysapart.m		compute_daysapart.m
compute_dayweek.m		compute_dayweek.m
compute_premises.m		compute_premises.m
compute_ransacked.m		compute_ransacked.m
compute_residents.m		compute_residents.m
compute_suspect.m		compute_suspect.m
compute_timeframe.m		compute_timeframe.m
compute_victim.m		compute_victim.m
find.m		find.m
learn_lambda.m		learn_lambda.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SeriesFinder

Step 0: pre-computed similarity of each feature

Step 1: learn_lambda.m

Step 2: find.m

About

Releases

Packages

Languages

wangtongada/SeriesFinder

Folders and files

Latest commit

History

Repository files navigation

SeriesFinder

Step 0: pre-computed similarity of each feature

Step 1: learn_lambda.m

Step 2: find.m

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages