-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analyzes regarding Duplicate Reimbursements #286
base: master
Are you sure you want to change the base?
Conversation
…-amor # Conflicts: # conda_requirements.txt # research/Dockerfile # research/requirements.txt
# Including pdf > png # png > sift descriptors # png > keras classifier
PDF to PNG ok PNG to SIFT (error in opencv)
Change the workflow for png references
Download files OK Split Files ok Testing trianing ...
…-amor # Conflicts: # research/Dockerfile
# Building Reference Dataset ok # Building Keras model and evaluation OK # PDF-> PNG OK
# Using dhash to detect near duplications.
# Inclusion of Fourier transformation to detect rotation, zoom, and filters.
Hi @silviodc, thanks for the contribuition! What I did to test this PR:
$ git clone git@github.com:datasciencebr/serenata-de-amor.git
$ cd serenata-de-amor
$ git checkout -b silviodc-silvio-cardoso master
$ git pull https://github.com/silviodc/serenata-de-amor.git silvio-cardoso
$ conda update conda
$ conda create --name serenata_de_amor python=3
$ source activate serenata_de_amor
$ ./setup
$ jupyter notebook
I really liked your work on it, looks real impressive! There is only one thing that I'll ask you, and then for me we can merge it! |
research/requirements.txt
Outdated
tensorflow>=1.2.1 | ||
h5py>=2.7.0 | ||
Pillow>=4.2.1 | ||
opencv-python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the version of the libraries that you are using? It helps in case they change something that make it not working ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
No. I guess i finished with these analyses. |
d02bbbd
to
3f76ad6
Compare
Detecting duplicate Reimbursements using dhash.
The last commit has the notebook to detect duplicate Reimbursements. It uses hash and hamming distance.
The other files concerns future implementation as: CFMT block (Compact Fourier Mellin Transform) to be more precise during the detection.
It is related to issue: #32