Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2019 uses different units for volume than other years/branches #5

Closed
fearofcode opened this issue May 3, 2020 · 5 comments
Closed
Labels

Comments

@fearofcode
Copy link

Thank you for creating this repository.

https://github.com/FX-Data/FX-Data-EURUSD-DS/blob/EURUSD-2008/EURUSD/2008/01/2008-01-01--01h_ticks.csv shows volume in what looks like millions.

The 2019 data, however, appears to be in flat units. https://github.com/FX-Data/FX-Data-EURUSD-DS/blob/EURUSD-2019/EURUSD/2019/01/2019-01-01--22h_ticks.csv

Are there any other inconsistencies to be aware of if you wanted to combine multiple years into a single, consistent dataset?

@kenorb
Copy link
Member

kenorb commented May 8, 2020

Thanks for the report. The data is downloaded as it is from Dukascopy endpoints using Python script without changing the original values, e.g.

./dl_bt_dukascopy.py -c -p EURUSD

Possibly they've changed the volume format in the recent year. Maybe they've decided that providing full volume values make more sense.

If there is some inconsistency in volumes, it needs to be fixed manually or in the script.

@kenorb
Copy link
Member

kenorb commented May 8, 2020

Which one do you suggest should be the base format? With in millions (with comma) or full values?

@fearofcode
Copy link
Author

Yes, I think the change in question is this one: FX31337/FX-BT-Scripts@d3e46bb#diff-e39272740e5f05eca41cf05889724b89R357-R358

I would keep the original data as is and not round it.

@kenorb
Copy link
Member

kenorb commented May 9, 2020

Ok, thanks for identifying the issue. I'll try to regenerate the files soon using original values.

@kenorb
Copy link
Member

kenorb commented Jun 15, 2020

Scripts has been fixed.

I've fixed data files in this repo manually.

Command for the reference:

find . -name "*.csv" -exec bash -c 'awk -F, '\''{print $1","$2","$3","($4/1000000)","($5/1000000)}'\'' {} > {}.new && mv {}.new {}' ';'

Binary files for MT platform has been re-generated (build: 698734474).

Tested with this build.

@kenorb kenorb removed the question label Jun 16, 2020
@kenorb kenorb closed this as completed Jun 16, 2020
kenorb added a commit that referenced this issue Jul 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants