You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Normalize the data:
python collector.py normalize_data --region US --max_workers=4 --interval 1d --source_dir /home/jovyan/quants/qlib-data/us_source --normalize_dir /home/jovyan/quants/qlib-data/us_data_normal
Expected Behavior
Normalizing the data without crashing
Screenshot
2022-07-15 11:09:48.999 | WARNING | collector:normalize_yahoo:425 - MCH change is abnormal for 18 consecutive days, please check the specific data file carefully
61%|█████████████████████████████████████████████████▌ | 6841/11175 [04:14<02:41, 26.89it/s]
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 198, in _process_chunk
return [fn(*args) for args in chunk]
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 198, in
return [fn(*args) for args in chunk]
File "/home/jovyan/quants/qlib/scripts/data_collector/base.py", line 293, in _executor
df = self._normalize_obj.normalize(df)
File "/home/jovyan/quants/qlib/scripts/data_collector/yahoo/collector.py", line 475, in normalize
df = super(YahooNormalize1d, self).normalize(df)
File "/home/jovyan/quants/qlib/scripts/data_collector/yahoo/collector.py", line 440, in normalize
df = self.normalize_yahoo(df, self._calendar_list, self._date_field_name, self._symbol_field_name)
File "/home/jovyan/quants/qlib/scripts/data_collector/yahoo/collector.py", line 391, in normalize_yahoo
symbol = df.loc[df[symbol_field_name].first_valid_index(), symbol_field_name]
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/indexing.py", line 925, in getitem
return self._getitem_tuple(key)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/indexing.py", line 1100, in _getitem_tuple
return self._getitem_lowerdim(tup)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/indexing.py", line 838, in _getitem_lowerdim
section = self._getitem_axis(key, axis=i)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/indexing.py", line 1164, in _getitem_axis
return self._get_label(key, axis=axis)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/indexing.py", line 1113, in _get_label
return self.obj.xs(label, axis=axis)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/generic.py", line 3776, in xs
loc = index.get_loc(key)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/indexes/range.py", line 388, in get_loc
raise KeyError(key)
KeyError: None
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "collector.py", line 1203, in
fire.Fire(Run)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "collector.py", line 1022, in normalize_data
super(Run, self).normalize_data(
File "/home/jovyan/quants/qlib/scripts/data_collector/base.py", line 427, in normalize_data
yc.normalize()
File "/home/jovyan/quants/qlib/scripts/data_collector/base.py", line 306, in normalize
for _ in worker.map(self._executor, file_list):
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 484, in _chain_from_iterable_of_lists
for element in iterable:
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator
yield fs.pop().result()
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
KeyError: None
Environment
Note: User could run cd scripts && python collect_info.py all under project directory to get system information
and paste them here directly.
Linux
x86_64
Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.10 #1 SMP Wed Mar 2 00:30:59 UTC 2022
Python version: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10) [GCC 10.3.0]
🐛 Bug Description
Normalizing US stock data, crashes
To Reproduce
Steps to reproduce the behavior:
python collector.py normalize_data --region US --max_workers=4 --interval 1d --source_dir /home/jovyan/quants/qlib-data/us_source --normalize_dir /home/jovyan/quants/qlib-data/us_data_normal
Expected Behavior
Normalizing the data without crashing
Screenshot
2022-07-15 11:09:48.999 | WARNING | collector:normalize_yahoo:425 - MCH
change
is abnormal for 18 consecutive days, please check the specific data file carefully61%|█████████████████████████████████████████████████▌ | 6841/11175 [04:14<02:41, 26.89it/s]
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 198, in _process_chunk
return [fn(*args) for args in chunk]
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 198, in
return [fn(*args) for args in chunk]
File "/home/jovyan/quants/qlib/scripts/data_collector/base.py", line 293, in _executor
df = self._normalize_obj.normalize(df)
File "/home/jovyan/quants/qlib/scripts/data_collector/yahoo/collector.py", line 475, in normalize
df = super(YahooNormalize1d, self).normalize(df)
File "/home/jovyan/quants/qlib/scripts/data_collector/yahoo/collector.py", line 440, in normalize
df = self.normalize_yahoo(df, self._calendar_list, self._date_field_name, self._symbol_field_name)
File "/home/jovyan/quants/qlib/scripts/data_collector/yahoo/collector.py", line 391, in normalize_yahoo
symbol = df.loc[df[symbol_field_name].first_valid_index(), symbol_field_name]
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/indexing.py", line 925, in getitem
return self._getitem_tuple(key)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/indexing.py", line 1100, in _getitem_tuple
return self._getitem_lowerdim(tup)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/indexing.py", line 838, in _getitem_lowerdim
section = self._getitem_axis(key, axis=i)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/indexing.py", line 1164, in _getitem_axis
return self._get_label(key, axis=axis)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/indexing.py", line 1113, in _get_label
return self.obj.xs(label, axis=axis)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/generic.py", line 3776, in xs
loc = index.get_loc(key)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/pandas/core/indexes/range.py", line 388, in get_loc
raise KeyError(key)
KeyError: None
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "collector.py", line 1203, in
fire.Fire(Run)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/opt/conda/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "collector.py", line 1022, in normalize_data
super(Run, self).normalize_data(
File "/home/jovyan/quants/qlib/scripts/data_collector/base.py", line 427, in normalize_data
yc.normalize()
File "/home/jovyan/quants/qlib/scripts/data_collector/base.py", line 306, in normalize
for _ in worker.map(self._executor, file_list):
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 484, in _chain_from_iterable_of_lists
for element in iterable:
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator
yield fs.pop().result()
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "/opt/conda/envs/qlib/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
KeyError: None
Environment
Note: User could run
cd scripts && python collect_info.py all
under project directory to get system informationand paste them here directly.
Linux
x86_64
Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.10
#1 SMP Wed Mar 2 00:30:59 UTC 2022
Python version: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10) [GCC 10.3.0]
Qlib version: 0.8.6.99
numpy==1.23.1
pandas==1.4.3
scipy==1.8.1
requests==2.28.1
sacred==0.8.2
python-socketio==5.6.0
redis==4.3.4
python-redis-lock==3.7.0
schedule==1.1.0
cvxpy==1.2.1
hyperopt==0.1.2
fire==0.4.0
statsmodels==0.13.2
xlrd==2.0.1
plotly==5.5.0
matplotlib==3.5.1
tables==3.7.0
pyyaml==6.0
mlflow==1.27.0
tqdm==4.64.0
loguru==0.6.0
lightgbm==3.3.2
tornado==6.1
joblib==1.1.0
fire==0.4.0
ruamel.yaml==0.17.21
Additional Notes
The text was updated successfully, but these errors were encountered: