You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have the following dataset (haven't put the complete one here)
id,date,participants
250,1517344239,2
418,1497884457,6
63,1515513662,3
67,1498667379,2
498,1503235860,2
45,1501160446,10
61,1515016822,3
1,1515169968,2
563,1497884443,8
184,1523390349,3
42,1516111608,3
85,1516095293,2
498,1503531487,2
id - represents virtual meetings (each id identifies a series of meetings that occured. For some could have occurred once every week, some every day some once every n weeks and so on). I am treating each meeting as a series. Hope thats fine
date - time (UTC) when the meeting happened
participants - number of participants that joined the meeting
I am trying to see if I can apply time series analysis to this kind of a data to predict number of users who might join a future occurrence. I am exploring tsfresh to see if I can get more features out for my data which can improve my prediction (its a regression problem). This is the context
series = pd.read_csv('meetings.data')
X = series[[col for col in series.columns if col != 'participants']]
y = series['participants']
extracted_features = extract_features(X, column_id='id', column_sort='date')
This results in following error
Traceback (most recent call last):
File "tsfresh_predict.py", line 16, in <module>
extracted_features = extract_features(X, column_id='id', column_sort='date')
File "/Users/vganapathy/mcufe/dev3/lib/python3.6/site-packages/tsfresh/feature_extraction/extraction.py", line 152, in extract_features
distributor=distributor)
File "/Users/vganapathy/mcufe/dev3/lib/python3.6/site-packages/tsfresh/feature_extraction/extraction.py", line 233, in _do_extraction
function_kwargs=kwargs)
File "/Users/vganapathy/mcufe/dev3/lib/python3.6/site-packages/tsfresh/utilities/distribution.py", line 142, in map_reduce
total_number_of_expected_results = math.ceil(data_length / chunk_size)
ZeroDivisionError: division by zero
In my data time of the meeting is the only parameter I have. Just would like to know what is causing this exception.
The text was updated successfully, but these errors were encountered:
You have the ZeroDivisionError because you pass no data. You extract features on the X dataframe, but this solely consists of the columns id and date.
You need to use your series dataframe and specify a column_value like this:
I have the following dataset (haven't put the complete one here)
id,date,participants
250,1517344239,2
418,1497884457,6
63,1515513662,3
67,1498667379,2
498,1503235860,2
45,1501160446,10
61,1515016822,3
1,1515169968,2
563,1497884443,8
184,1523390349,3
42,1516111608,3
85,1516095293,2
498,1503531487,2
id - represents virtual meetings (each id identifies a series of meetings that occured. For some could have occurred once every week, some every day some once every n weeks and so on). I am treating each meeting as a series. Hope thats fine
date - time (UTC) when the meeting happened
participants - number of participants that joined the meeting
I am trying to see if I can apply time series analysis to this kind of a data to predict number of users who might join a future occurrence. I am exploring tsfresh to see if I can get more features out for my data which can improve my prediction (its a regression problem). This is the context
This results in following error
In my data time of the meeting is the only parameter I have. Just would like to know what is causing this exception.
The text was updated successfully, but these errors were encountered: