Fix categorical column after sequence_index column issue #357

fealho · 2021-03-23T02:13:33Z

Resolve #314.

tests/integration/timeseries/test_par.py

csala · 2021-03-23T20:22:12Z

sdv/timeseries/deepecho.py

@@ -67,7 +67,8 @@ def _fit(self, timeseries_data):

        data_types = list()
        context_types = list()
-        for field, meta in self._metadata.get_fields().items():
+        for field in self._entity_columns + self._data_columns:
+            meta = self._metadata.get_fields()[field]


I would capture the fields_metadata in a variable before the loop to avoid having to call self._metadata.get_fields() at each iteration.

csala · 2021-03-23T20:22:51Z

sdv/timeseries/deepecho.py

@@ -67,7 +67,8 @@ def _fit(self, timeseries_data):

        data_types = list()
        context_types = list()
-        for field, meta in self._metadata.get_fields().items():
+        for field in self._entity_columns + self._data_columns:


I think this will possibly not work because the order of the columns will be altered, and also we would be missing the context_columns.

If the order of the key/value pairs from the self._metadata.get_fields() is the problem, maybe a possibility would be to just iterate over self._output_columns (which is the list of columns from the input data)?

Then, in order to solve the sequence_index problem, we could change line 74 (from the old code):

if field == self._sequence_index: data_types.append('continuous')

to

if field == self._sequence_index: data_types.extend(['continuous', 'continuous'])

And then just remove line 82 (from the old code).

csala · 2021-03-23T20:55:05Z

tests/integration/timeseries/test_par.py

+def test_column_after_date():
+    """Test that adding columns after the `sequence_index` column works."""
+    date = datetime.datetime.strptime('2020-01-01', '%Y-%m-%d')
+    daily_timeseries = pd.DataFrame({


I think it would be worth to make this test slightly more complex, so there are multiple data types and both entity columns and context columns.

sdv/timeseries/deepecho.py

csala · 2021-09-14T11:35:50Z

tests/integration/timeseries/test_par.py

+    })
+
+    model = PAR(entity_columns=['col'], sequence_index='date', epochs=1)
+    model.fit(daily_timeseries)


Should we be validating a bit more? For example, validate that the output types are actually right

sdv/timeseries/deepecho.py

codecov-commenter · 2021-12-15T17:32:19Z

Codecov Report

Merging #357 (9678f8d) into master (643de0a) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #357   +/-   ##
=======================================
  Coverage   65.01%   65.01%           
=======================================
  Files          34       34           
  Lines        2590     2590           
=======================================
  Hits         1684     1684           
  Misses        906      906

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 643de0a...9678f8d. Read the comment docs.

pvk-developer

fealho added 3 commits March 22, 2021 15:49

Fixes the issue

f649641

Add test

aa1ce10

Fix lint

c7859d4

fealho requested review from csala and pvk-developer March 23, 2021 02:17

csala suggested changes Mar 23, 2021

View reviewed changes

fealho added 2 commits September 7, 2021 15:52

Merge branch 'master' into par-wrong-type

8079d83

Addresses feedback/adds new test case

0dc3711

fealho requested a review from a team as a code owner September 8, 2021 04:02

fealho added 2 commits September 7, 2021 21:06

Fix lint/remove get_fields() from loop

091aee6

Changes fields to fields_metadata

92efecd

fealho requested a review from csala September 8, 2021 14:44

csala reviewed Oct 5, 2021

View reviewed changes

fealho added 2 commits October 6, 2021 12:10

Merge branch 'master' into par-wrong-type

baff90b

Add more validation to the test cases

6e3125f

fealho requested review from csala and pvk-developer and removed request for pvk-developer October 6, 2021 19:55

csala approved these changes Dec 14, 2021

View reviewed changes

sdv/timeseries/deepecho.py Outdated Show resolved Hide resolved

Merge branch 'master' into par-wrong-type

9678f8d

pvk-developer approved these changes Dec 16, 2021

View reviewed changes

fealho merged commit f87f503 into master Dec 16, 2021

fealho deleted the par-wrong-type branch December 16, 2021 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix categorical column after sequence_index column issue #357

Fix categorical column after sequence_index column issue #357

fealho commented Mar 23, 2021 •

edited

Loading

csala Mar 23, 2021

csala Mar 23, 2021

csala Mar 23, 2021

csala Sep 14, 2021

codecov-commenter commented Dec 15, 2021

pvk-developer left a comment

Fix categorical column after sequence_index column issue #357

Fix categorical column after sequence_index column issue #357

Conversation

fealho commented Mar 23, 2021 • edited Loading

csala Mar 23, 2021

Choose a reason for hiding this comment

csala Mar 23, 2021

Choose a reason for hiding this comment

csala Mar 23, 2021

Choose a reason for hiding this comment

csala Sep 14, 2021

Choose a reason for hiding this comment

codecov-commenter commented Dec 15, 2021

Codecov Report

pvk-developer left a comment

Choose a reason for hiding this comment

fealho commented Mar 23, 2021 •

edited

Loading