Added CSVData tests for io streams #327

gautomdas · 2021-07-13T19:46:04Z

Added the tests to compare CSVData functionality with streams

JGSweets · 2021-07-14T14:06:03Z

dataprofiler/tests/data_readers/test_csv_data.py

+            with self.assertRaisesRegex(ValueError,
+                                        '`header` must be one of following: auto, '
+                                        'none for no header, or a non-negative '
+                                        'integer for the row that represents the '
+                                        'header \(0 based index\)'):
+                csv_data = CSVData(filename, options=options)
+                first_value = csv_data.data.loc[0][0]
+
+            # set bad header setting
+            options = dict(header='abcdef')
+            with self.assertRaisesRegex(ValueError,
+                                        '`header` must be one of following: auto, '
+                                        'none for no header, or a non-negative '
+                                        'integer for the row that represents the '
+                                        'header \(0 based index\)'):
+                csv_data = CSVData(filename, options=options)
+                first_value = csv_data.data.loc[0][0]


We can remove the assertRaises checks as it is covered in the other test. Additionally, if we did want these tests, I would suggest not putting them in the for loop as we would only need to test it once.

…o csv_data_tests

JGSweets · 2021-07-15T16:38:11Z

dataprofiler/tests/data_readers/test_csv_data.py

+                                input_file["path"])
+
+            with open(input_file['path'], 'r', encoding=input_file['encoding']) as fp:
+                byte_string = StringIO(fp.read())


not a blocker for approval, but byte_string maybe should be buffer or stream

JGSweets

I think this is fine for now, but I think we should refactor this immediately after to clean up tests and set the future precedent.

Essentially, wether it is read from the file or from the buffer, they should be identical meaning the tests can be used the same.

Something like this:

def setUpClass(cls):
    .
    .
    .

      
    cls.buffer_list = []
    for input_file in cls.input_file_names:
        #
        # Create stream here
        # buffer = ...
        buffer_info = input_file.copy()
        buffer_info['path'] = buffer
        cls.buffer_list.append(buffer_info)
        )
        
.
.
.
# then in tests also loop through the buffer list with existing tests not made specifically for streams

Again, not saying it should be this PR. We can refactor directly after.

* Added CSVData tests for io streams * Change Data to CSVData * Fixed parameter issue and checked tests * Added change to fix 3.6 bug * Made small changes

Added CSVData tests for io streams

22c4820

gautomdas requested review from AnhTruong, ChrisWallace2020, grant-eden, JGSweets and lettergram as code owners July 13, 2021 19:46

gautomdas added 2 commits July 13, 2021 15:50

Change Data to CSVData

2ea210f

Fixed parameter issue and checked tests

e3b2248

JGSweets reviewed Jul 14, 2021

View reviewed changes

gautomdas added 2 commits July 15, 2021 12:32

Added change to fix 3.6 bug

8e449b2

Merge branch 'main' of https://github.com/capitalone/DataProfiler int…

905c0b5

…o csv_data_tests

gautomdas force-pushed the csv_data_tests branch from f0cf84e to 905c0b5 Compare July 15, 2021 16:35

JGSweets reviewed Jul 15, 2021

View reviewed changes

JGSweets enabled auto-merge (squash) July 15, 2021 16:48

Made small changes

eacc88e

auto-merge was automatically disabled July 15, 2021 16:49
Head branch was pushed to by a user without write access

JGSweets reviewed Jul 15, 2021

View reviewed changes

JGSweets enabled auto-merge (squash) July 15, 2021 16:53

JGSweets approved these changes Jul 15, 2021

View reviewed changes

ChrisWallace2020 approved these changes Jul 15, 2021

View reviewed changes

JGSweets merged commit 5116641 into capitalone:main Jul 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added CSVData tests for io streams #327

Added CSVData tests for io streams #327

gautomdas commented Jul 13, 2021

JGSweets Jul 14, 2021 •

edited

Loading

JGSweets Jul 15, 2021

JGSweets left a comment

Added CSVData tests for io streams #327

Added CSVData tests for io streams #327

Conversation

gautomdas commented Jul 13, 2021

JGSweets Jul 14, 2021 • edited Loading

Choose a reason for hiding this comment

JGSweets Jul 15, 2021

Choose a reason for hiding this comment

JGSweets left a comment

Choose a reason for hiding this comment

JGSweets Jul 14, 2021 •

edited

Loading