Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added CSVData tests for io streams #327

Merged
merged 6 commits into from
Jul 15, 2021

Conversation

gautomdas
Copy link
Contributor

Added the tests to compare CSVData functionality with streams

Comment on lines 440 to 456
with self.assertRaisesRegex(ValueError,
'`header` must be one of following: auto, '
'none for no header, or a non-negative '
'integer for the row that represents the '
'header \(0 based index\)'):
csv_data = CSVData(filename, options=options)
first_value = csv_data.data.loc[0][0]

# set bad header setting
options = dict(header='abcdef')
with self.assertRaisesRegex(ValueError,
'`header` must be one of following: auto, '
'none for no header, or a non-negative '
'integer for the row that represents the '
'header \(0 based index\)'):
csv_data = CSVData(filename, options=options)
first_value = csv_data.data.loc[0][0]
Copy link
Contributor

@JGSweets JGSweets Jul 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the assertRaises checks as it is covered in the other test. Additionally, if we did want these tests, I would suggest not putting them in the for loop as we would only need to test it once.

input_file["path"])

with open(input_file['path'], 'r', encoding=input_file['encoding']) as fp:
byte_string = StringIO(fp.read())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a blocker for approval, but byte_string maybe should be buffer or stream

@JGSweets JGSweets enabled auto-merge (squash) July 15, 2021 16:48
auto-merge was automatically disabled July 15, 2021 16:49

Head branch was pushed to by a user without write access

Copy link
Contributor

@JGSweets JGSweets left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for now, but I think we should refactor this immediately after to clean up tests and set the future precedent.

Essentially, wether it is read from the file or from the buffer, they should be identical meaning the tests can be used the same.

Something like this:

def setUpClass(cls):
    .
    .
    .

      
    cls.buffer_list = []
    for input_file in cls.input_file_names:
        #
        # Create stream here
        # buffer = ...
        buffer_info = input_file.copy()
        buffer_info['path'] = buffer
        cls.buffer_list.append(buffer_info)
        )
        
.
.
.
# then in tests also loop through the buffer list with existing tests not made specifically for streams

Again, not saying it should be this PR. We can refactor directly after.

@JGSweets JGSweets enabled auto-merge (squash) July 15, 2021 16:53
@JGSweets JGSweets merged commit 5116641 into capitalone:main Jul 15, 2021
stevensecreti pushed a commit to stevensecreti/DataProfiler that referenced this pull request Jun 15, 2022
* Added CSVData tests for io streams

* Change Data to CSVData

* Fixed parameter issue and checked tests

* Added change to fix 3.6 bug

* Made small changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants