Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: failing windows parser test #7623

Closed
jreback opened this issue Jun 30, 2014 · 7 comments
Closed

TST: failing windows parser test #7623

jreback opened this issue Jun 30, 2014 · 7 comments
Labels
IO CSV read_csv, to_csv Testing pandas testing functions or related to the test suite
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Jun 30, 2014

This test is failing on all versions of windows (but not linux).

related to #7582, #7591
cc @AmrAS1
cc @mcwitt

not sure what is failing this

FAIL: test_concat_invalid_first_argument (pandas.tools.tests.test_merge.TestConcatenate)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tools\tests\test_merge.py", line 2107, in test_concat_invalid_first_argument
    assert_frame_equal(result,expected)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\util\testing.py", line 641, in assert_frame_equal
    check_exact=check_exact)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\util\testing.py", line 588, in assert_series_equal
    assert_almost_equal(left.values, right.values, check_less_precise)
  File "testing.pyx", line 58, in pandas._testing.assert_almost_equal (pandas\src\testing.c:2465)
  File "testing.pyx", line 93, in pandas._testing.assert_almost_equal (pandas\src\testing.c:1793)
  File "testing.pyx", line 69, in pandas._testing.assert_almost_equal (pandas\src\testing.c:1489)
AssertionError: nan != 'bar'

can you guys take a look...thxs

Here's some debug output

C:\Users\Jeff Reback\Documents\GitHub\pandas>c:\python27-64\Scripts\nosetests.exe build\lib.win-amd64-2.7\pandas\tools\tests\test_merge.py --pdb --pdb-failure
................> c:\users\jeff reback\documents\github\pandas\testing.pyx(69)pandas._testing.assert_almost_equal (pandas\src\testing.c:1489)()
(Pdb) u
> c:\users\jeff reback\documents\github\pandas\testing.pyx(93)pandas._testing.assert_almost_equal (pandas\src\testing.c:1793)()
(Pdb) u
> c:\users\jeff reback\documents\github\pandas\testing.pyx(58)pandas._testing.assert_almost_equal (pandas\src\testing.c:2465)()
(Pdb) u
> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-2.7\pandas\util\testing.py(588)assert_series_equal()
-> assert_almost_equal(left.values, right.values, check_less_precise)
(Pdb) u
> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-2.7\pandas\util\testing.py(641)assert_frame_equal()
-> check_exact=check_exact)
(Pdb) u
> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-2.7\pandas\tools\tests\test_merge.py(2107)test_concat_invalid_first_argument()
-> assert_frame_equal(result,expected)
(Pdb) l
2102    """
2103
2104            reader = read_csv(StringIO(data), chunksize=1)
2105            result = concat(reader, ignore_index=True)
2106            expected = read_csv(StringIO(data))
2107 ->         assert_frame_equal(result,expected)
2108
2109    class TestOrderedMerge(tm.TestCase):
2110
2111        def setUp(self):
2112            self.left = DataFrame({'key': ['a', 'c', 'e'],
(Pdb) p result
  index     A   B   C   D
0   foo     2   3   4   5
1   NaN  ?¶  n NaN NaN
2   baz    12  13  14  15
3   qux    12  13  14  15
4  foo2    12  13  14  15
5  bar2    12  13  14  15
(Pdb) p data
'index,A,B,C,D\nfoo,2,3,4,5\nbar,7,8,9,10\nbaz,12,13,14,15\nqux,12,13,14,15\nfoo2,12,13,14,15\nbar2,12,13,14,15\n'
(Pdb) p expected
  index   A   B   C   D
0   foo   2   3   4   5
1   bar   7   8   9  10
2   baz  12  13  14  15
3   qux  12  13  14  15
4  foo2  12  13  14  15
5  bar2  12  13  14  15
(Pdb) !reader = read_csv(StringIO(data),chunksize=1)
(Pdb) p list(reader)
[  index  A  B  C  D
0   foo  2  3  4  5,    index     A   B   C   D
0    NaN  ?¶  n NaN NaN,   index   A   B   C   D
0   baz  12  13  14  15,   index   A   B   C   D
0   qux  12  13  14  15,   index   A   B   C   D
0  foo2  12  13  14  15,   index   A   B   C   D
0  bar2  12  13  14  15]

@jreback jreback added this to the 0.14.1 milestone Jun 30, 2014
@mcwitt
Copy link
Contributor

mcwitt commented Jun 30, 2014

oof... I suspect this is related to #7591 because the 2nd line after the header is getting corrupted. I don't see anything yet but will keep looking. Does this same line get corrupted for other inputs as well?

@jreback
Copy link
Contributor Author

jreback commented Jun 30, 2014

that's the only failing test, maybe has to do with the 'backing' up in the parse stream (note the chunksize is 1 ?

@mcwitt
Copy link
Contributor

mcwitt commented Jun 30, 2014

hmm, there shouldn't be any backing up in this case because the 3rd row has the same number of fields as the header... any idea what could be different in windows?

@jreback
Copy link
Contributor Author

jreback commented Jun 30, 2014

no idea...want to try to setup (its not that hard), see here: https://github.com/pydata/pandas/wiki/Building-On-Windows

@jreback
Copy link
Contributor Author

jreback commented Jul 2, 2014

@mcwitt any luck?

@mcwitt
Copy link
Contributor

mcwitt commented Jul 2, 2014

No luck yet... Unfortunately the timing was really bad on this one as now I'm traveling with limited internet access for 2 weeks (typing this from my phone). I'll be able to give this my full attention when I get back, but in the meantime if it's a priority maybe someone else could look at it or even revert the bad commit? (Have we established that it's definitely #7591?)

@jreback
Copy link
Contributor Author

jreback commented Jul 2, 2014

reverted #7591 (original issue was reopened)

@jreback jreback closed this as completed Jul 2, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv Testing pandas testing functions or related to the test suite
Projects
None yet
Development

No branches or pull requests

2 participants