-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: Respect na_rep in DataFrame.to_latex() when used with formatters (#9046) #25799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/io/formats/format.py
Outdated
@@ -1047,7 +1047,8 @@ def get_result_as_array(self): | |||
""" | |||
|
|||
if self.formatter is not None: | |||
return np.array([self.formatter(x) for x in self.values]) | |||
return np.array([self.formatter(x) if not isna(x) else self.na_rep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
construct the array (as object), the set the nan's via mask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed that list comp. to a np.array then mask. Is that what you mean?
Codecov Report
@@ Coverage Diff @@
## master #25799 +/- ##
=======================================
Coverage 41.77% 41.77%
=======================================
Files 173 173
Lines 53002 53002
=======================================
Hits 22141 22141
Misses 30861 30861
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #25799 +/- ##
==========================================
- Coverage 91.82% 91.8% -0.02%
==========================================
Files 175 175
Lines 52580 52583 +3
==========================================
- Hits 48279 48275 -4
- Misses 4301 4308 +7
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomneep looks like a conflict can you merge master?
doc/source/whatsnew/v0.25.0.rst
Outdated
@@ -280,6 +280,7 @@ I/O | |||
- :meth:`DataFrame.to_html` now raises ``TypeError`` when using an invalid type for the ``classes`` parameter instead of ``AsseertionError`` (:issue:`25608`) | |||
- Bug in :meth:`DataFrame.to_string` and :meth:`DataFrame.to_latex` that would lead to incorrect output when the ``header`` keyword is used (:issue:`16718`) | |||
- Bug in :func:`read_csv` not properly interpreting the UTF8 encoded filenames on Windows on Python 3.6+ (:issue:`15086`) | |||
- Bug in :meth:`DataFrame.to_latex` that would ignore `na_rep` if the `formatters` argument was used (:issue:`9046`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want double backticks around na_rep here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm @jreback
i'm -1 on this PR since i thought the idea of the custom formatters was to override the behavior with a user defined function, hence the shortcut. if you want to define a custom formatter for a particular column, but have the standard na_rep for the remaining columns, doesn't this change prevent that? |
@simonjayhawkins this is a good point that I hadn't considered. I'm happy for this to be closed. |
the
which is out-of-sync with the only reference to
so any improvements here would be welcome, but I don't think it is strictly necessary in order to close the issue. |
Thinking about it, this could work if we change the docstring. At the moment the custom function must return a string. we could allow nan, None whatever to be returned from the custom function. Then apply the |
There's been a issue raised, #26278, regarding using string formatters. This make sense to be compatible with the .style api. I think that it would also make sense, that if string formatters were accepted in to_latex, to_html etc then the na_rep should be respected. so although I was -1 on this PR originally, I think that, in combination with adding string formatters, and keeping the behavior unchanged for callables, then this PR would close the original issue. |
Ok- I think that it is probably still best to close this for now, as I think we'd still want to keep the current behaviour if a list of callables is given if I'm reading your suggestion correctly, e.g. df.to_latex(formatters=[i.format for i in ('{:d}', '{:1.1f}', '{:1.3f}')]) Then the df.to_latex(formatters=['%1i', '%1.1f', '%1.3f']) It might be worth considering how this change would (or maybe wouldn't) affect the df.to_latex(float_format=['%1i', '%1.1f', '%1.3f']) could be a viable choice for this feature too. Anyway, If there's no objection I'll close this PR tomorrow. |
No problem. Thanks @tomneep for looking into this. |
git diff upstream/master -u -- "*.py" | flake8 --diff
A very old issue so low priority!
I'd be tempted to remove the shortcut in
_format_strings()
since I'd guess in most cases it doesn't really save much time.