Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output to LaTeX added (useful for IJulia notebook export to PDF) #845

Merged
merged 11 commits into from
Oct 3, 2016

Conversation

maximerischard
Copy link
Contributor

At the moment DataFrames look great in IJulia (thanks to #433) until they are exported to PDF. This commit adds an additional writemime method for the text/latex mimetype.

@maximerischard
Copy link
Contributor Author

For example

using DataFrames
using LaTeXStrings
df = DataFrame(A = 1:4, B = ["M", "F", "F", L"\alpha"])
print(stringmime(MIME("text/latex"), df))

outputs

\begin{tabular}{r|cc}
    &A&B\\ 
    \hline 
    1 &1&M\\ 
    2 &2&F\\ 
    3 &3&F\\ 
    4 &4&$\alpha$\\ 
\end{tabular}

@johnmyleswhite
Copy link
Contributor

This is awesome. One concern: I think you don't do any string escaping, so what happens if the rows contain data with ampersands in them?

@nalimilan
Copy link
Member

Sorry for missing this. Would you mind updating this to the new show system in Julia 0.5? Then I promise I'll merge.

#
##############################################################################

function latex_char_escape(char::SubString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my understanding that SubString will eventually be merged into String. Also, why not use Char?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, should either be Char or AbstractString.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this should be Char or AbstractString. Either is fine, though the function and argument name suggest to me that Char would make the most sense.

@maximerischard
Copy link
Contributor Author

Hmm, I tried rebasing the PR, not sure what happened here. But I've made the few changes that were requested. Here's a new example:

using DataFrames
using LaTeXStrings
df = DataFrame(A = 1:4, 
               B = ["\$10.0", "M&F", "A~B", "\\alpha"], 
               C = [L"\alpha", L"\beta", L"\gamma", L"\sum_{i=1}^n \delta_i"],
               D = [π, 2π, Nullable(), 3π]
               )
print(reprmime(MIME("text/latex"), df))

outputs

\begin{tabular}{r|cccc}
    & A & B & C & D\\ 
    \hline 
    1 & 1 & \$10.0 & $\alpha$ & 3.141592653589793 \\ 
    2 & 2 & M\&F & $\beta$ & 6.283185307179586 \\ 
    3 & 3 & A\textasciitilde{}B & $\gamma$ &  \\ 
    4 & 4 & \textbackslash{}alpha & $\sum_{i=1}^n \delta_i$ & 9.42477796076938 \\ 
\end{tabular}

Let me know what needs to be done to make this merge-able.

@nalimilan
Copy link
Member

Hmm, I'm not sure what you did, but that's definitely not right. Try a git rebase origin/master.

@maximerischard
Copy link
Contributor Author

OK, I just spent a bit of time in git hell, and I think this might be alright now.

end

function Base.show(io::IO,
::MIME"text/latex",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alignment issue here.

nrows = size(df, 1)
ncols = size(df, 2)
cnames = _names(df)
alignment = join(["c" for _ in 1:ncols])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can write this as repeat("c", ncols) or "c"^ncols.

if !isnull(cell)
content = get(cell)
if mimewritable(MIME("text/latex"), content)
Base.show(io, MIME("text/latex"), content)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for Base. here and below.

@nalimilan
Copy link
Member

Cool, thanks! Could you address the comments and add some tests?


function Base.show(io::IO, ::MIME"text/latex", df::AbstractDataFrame)
nrows = size(df, 1)
ncols = size(df, 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be written more succinctly as nrows, ncols = size(df). Doesn't matter too much, just thought I'd point it out.

@maximerischard
Copy link
Contributor Author

I've added a test using my example. It's not the most elegant test (suggestions welcome!), and it adds a dependency to LaTeXStrings for tests, but I think it's important to have objects in the test dataframe for which a show function to LaTeX already exists.

@@ -1,6 +1,7 @@
module TestIO
using Base.Test
using DataFrames, Compat
using LaTeXStrings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this introduces a test dependency on an external package, the package will have to be included in test/REQUIRE.

C = [L"\alpha", L"\beta", L"\gamma", L"\sum_{i=1}^n \delta_i"],
D = [1.0, 2.0, Nullable(), 3.0]
)
@test isequal(reprmime(MIME("text/latex"), df), "\\begin{tabular}{r|cccc}\n\t& A & B & C & D\\\\ \n\t\\hline \n\t1 & 1 & \\\$10.0 & \$\\alpha\$ & 1.0 \\\\ \n\t2 & 2 & M\\&F & \$\\beta\$ & 2.0 \\\\ \n\t3 & 3 & A\\textasciitilde{}B & \$\\gamma\$ & \\\\ \n\t4 & 4 & \\textbackslash{}alpha & \$\\sum_{i=1}^n \\delta_i\$ & 3.0 \\\\ \n\\end{tabular}\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using a multiline string block (""") you should be able to replace the \n by actual line breaks to make it more readable.

@@ -498,6 +498,13 @@ module TestIO
C = [L"\alpha", L"\beta", L"\gamma", L"\sum_{i=1}^n \delta_i"],
D = [1.0, 2.0, Nullable(), 3.0]
)
@test isequal(reprmime(MIME("text/latex"), df), "\\begin{tabular}{r|cccc}\n\t& A & B & C & D\\\\ \n\t\\hline \n\t1 & 1 & \\\$10.0 & \$\\alpha\$ & 1.0 \\\\ \n\t2 & 2 & M\\&F & \$\\beta\$ & 2.0 \\\\ \n\t3 & 3 & A\\textasciitilde{}B & \$\\gamma\$ & \\\\ \n\t4 & 4 & \\textbackslash{}alpha & \$\\sum_{i=1}^n \\delta_i\$ & 3.0 \\\\ \n\\end{tabular}\n")
@test isequal(reprmime(MIME("text/latex"), df),
"""\\begin{tabular}{r|cccc}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use spaces instead of tabs. Also, why not align the last line with the rest of the contents?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the multiline string has to match the output of reprmime.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But triple-quoted strings are also automatically deindented, so you can still align them with the """ block: http://docs.julialang.org/en/latest/manual/strings/#triple-quoted-string-literals

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's really cool! I didn't know that.

@maximerischard
Copy link
Contributor Author

What happens next?

#
##############################################################################

function latex_char_escape(char::SubString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this should be Char or AbstractString. Either is fine, though the function and argument name suggest to me that Char would make the most sense.

@nalimilan
Copy link
Member

What happens next?

We generally keep PRs open few a few days to allow people to comment. And indeed @ararslan found an issue I had missed the second time.

@nalimilan nalimilan merged commit 1658c35 into JuliaData:master Oct 3, 2016
@nalimilan
Copy link
Member

Thanks!

for col in 1:ncols
write(io, " & ")
cell = df[row,col]
if !isnull(cell)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I hadn't noticed this. This should print #NULL instead of nothing for null values, just like at the REPL and in HTML. Could you apply the strategy I described at #1084 in a new PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants