-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Download converted documents with uploadable configuration #2413
Conversation
notebook/nbconvert/handlers.py
Outdated
} | ||
) | ||
|
||
print(resources) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forgotten debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
notebook/nbconvert/handlers.py
Outdated
|
||
@web.authenticated | ||
def get(self, format, path): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpicking 2 whitelines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing you meant the one below get and the one above self.finish().
fileformat.append( | ||
$('<option>script</option>') | ||
.attr('value','script') | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may be able to chain the append, or put that in a loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, via loop (which takes a standard structure which will make it easier for linking into by 3rd party exporters)
form.append(fileinput); | ||
body.append(form); | ||
|
||
var notebook_path = utils.encode_uri_components(this.notebook.notebook_path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you've the same line that defines notebook_path
above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gone
}; | ||
if (fileinput[0].files.length>0){ | ||
filereader.readAsText(fileinput[0].files[0]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else on same line as }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
xhr.setRequestHeader('X-XSRFToken', xsrf_token); | ||
xhr.onreadystatechange = function(){ | ||
if(xhr.readyState === 4){ | ||
var blob = xhr.response; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you figured out the unicode thing ? was is "just" not using jquery ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more or less that seemed to do it as bizarre as that is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know exactly what the issue is, but I hope we can work out how to do this with a more familiar API than the low-level XMLHttpRequest interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We searched a lot, and many responses on internet were "do not use jquery". It's trying to do fancy stuff to guess encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's possible to turn the encoding detection off? We have Ajax functionality that works throughout the rest of the codebase, I find it hard to believe it's totally incompatible in this one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, in our specific case, raw XMLHttpRequest was the only thing that was stated (by others) to work for passing around binary objects unaltered as we wanted. And we tried using other mechanisms to get this to work and none of them worked — figuring out this is probably the part that took longest of this entire PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is precisely the place in which it's totally incompatible. It's a known issue: http://stackoverflow.com/a/23797348/1816995. That answer admittedly is a bit old, but I spent a long time trying to solve it and this works without finagling, so I'd prefer to stick with it.
a.download = fileName; | ||
a.click(); | ||
window.URL.revokeObjectURL(url); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you probably want to remove a
from the dom.
notebook/templates/notebook.html
Outdated
@@ -84,6 +84,7 @@ | |||
<a href="#">Open...</a></li> | |||
<!-- <hr/> --> | |||
<li class="divider"></li> | |||
<li id="custom_download"><a href="#">Custom Download</a></li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to put that back in its original place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think moving it here is an improvement in the design in the menubar. It should not require drilling down into the menu structure to access this feature.
I'm fine moving it down next to "Download as…" but it shouldn't be one option among the various downloads as it is strictly speaking a broader class of functionality.
Re: "Download as…" I'm thinking of actually switching them to use this post method as then they won't open a new window (which I've always disliked).
Thanks that's great. We might be able to make this more readable with Promises at some point. Don't forget to write some documentation as well :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also post a screenshot of the new dialog, so it's easy to see what it will look like?
notebook/nbconvert/handlers.py
Outdated
json_config = json.loads(json_upload.get("config",{})) | ||
c.merge(json_config) | ||
nb_content = json.dumps(json_upload["notebook"]) | ||
self.call_nbconvert(format, path, config=c, content=nb_content, post=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit confusing - it allows for both specifying a path and passing the notebook in the request body. I think we should stick to one or the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's for symmetry with get
that was defined before. The "only" difference between get
and post
is that it gets an extra config object. There is also an efficiency question that if the notebook is on the server already we don't need to send it on the wire (and that's how it works currently). In the end we want to be able to post the notebook content and this is part of the transition effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is adding a new handler, so I feel like it should be possible to come up with a neater, clearer way - there's no existing code using POST to this endpoint, so compatibility isn't a concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually not adding a new handler (from the perspective of the web endpoints), but merely distinguishing whether it got a GET or a POST request. I was trying to keep the code as similar in the conversion step as possible and that seemed the cleanest way to do it.
I'm going to write something longer in the main PR comment thread based on a conversation with Matthias. But the approach I took is the way to have the cleanest backwards compatibility story. We're going to have to come up with a migration story for the more comprehensive version of this feature.
notebook/nbconvert/handlers.py
Outdated
def get(self, format, path): | ||
|
||
exporter = get_exporter(format, config=self.config, log=self.log) | ||
def call_nbconvert(self, format, path, config=None, content=None, post=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the post
parameter used anywhere? I don't see it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that might have been a remnant of an earlier effort, I'll nix it.
notebook: notebook.toJSON(), | ||
config: config | ||
}; | ||
return json_to_pass; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same on this side, you're passing both the path to the notebook and the serialised notebook file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same answer that's on purpose: currently you need the path which forces us to save the notebook before converting. Also yo need the "path" as it affects things like the naming of the generated files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change the API more fundamentally, but then the name should probably be passed in explicitly as part of the JSON object.
xhr.setRequestHeader('X-XSRFToken', xsrf_token); | ||
xhr.onreadystatechange = function(){ | ||
if(xhr.readyState === 4){ | ||
var blob = xhr.response; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know exactly what the issue is, but I hope we can work out how to do this with a more familiar API than the low-level XMLHttpRequest interface.
a.style = "display: none"; | ||
return function (data, fileName) { | ||
var blob = new Blob([data], {type: "octet/stream"}), | ||
url = window.URL.createObjectURL(blob); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat, I didn't know about object URLs.
|
||
var that = this; | ||
|
||
var handle_nbconvert_post = function(){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming stuff: 'handling' something usually refers to the code that runs when that something is received. This function is getting values from the UI and sending the post request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It technically is receiving the file if it has been uploaded when clicking the "Download" button.
But, ok, it's changed to trigger_nbconvert_post
.
on_done(); | ||
} | ||
}; | ||
var on_done = function(){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming stuff again: on what being done? on_config_file_read
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also being called if there is no config file being passed in so that wouldn't be quite right either.
if(xhr.readyState === 4){ | ||
var blob = xhr.response; | ||
var content_disposition = xhr.getResponseHeader('Content-Disposition'); | ||
var filename = content_disposition.match(/filename="(.+)"/)[1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point to have a fallback of some kind, in case the regex doesn't match, or the header isn't even there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is, I should just have an error?
I'm pretty sure that the request did not work properly if it doesn't have an attachment and a filename.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably an error, it's just best for it to be a clear one rather than a cryptic one because something you assumed would always be true no longer is.
@Carreau and I talked about the choice of continuing to include For example a post request handler of a sort exists that does not have a path following it, but the way that the As a more general topic, we need to discuss our path toward having nbconvert as a service. I was trying to write this in a way that would allow that fairly easily and I made decisions while keeping that goal in mind. For example, right now this is transmitting the entire notebook as json when that notebook is already on the server. That introduces higher bandwidth costs but accommodates the case where the notebook is not on the server. But solving that would also require passing along (or embedding) resources to the endpoint that would not otherwise be available for conversion. That seems like a much harder problem, but something we should be thinking about. @Carreau had specific suggestions about how to move forward with this PR and what issues to open…I don't remember all of them in particular (and need to run and want to post this now) so I'll let him raise those points later. |
Fair enough about the raw XHR, though I still hope we can make that code a bit neater. I'm still puzzled why you're talking about backwards compatibility. You're adding a new operation that wasn't possible before, so there's nothing for it to be backwards compatible with. It doesn't matter that the same URLs already work for GET requests; POST requests are entirely separate. If you don't want to pass the path in the URL, make a new endpoint like Backwards compatibility is what forces us to make awkward compromises. You have the liberty of designing a new API that doesn't have to be backwards compatible with anything, so design it to be beautiful and simple and consistent, not involving a load of duplicate information with a 'backwards compatible' label stuck on it. |
oops…messed up a rebase and forced pushed without thinking about it. I'll try to fix this. |
Recently I have been looking at the UX design around RStudio's document conversion stuff. I knmow some folks who use Jupyter and RStudio and from their perspective, the UX of RStudio is so far ahead of us on this point, they wouldn't think of using nbconvert - even though nbconvert does much of what is offered on RStudio from the technical perspective. What is a remarkable is that from a technical perspective, what RStudio does is much more primitive that what we do in the notebook (custom REST endpoint, menus, custom UI). They have nicer UI, but the result of that UI is to just fire off a command in the terminal. It converts things in place and leaves the converted documents (HTML, pdf, etc.) in that directory. I think we should step back from the technical details of this and do some UX design work to understand what Jupyter users really need in this area. It may be that we can provide what they need with some small UI improvements and no difficult technical work. I think studying the RMarkdown/RStudio model would be a good starting point. |
@ellisonbg I agree with you that we have much to learn from the RMarkdown and RStudio approach to document creation. Fortunately, that is exactly the kind of thing that this PR is trying to do! nbconvert (as a CLI) works as a simple command that you can just fire off in the terminal, which converts things in place and leaves the file(s†) in the directory. You can configure it by passing in a configuration file with the Unfortunately, the notebook doesn't have any good way to surface this functionality when exporting to different formats. That is even taking into account that nbconvert is being used on the backend to manage the conversion. The primary problem is that there is no user-friendly way to pass a configuration file to nbconvert from the front end. It is (strictly speaking) possible to configure your exports in a way that is mostly in the front-end. Specifically, the configuration file that nbconvert uses by default is the same configuration file that the NotebookApp receives at the point of its instantiation. So the only way to currently change the configuration of an export is to shut down the server, and then restart it. Forcing a user to stop whatever they were doing in order to shut down the server and restart it is about as user hostile an approach to configuration as I can imagine. Solving that was the first task of this PR. This PR makes it possible to pass in a JSON version of the configuration file to the nbconvert calls, which allows us to get around that limitation. You can just create whatever configuration file you want and pass it in. The second problem that this addresses has to do with how the download occurs. Currently our downloads occur by redirecting us to a new tab where we pass the information needed for conversion via the url. At best, opening a new tab is an inconvenient and unnecessary context switch. This PR allows us to trigger the download in the background of the current tab without any interruption to a person's workflow. A third problem this approach inadvertently solves is a feature that @ian-r-rose was asking for: the ability to pass in a notebook as part of the data attached to a POST request that is then able to be converted by nbconvert and return the resulting file in the response to the POST request. Perhaps most importantly to your points, this PR solves many problems that will make it easier for others to try other UX deisgns on the basis of the infrastructure built here. So, I think in fact your arguments are reasons for moving forward with this PR, not stepping away from it. Hopefully with these explanations, you agree. If not, then please respond, and I will do my best to address any remaining concerns. Footnotes: †: For example, for document formats that don't support images being directly embedded (e.g., plaintext formats like rst), nbconvert creates a directory that contains the relevant images along side the main file. |
I probably didn't flush out enough details. Here is what I am proposing:
In the classic notebook, all of this is possible, but it will be a bit clunky because the different UI's needed (file browser, notebook, file editor, terminal) are on different pages. But it is still possible. In JupyterLab, we can provide an extremely nice and optimized UX around these abstractions. This is exactly how RStudio works. The only difference is that RMarkdown allows the conversion config to be put into the markdown file in a YAML block. We could also explore things like that as well if we want, but it is not needed for all of this to work. |
It seems to me that there are two points of discussion here which are related but distinct:
|
I agree with M and Ian, and we had this discussion at the dev meeting a couple of month ago; while the approach that you (@ellisonbg) is proposing has some advantage from the user point of view it is in large part orthogonal to what is developed here. This not only expose a UI to do some conversion; but also expose nbconvert through a rest API. This has been discussed quite a bit in various context and is necessary for other improvement we'd like to do; like an nbconvert CLI that use remote conversion - for example because getting latex installation correct is not easy. I also want to point out that not everyone is using and developing on top of JupyterLab and that the rest API for nbconvert is extremely useful for custom clients wishing to do conversions. |
7d1f362
to
deaac69
Compare
ok… now when i go back and run tests locally on 817a87c, the tests now fail. I'm deeply baffled as to why. but I'm going to stop trying to appease the casperjs gods and call this a night. If someone tries this branch and they can't get this to work, please let me know. |
@Carreau thanks for clarifying the intended usage of the different parts of this PR. I completely understand the general need for a REST API for nbconvert. In particular, the main usage case that I see is a hosted version of nbconvert that is run separately from the rest of the notebook server. In general, I tend to think that the REST services of the notebook server are one of the project's "open standards" and we should evolve it conservatively and with very clear versioning semantics and documentation. At the same time, I want nbconvert to be able to move quickly and innovate. Because of this, I would probably feel more comfortable with the server side parts of this being in nbconvert and used in the notebook server as a server extension. That would also make it very easy to develop a separate server that just does nbconvert REST calls. We have taken the exact same approach with the Git server extension that exposes a REST API to git. For the classic notebook I am fine with the UI additions in this PR, if the server side stuff is moved to a server extension, the UI could remain and add a small amount of logic to see if the server side stuff is installed and enabled. It could also be added as a nbextension, but that is a pain. The only potential issue there is version mismatch between the UI and the server extension. For JupyterLab usage, the UI would probably be developed as a JupyterLab extension (an npm extension) so any of these options are fine on that front. I agree these things are mostly separate from the issues I was bringing up about the usability of nbconvert. However, my points about the usability still stand and I don't think that these proposed changes remove the need for us to consider those matters on their own. |
…entmodel, output format & config
…sing to the json object
var url = | ||
utils.url_path_join( | ||
that.base_url, | ||
"nbconvert", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switch consts to var per remove trailing commas
Working on some tests & changes that came about as a result of writing those tests. I've introduced a decision point. I need to make either the configuration file that is uploaded the source of truth for the output format or the setting in the application. I'm going to go with the configuration for now, as there would be no way to otherwise point to a non-entrypoint specified output format. |
- use Config object - keep naming consistent with other Handlers - use export_format instead of output_format, to match NbConvert.export_format - search configuration for export format and fallback to the json body
also, change name to NbconvertConfigHandler
|
||
var fileformat = $("<select>"); | ||
|
||
var export_opts = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see slides
here... could be the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the tests pass after I made the changes @rgbkrk suggested, but I definitely should include slides there. Apologies!
notebook/templates/notebook.html
Outdated
<li id="download_markdown"><a href="#">{% trans %}Markdown (.md){% endtrans %}</a></li> | ||
<li id="download_rst"><a href="#">{% trans %}reST (.rst){% endtrans %}</a></li> | ||
<li id="download_latex"><a href="#">{% trans %}LaTeX (.tex){% endtrans %}</a></li> | ||
<li id="download_pdf"><a href="#">{% trans %}PDF via LaTeX (.pdf){% endtrans %}</a></li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I ask you why you remove {% trans %}
on these ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a while ago that I had made that decision with input from either @takluyver or @Carreau, on the grounds that for some of those, a translation would never make sense (e.g., LaTeX should always be LaTeX). However you’re right that we can always include the translation tag and assume that there won’t be a mistranslation (e.g., treating reST as rest (the verb) with weird capitalisation and translating accordingly). Certainly Notebook, Script and the word “via” make sense to translate. I can add those tags back.
- change the capitalization to match menu options - add slides - add translations back Thanks @damianavila
This changes the base-path to be / instead of /api so that in the future other APIs that return files and do not redirect you to a new address (e.g., see jupyter#2413). This also improves the way that the api spec handler responds to a request, by actually displaying the spec with the appropriate content-type header (text/x-yaml) instead of triggering a download with the StaticFileHandler's default application/octet-stream mimetype.
@gnestor @takluyver I just realised I never wrote out the API spec for this, so I definitely need to get that in before this would be merged and now that our spec is cleaner, this should be much easier. |
Closing this PR as stale as part of housekeeping (and because of the merge conflicts). Thanks for exploring this @mpacer! |
At long last! The ability to upload a custom configuration file to customise exports is here.
There are still a lot of issues I need to work out in terms of the UI, but the core functionality is present.
Steps to accomplish this:
.json
config fileLooking forward to getting feedback on this.
Note: by doing this, I've actually made the current "Download as" sub-menu somewhat redundant as that is equivalent to this case but where you simply don't give it any config file.
I'm a little concerned about size limitations due to the current anchor + dataURI approach to downloading things, but I figured others would have clearer thoughts on that than I would be able to express.edit: I realised that the size worries stemmed from an earlier attempt of mine and that I had actually addressed that by using the blob object and createObjectURL approach.Bonus: it allows downloading without needing to open a new window, which I see as a win.
Thank you to everyone who helped bring this to fruition (@minrk, @takluyver and especially @Carreau).
cc: @bollwyvl