-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asynchronous PUT #12097
Comments
why would polling make anything easier/more performant? The server side change is not trivial and introduces another piece of no-standard webdav api - just like my idea in #12001 - I really question if we should not skip webdav then and rely on our own protocol? |
I believe this might not be possible with PHP, as PHP doesn't support multi-threads (except using hacks like spawning processes with popen()) One idea would be to make the PHP code return an XML response with maybe a transaction id or something, but then continue processing. This means the network connection will still remain open and the client must manually close the connection after receiving the response. Another issue: the PHP process itself might time out if assembling chunks thakes more than an hour for any reason. It's all quite hacky. And as @DeepDiver1975 said it's no standard WebDAV. |
I forgot to mention that the json can also contains This is not about performence, it is about reliability. |
Is reliability something that could otherwise be achieved with checksums ? ducks |
From a server side implementation point of view something like this requires an architectural change where we have true server side processing within a daemon/worker. In order to build a reliable and scaleable infrastructure job queueing systems come to my mind. e.g. something like |
refs #12052 (comment) |
if we really want to walk this path - this is a new api - no webdav mangling please |
Something to be discussed very carefully because this has a lot of implications. Frank |
Linking some other issues which might be fixed by implementing this task or a similar solution with the same goal: |
Could be checksum another option to avoid upload or download the exiting files? |
Related: http://php.net/manual/de/function.fastcgi-finish-request.php |
Hm, others seem to have gone the way of supporting the Content Range Header in webdav PUT and POST requests: @ogoffart Are there any special reasons why you did not choose to go down that road? |
@butonic Hmmm. Range Headers are not part of the official http spec for PUT and POST and we also were not aware of that fact that others are bending the standard like that at the time we implemented our big file chunking. |
@karlitschek which is why sabredav uses PATCH for that. |
sure. Unfortunately nothing we can/should change in the short term |
We still need to decide if we want to implement the custom polling solution described by @ogoffart, or the PATCH based solution. Thus, adding triage label. |
I don't know why we came to the current chunking algorithm, it was done before i joined the project. Maybe @dragotin knows. I'm not sure if PATCH will work fine, because the client do not really know how much was well recieved and saved on the server. And it won't fix the cases where the server needs a lot of time to process the file before accepting it. |
I repeat: This feature is usefull to fix the case in which the connection timesout because the server takes a lot of time before accepting a file (and returning an etag or an error to the client) The best fix would be to make sure that the chunk assembly is always fast enough by speeding up the processing (pre-assembling it on each chunk or something like that) We had to solve it solve the problem for a specific server were speeding up was not possible. For that spcial case, this is the best solution we came up to, and it was implemented. This issue is just created to document what the client implements. |
I'm a little late to the conversation, but I just started using owncloud last month. I don't know this has ever been discussed for owncloud, but setting up a job queue using a platform like beanstalkd (http://kr.github.io/beanstalkd/) would help facilitate async server requests. Uploaded files could be shuttled off to a temp storage area as part of queuing the job, and chunking/comparing/etc. could be triggered as part of of the upload, a cron job, or an ajax request. |
IIRC it will be much better with the new chunking but not solved in principle. Once we can breath we should think about an async solution. |
Note that in the case of external storage uploads, the connection might stall for a while where all bytes consumed by PHP, no response sent yet. This happens when the file is uploaded from temp storage to the exernal storage. This kind of situation is a good example where async put would be useful for the client to check for completion. CC @mmattel |
@PVince81 I couldn't agree more, thanks! Let's take the opportunity to do things right :-) |
I don't know if any decision was made how to tackle async put, yet. I had this idea: we could allow the client to create the file without uploading any chunks, yet. It could even show in the file listing, but trying to download it will give a Revealing a file in the filelist would even allow the client to discover chunks that have already been uploaded so downloads can start as soon as the first chunk is available. This approach could even lead to deduplication if we stored the file changs by their hash and stored them that way. An easy enough task for object storage based installations. A path towards that could be #17638, a file based objectstorage. Our local storage is not supposed to be accessed outside of owncloud ... so why not move ahead and store files in a way that makes our uploads easier. @DeepDiver1975 @dragotin I'd like to discuss this with you two. |
Before diving into further discussions - any async operation requires a proper background processing sub system. We are aiming to get this implemented for 9.1. Once in place we can think/discuss about the next steps in this area |
...and how we combine that with our new chunking, which should not be a problem, but to consider. |
need to take care of the case where files get overwritten, might still need part files... The advantage of part files is that it reduces the time window of locking, else the file might be locked during an hour long upload. |
Raised #24509 for more generic async file operations (also DELETE, etc). |
What about this please? |
This topic will be adressed in ownCloud infinite scale. |
This is a feature required to have reliable upload of big files. Sometimes, the server might take a while to assemble chunks or to put them in their final destination, and the client times out which causes the file to be downloaded again.
( owncloud/client#2074 )
The solution to this problem is Asynchronous PUT via a poll URL.
This is already implemented on the client and will be part of the 1.8 release.
It is working successfully in a proprietary deployment (with a custom app on the server).
The client send with the PUT a "OC-Async:1" header, indicating it supports that feature.
The idea is that, for the last chunk, if the server has reason to believe that assembling the chunk may take more than a few seconds, it may return "202 Try LAter" instead of the usual "201 Created". Also in the response there is a header "OC-Finish-Poll" with a path to the poll url.
Example:
OC-Finish-Poll: /remote.php/poll?pollId=123456
The client will then query this URL, which can write some spaces on the wire to avoid timeout, and when it is finaly finished, it can leave the JSON with "etag", and "fileid" property incase of success, or an "error" property in case of failure.
The text was updated successfully, but these errors were encountered: