Simplify filefetcher, remove chunking and use guzzle #26
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This removes a major feature from FileFetcher which has proven too hard to get right: chunking file downloads and doing them in batches. We have continued to run into edge cases where we see file corruption occurring as a side effect of this, and the performance is a major issue. Running on a web server with reasonable bandwidth, even a 7GB file should be downloadable in about 10-15 minutes, which seems reasonable to do in a single pass.
We now use native php copy() for local file copying, and guzzle for remote copy. Future improvements could include passing specific guzzle config to the processor. We may however simply want to move all of this into DKAN core, rely more heavily on DKAN and Drupal APIs.
This PR also simplifies a few things in the library - we remove the php bridge class, eliminate the LastResort processor (the class is still there, as an empty extension of Local, in case any references to it exist out there), make testing a little more straightforward and streamline the CI.
See GetDKAN/dkan#3892 for DKAN integration.
This would necessitate a 5.x tag.