Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fcrepo-import-export-0.2.0.jar import function not importing all records #113

Open
straccers opened this issue Mar 23, 2018 · 7 comments
Open

Comments

@straccers
Copy link

trying to use the standalone import-export utility (fcrepo-import-export-0.2.0.jar) to migrate data between two machines. Both machines use the same version of fedora (4.7.5). The export appears to complete correctly, without any reported errors, and all the expected data appears to be present in the ttl files. However when running the import, the output shows multiple errors, -
this is the first one in the output log for an attempt to import the entire repository
ERROR 15:24:04.094 (Importer) Error while importing /home/dlib/thursday_22_03_2018_11-27am_oasis_export/fedora/rest/oasis/d9/4a/ec/2f/
d94aec2f-ad6c-47a0-87bb-36fd88ff4f5f.ttl (412): Request failed due to unspecified failed precondition.

(as you can see, its not very informative)

and also quite a few errors of the type
WARN 15:24:05.012 (Importer) Skipping Membership Resource: http://localhost:8080/fedora/rest/oasis/57/12/m6/52/5712m6524

The import runs to completion but and does not populate the records with all of the expected data elements, although when we look in the exported turtle files these are clearly present.

import export version: fcrepo-import-export-0.2.0.jar
fedora version: fcrepo-import-export-0.2.0.jar (on both machines) and we have set -Dfcrepo.properties.management=relaxed on the machine with the fedora repository we are trying to import into.

@awoods
Copy link

awoods commented Mar 29, 2018

@bbpennel : Are you in a position to investigate this?

@bbpennel
Copy link
Contributor

@straccers
I will check to see if I can replicate the issue later. Would you be able to share the parameters or config file you were using for the import and/or original export? And also, if possible, an example .ttl file or few that triggered the 412 response on import?

A few other questions that may help diagnosis the issue:
Were all the objects present after the import, just not fully populated?
Was there a pattern to which properties were not populated?
And was the import into an empty repository?

@bbpennel
Copy link
Contributor

bbpennel commented Apr 2, 2018

After some testing, I am wondering if you are reimporting objects that are already in the destination fedora instance, or if this is going into an empty fedora instance?

When importing an object, the import tool provides the current time as the if-unmodified-since header so that fedora will reject it if another client modified the same object while the import was taking place. Normally this shouldn't block you from reimporting the same object, but if the object in the repository has a last modified date more recent than the current time, the update will be rejected. This could possibly happen if objects had been previously imported with timestamps in the future OR potentially during the same import if the system clocks between the client and the server disagree.

I was able to replicate the 412 response by importing an object into fedora with a future last modified date, and then importing it again. My test involved fcrepo-4.7.5 for both servers (using the embedded jetty distribution) and the -Dfcrepo.properties.management=relaxed property set.

Also, the "Skipping Membership Resource" are normal (despite the "WARNING" level), you will see those for any container that uses a ldp:DirectContainer or ldp:IndirectContainer to populate membership relations. We should most likely change that to INFO or DEBUG.

@straccers
Copy link
Author

straccers commented Apr 3, 2018 via email

@straccers
Copy link
Author

right, apologies for the huge gap here - we had a live system issue to deal with. So looking into this some more, I can confirm we ARE starting from a fresh repository each time. The predicate the import seems to be reporting as an error each time is fedora:createdBy in the fcr:metadata file . example of one such turtle file below (references to our own server names changed) . in this case i am exporting and importing with the -inward, -external and -binary flags, however this also happens with just the -binary flag . Its a tad puzzling, as the same predicate appears in the other ttl files too - but the error is only reported for the fcr:metadata

export config :
mode: export
external: true
legacyMode: false
predicates: http://www.w3.org/ns/ldp#contains
overwriteTombstones: false
auditLog: false
resource: http://<OUR_FEDORA_SERVER>:8080/fedora/rest
inbound: true
versions: false
dir: /home/dlib/oasisbackup_ei_09042018_1500
binaries: true
rdfLang: text/turtle

import config: (I didnt use legacy mode this time as I wanted to get the error output)
legacyMode: false
overwriteTombstones: false
auditLog: false
resource: http://localhost:8080/fedora/rest
inbound: true
dir: /home/dlib/oasisbackup_ei_09042018_1500
rdfLang: text/turtle
mode: import
external: true
predicates: http://www.w3.org/ns/ldp#contains
versions: false
map: http://<OUR_FEDORA_SERVER>:8080/fedora/rest,http://localhost:8080/fedora/rest
binaries: true

@straccers
Copy link
Author

whoops heres the zipped ttl file folder
fcr%253Ametadata_REDACTED.zip

@birkland
Copy link
Contributor

I ran into the same issue today, with random Request failed due to unspecified failed precondition errors on an fcrepo 4.7.5 with the latest 0.2.0 import/export tool. We were loading only rdf resources, no binaries or external content. Some observations:

  • The errors tended to happen at the beginning of an import. There seemed to be a point at which errors stopped occurring.
  • The errors were random ; re-loading the same data resulted in different resources failing.
  • There were no discernible similarities between the resources that failed.

Ultimately, under extreme time pressure, I had to remove the .ifUnmodifiedSince(currentTimestamp()) option from the PutBuilder in Importer in order to get a clean load.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants