A high performance bulk import tool for the open source Alfresco Document Management System.
"'High Performance', you say?"
Why yes. Alfresco's built-in mechanisms for moving large amounts of content into the repository (the various file-server protocols, the venerable ACP mechanism, the mind-bogglingly inefficient CMIS standard etc.) all suffer from a variety of limitations that make them a lot slower than the core Alfresco repository. This tool cuts out virtually all of that nonsense, attempts to maximise "mechanical sympathy" (which, for Alfresco, basically means treating your database nicely), and makes one or two large and opinionated assumptions that allows it to be a lot faster than anything else out there.
In terms of benchmarks, the old v1.x versions of the tool have regularly demonstrated sustained ingestion rates of over 500 documents per second in production environments, and in testing, the v2.x version has been shown to be up to 4X faster than 1.x (in specific circumstances, notably for streaming imports).
Older resources (less relevant for v2.0+):
This extension is not supported by Alfresco Software Inc., although a fork of an early, pre-release version of the tool has been included in Alfresco Enterprise since v4.0, and is supported by Alfresco support.
Please note that the embedded fork has never been rebased against upstream, meaning that it is functionally equivalent to the 1.0-RC1 (ancient, circa mid-2010) version of the tool. Its use is therefore discouraged.
Copyright © Peter Monks. Licensed under the Apache 2.0 License.
- Contributors list
- Icon adapted from Appzgear on www.flaticon.com.
- Contributing file heavily inspired by the Atom project.
Please see Contributing.