Skip to content
This repository has been archived by the owner on Oct 5, 2020. It is now read-only.

Using Slush on top of MarkLogic Data Hub

Geert edited this page Sep 11, 2018 · 1 revision

MarkLogic Data-Hub has gained a lot of popularity. Unfortunately, it doesn't come with a nice polished search app, that can be extended for specific use cases. People often still like to use Slush-MarkLogic-Node together with MarkLogic Data Hub, but have trouble making them work together properly. This page explains how best to approach this.

Architecture

The architecture that works best is one that keeps datahub and your ui project fairly isolated. You likely want to reuse the final data, but apply your own modules, and security layer. It is difficult to fully isolate both, but a fairly decent split is easy to make. You connect the ui to the Final database from data-hub, and try to keep everything else separate.

The biggest catch is that if you give control over Final database 'out of your hands', you depend on the data-hub to provide necessary indexes for facets and such as well.

Set up datahub

Setup your data-hub project as you would otherwise do. You could consider using a naming convention like ${app-name}-mdh-xxx for your convenience, and overall recognition, but that is not required. Make note of the name of the final database (the mlFinalDbName property). And since schemas and triggers db are linked to the content database, you will need to take note of those too (the mlSchemasDbName, and mlTriggersDbName properties).

Set up slush

Generate a slush project as you would otherwise do. After that you start with making various adjustments. Most importantly:

  • edit deploy/build.properties, and edit content-db to match the name of the final db of your data-hub project.
  • also enable and edit schemas-db, and triggers-db to match the datahub db names as well.
  • edit deploy/ml-config.xml, and remove the <database> section for content-db, as well as any forest <assignment> that links to it
  • also replace the placeholders for schemas and triggers forest assignments: @ml.schemas-assignment and @ml.triggers-assignment
  • as well as those for schemas and triggers databases: @ml.triggers-db-xml and @ml.schemas-db-xml

Now that database and forest references are gone from ml-config.xml, bootstrap can no longer interfere with databases and indexes managed by datahub. Just a last few tweaks to really polish off. Schemas and triggers are managed by datahub too, and they likely don't get the correct document permissions for the ui app-role. Also, we don't want the Roxy deployer that comes with Slush trying to deploy triggers itself, so we disable the original behavior:

  • edit 'deploy/app_specific.rb', and add these lines near the bottom of class ServerConfig:
  alias_method :original_deploy_schemas, :deploy_schemas
  def deploy_schemas
    original_deploy_schemas
    change_permissions(@properties["ml.schemas-db"])
  end

  alias_method :original_deploy_triggers, :deploy_triggers
  def deploy_triggers
    #original_deploy_triggers
    puts "Skipping deploy triggers.."
    change_permissions(@properties["ml.triggers-db"])
  end

Note: deploy_schemas could be an interesting place to add deployment of ui specific TDE's, though that could be managed from datahub side as well. Note: in case datahub does not deploy any triggers, and you need ui-specific ones, consider re-enabling original_deploy_triggers, and adding a necessary deploy/triggers-config.xml.

After these steps you can use ./ml local install to bootstrap and deploy everything, or use the finer-grained commands to deploy sub-parts..