Skip to content

svramusi/retrieve-and-rank-java

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Retrieve and Rank

Build Status

The IBM Watson Retrieve and Rank service helps users find the most relevant information for their queries by using a combination of search and machine learning algorithms to detect "signals" in the data. You load your data into the service, which is built on top of Apache Solr, and train a machine learning model. Then use the trained model to provide improved results to users.

Give it a try! Click the button below to fork into IBM DevOps Services and deploy your own copy of this application on Bluemix.
Deploy to Bluemix

View a demo of this app.

How it works

This application uses publicly available test data called the Cranfield collection. The collection contains abstracts of aerodynamics journal articles, a set of questions about aerodynamics, and labels to mark how relevant an article is to a question. Some questions are not used as training data, which means that you can use them to validate the performance of the trained ranker. This subset of questions are are used in the demo.

Before you begin

Ensure that you have the following prerequisites before you start:

Getting Started

  1. Create a Bluemix Account

    Sign up in Bluemix or use an existing account. Watson Services in Beta are free to use.

  2. Download and install the Cloud-foundry CLI tool.

  3. Edit the manifest.yml file and change the <application-name> to something unique.

applications:
- services:
  - retrieve-and-rank-service
  name: <application-name>
  path: webApp.war
  memory: 512M

The name you use determines your initial application URL, e.g., <application-name>.mybluemix.net.

  1. Connect to Bluemix in the command line tool.
$ cf api https://api.ng.bluemix.net
$ cf login -u <your-user-ID>
  1. Create the Retrieve and Rank service in Bluemix.
$ cf create-service retrieve_and_rank standard retrieve-and-rank-service
  1. Download and install the maven compiler.

  2. Build the project.

    You need to use the Apache maven to build the war file.

$ maven install
  1. Push it live!
$ cf push -p target/webApp.war
  1. Train the service to use the Cranfield collection and train a ranker with the Cranfield data. See a tutorial in Getting started with the Retrieve and Rank service. As you complete the tutorial, save this information:
  • Solr cluster ID: The unique identifier of the Apache Solr Cluster that you create.
  • Collection name: The name you give to the Solr collection when you create it.
  • Ranker ID: The unique identifier of the ranker you create.
  1. Use the values from the tutorial to specify environment variables in your app.

  2. Navigate to the application dashboard in Bluemix.

  3. Click the Retrieve and Rank application you created earlier.

  4. Click Environment Variables.

  5. Click USER-DEFINED.

  6. Add the following three environment variables with the values that you copied from the tutorial:

    • CLUSTER_ID
    • COLLECTION_NAME
    • RANKER_ID

Running locally

The application uses the WebSphere Liberty profile runtime as its server, so you need to download and install the profile as part of the steps below.

  1. Copy the credentials, CLUSTER_ID, COLLECTION_NAME and RANKER_ID from your retrieve-and-rank-service service in Bluemix to RetrieveAndRankResource.java.
    You can use the following command to see the credentials:

    $ cf env <application-name>

    Example output:

    System-Provided:
    {
    "VCAP_SERVICES": {
      "retrieve-and-rank": [{
          "credentials": {
            "url": "<url>",
            "password": "<password>",
            "username": "<username>"
          },
        "label": "retrieve-and-rank",
        "name": "retrieve-and-rank-service",
        "plan": "standard"
     }]
    }
    }
    User-Provided:
    CLUSTER_ID: xxxxxxxx_ca0e_zzzz_zzzz_95zzz3aa2404
    COLLECTION_NAME: ga
    RANKER_ID: F131F6-rank-10

    You need to copy the username, password, and url,

  2. Create a Liberty profile server in Eclipse.

  3. Add the application to the server.

  4. Start the server.

  5. Go to http://localhost:9080/webApp to see the running application.

Troubleshooting

To troubleshoot your Bluemix application, the most useful source of information is the log files. To see them, run the following command:

$ cf logs <application-name> --recent

License

This sample code is licensed under Apache 2.0.
Full license text is available in LICENSE.

Contributing

See CONTRIBUTING.

Reference information

Open Source @ IBM

Find more open source projects on the IBM Github Page.

Packages

No packages published

Languages

  • CSS 44.5%
  • Java 41.8%
  • HTML 8.9%
  • JavaScript 4.8%