Skip to content

CreateJob

Erwan KOFFI edited this page Jun 20, 2018 · 2 revisions

Create job

This task allows you to create a brand new job on Data Fabric.

The task can be launched with the command gradle createJob.

Configuration

The configuration job allows to create multiple jobs from the same archive.

Configuration object

Once your saagie object is available on your project with the server correctly set up, you need to fill the jobs list which will be created.

saagie {
    server {...}

    jobs {[
            {
                name = <job_name>
                type = <job_type>
                category = <job_category>
                language = <job_language>
                languageVersion = <language_version>
                sparkVersion = <spark_version>
                cpu = <job_cpu>
                memory = <job_memory>
                disk = <job disk>
                streaming = <streaming_flag>
                mainClass = <spark_main_class>
                arguments = <job_arguments>
                description = <job_description>
                releaseNote = <job_release_note>
                email = <job_email_notification>
                template = <command_template>
                idFile = <file_name>
            }
        ]}
    fileName = <archive_name>
}

The job creation is only allowed for the following job types:

  • Java/Scala
  • Spark
  • Python
  • R
  • Talend
  • SQOOP
  • Docker

The following is not yet supported:

  • Notebooks

Properties explanation

  • name (mandatory)

    • The name of the job that will be created.
    • type: string
    • default: Generic Job Name
  • type (mandatory)

    • The type of the job to be created.
    • type: string
    • default: java-scala
    • accepted values: java-scala, spark, python, r, talend, sqoop, docker
  • category (mandatory)

    • The category where the job will be created. dataviz is Smart Apps category.
    • type: string
    • default: extract
    • accepted values: extract, processing, dataviz
  • fileName (mandatory)

    • The archive file name. If target property is set, it will be used as path to this archive.
    • type: string
    • default:
  • language

    • Only useful for Spark jobs as it can be using JVM or Python languages.
    • type: string
    • default: java
    • accepted values: java, python
  • languageVersion

    • Job's language version. When multiple versions are available, this property should be set with one of the versions available on the platform.
    • type: string
    • default: 8.131
  • sparkVersion

    • Spark version to use. The chosen version must be available on the platform.
    • type: string
    • default: 2.1.0
  • cpu

    • Job's CPU allocation. Please note that resources requested could lead to your job not being able to start.
    • type: float
    • default: 0.3
  • memory

    • Job's memory allocation in MB. Please note that resources requested could lead to your job not being able to start.
    • type: int
    • default: 512
  • disk

    • Job's disk allocation in MB. Please note that resources requested could lead to your job not being able to start.
    • type: int
    • default: 512
  • streaming

    • Whenever your job has to restart automatically whenever it fails.
    • type: boolean
    • default: false
  • mainClass

    • Main class which will be used for JVM based Spark jobs on launch.
    • type: string
    • default:
  • arguments

    • Job's arguments can be set through this variable.
    • type: string
    • default:
  • description

    • Job's description.
    • type: string
    • default:
  • releaseNote

    • Job's release note.
    • type: string
    • default:
  • email

    • Email used for job's notifications.
    • type: string
    • default:
  • template

    • This will replace the whole script command. If provided, this will override any of the template commands and therefore, arguments and mainclass will not be taken in account.
    • type: string
    • default:
  • idFile

    • This property allow you to save the newly created job's id into the provided file. The complete path has to be provided and write rights on file and directory must be granted.
    • type: string
    • default:

Examples

JVM

saagie {
    server {...}

    jobs {[
            {
                name = 'JVM Example job'
                type = 'java-scala'
                category = 'processing'
                languageVersion = '8.131'
                cpu = 1
                memory = 1024
                disk = 2048
                arguments = 'http://www.saagie.com'
                description = 'This is an example job for jvm based languages.'
                releaseNote = 'This release is fine.'
                email = 'someone@domain.ext'
                idFile = './jvm-example.id'
            }
        ]}
    fileName = './my-cool-archive.jar'
}

Python

saagie {
    server {...}

    jobs {[
            {
                name = 'Python Example job'
                type = 'python'
                category = 'processing'
                languageVersion = '3.5.2'
                cpu = 0.7
                memory = 512
                disk = 1024
                arguments = 'http://www.saagie.com'
                description = 'This is an example job for python.'
                releaseNote = 'This release is fine.'
                email = 'someone@domain.ext'
                idFile = './python-example.id'
            }
        ]}
    fileName = './my-cool-archive.zip'
}

R

saagie {
    server {...}

    jobs {[
            {
                name = 'R Example job'
                type = 'r'
                category = 'processing'
                cpu = 0.3
                memory = 324
                disk = 2000
                template = 'unzip my-cool-archive.zip && Rscript my-cool-script.R'
                description = 'This is an example job for r.'
                releaseNote = 'This release is fine.'
                email = 'someone@domain.ext'
                idFile = './r-example.id'
            }
        ]}
    fileName = './my-cool-archive.zip'
}

SQOOP

saagie {
    server {...}

    jobs {[
            {
                name = 'SQOOP Example job'
                type = 'sqoop'
                category = 'extract'
                cpu = 0.3
                memory = 324
                disk = 2000
                template = 'sh my-script.sh'
                description = 'This is an example job for sqoop.'
                releaseNote = 'This release is fine.'
                email = 'someone@domain.ext'
                idFile = './sqoop-example.id'
            }
        ]}
}

Docker

saagie {
    server {...}

    jobs {[
            {
                 name = 'nginx'
                 category = 'dataviz'
                 type = 'docker'
                 packageUrl = 'nginx'
                 externalPort = 80
                 externalSubDomain = 'hudebert'
                 streaming = true
                 auth = false
            }
        ]}
}

Spark

JVM
saagie {
    server {...}

    jobs {[
            {
                name = 'Spark JVM Example job'
                type = 'spark'
                category = 'processing'
                language = 'java'
                languageVersion = '8.131'
                sparkVersion = '2.1.0'
                cpu = 2
                memory = 2048
                disk = 4096
                arguments = 'http://www.saagie.com'
                description = 'This is a Spark example job for jvm based languages.'
                releaseNote = 'This release is fine.'
                email = 'someone@domain.ext'
                idFile = './spark-jvm-example.id'
            }
        ]}
    fileName = './my-cool-archive.jar'
}
Python
saagie {
    server {...}

    jobs {[
            {
                name = 'Pyspark Example job'
                type = 'python'
                category = 'processing'
                languageVersion = '3.5.2'
                sparkVersion = '2.1.0'
                cpu = 2
                memory = 2048
                disk = 4096
                arguments = 'http://www.saagie.com'
                description = 'This is a PySpark example job.'
                releaseNote = 'This release is fine.'
                email = 'someone@domain.ext'
                idFile = './spark-python-example.id'
            }
        ]}
    fileName = './my-cool-archive.zip'
}

Multiples jobs

saagie {
    server {...}

    jobs {[
            {
                name = 'JVM Example job sister'
                type = 'java-scala'
                category = 'processing'
                languageVersion = '8.131'
                cpu = 2
                memory = 2048
                disk = 1024
                arguments = 'http://www.saagie.com'
                description = 'This is an example job.'
                releaseNote = 'This release is fine.'
                email = 'someone@domain.ext'
                idFile = './jvm-sister-example.id'
            },
            {
                name = 'JVM Example job brother'
                type = 'java-scala'
                category = 'processing'
                languageVersion = '8.131'
                cpu = 1.1
                memory = 1024
                disk = 2048
                arguments = 'https://www.saagie.com/fr'
                description = 'This is an example job bis.'
                releaseNote = 'This release is fine.'
                email = 'someone@domain.ext'
                idFile = './jvm-brother-example.id'
            }
        ]}
    fileName = './my-cool-archive.jar'
}