From 5665dcfb7dce4805d917efee01bef7acd3f170ab Mon Sep 17 00:00:00 2001 From: Jesang Yoon Date: Mon, 18 Jan 2016 00:20:03 +0900 Subject: [PATCH 1/3] Fix wrong HTML tags, indention and space between paragraph and tables. Remove unnecessary spaces. --- docs/interpreter/cassandra.md | 1025 ++++++++++++++++----------------- 1 file changed, 496 insertions(+), 529 deletions(-) diff --git a/docs/interpreter/cassandra.md b/docs/interpreter/cassandra.md index eded545e52d..75a5218a161 100644 --- a/docs/interpreter/cassandra.md +++ b/docs/interpreter/cassandra.md @@ -6,10 +6,9 @@ group: manual --- {% include JB/setup %} -
-## 1. Cassandra CQL Interpreter for Apache Zeppelin -
+## Cassandra CQL Interpreter for Apache Zeppelin + @@ -23,81 +22,78 @@ group: manual
Name
-
-## 2. Enabling Cassandra Interpreter +## Enabling Cassandra Interpreter + +In a notebook, to enable the **Cassandra** interpreter, click on the **Gear** icon and select **Cassandra** - In a notebook, to enable the **Cassandra** interpreter, click on the **Gear** icon and select **Cassandra** - -
- ![Interpreter Binding](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterBinding.png) +
- ![Interpreter Selection](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterSelection.png) -
+![Interpreter Binding](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterBinding.png) -
- -## 3. Using the Cassandra Interpreter +![Interpreter Selection](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterSelection.png) - In a paragraph, use **_%cassandra_** to select the **Cassandra** interpreter and then input all commands. - - To access the interactive help, type **HELP;** - -
- ![Interactive Help](../assets/themes/zeppelin/img/docs-img/cassandra-InteractiveHelp.png) -
+
-
-## 4. Interpreter Commands +## Using the Cassandra Interpreter + +In a paragraph, use **_%cassandra_** to select the **Cassandra** interpreter and then input all commands. + +To access the interactive help, type **HELP;** - The **Cassandra** interpreter accepts the following commands -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Command TypeCommand NameDescription
Help commandHELPDisplay the interactive help menu
Schema commandsDESCRIBE KEYSPACE, DESCRIBE CLUSTER, DESCRIBE TABLES ...Custom commands to describe the Cassandra schema
Option commands@consistency, @retryPolicy, @fetchSize ...Inject runtime options to all statements in the paragraph
Prepared statement commands@prepare, @bind, @remove_preparedLet you register a prepared command and re-use it later by injecting bound values
Native CQL statementsAll CQL-compatible statements (SELECT, INSERT, CREATE ...)All CQL statements are executed directly against the Cassandra server
+![Interactive Help](../assets/themes/zeppelin/img/docs-img/cassandra-InteractiveHelp.png)
-
-## 5. CQL statements - -This interpreter is compatible with any CQL statement supported by Cassandra. Ex: + +## Interpreter Commands + +The **Cassandra** interpreter accepts the following commands + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Command TypeCommand NameDescription
Help commandHELPDisplay the interactive help menu
Schema commandsDESCRIBE KEYSPACE, DESCRIBE CLUSTER, DESCRIBE TABLES ...Custom commands to describe the Cassandra schema
Option commands@consistency, @retryPolicy, @fetchSize ...Inject runtime options to all statements in the paragraph
Prepared statement commands@prepare, @bind, @remove_preparedLet you register a prepared command and re-use it later by injecting bound values
Native CQL statementsAll CQL-compatible statements (SELECT, INSERT, CREATE ...)All CQL statements are executed directly against the Cassandra server
+ + +## CQL statements + +This interpreter is compatible with any CQL statement supported by Cassandra. Ex: ```sql INSERT INTO users(login,name) VALUES('jdoe','John DOE'); SELECT * FROM users WHERE login='jdoe'; -``` +``` Each statement should be separated by a semi-colon ( **;** ) except the special commands below: @@ -109,9 +105,9 @@ Each statement should be separated by a semi-colon ( **;** ) except the special 6. @timestamp 7. @retryPolicy 8. @fetchSize - + Multi-line statements as well as multiple statements on the same line are also supported as long as they are -separated by a semi-colon. Ex: +separated by a semi-colon. Ex: ```sql @@ -124,7 +120,7 @@ separated by a semi-colon. Ex: WHERE login='jlennon'; ``` -Batch statements are supported and can span multiple lines, as well as DDL(CREATE/ALTER/DROP) statements: +Batch statements are supported and can span multiple lines, as well as DDL(CREATE/ALTER/DROP) statements: ```sql @@ -139,8 +135,8 @@ Batch statements are supported and can span multiple lines, as well as DDL(CREAT ); ``` -CQL statements are case-insensitive (except for column names and values). -This means that the following statements are equivalent and valid: +CQL statements are case-insensitive (except for column names and values). +This means that the following statements are equivalent and valid: ```sql @@ -149,47 +145,45 @@ This means that the following statements are equivalent and valid: ``` The complete list of all CQL statements and versions can be found below: -
- - - - - - - - - - - - - - - - - -
Cassandra VersionDocumentation Link
2.2 - - http://docs.datastax.com/en/cql/3.3/cql/cqlIntro.html - -
2.1 & 2.0 - - http://docs.datastax.com/en/cql/3.1/cql/cql_intro_c.html - -
1.2 - - http://docs.datastax.com/en/cql/3.0/cql/aboutCQL.html - -
-
-
+ + + + + + + + + + + + + + + + + +
Cassandra VersionDocumentation Link
2.2 + + http://docs.datastax.com/en/cql/3.3/cql/cqlIntro.html + +
2.1 & 2.0 + + http://docs.datastax.com/en/cql/3.1/cql/cql_intro_c.html + +
1.2 + + http://docs.datastax.com/en/cql/3.0/cql/aboutCQL.html + +
+ -## 6. Comments in statements +## Comments in statements -It is possible to add comments between statements. Single line comments start with the hash sign (#). Multi-line comments are enclosed between /** and **/. Ex: +It is possible to add comments between statements. Single line comments start with the hash sign (#). Multi-line comments are enclosed between /** and **/. Ex: ```sql @@ -203,171 +197,160 @@ It is possible to add comments between statements. Single line comments start wi Insert into users(login,name) vAlues('hsue','Helen SUE'); ``` -
-## 7. Syntax Validation +## Syntax Validation -The interpreters is shipped with a built-in syntax validator. This validator only checks for basic syntax errors. -All CQL-related syntax validation is delegated directly to **Cassandra** +The interpreters is shipped with a built-in syntax validator. This validator only checks for basic syntax errors. +All CQL-related syntax validation is delegated directly to **Cassandra** Most of the time, syntax errors are due to **missing semi-colons** between statements or **typo errors**. -
- -## 8. Schema commands + +## Schema commands To make schema discovery easier and more interactive, the following commands are supported: -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CommandDescription
DESCRIBE CLUSTER;Show the current cluster name and its partitioner
DESCRIBE KEYSPACES;List all existing keyspaces in the cluster and their configuration (replication factor, durable write ...)
DESCRIBE TABLES;List all existing keyspaces in the cluster and for each, all the tables name
DESCRIBE TYPES;List all existing user defined types in the current (logged) keyspace
DESCRIBE FUNCTIONS <keyspace_name>;List all existing user defined functions in the given keyspace
DESCRIBE AGGREGATES <keyspace_name>;List all existing user defined aggregates in the given keyspace
DESCRIBE KEYSPACE <keyspace_name>;Describe the given keyspace configuration and all its table details (name, columns, ...)
DESCRIBE TABLE (<keyspace_name>).<table_name>; - Describe the given table. If the keyspace is not provided, the current logged in keyspace is used. - If there is no logged in keyspace, the default system keyspace is used. - If no table is found, an error message is raised -
DESCRIBE TYPE (<keyspace_name>).<type_name>; - Describe the given type(UDT). If the keyspace is not provided, the current logged in keyspace is used. - If there is no logged in keyspace, the default system keyspace is used. - If no type is found, an error message is raised -
DESCRIBE FUNCTION (<keyspace_name>).<function_name>;Describe the given user defined function. The keyspace is optional
DESCRIBE AGGREGATE (<keyspace_name>).<aggregate_name>;Describe the given user defined aggregate. The keyspace is optional
-
- -The schema objects (cluster, keyspace, table, type, function and aggregate) are displayed in a tabular format. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
CommandDescription
DESCRIBE CLUSTER;Show the current cluster name and its partitioner
DESCRIBE KEYSPACES;List all existing keyspaces in the cluster and their configuration (replication factor, durable write ...)
DESCRIBE TABLES;List all existing keyspaces in the cluster and for each, all the tables name
DESCRIBE TYPES;List all existing user defined types in the current (logged) keyspace
DESCRIBE FUNCTIONS <keyspace_name>;List all existing user defined functions in the given keyspace
DESCRIBE AGGREGATES <keyspace_name>;List all existing user defined aggregates in the given keyspace
DESCRIBE KEYSPACE <keyspace_name>;Describe the given keyspace configuration and all its table details (name, columns, ...)
DESCRIBE TABLE (<keyspace_name>).<table_name>; + Describe the given table. If the keyspace is not provided, the current logged in keyspace is used. + If there is no logged in keyspace, the default system keyspace is used. + If no table is found, an error message is raised +
DESCRIBE TYPE (<keyspace_name>).<type_name>; + Describe the given type(UDT). If the keyspace is not provided, the current logged in keyspace is used. + If there is no logged in keyspace, the default system keyspace is used. + If no type is found, an error message is raised +
DESCRIBE FUNCTION (<keyspace_name>).<function_name>;Describe the given user defined function. The keyspace is optional
DESCRIBE AGGREGATE (<keyspace_name>).<aggregate_name>;Describe the given user defined aggregate. The keyspace is optional
+ +The schema objects (cluster, keyspace, table, type, function and aggregate) are displayed in a tabular format. There is a drop-down menu on the top left corner to expand objects details. On the top right menu is shown the Icon legend. -
![Describe Schema](../assets/themes/zeppelin/img/docs-img/cassandra-DescribeSchema.png)
-
- -## 9. Runtime Parameters - -Sometimes you want to be able to pass runtime query parameters to your statements. -Those parameters are not part of the CQL specs and are specific to the interpreter. -Below is the list of all parameters: - -
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterSyntaxDescription
Consistency Level@consistency=valueApply the given consistency level to all queries in the paragraph
Serial Consistency Level@serialConsistency=valueApply the given serial consistency level to all queries in the paragraph
Timestamp@timestamp=long value - Apply the given timestamp to all queries in the paragraph. - Please note that timestamp value passed directly in CQL statement will override this value -
Retry Policy@retryPolicy=valueApply the given retry policy to all queries in the paragraph
Fetch Size@fetchSize=integer valueApply the given fetch size to all queries in the paragraph
-
- Some parameters only accept restricted values: - -
-
- - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterPossible Values
Consistency LevelALL, ANY, ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM
Serial Consistency LevelSERIAL, LOCAL_SERIAL
TimestampAny long value
Retry PolicyDEFAULT, DOWNGRADING_CONSISTENCY, FALLTHROUGH, LOGGING_DEFAULT, LOGGING_DOWNGRADING, LOGGING_FALLTHROUGH
Fetch SizeAny integer value
-
- ->Please note that you should **not** add semi-colon ( **;** ) at the end of each parameter statement - -Some examples: +## Runtime Parameters + +Sometimes you want to be able to pass runtime query parameters to your statements. +Those parameters are not part of the CQL specs and are specific to the interpreter. +Below is the list of all parameters: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterSyntaxDescription
Consistency Level@consistency=valueApply the given consistency level to all queries in the paragraph
Serial Consistency Level@serialConsistency=valueApply the given serial consistency level to all queries in the paragraph
Timestamp@timestamp=long value + Apply the given timestamp to all queries in the paragraph. + Please note that timestamp value passed directly in CQL statement will override this value +
Retry Policy@retryPolicy=valueApply the given retry policy to all queries in the paragraph
Fetch Size@fetchSize=integer valueApply the given fetch size to all queries in the paragraph
+ +Some parameters only accept restricted values: + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterPossible Values
Consistency LevelALL, ANY, ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM
Serial Consistency LevelSERIAL, LOCAL_SERIAL
TimestampAny long value
Retry PolicyDEFAULT, DOWNGRADING_CONSISTENCY, FALLTHROUGH, LOGGING_DEFAULT, LOGGING_DOWNGRADING, LOGGING_FALLTHROUGH
Fetch SizeAny integer value
+ +> Please note that you should **not** add semi-colon ( **;** ) at the end of each parameter statement. + +Some examples: ```sql @@ -392,26 +375,25 @@ Some examples: # Check for the result. You should see 'first insert' SELECT value FROM spark_demo.ts WHERE key=1; ``` - + Some remarks about query parameters: - + > 1. **many** query parameters can be set in the same paragraph > 2. if the **same** query parameter is set many time with different values, the interpreter only take into account the first value > 3. each query parameter applies to **all CQL statements** in the same paragraph, unless you override the option using plain CQL text (like forcing timestamp with the USING clause) > 4. the order of each query parameter with regard to CQL statement does not matter -
-## 10. Support for Prepared Statements +## Support for Prepared Statements -For performance reason, it is better to prepare statements before-hand and reuse them later by providing bound values. -This interpreter provides 3 commands to handle prepared and bound statements: +For performance reason, it is better to prepare statements before-hand and reuse them later by providing bound values. +This interpreter provides 3 commands to handle prepared and bound statements: 1. **@prepare** 2. **@bind** 3. **@remove_prepared** -Example: +Example: ``` @@ -424,41 +406,39 @@ Example: @remove_prepare[statement_name] ``` -
-#### a. @prepare -
-You can use the syntax _"@prepare[statement_name]=SELECT ..."_ to create a prepared statement. -The _statement_name_ is **mandatory** because the interpreter prepares the given statement with the Java driver and +#### @prepare + +You can use the syntax _"@prepare[statement_name]=SELECT ..."_ to create a prepared statement. +The _statement_name_ is **mandatory** because the interpreter prepares the given statement with the Java driver and saves the generated prepared statement in an **internal hash map**, using the provided _statement_name_ as search key. - -> Please note that this internal prepared statement map is shared with **all notebooks** and **all paragraphs** because + +> Please note that this internal prepared statement map is shared with **all notebooks** and **all paragraphs** because there is only one instance of the interpreter for Cassandra - + > If the interpreter encounters **many** @prepare for the **same _statement_name_ (key)**, only the **first** statement will be taken into account. - -Example: + +Example: ``` @prepare[select]=SELECT * FROM spark_demo.albums LIMIT ? @prepare[select]=SELECT * FROM spark_demo.artists LIMIT ? -``` +``` -For the above example, the prepared statement is _SELECT * FROM spark_demo.albums LIMIT ?_. -_SELECT * FROM spark_demo.artists LIMIT ?_ is ignored because an entry already exists in the prepared statements map with the key select. +For the above example, the prepared statement is _SELECT * FROM spark_demo.albums LIMIT ?_. +_SELECT * FROM spark_demo.artists LIMIT ?_ is ignored because an entry already exists in the prepared statements map with the key select. In the context of **Zeppelin**, a notebook can be scheduled to be executed at regular interval, thus it is necessary to **avoid re-preparing many time the same statement (considered an anti-pattern)**. -
-
-#### b. @bind -
-Once the statement is prepared (possibly in a separated notebook/paragraph). You can bind values to it: + +#### @bind + +Once the statement is prepared (possibly in a separated notebook/paragraph). You can bind values to it ``` @bind[select_first]=10 -``` +``` Bound values are not mandatory for the **@bind** statement. However if you provide bound values, they need to comply to some syntax: @@ -476,35 +456,33 @@ Bound values are not mandatory for the **@bind** statement. However if you provi * **udt** values should be enclosed between brackets (see **[UDT CQL syntax]**): {stree_name: ‘Beverly Hills’, number: 104, zip_code: 90020, state: ‘California’, …} > It is possible to use the @bind statement inside a batch: -> +> > ```sql -> +> > BEGIN BATCH > @bind[insert_user]='jdoe','John DOE' > UPDATE users SET age = 27 WHERE login='hsue'; > APPLY BATCH; > ``` -
-#### c. @remove_prepare -
+#### @remove_prepare + To avoid for a prepared statement to stay forever in the prepared statement map, you can use the -**@remove_prepare[statement_name]** syntax to remove it. +**@remove_prepare[statement_name]** syntax to remove it. Removing a non-existing prepared statement yields no error. -
-## 11. Using Dynamic Forms +## Using Dynamic Forms -Instead of hard-coding your CQL queries, it is possible to use the mustache syntax ( **\{\{ \}\}** ) to inject simple value or multiple choices forms. +Instead of hard-coding your CQL queries, it is possible to use the mustache syntax ( **\{\{ \}\}** ) to inject simple value or multiple choices forms. The syntax for simple parameter is: **\{\{input_Label=default value\}\}**. The default value is mandatory because the first time the paragraph is executed, -we launch the CQL query before rendering the form so at least one value should be provided. +we launch the CQL query before rendering the form so at least one value should be provided. The syntax for multiple choices parameter is: **\{\{input_Label=value1 | value2 | … | valueN \}\}**. By default the first choice is used for CQL query -the first time the paragraph is executed. +the first time the paragraph is executed. -Example: +Example: {% raw %} #Secondary index on performer style @@ -513,28 +491,26 @@ Example: WHERE name='{{performer=Sheryl Crow|Doof|Fanfarlo|Los Paranoia}}' AND styles CONTAINS '{{style=Rock}}'; {% endraw %} - -In the above example, the first CQL query will be executed for _performer='Sheryl Crow' AND style='Rock'_. -For subsequent queries, you can change the value directly using the form. +In the above example, the first CQL query will be executed for _performer='Sheryl Crow' AND style='Rock'_. +For subsequent queries, you can change the value directly using the form. -> Please note that we enclosed the **\{\{ \}\}** block between simple quotes ( **'** ) because Cassandra expects a String here. -> We could have also use the **\{\{style='Rock'\}\}** syntax but this time, the value displayed on the form is **_'Rock'_** and not **_Rock_**. +> Please note that we enclosed the **\{\{ \}\}** block between simple quotes ( **'** ) because Cassandra expects a String here. +> We could have also use the **\{\{style='Rock'\}\}** syntax but this time, the value displayed on the form is **_'Rock'_** and not **_Rock_**. -It is also possible to use dynamic forms for **prepared statements**: +It is also possible to use dynamic forms for **prepared statements**: {% raw %} @bind[select]=='{{performer=Sheryl Crow|Doof|Fanfarlo|Los Paranoia}}', '{{style=Rock}}' - + {% endraw %} -
-## 12. Execution parallelism and shared states +## Execution parallelism and shared states -It is possible to execute many paragraphs in parallel. However, at the back-end side, we’re still using synchronous queries. -_Asynchronous execution_ is only possible when it is possible to return a `Future` value in the `InterpreterResult`. +It is possible to execute many paragraphs in parallel. However, at the back-end side, we’re still using synchronous queries. +_Asynchronous execution_ is only possible when it is possible to return a `Future` value in the `InterpreterResult`. It may be an interesting proposal for the **Zeppelin** project. Another caveat is that the same `com.datastax.driver.core.Session` object is used for **all** notebooks and paragraphs. @@ -544,45 +520,38 @@ per instance of **Cassandra** interpreter. The same remark does apply to the **prepared statement hash map**, it is shared by **all users** using the same instance of **Cassandra** interpreter. -Until **Zeppelin** offers a real multi-users separation, there is a work-around to segregate user environment and states: +Until **Zeppelin** offers a real multi-users separation, there is a work-around to segregate user environment and states: _create different **Cassandra** interpreter instances_ For this, first go to the **Interpreter** menu and click on the **Create** button -
-
+
![Create Interpreter](../assets/themes/zeppelin/img/docs-img/cassandra-NewInterpreterInstance.png)
- -In the interpreter creation form, put **cass-instance2** as **Name** and select the **cassandra** -in the interpreter drop-down list -
-
+ +In the interpreter creation form, put **cass-instance2** as **Name** and select the **cassandra** +in the interpreter drop-down list +
![Interpreter Name](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterName.png) -
+ Click on **Save** to create the new interpreter instance. Now you should be able to see it in the interpreter list. - -
-
+
![Interpreter In List](../assets/themes/zeppelin/img/docs-img/cassandra-NewInterpreterInList.png) -
+ Go back to your notebook and click on the **Gear** icon to configure interpreter bindings. You should be able to see and select the **cass-instance2** interpreter instance in the available interpreter list instead of the standard **cassandra** instance. -
-
![Interpreter Instance Selection](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterInstanceSelection.png) -
+ -
-## 13. Interpreter Configuration +## Interpreter Configuration To configure the **Cassandra** interpreter, go to the **Interpreter** menu and scroll down to change the parameters. The **Cassandra** interpreter is using the official **[Cassandra Java Driver]** and most of the parameters are used @@ -590,213 +559,211 @@ to configure the Java driver Below are the configuration parameters and their default value. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Property NameDescriptionDefault Value
cassandra.clusterName of the Cassandra cluster to connect toTest Cluster
cassandra.compression.protocolOn wire compression. Possible values are: NONE, SNAPPY, LZ4NONE
cassandra.credentials.usernameIf security is enable, provide the loginnone
cassandra.credentials.passwordIf security is enable, provide the passwordnone
cassandra.hosts - Comma separated Cassandra hosts (DNS name or IP address). -
- Ex: '192.168.0.12,node2,node3' -
localhost
cassandra.interpreter.parallelismNumber of concurrent paragraphs(queries block) that can be executed10
cassandra.keyspace - Default keyspace to connect to. - - It is strongly recommended to let the default value - and prefix the table name with the actual keyspace - in all of your queries - - system
cassandra.load.balancing.policy - Load balancing policy. Default = new TokenAwarePolicy(new DCAwareRoundRobinPolicy()) - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. - At runtime the interpreter will instantiate the policy using - Class.forName(FQCN) - DEFAULT
cassandra.max.schema.agreement.wait.secondCassandra max schema agreement wait in second10
cassandra.pooling.core.connection.per.host.localProtocol V2 and below default = 2. Protocol V3 and above default = 12
cassandra.pooling.core.connection.per.host.remoteProtocol V2 and below default = 1. Protocol V3 and above default = 11
cassandra.pooling.heartbeat.interval.secondsCassandra pool heartbeat interval in secs30
cassandra.pooling.idle.timeout.secondsCassandra idle time out in seconds120
cassandra.pooling.max.connection.per.host.localProtocol V2 and below default = 8. Protocol V3 and above default = 18
cassandra.pooling.max.connection.per.host.remoteProtocol V2 and below default = 2. Protocol V3 and above default = 12
cassandra.pooling.max.request.per.connection.localProtocol V2 and below default = 128. Protocol V3 and above default = 1024128
cassandra.pooling.max.request.per.connection.remoteProtocol V2 and below default = 128. Protocol V3 and above default = 256128
cassandra.pooling.new.connection.threshold.localProtocol V2 and below default = 100. Protocol V3 and above default = 800100
cassandra.pooling.new.connection.threshold.remoteProtocol V2 and below default = 100. Protocol V3 and above default = 200100
cassandra.pooling.pool.timeout.millisecsCassandra pool time out in millisecs5000
cassandra.protocol.versionCassandra binary protocol version3
cassandra.query.default.consistency - Cassandra query default consistency level -
- Available values: ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM, ALL -
ONE
cassandra.query.default.fetchSizeCassandra query default fetch size5000
cassandra.query.default.serial.consistency - Cassandra query default serial consistency level + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Property NameDescriptionDefault Value
cassandra.clusterName of the Cassandra cluster to connect toTest Cluster
cassandra.compression.protocolOn wire compression. Possible values are: NONE, SNAPPY, LZ4NONE
cassandra.credentials.usernameIf security is enable, provide the loginnone
cassandra.credentials.passwordIf security is enable, provide the passwordnone
cassandra.hosts + Comma separated Cassandra hosts (DNS name or IP address).
- Available values: SERIAL, LOCAL_SERIAL -
SERIAL
cassandra.reconnection.policy - Cassandra Reconnection Policy. - Default = new ExponentialReconnectionPolicy(1000, 10 * 60 * 1000) - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. - At runtime the interpreter will instantiate the policy using - Class.forName(FQCN) - DEFAULT
cassandra.retry.policy - Cassandra Retry Policy. - Default = DefaultRetryPolicy.INSTANCE - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. - At runtime the interpreter will instantiate the policy using - Class.forName(FQCN) - DEFAULT
cassandra.socket.connection.timeout.millisecsCassandra socket default connection timeout in millisecs500
cassandra.socket.read.timeout.millisecsCassandra socket read timeout in millisecs12000
cassandra.socket.tcp.no_delayCassandra socket TCP no delaytrue
cassandra.speculative.execution.policy - Cassandra Speculative Execution Policy. - Default = NoSpeculativeExecutionPolicy.INSTANCE - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. - At runtime the interpreter will instantiate the policy using - Class.forName(FQCN) - DEFAULT
- -
- -## 14. Bugs & Contacts - - If you encounter a bug for this interpreter, please create a **[JIRA]** ticket and ping me on Twitter - at **[@doanduyhai]** + Ex: '192.168.0.12,node2,node3' +
localhost
cassandra.interpreter.parallelismNumber of concurrent paragraphs(queries block) that can be executed10
cassandra.keyspace + Default keyspace to connect to. + + It is strongly recommended to let the default value + and prefix the table name with the actual keyspace + in all of your queries + + system
cassandra.load.balancing.policy + Load balancing policy. Default = new TokenAwarePolicy(new DCAwareRoundRobinPolicy()) + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + At runtime the interpreter will instantiate the policy using + Class.forName(FQCN) + DEFAULT
cassandra.max.schema.agreement.wait.secondCassandra max schema agreement wait in second10
cassandra.pooling.core.connection.per.host.localProtocol V2 and below default = 2. Protocol V3 and above default = 12
cassandra.pooling.core.connection.per.host.remoteProtocol V2 and below default = 1. Protocol V3 and above default = 11
cassandra.pooling.heartbeat.interval.secondsCassandra pool heartbeat interval in secs30
cassandra.pooling.idle.timeout.secondsCassandra idle time out in seconds120
cassandra.pooling.max.connection.per.host.localProtocol V2 and below default = 8. Protocol V3 and above default = 18
cassandra.pooling.max.connection.per.host.remoteProtocol V2 and below default = 2. Protocol V3 and above default = 12
cassandra.pooling.max.request.per.connection.localProtocol V2 and below default = 128. Protocol V3 and above default = 1024128
cassandra.pooling.max.request.per.connection.remoteProtocol V2 and below default = 128. Protocol V3 and above default = 256128
cassandra.pooling.new.connection.threshold.localProtocol V2 and below default = 100. Protocol V3 and above default = 800100
cassandra.pooling.new.connection.threshold.remoteProtocol V2 and below default = 100. Protocol V3 and above default = 200100
cassandra.pooling.pool.timeout.millisecsCassandra pool time out in millisecs5000
cassandra.protocol.versionCassandra binary protocol version3
cassandra.query.default.consistency + Cassandra query default consistency level +
+ Available values: ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM, ALL +
ONE
cassandra.query.default.fetchSizeCassandra query default fetch size5000
cassandra.query.default.serial.consistency + Cassandra query default serial consistency level +
+ Available values: SERIAL, LOCAL_SERIAL +
SERIAL
cassandra.reconnection.policy + Cassandra Reconnection Policy. + Default = new ExponentialReconnectionPolicy(1000, 10 * 60 * 1000) + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + At runtime the interpreter will instantiate the policy using + Class.forName(FQCN) + DEFAULT
cassandra.retry.policy + Cassandra Retry Policy. + Default = DefaultRetryPolicy.INSTANCE + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + At runtime the interpreter will instantiate the policy using + Class.forName(FQCN) + DEFAULT
cassandra.socket.connection.timeout.millisecsCassandra socket default connection timeout in millisecs500
cassandra.socket.read.timeout.millisecsCassandra socket read timeout in millisecs12000
cassandra.socket.tcp.no_delayCassandra socket TCP no delaytrue
cassandra.speculative.execution.policy + Cassandra Speculative Execution Policy. + Default = NoSpeculativeExecutionPolicy.INSTANCE + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + At runtime the interpreter will instantiate the policy using + Class.forName(FQCN) + DEFAULT
+ + +## Bugs & Contacts + +If you encounter a bug for this interpreter, please create a **[JIRA]** ticket and ping me on Twitter +at **[@doanduyhai]** [Cassandra Java Driver]: https://github.com/datastax/java-driver From 5b091e4b0e7e1417744cde1c81f5a44d3330ba25 Mon Sep 17 00:00:00 2001 From: Jesang Yoon Date: Mon, 18 Jan 2016 00:20:03 +0900 Subject: [PATCH 2/3] Fix wrong HTML tags, indention and space between paragraph and tables. Remove unnecessary spaces. --- docs/interpreter/cassandra.md | 1037 ++++++++++++++--------------- docs/interpreter/elasticsearch.md | 27 +- docs/interpreter/flink.md | 7 +- docs/interpreter/geode.md | 52 +- docs/interpreter/hive.md | 3 +- docs/interpreter/ignite.md | 10 +- docs/interpreter/lens.md | 15 +- docs/interpreter/markdown.md | 4 +- docs/interpreter/postgresql.md | 17 +- docs/interpreter/scalding.md | 16 +- docs/interpreter/spark.md | 34 +- docs/manual/interpreters.md | 10 +- 12 files changed, 587 insertions(+), 645 deletions(-) diff --git a/docs/interpreter/cassandra.md b/docs/interpreter/cassandra.md index eded545e52d..2fd29cc5a54 100644 --- a/docs/interpreter/cassandra.md +++ b/docs/interpreter/cassandra.md @@ -6,10 +6,9 @@ group: manual --- {% include JB/setup %} -
-## 1. Cassandra CQL Interpreter for Apache Zeppelin -
+## Cassandra CQL Interpreter for Apache Zeppelin + @@ -23,81 +22,78 @@ group: manual
Name
-
-## 2. Enabling Cassandra Interpreter +## Enabling Cassandra Interpreter + +In a notebook, to enable the **Cassandra** interpreter, click on the **Gear** icon and select **Cassandra** - In a notebook, to enable the **Cassandra** interpreter, click on the **Gear** icon and select **Cassandra** - -
- ![Interpreter Binding](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterBinding.png) +
- ![Interpreter Selection](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterSelection.png) -
+![Interpreter Binding](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterBinding.png) -
- -## 3. Using the Cassandra Interpreter +![Interpreter Selection](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterSelection.png) - In a paragraph, use **_%cassandra_** to select the **Cassandra** interpreter and then input all commands. - - To access the interactive help, type **HELP;** - -
- ![Interactive Help](../assets/themes/zeppelin/img/docs-img/cassandra-InteractiveHelp.png) -
+
-
-## 4. Interpreter Commands +## Using the Cassandra Interpreter + +In a paragraph, use **_%cassandra_** to select the **Cassandra** interpreter and then input all commands. + +To access the interactive help, type **HELP;** - The **Cassandra** interpreter accepts the following commands -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Command TypeCommand NameDescription
Help commandHELPDisplay the interactive help menu
Schema commandsDESCRIBE KEYSPACE, DESCRIBE CLUSTER, DESCRIBE TABLES ...Custom commands to describe the Cassandra schema
Option commands@consistency, @retryPolicy, @fetchSize ...Inject runtime options to all statements in the paragraph
Prepared statement commands@prepare, @bind, @remove_preparedLet you register a prepared command and re-use it later by injecting bound values
Native CQL statementsAll CQL-compatible statements (SELECT, INSERT, CREATE ...)All CQL statements are executed directly against the Cassandra server
+![Interactive Help](../assets/themes/zeppelin/img/docs-img/cassandra-InteractiveHelp.png)
-
-## 5. CQL statements - -This interpreter is compatible with any CQL statement supported by Cassandra. Ex: + +## Interpreter Commands + +The **Cassandra** interpreter accepts the following commands + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Command TypeCommand NameDescription
Help commandHELPDisplay the interactive help menu
Schema commandsDESCRIBE KEYSPACE, DESCRIBE CLUSTER, DESCRIBE TABLES ...Custom commands to describe the Cassandra schema
Option commands@consistency, @retryPolicy, @fetchSize ...Inject runtime options to all statements in the paragraph
Prepared statement commands@prepare, @bind, @remove_preparedLet you register a prepared command and re-use it later by injecting bound values
Native CQL statementsAll CQL-compatible statements (SELECT, INSERT, CREATE ...)All CQL statements are executed directly against the Cassandra server
+ + +## CQL statements + +This interpreter is compatible with any CQL statement supported by Cassandra. Ex: ```sql INSERT INTO users(login,name) VALUES('jdoe','John DOE'); SELECT * FROM users WHERE login='jdoe'; -``` +``` Each statement should be separated by a semi-colon ( **;** ) except the special commands below: @@ -109,9 +105,8 @@ Each statement should be separated by a semi-colon ( **;** ) except the special 6. @timestamp 7. @retryPolicy 8. @fetchSize - -Multi-line statements as well as multiple statements on the same line are also supported as long as they are -separated by a semi-colon. Ex: + +Multi-line statements as well as multiple statements on the same line are also supported as long as they are separated by a semi-colon. Ex: ```sql @@ -124,7 +119,7 @@ separated by a semi-colon. Ex: WHERE login='jlennon'; ``` -Batch statements are supported and can span multiple lines, as well as DDL(CREATE/ALTER/DROP) statements: +Batch statements are supported and can span multiple lines, as well as DDL(CREATE/ALTER/DROP) statements: ```sql @@ -139,8 +134,8 @@ Batch statements are supported and can span multiple lines, as well as DDL(CREAT ); ``` -CQL statements are case-insensitive (except for column names and values). -This means that the following statements are equivalent and valid: +CQL statements are case-insensitive (except for column names and values). +This means that the following statements are equivalent and valid: ```sql @@ -149,47 +144,45 @@ This means that the following statements are equivalent and valid: ``` The complete list of all CQL statements and versions can be found below: -
- - - - - - - - - - - - - - - - - -
Cassandra VersionDocumentation Link
2.2 - - http://docs.datastax.com/en/cql/3.3/cql/cqlIntro.html - -
2.1 & 2.0 - - http://docs.datastax.com/en/cql/3.1/cql/cql_intro_c.html - -
1.2 - - http://docs.datastax.com/en/cql/3.0/cql/aboutCQL.html - -
-
-
+ + + + + + + + + + + + + + + + + +
Cassandra VersionDocumentation Link
2.2 + + http://docs.datastax.com/en/cql/3.3/cql/cqlIntro.html + +
2.1 & 2.0 + + http://docs.datastax.com/en/cql/3.1/cql/cql_intro_c.html + +
1.2 + + http://docs.datastax.com/en/cql/3.0/cql/aboutCQL.html + +
+ -## 6. Comments in statements +## Comments in statements -It is possible to add comments between statements. Single line comments start with the hash sign (#). Multi-line comments are enclosed between /** and **/. Ex: +It is possible to add comments between statements. Single line comments start with the hash sign (#). Multi-line comments are enclosed between /** and **/. Ex: ```sql @@ -203,171 +196,160 @@ It is possible to add comments between statements. Single line comments start wi Insert into users(login,name) vAlues('hsue','Helen SUE'); ``` -
-## 7. Syntax Validation +## Syntax Validation -The interpreters is shipped with a built-in syntax validator. This validator only checks for basic syntax errors. -All CQL-related syntax validation is delegated directly to **Cassandra** +The interpreters is shipped with a built-in syntax validator. This validator only checks for basic syntax errors. +All CQL-related syntax validation is delegated directly to **Cassandra** Most of the time, syntax errors are due to **missing semi-colons** between statements or **typo errors**. -
- -## 8. Schema commands + +## Schema commands To make schema discovery easier and more interactive, the following commands are supported: -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CommandDescription
DESCRIBE CLUSTER;Show the current cluster name and its partitioner
DESCRIBE KEYSPACES;List all existing keyspaces in the cluster and their configuration (replication factor, durable write ...)
DESCRIBE TABLES;List all existing keyspaces in the cluster and for each, all the tables name
DESCRIBE TYPES;List all existing user defined types in the current (logged) keyspace
DESCRIBE FUNCTIONS <keyspace_name>;List all existing user defined functions in the given keyspace
DESCRIBE AGGREGATES <keyspace_name>;List all existing user defined aggregates in the given keyspace
DESCRIBE KEYSPACE <keyspace_name>;Describe the given keyspace configuration and all its table details (name, columns, ...)
DESCRIBE TABLE (<keyspace_name>).<table_name>; - Describe the given table. If the keyspace is not provided, the current logged in keyspace is used. - If there is no logged in keyspace, the default system keyspace is used. - If no table is found, an error message is raised -
DESCRIBE TYPE (<keyspace_name>).<type_name>; - Describe the given type(UDT). If the keyspace is not provided, the current logged in keyspace is used. - If there is no logged in keyspace, the default system keyspace is used. - If no type is found, an error message is raised -
DESCRIBE FUNCTION (<keyspace_name>).<function_name>;Describe the given user defined function. The keyspace is optional
DESCRIBE AGGREGATE (<keyspace_name>).<aggregate_name>;Describe the given user defined aggregate. The keyspace is optional
-
- -The schema objects (cluster, keyspace, table, type, function and aggregate) are displayed in a tabular format. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
CommandDescription
DESCRIBE CLUSTER;Show the current cluster name and its partitioner
DESCRIBE KEYSPACES;List all existing keyspaces in the cluster and their configuration (replication factor, durable write ...)
DESCRIBE TABLES;List all existing keyspaces in the cluster and for each, all the tables name
DESCRIBE TYPES;List all existing user defined types in the current (logged) keyspace
DESCRIBE FUNCTIONS <keyspace_name>;List all existing user defined functions in the given keyspace
DESCRIBE AGGREGATES <keyspace_name>;List all existing user defined aggregates in the given keyspace
DESCRIBE KEYSPACE <keyspace_name>;Describe the given keyspace configuration and all its table details (name, columns, ...)
DESCRIBE TABLE (<keyspace_name>).<table_name>; + Describe the given table. If the keyspace is not provided, the current logged in keyspace is used. + If there is no logged in keyspace, the default system keyspace is used. + If no table is found, an error message is raised +
DESCRIBE TYPE (<keyspace_name>).<type_name>; + Describe the given type(UDT). If the keyspace is not provided, the current logged in keyspace is used. + If there is no logged in keyspace, the default system keyspace is used. + If no type is found, an error message is raised +
DESCRIBE FUNCTION (<keyspace_name>).<function_name>;Describe the given user defined function. The keyspace is optional
DESCRIBE AGGREGATE (<keyspace_name>).<aggregate_name>;Describe the given user defined aggregate. The keyspace is optional
+ +The schema objects (cluster, keyspace, table, type, function and aggregate) are displayed in a tabular format. There is a drop-down menu on the top left corner to expand objects details. On the top right menu is shown the Icon legend. -
![Describe Schema](../assets/themes/zeppelin/img/docs-img/cassandra-DescribeSchema.png)
-
- -## 9. Runtime Parameters - -Sometimes you want to be able to pass runtime query parameters to your statements. -Those parameters are not part of the CQL specs and are specific to the interpreter. -Below is the list of all parameters: - -
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterSyntaxDescription
Consistency Level@consistency=valueApply the given consistency level to all queries in the paragraph
Serial Consistency Level@serialConsistency=valueApply the given serial consistency level to all queries in the paragraph
Timestamp@timestamp=long value - Apply the given timestamp to all queries in the paragraph. - Please note that timestamp value passed directly in CQL statement will override this value -
Retry Policy@retryPolicy=valueApply the given retry policy to all queries in the paragraph
Fetch Size@fetchSize=integer valueApply the given fetch size to all queries in the paragraph
-
- Some parameters only accept restricted values: - -
-
- - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterPossible Values
Consistency LevelALL, ANY, ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM
Serial Consistency LevelSERIAL, LOCAL_SERIAL
TimestampAny long value
Retry PolicyDEFAULT, DOWNGRADING_CONSISTENCY, FALLTHROUGH, LOGGING_DEFAULT, LOGGING_DOWNGRADING, LOGGING_FALLTHROUGH
Fetch SizeAny integer value
-
- ->Please note that you should **not** add semi-colon ( **;** ) at the end of each parameter statement - -Some examples: +## Runtime Parameters + +Sometimes you want to be able to pass runtime query parameters to your statements. +Those parameters are not part of the CQL specs and are specific to the interpreter. +Below is the list of all parameters: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterSyntaxDescription
Consistency Level@consistency=valueApply the given consistency level to all queries in the paragraph
Serial Consistency Level@serialConsistency=valueApply the given serial consistency level to all queries in the paragraph
Timestamp@timestamp=long value + Apply the given timestamp to all queries in the paragraph. + Please note that timestamp value passed directly in CQL statement will override this value +
Retry Policy@retryPolicy=valueApply the given retry policy to all queries in the paragraph
Fetch Size@fetchSize=integer valueApply the given fetch size to all queries in the paragraph
+ +Some parameters only accept restricted values: + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterPossible Values
Consistency LevelALL, ANY, ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM
Serial Consistency LevelSERIAL, LOCAL_SERIAL
TimestampAny long value
Retry PolicyDEFAULT, DOWNGRADING_CONSISTENCY, FALLTHROUGH, LOGGING_DEFAULT, LOGGING_DOWNGRADING, LOGGING_FALLTHROUGH
Fetch SizeAny integer value
+ +> Please note that you should **not** add semi-colon ( **;** ) at the end of each parameter statement. + +Some examples: ```sql @@ -392,26 +374,25 @@ Some examples: # Check for the result. You should see 'first insert' SELECT value FROM spark_demo.ts WHERE key=1; ``` - + Some remarks about query parameters: - + > 1. **many** query parameters can be set in the same paragraph > 2. if the **same** query parameter is set many time with different values, the interpreter only take into account the first value > 3. each query parameter applies to **all CQL statements** in the same paragraph, unless you override the option using plain CQL text (like forcing timestamp with the USING clause) > 4. the order of each query parameter with regard to CQL statement does not matter -
-## 10. Support for Prepared Statements +## Support for Prepared Statements -For performance reason, it is better to prepare statements before-hand and reuse them later by providing bound values. -This interpreter provides 3 commands to handle prepared and bound statements: +For performance reason, it is better to prepare statements before-hand and reuse them later by providing bound values. +This interpreter provides 3 commands to handle prepared and bound statements: 1. **@prepare** 2. **@bind** 3. **@remove_prepared** -Example: +Example: ``` @@ -424,41 +405,39 @@ Example: @remove_prepare[statement_name] ``` -
-#### a. @prepare -
-You can use the syntax _"@prepare[statement_name]=SELECT ..."_ to create a prepared statement. -The _statement_name_ is **mandatory** because the interpreter prepares the given statement with the Java driver and +#### @prepare + +You can use the syntax _"@prepare[statement_name]=SELECT ..."_ to create a prepared statement. +The _statement_name_ is **mandatory** because the interpreter prepares the given statement with the Java driver and saves the generated prepared statement in an **internal hash map**, using the provided _statement_name_ as search key. - -> Please note that this internal prepared statement map is shared with **all notebooks** and **all paragraphs** because + +> Please note that this internal prepared statement map is shared with **all notebooks** and **all paragraphs** because there is only one instance of the interpreter for Cassandra - + > If the interpreter encounters **many** @prepare for the **same _statement_name_ (key)**, only the **first** statement will be taken into account. - -Example: + +Example: ``` @prepare[select]=SELECT * FROM spark_demo.albums LIMIT ? @prepare[select]=SELECT * FROM spark_demo.artists LIMIT ? -``` +``` -For the above example, the prepared statement is _SELECT * FROM spark_demo.albums LIMIT ?_. -_SELECT * FROM spark_demo.artists LIMIT ?_ is ignored because an entry already exists in the prepared statements map with the key select. +For the above example, the prepared statement is _SELECT * FROM spark_demo.albums LIMIT ?_. +_SELECT * FROM spark_demo.artists LIMIT ?_ is ignored because an entry already exists in the prepared statements map with the key select. In the context of **Zeppelin**, a notebook can be scheduled to be executed at regular interval, thus it is necessary to **avoid re-preparing many time the same statement (considered an anti-pattern)**. -
-
-#### b. @bind -
-Once the statement is prepared (possibly in a separated notebook/paragraph). You can bind values to it: + +#### @bind + +Once the statement is prepared (possibly in a separated notebook/paragraph). You can bind values to it ``` @bind[select_first]=10 -``` +``` Bound values are not mandatory for the **@bind** statement. However if you provide bound values, they need to comply to some syntax: @@ -476,35 +455,32 @@ Bound values are not mandatory for the **@bind** statement. However if you provi * **udt** values should be enclosed between brackets (see **[UDT CQL syntax]**): {stree_name: ‘Beverly Hills’, number: 104, zip_code: 90020, state: ‘California’, …} > It is possible to use the @bind statement inside a batch: -> +> > ```sql -> +> > BEGIN BATCH > @bind[insert_user]='jdoe','John DOE' > UPDATE users SET age = 27 WHERE login='hsue'; > APPLY BATCH; > ``` -
-#### c. @remove_prepare -
+#### @remove_prepare + To avoid for a prepared statement to stay forever in the prepared statement map, you can use the -**@remove_prepare[statement_name]** syntax to remove it. +**@remove_prepare[statement_name]** syntax to remove it. Removing a non-existing prepared statement yields no error. -
-## 11. Using Dynamic Forms +## Using Dynamic Forms -Instead of hard-coding your CQL queries, it is possible to use the mustache syntax ( **\{\{ \}\}** ) to inject simple value or multiple choices forms. +Instead of hard-coding your CQL queries, it is possible to use the mustache syntax ( **\{\{ \}\}** ) to inject simple value or multiple choices forms. The syntax for simple parameter is: **\{\{input_Label=default value\}\}**. The default value is mandatory because the first time the paragraph is executed, -we launch the CQL query before rendering the form so at least one value should be provided. +we launch the CQL query before rendering the form so at least one value should be provided. -The syntax for multiple choices parameter is: **\{\{input_Label=value1 | value2 | … | valueN \}\}**. By default the first choice is used for CQL query -the first time the paragraph is executed. +The syntax for multiple choices parameter is: **\{\{input_Label=value1 | value2 | … | valueN \}\}**. By default the first choice is used for CQL query the first time the paragraph is executed. -Example: +Example: {% raw %} #Secondary index on performer style @@ -513,290 +489,273 @@ Example: WHERE name='{{performer=Sheryl Crow|Doof|Fanfarlo|Los Paranoia}}' AND styles CONTAINS '{{style=Rock}}'; {% endraw %} - -In the above example, the first CQL query will be executed for _performer='Sheryl Crow' AND style='Rock'_. -For subsequent queries, you can change the value directly using the form. +In the above example, the first CQL query will be executed for _performer='Sheryl Crow' AND style='Rock'_. +For subsequent queries, you can change the value directly using the form. -> Please note that we enclosed the **\{\{ \}\}** block between simple quotes ( **'** ) because Cassandra expects a String here. -> We could have also use the **\{\{style='Rock'\}\}** syntax but this time, the value displayed on the form is **_'Rock'_** and not **_Rock_**. +> Please note that we enclosed the **\{\{ \}\}** block between simple quotes ( **'** ) because Cassandra expects a String here. +> We could have also use the **\{\{style='Rock'\}\}** syntax but this time, the value displayed on the form is **_'Rock'_** and not **_Rock_**. -It is also possible to use dynamic forms for **prepared statements**: +It is also possible to use dynamic forms for **prepared statements**: {% raw %} @bind[select]=='{{performer=Sheryl Crow|Doof|Fanfarlo|Los Paranoia}}', '{{style=Rock}}' - + {% endraw %} -
-## 12. Execution parallelism and shared states +## Execution parallelism and shared states -It is possible to execute many paragraphs in parallel. However, at the back-end side, we’re still using synchronous queries. -_Asynchronous execution_ is only possible when it is possible to return a `Future` value in the `InterpreterResult`. +It is possible to execute many paragraphs in parallel. However, at the back-end side, we’re still using synchronous queries. +_Asynchronous execution_ is only possible when it is possible to return a `Future` value in the `InterpreterResult`. It may be an interesting proposal for the **Zeppelin** project. Another caveat is that the same `com.datastax.driver.core.Session` object is used for **all** notebooks and paragraphs. -Consequently, if you use the **USE _keyspace name_;** statement to log into a keyspace, it will change the keyspace for -**all current users** of the **Cassandra** interpreter because we only create 1 `com.datastax.driver.core.Session` object -per instance of **Cassandra** interpreter. +Consequently, if you use the **USE _keyspace name_;** statement to log into a keyspace, it will change the keyspace for **all current users** of the **Cassandra** interpreter because we only create 1 `com.datastax.driver.core.Session` object per instance of **Cassandra** interpreter. The same remark does apply to the **prepared statement hash map**, it is shared by **all users** using the same instance of **Cassandra** interpreter. -Until **Zeppelin** offers a real multi-users separation, there is a work-around to segregate user environment and states: +Until **Zeppelin** offers a real multi-users separation, there is a work-around to segregate user environment and states: _create different **Cassandra** interpreter instances_ For this, first go to the **Interpreter** menu and click on the **Create** button -
-
+
![Create Interpreter](../assets/themes/zeppelin/img/docs-img/cassandra-NewInterpreterInstance.png)
- -In the interpreter creation form, put **cass-instance2** as **Name** and select the **cassandra** -in the interpreter drop-down list -
-
+ +In the interpreter creation form, put **cass-instance2** as **Name** and select the **cassandra** in the interpreter drop-down list +
![Interpreter Name](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterName.png) -
+ + +Click on **Save** to create the new interpreter instance. Now you should be able to see it in the interpreter list. - Click on **Save** to create the new interpreter instance. Now you should be able to see it in the interpreter list. - -
-
![Interpreter In List](../assets/themes/zeppelin/img/docs-img/cassandra-NewInterpreterInList.png) -
+ Go back to your notebook and click on the **Gear** icon to configure interpreter bindings. -You should be able to see and select the **cass-instance2** interpreter instance in the available -interpreter list instead of the standard **cassandra** instance. +You should be able to see and select the **cass-instance2** interpreter instance in the available interpreter list instead of the standard **cassandra** instance. -
-
![Interpreter Instance Selection](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterInstanceSelection.png) -
+ -
-## 13. Interpreter Configuration +## Interpreter Configuration To configure the **Cassandra** interpreter, go to the **Interpreter** menu and scroll down to change the parameters. -The **Cassandra** interpreter is using the official **[Cassandra Java Driver]** and most of the parameters are used -to configure the Java driver +The **Cassandra** interpreter is using the official **[Cassandra Java Driver]** and most of the parameters are used to configure the Java driver Below are the configuration parameters and their default value. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Property NameDescriptionDefault Value
cassandra.clusterName of the Cassandra cluster to connect toTest Cluster
cassandra.compression.protocolOn wire compression. Possible values are: NONE, SNAPPY, LZ4NONE
cassandra.credentials.usernameIf security is enable, provide the loginnone
cassandra.credentials.passwordIf security is enable, provide the passwordnone
cassandra.hosts - Comma separated Cassandra hosts (DNS name or IP address). -
- Ex: '192.168.0.12,node2,node3' -
localhost
cassandra.interpreter.parallelismNumber of concurrent paragraphs(queries block) that can be executed10
cassandra.keyspace - Default keyspace to connect to. - - It is strongly recommended to let the default value - and prefix the table name with the actual keyspace - in all of your queries - - system
cassandra.load.balancing.policy - Load balancing policy. Default = new TokenAwarePolicy(new DCAwareRoundRobinPolicy()) - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. - At runtime the interpreter will instantiate the policy using - Class.forName(FQCN) - DEFAULT
cassandra.max.schema.agreement.wait.secondCassandra max schema agreement wait in second10
cassandra.pooling.core.connection.per.host.localProtocol V2 and below default = 2. Protocol V3 and above default = 12
cassandra.pooling.core.connection.per.host.remoteProtocol V2 and below default = 1. Protocol V3 and above default = 11
cassandra.pooling.heartbeat.interval.secondsCassandra pool heartbeat interval in secs30
cassandra.pooling.idle.timeout.secondsCassandra idle time out in seconds120
cassandra.pooling.max.connection.per.host.localProtocol V2 and below default = 8. Protocol V3 and above default = 18
cassandra.pooling.max.connection.per.host.remoteProtocol V2 and below default = 2. Protocol V3 and above default = 12
cassandra.pooling.max.request.per.connection.localProtocol V2 and below default = 128. Protocol V3 and above default = 1024128
cassandra.pooling.max.request.per.connection.remoteProtocol V2 and below default = 128. Protocol V3 and above default = 256128
cassandra.pooling.new.connection.threshold.localProtocol V2 and below default = 100. Protocol V3 and above default = 800100
cassandra.pooling.new.connection.threshold.remoteProtocol V2 and below default = 100. Protocol V3 and above default = 200100
cassandra.pooling.pool.timeout.millisecsCassandra pool time out in millisecs5000
cassandra.protocol.versionCassandra binary protocol version3
cassandra.query.default.consistency - Cassandra query default consistency level -
- Available values: ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM, ALL -
ONE
cassandra.query.default.fetchSizeCassandra query default fetch size5000
cassandra.query.default.serial.consistency - Cassandra query default serial consistency level + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Property NameDescriptionDefault Value
cassandra.clusterName of the Cassandra cluster to connect toTest Cluster
cassandra.compression.protocolOn wire compression. Possible values are: NONE, SNAPPY, LZ4NONE
cassandra.credentials.usernameIf security is enable, provide the loginnone
cassandra.credentials.passwordIf security is enable, provide the passwordnone
cassandra.hosts + Comma separated Cassandra hosts (DNS name or IP address).
- Available values: SERIAL, LOCAL_SERIAL -
SERIAL
cassandra.reconnection.policy - Cassandra Reconnection Policy. - Default = new ExponentialReconnectionPolicy(1000, 10 * 60 * 1000) - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. - At runtime the interpreter will instantiate the policy using - Class.forName(FQCN) - DEFAULT
cassandra.retry.policy - Cassandra Retry Policy. - Default = DefaultRetryPolicy.INSTANCE - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. - At runtime the interpreter will instantiate the policy using - Class.forName(FQCN) - DEFAULT
cassandra.socket.connection.timeout.millisecsCassandra socket default connection timeout in millisecs500
cassandra.socket.read.timeout.millisecsCassandra socket read timeout in millisecs12000
cassandra.socket.tcp.no_delayCassandra socket TCP no delaytrue
cassandra.speculative.execution.policy - Cassandra Speculative Execution Policy. - Default = NoSpeculativeExecutionPolicy.INSTANCE - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. - At runtime the interpreter will instantiate the policy using - Class.forName(FQCN) - DEFAULT
- -
- -## 14. Bugs & Contacts - - If you encounter a bug for this interpreter, please create a **[JIRA]** ticket and ping me on Twitter - at **[@doanduyhai]** + Ex: '192.168.0.12,node2,node3' +
localhost
cassandra.interpreter.parallelismNumber of concurrent paragraphs(queries block) that can be executed10
cassandra.keyspace + Default keyspace to connect to. + + It is strongly recommended to let the default value + and prefix the table name with the actual keyspace + in all of your queries + + system
cassandra.load.balancing.policy + Load balancing policy. Default = new TokenAwarePolicy(new DCAwareRoundRobinPolicy()) + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + At runtime the interpreter will instantiate the policy using + Class.forName(FQCN) + DEFAULT
cassandra.max.schema.agreement.wait.secondCassandra max schema agreement wait in second10
cassandra.pooling.core.connection.per.host.localProtocol V2 and below default = 2. Protocol V3 and above default = 12
cassandra.pooling.core.connection.per.host.remoteProtocol V2 and below default = 1. Protocol V3 and above default = 11
cassandra.pooling.heartbeat.interval.secondsCassandra pool heartbeat interval in secs30
cassandra.pooling.idle.timeout.secondsCassandra idle time out in seconds120
cassandra.pooling.max.connection.per.host.localProtocol V2 and below default = 8. Protocol V3 and above default = 18
cassandra.pooling.max.connection.per.host.remoteProtocol V2 and below default = 2. Protocol V3 and above default = 12
cassandra.pooling.max.request.per.connection.localProtocol V2 and below default = 128. Protocol V3 and above default = 1024128
cassandra.pooling.max.request.per.connection.remoteProtocol V2 and below default = 128. Protocol V3 and above default = 256128
cassandra.pooling.new.connection.threshold.localProtocol V2 and below default = 100. Protocol V3 and above default = 800100
cassandra.pooling.new.connection.threshold.remoteProtocol V2 and below default = 100. Protocol V3 and above default = 200100
cassandra.pooling.pool.timeout.millisecsCassandra pool time out in millisecs5000
cassandra.protocol.versionCassandra binary protocol version3
cassandra.query.default.consistency + Cassandra query default consistency level +
+ Available values: ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM, ALL +
ONE
cassandra.query.default.fetchSizeCassandra query default fetch size5000
cassandra.query.default.serial.consistency + Cassandra query default serial consistency level +
+ Available values: SERIAL, LOCAL_SERIAL +
SERIAL
cassandra.reconnection.policy + Cassandra Reconnection Policy. + Default = new ExponentialReconnectionPolicy(1000, 10 * 60 * 1000) + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + At runtime the interpreter will instantiate the policy using + Class.forName(FQCN) + DEFAULT
cassandra.retry.policy + Cassandra Retry Policy. + Default = DefaultRetryPolicy.INSTANCE + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + At runtime the interpreter will instantiate the policy using + Class.forName(FQCN) + DEFAULT
cassandra.socket.connection.timeout.millisecsCassandra socket default connection timeout in millisecs500
cassandra.socket.read.timeout.millisecsCassandra socket read timeout in millisecs12000
cassandra.socket.tcp.no_delayCassandra socket TCP no delaytrue
cassandra.speculative.execution.policy + Cassandra Speculative Execution Policy. + Default = NoSpeculativeExecutionPolicy.INSTANCE + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + At runtime the interpreter will instantiate the policy using + Class.forName(FQCN) + DEFAULT
+ + +## Bugs & Contacts + +If you encounter a bug for this interpreter, please create a **[JIRA]** ticket and ping me on Twitter at **[@doanduyhai]** [Cassandra Java Driver]: https://github.com/datastax/java-driver diff --git a/docs/interpreter/elasticsearch.md b/docs/interpreter/elasticsearch.md index 34a23ba3543..23459433c68 100644 --- a/docs/interpreter/elasticsearch.md +++ b/docs/interpreter/elasticsearch.md @@ -9,9 +9,8 @@ group: manual ## Elasticsearch Interpreter for Apache Zeppelin -### 1. Configuration +### Configuration -
@@ -49,18 +48,11 @@ group: manual > Note #2: if you use Shield, you can add a property named `shield.user` with a value containing the name and the password (format: `username:password`). For more details about Shield configuration, consult the [Shield reference guide](https://www.elastic.co/guide/en/shield/current/_using_elasticsearch_java_clients_with_shield.html). Do not forget, to copy the shield client jar in the interpreter directory (`ZEPPELIN_HOME/interpreters/elasticsearch`). - -
- -### 2. Enabling the Elasticsearch Interpreter +### Enabling the Elasticsearch Interpreter In a notebook, to enable the **Elasticsearch** interpreter, click the **Gear** icon and select **Elasticsearch**. - -
- - -### 3. Using the Elasticsearch Interpreter +### Using the Elasticsearch Interpreter In a paragraph, use `%elasticsearch` to select the Elasticsearch interpreter and then input all commands. To get the list of available commands, use `help`. @@ -88,7 +80,6 @@ Commands: > Tip: use (CTRL + .) for completion - #### get With the `get` command, you can find a document by id. The result is a JSON document. @@ -100,7 +91,6 @@ With the `get` command, you can find a document by id. The result is a JSON docu Example: ![Elasticsearch - Get](../assets/themes/zeppelin/img/docs-img/elasticsearch-get.png) - #### search With the `search` command, you can send a search query to Elasticsearch. There are two formats of query: @@ -110,7 +100,6 @@ With the `search` command, you can send a search query to Elasticsearch. There a * This is a shortcut to a query like that: `{ "query": { "query_string": { "query": "__HERE YOUR QUERY__", "analyze_wildcard": true } } }` * See [Elasticsearch query string syntax](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-syntax) for more details about the content of such a query. - ```bash | %elasticsearch | search /index1,index2,.../type1,type2,... @@ -124,10 +113,8 @@ If you want to modify the size of the result set, you can add a line that is set | search /index1,index2,.../type1,type2,... ``` - > A search query can also contain [aggregations](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html). If there is at least one aggregation, the result of the first aggregation is shown, otherwise, you get the search hits. - Examples: * With a JSON query: @@ -146,7 +133,7 @@ Examples: | "field": "content_length" | } | } -| } } +| } } ``` * With query_string elements: @@ -179,12 +166,10 @@ Suppose we have a JSON document: The data will be flattened like this: - content_length | date | request.headers[0] | request.headers[1] | request.method | request.url | status ---------------|------|--------------------|--------------------|----------------|-------------|------- 1234 | 2015-12-08T21:03:13.588Z | Accept: \*.\* | Host: apache.org | GET | /zeppelin/4cd001cd-c517-4fa9-b8e5-a06b8f4056c4 | 403 - Examples: * With a table containing the results: @@ -205,7 +190,6 @@ Examples: * With a query containing a multi-bucket aggregation: ![Elasticsearch - Search with aggregation (multi-bucket)](../assets/themes/zeppelin/img/docs-img/elasticsearch-agg-multi-bucket-pie.png) - #### count With the `count` command, you can count documents available in some indices and types. You can also provide a query. @@ -222,7 +206,6 @@ Examples: * With a query: ![Elasticsearch - Count with query](../assets/themes/zeppelin/img/docs-img/elasticsearch-count-with-query.png) - #### index With the `index` command, you can insert/update a document in Elasticsearch. @@ -242,8 +225,6 @@ With the `delete` command, you can delete a document. | delete /index/type/id ``` - - #### Apply Zeppelin Dynamic Forms You can leverage [Zeppelin Dynamic Form]({{BASE_PATH}}/manual/dynamicform.html) inside your queries. You can use both the `text input` and `select form` parameterization features diff --git a/docs/interpreter/flink.md b/docs/interpreter/flink.md index ce1f7800814..4baa0b883ba 100644 --- a/docs/interpreter/flink.md +++ b/docs/interpreter/flink.md @@ -8,13 +8,15 @@ group: manual ## Flink interpreter for Apache Zeppelin -[Apache Flink](https://flink.apache.org) is an open source platform for distributed stream and batch data processing. +[Apache Flink](https://flink.apache.org) is an open source platform for distributed stream and batch data processing. ### How to start local Flink cluster, to test the interpreter + Zeppelin comes with pre-configured flink-local interpreter, which starts Flink in a local mode on your machine, so you do not need to install anything. ### How to configure interpreter to point to Flink cluster + At the "Interpreters" menu, you have to create a new Flink interpreter and provide next properties:
Property
@@ -39,14 +41,11 @@ At the "Interpreters" menu, you have to create a new Flink interpreter and provi
anything else from [Flink Configuration](https://ci.apache.org/projects/flink/flink-docs-release-0.9/setup/config.html)
-
- ### How to test it's working In example, by using the [Zeppelin notebook](https://www.zeppelinhub.com/viewer/notebooks/aHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL05GTGFicy96ZXBwZWxpbi1ub3RlYm9va3MvbWFzdGVyL25vdGVib29rcy8yQVFFREs1UEMvbm90ZS5qc29u) is from [Till Rohrmann's presentation](http://www.slideshare.net/tillrohrmann/data-analysis-49806564) "Interactive data analysis with Apache Flink" for Apache Flink Meetup. - ``` %sh rm 10.txt.utf-8 diff --git a/docs/interpreter/geode.md b/docs/interpreter/geode.md index 250495aaf2f..1717488e440 100644 --- a/docs/interpreter/geode.md +++ b/docs/interpreter/geode.md @@ -9,7 +9,6 @@ group: manual ## Geode/Gemfire OQL Interpreter for Apache Zeppelin -
@@ -23,7 +22,6 @@ group: manual
Name
-
This interpreter supports the [Geode](http://geode.incubator.apache.org/) [Object Query Language (OQL)](http://geode-docs.cfapps.io/docs/developing/querying_basics/oql_compared_to_sql.html). With the OQL-based querying language: [zeppelin-view](https://www.youtube.com/watch?v=zvzzA9GXu3Q) @@ -48,34 +46,35 @@ To create new Geode instance open the `Interpreter` section and click the `+Crea > Note: The `Name` of the instance is used only to distinguish the instances while binding them to the `Notebook`. The `Name` is irrelevant inside the `Notebook`. In the `Notebook` you must use `%geode.oql` tag. ### Bind to Notebook + In the `Notebook` click on the `settings` icon in the top right corner. The select/deselect the interpreters to be bound with the `Notebook`. ### Configuration -You can modify the configuration of the Geode from the `Interpreter` section. The Geode interpreter expresses the following properties: +You can modify the configuration of the Geode from the `Interpreter` section. The Geode interpreter expresses the following properties: - - - - - - - - - - - - - - - - - - - - - -
Property NameDescriptionDefault Value
geode.locator.hostThe Geode Locator Hostlocalhost
geode.locator.portThe Geode Locator Port10334
geode.max.resultMax number of OQL result to display to prevent the browser overload1000
+ + + + + + + + + + + + + + + + + + + + + +
Property NameDescriptionDefault Value
geode.locator.hostThe Geode Locator Hostlocalhost
geode.locator.portThe Geode Locator Port10334
geode.max.resultMax number of OQL result to display to prevent the browser overload1000
### How to use @@ -107,7 +106,6 @@ Above snippet re-creates two regions: `regionEmployee` and `regionCompany`. Note #### Basic OQL - ```sql %geode.oql SELECT count(*) FROM /regionEmployee @@ -144,10 +142,8 @@ Following query will return the EntrySet value as a Blob: SELECT e.key, e.value FROM /regionEmployee.entrySet e ``` - > Note: You can have multiple queries in the same paragraph but only the result from the first is displayed. [[1](https://issues.apache.org/jira/browse/ZEPPELIN-178)], [[2](https://issues.apache.org/jira/browse/ZEPPELIN-212)]. - #### GFSH Commands From The Shell Use the Shell Interpreter (`%sh`) to run OQL commands form the command line: diff --git a/docs/interpreter/hive.md b/docs/interpreter/hive.md index b37c421de12..437700c82dd 100644 --- a/docs/interpreter/hive.md +++ b/docs/interpreter/hive.md @@ -11,7 +11,6 @@ group: manual ### Configuration -
@@ -71,7 +70,7 @@ group: manual
Property
This interpreter provides multiple configuration with ${prefix}. User can set a multiple connection properties by this prefix. It can be used like `%hive(${prefix})`. - + ### How to use Basically, you can use diff --git a/docs/interpreter/ignite.md b/docs/interpreter/ignite.md index 963c7d96990..418765f9c77 100644 --- a/docs/interpreter/ignite.md +++ b/docs/interpreter/ignite.md @@ -6,9 +6,11 @@ group: manual --- {% include JB/setup %} + ## Ignite Interpreter for Apache Zeppelin ### Overview + [Apache Ignite](https://ignite.apache.org/) In-Memory Data Fabric is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies. ![Apache Ignite](../assets/themes/zeppelin/img/docs-img/ignite-logo.png) @@ -16,6 +18,7 @@ group: manual You can use Zeppelin to retrieve distributed data from cache using Ignite SQL interpreter. Moreover, Ignite interpreter allows you to execute any Scala code in cases when SQL doesn't fit to your requirements. For example, you can populate data into your caches or execute distributed computations. ### Installing and Running Ignite example + In order to use Ignite interpreters, you may install Apache Ignite in some simple steps: 1. Download Ignite [source release](https://ignite.apache.org/download.html#sources) or [binary release](https://ignite.apache.org/download.html#binaries) whatever you want. But you must download Ignite as the same version of Zeppelin's. If it is not, you can't use scala code on Zeppelin. You can find ignite version in Zepplin at the pom.xml which is placed under `path/to/your-Zeppelin/ignite/pom.xml` ( Of course, in Zeppelin source release ). Please check `ignite.version` .
Currently, Zeppelin provides ignite only in Zeppelin source release. So, if you download Zeppelin binary release( `zeppelin-0.5.0-incubating-bin-spark-xxx-hadoop-xx` ), you can not use ignite interpreter on Zeppelin. We are planning to include ignite in a future binary release. @@ -32,7 +35,8 @@ In order to use Ignite interpreters, you may install Apache Ignite in some simpl $ nohup java -jar ``` -### Configuring Ignite Interpreter +### Configuring Ignite Interpreter + At the "Interpreters" menu, you may edit Ignite interpreter or create new one. Zeppelin provides these properties for Ignite. @@ -71,6 +75,7 @@ At the "Interpreters" menu, you may edit Ignite interpreter or create new one. Z ![Configuration of Ignite Interpreter](../assets/themes/zeppelin/img/docs-img/ignite-interpreter-setting.png) ### Interpreter Binding for Zeppelin Notebook + After configuring Ignite interpreter, create your own notebook. Then you can bind interpreters like below image. ![Binding Interpreters](../assets/themes/zeppelin/img/docs-img/ignite-interpreter-binding.png) @@ -78,6 +83,7 @@ After configuring Ignite interpreter, create your own notebook. Then you can bin For more interpreter binding information see [here](http://zeppelin.incubator.apache.org/docs/manual/interpreters.html). ### How to use Ignite SQL interpreter + In order to execute SQL query, use ` %ignite.ignitesql ` prefix.
Supposing you are running `org.apache.ignite.examples.streaming.wordcount.StreamWords`, then you can use "words" cache( Of course you have to specify this cache name to the Ignite interpreter setting section `ignite.jdbc.url` of Zeppelin ). For example, you can select top 10 words in the words cache using the following query @@ -112,5 +118,3 @@ As long as your Ignite version and Zeppelin Ignite version is same, you can also ![Using Scala Code](../assets/themes/zeppelin/img/docs-img/ignite-scala-example.png) Apache Ignite also provides a guide docs for Zeppelin ["Ignite with Apache Zeppelin"](https://apacheignite.readme.io/docs/data-analysis-with-apache-zeppelin) - - diff --git a/docs/interpreter/lens.md b/docs/interpreter/lens.md index a3eb2848e04..f8e6352350c 100644 --- a/docs/interpreter/lens.md +++ b/docs/interpreter/lens.md @@ -6,14 +6,17 @@ group: manual --- {% include JB/setup %} + ## Lens Interpreter for Apache Zeppelin ### Overview + [Apache Lens](https://lens.apache.org/) provides an Unified Analytics interface. Lens aims to cut the Data Analytics silos by providing a single view of data across multiple tiered data stores and optimal execution environment for the analytical query. It seamlessly integrates Hadoop with traditional data warehouses to appear like one. ![Apache Lens](../assets/themes/zeppelin/img/docs-img/lens-logo.png) ### Installing and Running Lens + In order to use Lens interpreters, you may install Apache Lens in some simple steps: 1. Download Lens for latest version from [the ASF](http://www.apache.org/dyn/closer.lua/lens/2.3-beta). Or the older release can be found [in the Archives](http://archive.apache.org/dist/lens/). @@ -25,9 +28,10 @@ In order to use Lens interpreters, you may install Apache Lens in some simple st ``` ### Configuring Lens Interpreter + At the "Interpreters" menu, you can to edit Lens interpreter or create new one. Zeppelin provides these properties for Lens. -
+
@@ -73,17 +77,19 @@ At the "Interpreters" menu, you can to edit Lens interpreter or create new one. -
Property Name valueyyy anything else from [Configuring lens server](https://lens.apache.org/admin/config-server.html)
+ ![Apache Lens Interpreter Setting](../assets/themes/zeppelin/img/docs-img/lens-interpreter-setting.png) ### Interpreter Bindging for Zeppelin Notebook + After configuring Lens interpreter, create your own notebook, then you can bind interpreters like below image. ![Zeppelin Notebook Interpreter Biding](../assets/themes/zeppelin/img/docs-img/lens-interpreter-binding.png) For more interpreter binding information see [here](http://zeppelin.incubator.apache.org/docs/manual/interpreters.html). -### How to use +### How to use + You can analyze your data by using [OLAP Cube](http://lens.apache.org/user/olap-cube.html) [QL](http://lens.apache.org/user/cli.html) which is a high level SQL like language to query and describe data sets organized in data cubes. You may experience OLAP Cube like this [Video tutorial](https://cwiki.apache.org/confluence/display/LENS/2015/07/13/20+Minute+video+demo+of+Apache+Lens+through+examples). As you can see in this video, they are using Lens Client Shell(./bin/lens-cli.sh). All of these functions also can be used on Zeppelin by using Lens interpreter. @@ -163,7 +169,8 @@ As you can see in this video, they are using Lens Client Shell(./bin/lens-cli.sh These are just examples that provided in advance by Lens. If you want to explore whole tutorials of Lens, see the [tutorial video](https://cwiki.apache.org/confluence/display/LENS/2015/07/13/20+Minute+video+demo+of+Apache+Lens+through+examples). -### Lens UI Service +### Lens UI Service + Lens also provides web UI service. Once the server starts up, you can open the service on http://serverhost:19999/index.html and browse. You may also check the structure that you made and use query easily here. ![Lens UI Servive](../assets/themes/zeppelin/img/docs-img/lens-ui-service.png) diff --git a/docs/interpreter/markdown.md b/docs/interpreter/markdown.md index 7c339d25083..7b428cf9a84 100644 --- a/docs/interpreter/markdown.md +++ b/docs/interpreter/markdown.md @@ -6,11 +6,13 @@ group: manual --- {% include JB/setup %} + ## Markdown Interpreter for Apache Zeppelin ### Overview + [Markdown](http://daringfireball.net/projects/markdown/) is a plain text formatting syntax designed so that it can be converted to HTML. -Zeppelin uses markdown4j, for more examples and extension support checkout [markdown4j](https://code.google.com/p/markdown4j/) +Zeppelin uses markdown4j, for more examples and extension support checkout [markdown4j](https://code.google.com/p/markdown4j/) In Zeppelin notebook you can use ``` %md ``` in the beginning of a paragraph to invoke the Markdown interpreter to generate static html from Markdown plain text. In Zeppelin, Markdown interpreter is enabled by default. diff --git a/docs/interpreter/postgresql.md b/docs/interpreter/postgresql.md index 9d3a2837c3e..7cd48c25d87 100644 --- a/docs/interpreter/postgresql.md +++ b/docs/interpreter/postgresql.md @@ -9,7 +9,6 @@ group: manual ## PostgreSQL, HAWQ Interpreter for Apache Zeppelin -
@@ -23,7 +22,6 @@ group: manual
Name
-
[zeppelin-view](https://www.youtube.com/watch?v=wqXXQhJ5Uk8) This interpreter seamlessly supports the following SQL data processing engines: @@ -46,13 +44,14 @@ To create new PSQL instance open the `Interpreter` section and click the `+Creat > Note: The `Name` of the instance is used only to distinct the instances while binding them to the `Notebook`. The `Name` is irrelevant inside the `Notebook`. In the `Notebook` you must use `%psql.sql` tag. ### Bind to Notebook + In the `Notebook` click on the `settings` icon in the top right corner. The select/deselect the interpreters to be bound with the `Notebook`. ### Configuration -You can modify the configuration of the PSQL from the `Interpreter` section. The PSQL interpreter expenses the following properties: +You can modify the configuration of the PSQL from the `Interpreter` section. The PSQL interpreter expenses the following properties: - +
@@ -83,13 +82,14 @@ You can modify the configuration of the PSQL from the `Interpreter` section. Th -
Property Name DescriptionMax number of SQL result to display to prevent the browser overload 1000
- + ### How to use + ``` Tip: Use (CTRL + .) for SQL auto-completion. ``` + #### DDL and SQL commands Start the paragraphs with the full `%psql.sql` prefix tag! The short notation: `%psql` would still be able run the queries but the syntax highlighting and the auto-completions will be disabled. @@ -131,6 +131,7 @@ psql -h phd3.localdomain -U gpadmin -p 5432 < - ![Interpreter Binding](../assets/themes/zeppelin/img/docs-img/scalding-InterpreterBinding.png) +
+ +![Interpreter Binding](../assets/themes/zeppelin/img/docs-img/scalding-InterpreterBinding.png) - ![Interpreter Selection](../assets/themes/zeppelin/img/docs-img/scalding-InterpreterSelection.png) -
+![Interpreter Selection](../assets/themes/zeppelin/img/docs-img/scalding-InterpreterSelection.png) + + ### Configuring the Interpreter + Zeppelin comes with a pre-configured Scalding interpreter in local mode, so you do not need to install anything. ### Testing the Interpreter @@ -73,6 +78,7 @@ If you click on the icon for the pie chart, you should be able to see a chart li ![Scalding - Pie - Chart](../assets/themes/zeppelin/img/docs-img/scalding-pie.png) ### Current Status & Future Work + The current implementation of the Scalding interpreter does not support canceling jobs, or fine-grained progress updates. -The pre-configured Scalding interpreter only supports Scalding in local mode. Hadoop mode for Scalding is currently unsupported, and will be future work (contributions welcome!). \ No newline at end of file +The pre-configured Scalding interpreter only supports Scalding in local mode. Hadoop mode for Scalding is currently unsupported, and will be future work (contributions welcome!). diff --git a/docs/interpreter/spark.md b/docs/interpreter/spark.md index 20be7f8324a..664b7c01428 100644 --- a/docs/interpreter/spark.md +++ b/docs/interpreter/spark.md @@ -41,15 +41,11 @@ Spark Interpreter group, which consisted of 4 interpreters. -

- ### Configuration -
Without any configuration, Spark interpreter works out of box in local mode. But if you want to connect to your Spark cluster, you'll need following two simple steps. - -#### 1. export SPARK_HOME +#### 1. Export SPARK_HOME In **conf/zeppelin-env.sh**, export SPARK_HOME environment variable with your Spark installation path. @@ -66,9 +62,7 @@ export HADOOP_CONF_DIR=/usr/lib/hadoop export SPARK_SUBMIT_OPTIONS="--packages com.databricks:spark-csv_2.10:1.2.0" ``` - -
-#### 2. set master in Interpreter menu. +#### 2. Set master in Interpreter menu. After start Zeppelin, go to **Interpreter** menu and edit **master** property in your Spark interpreter setting. The value may vary depending on your Spark cluster deployment type. @@ -79,27 +73,22 @@ for example, * **yarn-client** in Yarn client mode * **mesos://host:5050** in Mesos cluster - - -
That's it. Zeppelin will work with any version of Spark and any deployment type without rebuild Zeppelin in this way. (Zeppelin 0.5.5-incubating release works up to Spark 1.5.1) Note that without exporting SPARK_HOME, it's running in local mode with included version of Spark. The included version may vary depending on the build profile. -

+ ### SparkContext, SQLContext, ZeppelinContext -
SparkContext, SQLContext, ZeppelinContext are automatically created and exposed as variable names 'sc', 'sqlContext' and 'z', respectively, both in scala and python environments. Note that scala / python environment shares the same SparkContext, SQLContext, ZeppelinContext instance. - -
-
+ + ### Dependency Management -
+ There are two ways to load external library in spark interpreter. First is using Zeppelin's %dep interpreter and second is loading Spark properties. #### 1. Dynamic Dependency Loading via %dep interpreter @@ -150,9 +139,8 @@ z.load("groupId:artifactId:version").exclude("groupId:*") z.load("groupId:artifactId:version").local() ``` - -
#### 2. Loading Spark Properties + Once `SPARK_HOME` is set in `conf/zeppelin-env.sh`, Zeppelin uses `spark-submit` as spark interpreter runner. `spark-submit` supports two ways to load configurations. The first is command line options such as --master and Zeppelin can pass these options to `spark-submit` by exporting `SPARK_SUBMIT_OPTIONS` in conf/zeppelin-env.sh. Second is reading configuration options from `SPARK_HOME/conf/spark-defaults.conf`. Spark properites that user can set to distribute libraries are: @@ -181,9 +169,9 @@ Once `SPARK_HOME` is set in `conf/zeppelin-env.sh`, Zeppelin uses `spark-submit`
Comma-separated list of files to be placed in the working directory of each executor.
+ Note that adding jar to pyspark is only availabe via %dep interpreter at the moment -
Here are few examples: * SPARK\_SUBMIT\_OPTIONS in conf/zeppelin-env.sh @@ -196,14 +184,11 @@ Here are few examples: spark.jars.packages com.databricks:spark-csv_2.10:1.2.0 spark.files /path/mylib1.py,/path/mylib2.egg,/path/mylib3.zip -
-
+ ### ZeppelinContext -
Zeppelin automatically injects ZeppelinContext as variable 'z' in your scala/python environment. ZeppelinContext provides some additional functions and utility. -
#### Object exchange ZeppelinContext extends map and it's shared between scala, python environment. @@ -224,7 +209,6 @@ Get object from python myObject = z.get("objName") ``` -
#### Form creation ZeppelinContext provides functions for creating forms. diff --git a/docs/manual/interpreters.md b/docs/manual/interpreters.md index 48cd1a35f3d..59e83b85d30 100644 --- a/docs/manual/interpreters.md +++ b/docs/manual/interpreters.md @@ -39,6 +39,7 @@ When you click on the ```+Create``` button in the interpreter page the interpret Zeppelin interpreter setting is the configuration of a given interpreter on zeppelin server. For example, the properties requried for hive JDBC interpreter to connect to the Hive server. + ### What is zeppelin interpreter group? Every Interpreter belongs to an InterpreterGroup. InterpreterGroup is a unit of start/stop interpreter. @@ -53,12 +54,11 @@ Interpreters belong to a single group a registered together and all of their pro ### Programming langages for interpreter If the interpreter uses a specific programming language (like Scala, Python, SQL), it is generally a good idea to add syntax highlighting support for that to the notebook paragraph editor. - + To check out the list of languages supported, see the mode-*.js files under zeppelin-web/bower_components/ace-builds/src-noconflict or from github https://github.com/ajaxorg/ace-builds/tree/master/src-noconflict - -To add a new set of syntax highlighting, + +To add a new set of syntax highlighting, + 1. add the mode-*.js file to zeppelin-web/bower.json (when built, zeppelin-web/src/index.html will be changed automatically) 2. add to the list of `editorMode` in zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js - it follows the pattern 'ace/mode/x' where x is the name 3. add to the code that checks for `%` prefix and calls `session.setMode(editorMode.x)` in `setParagraphMode` in zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js - - From 781954b82c54b25fa491ba24dd72d5c294639d69 Mon Sep 17 00:00:00 2001 From: Jesang Yoon Date: Mon, 18 Jan 2016 03:49:23 +0900 Subject: [PATCH 3/3] Interpreter documentation merge with commit #578 --- docs/interpreter/cassandra.md | 922 ++++++++++++++---------------- docs/interpreter/elasticsearch.md | 85 ++- docs/interpreter/flink.md | 1 - docs/interpreter/geode.md | 12 +- docs/interpreter/hive.md | 4 - docs/interpreter/ignite.md | 123 ++-- docs/interpreter/lens.md | 188 +++--- docs/interpreter/postgresql.md | 72 +-- docs/interpreter/scalding.md | 9 +- docs/manual/interpreters.md | 6 - 10 files changed, 670 insertions(+), 752 deletions(-) diff --git a/docs/interpreter/cassandra.md b/docs/interpreter/cassandra.md index 944f715307f..3cec02d18e3 100644 --- a/docs/interpreter/cassandra.md +++ b/docs/interpreter/cassandra.md @@ -7,7 +7,6 @@ group: manual {% include JB/setup %} ## Cassandra CQL Interpreter for Apache Zeppelin - @@ -28,7 +27,7 @@ In a notebook, to enable the **Cassandra** interpreter, click on the **Gear** ic
- + ## Using the Cassandra Interpreter In a paragraph, use **_%cassandra_** to select the **Cassandra** interpreter and then input all commands. @@ -36,7 +35,7 @@ In a paragraph, use **_%cassandra_** to select the **Cassandra** interpreter and To access the interactive help, type `HELP;`
- +
## Interpreter Commands @@ -82,8 +81,8 @@ This interpreter is compatible with any CQL statement supported by Cassandra. Ex ```sql - INSERT INTO users(login,name) VALUES('jdoe','John DOE'); - SELECT * FROM users WHERE login='jdoe'; +INSERT INTO users(login,name) VALUES('jdoe','John DOE'); +SELECT * FROM users WHERE login='jdoe'; ``` Each statement should be separated by a semi-colon ( **;** ) except the special commands below: @@ -97,169 +96,163 @@ Each statement should be separated by a semi-colon ( **;** ) except the special 7. @retryPolicy 8. @fetchSize -Multi-line statements as well as multiple statements on the same line are also supported as long as they are -separated by a semi-colon. Ex: +Multi-line statements as well as multiple statements on the same line are also supported as long as they are separated by a semi-colon. Ex: ```sql - USE spark_demo; +USE spark_demo; - SELECT * FROM albums_by_country LIMIT 1; SELECT * FROM countries LIMIT 1; +SELECT * FROM albums_by_country LIMIT 1; SELECT * FROM countries LIMIT 1; - SELECT * - FROM artists - WHERE login='jlennon'; +SELECT * +FROM artists +WHERE login='jlennon'; ``` Batch statements are supported and can span multiple lines, as well as DDL(CREATE/ALTER/DROP) statements: ```sql - BEGIN BATCH - INSERT INTO users(login,name) VALUES('jdoe','John DOE'); - INSERT INTO users_preferences(login,account_type) VALUES('jdoe','BASIC'); - APPLY BATCH; +BEGIN BATCH + INSERT INTO users(login,name) VALUES('jdoe','John DOE'); + INSERT INTO users_preferences(login,account_type) VALUES('jdoe','BASIC'); +APPLY BATCH; - CREATE TABLE IF NOT EXISTS test( - key int PRIMARY KEY, - value text - ); +CREATE TABLE IF NOT EXISTS test( + key int PRIMARY KEY, + value text +); ``` -CQL statements are case-insensitive (except for column names and values). +CQL statements are case-insensitive (except for column names and values). This means that the following statements are equivalent and valid: ```sql - INSERT INTO users(login,name) VALUES('jdoe','John DOE'); - Insert into users(login,name) vAlues('hsue','Helen SUE'); +INSERT INTO users(login,name) VALUES('jdoe','John DOE'); +Insert into users(login,name) vAlues('hsue','Helen SUE'); ``` The complete list of all CQL statements and versions can be found below: -
-
Name
- - - - - - - - - - - - - - - - -
Cassandra VersionDocumentation Link
2.2 - - http://docs.datastax.com/en/cql/3.3/cql/cqlIntro.html - -
2.1 & 2.0 - - http://docs.datastax.com/en/cql/3.1/cql/cql_intro_c.html - -
1.2 - - http://docs.datastax.com/en/cql/3.0/cql/aboutCQL.html - -
- -## Comments in statements + + + + + + + + + + + + + + + + + +
Cassandra VersionDocumentation Link
2.2 + + http://docs.datastax.com/en/cql/3.3/cql/cqlIntro.html + +
2.1 & 2.0 + + http://docs.datastax.com/en/cql/3.1/cql/cql_intro_c.html + +
1.2 + + http://docs.datastax.com/en/cql/3.0/cql/aboutCQL.html + +
+## Comments in statements It is possible to add comments between statements. Single line comments start with the hash sign (#). Multi-line comments are enclosed between /** and **/. Ex: ```sql - #First comment - INSERT INTO users(login,name) VALUES('jdoe','John DOE'); +#First comment +INSERT INTO users(login,name) VALUES('jdoe','John DOE'); - /** - Multi line - comments - **/ - Insert into users(login,name) vAlues('hsue','Helen SUE'); +/** + Multi line + comments + **/ +Insert into users(login,name) vAlues('hsue','Helen SUE'); ``` ## Syntax Validation - -The interpreters is shipped with a built-in syntax validator. This validator only checks for basic syntax errors. +The interpreters is shipped with a built-in syntax validator. This validator only checks for basic syntax errors. All CQL-related syntax validation is delegated directly to **Cassandra**. Most of the time, syntax errors are due to **missing semi-colons** between statements or **typo errors**. - -## 8. Schema commands +## Schema commands To make schema discovery easier and more interactive, the following commands are supported: -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CommandDescription
DESCRIBE CLUSTER;Show the current cluster name and its partitioner
DESCRIBE KEYSPACES;List all existing keyspaces in the cluster and their configuration (replication factor, durable write ...)
DESCRIBE TABLES;List all existing keyspaces in the cluster and for each, all the tables name
DESCRIBE TYPES;List all existing user defined types in the current (logged) keyspace
DESCRIBE FUNCTIONS <keyspace_name>;List all existing user defined functions in the given keyspace
DESCRIBE AGGREGATES <keyspace_name>;List all existing user defined aggregates in the given keyspace
DESCRIBE KEYSPACE <keyspace_name>;Describe the given keyspace configuration and all its table details (name, columns, ...)
DESCRIBE TABLE (<keyspace_name>).<table_name>; - Describe the given table. If the keyspace is not provided, the current logged in keyspace is used. - If there is no logged in keyspace, the default system keyspace is used. - If no table is found, an error message is raised. -
DESCRIBE TYPE (<keyspace_name>).<type_name>; - Describe the given type(UDT). If the keyspace is not provided, the current logged in keyspace is used. - If there is no logged in keyspace, the default system keyspace is used. - If no type is found, an error message is raised. -
DESCRIBE FUNCTION (<keyspace_name>).<function_name>;Describe the given user defined function. The keyspace is optional.
DESCRIBE AGGREGATE (<keyspace_name>).<aggregate_name>;Describe the given user defined aggregate. The keyspace is optional.
-
- -The schema objects (cluster, keyspace, table, type, function and aggregate) are displayed in a tabular format. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
CommandDescription
DESCRIBE CLUSTER;Show the current cluster name and its partitioner
DESCRIBE KEYSPACES;List all existing keyspaces in the cluster and their configuration (replication factor, durable write ...)
DESCRIBE TABLES;List all existing keyspaces in the cluster and for each, all the tables name
DESCRIBE TYPES;List all existing user defined types in the current (logged) keyspace
DESCRIBE FUNCTIONS <keyspace_name>;List all existing user defined functions in the given keyspace
DESCRIBE AGGREGATES <keyspace_name>;List all existing user defined aggregates in the given keyspace
DESCRIBE KEYSPACE <keyspace_name>;Describe the given keyspace configuration and all its table details (name, columns, ...)
DESCRIBE TABLE (<keyspace_name>).<table_name>; + Describe the given table. If the keyspace is not provided, the current logged in keyspace is used. + If there is no logged in keyspace, the default system keyspace is used. + If no table is found, an error message is raised. +
DESCRIBE TYPE (<keyspace_name>).<type_name>; + Describe the given type(UDT). If the keyspace is not provided, the current logged in keyspace is used. + If there is no logged in keyspace, the default system keyspace is used. + If no type is found, an error message is raised. +
DESCRIBE FUNCTION (<keyspace_name>).<function_name>;Describe the given user defined function. The keyspace is optional.
DESCRIBE AGGREGATE (<keyspace_name>).<aggregate_name>;Describe the given user defined aggregate. The keyspace is optional.
+ +The schema objects (cluster, keyspace, table, type, function and aggregate) are displayed in a tabular format. There is a drop-down menu on the top left corner to expand objects details. On the top right menu is shown the Icon legend.
@@ -268,117 +261,111 @@ There is a drop-down menu on the top left corner to expand objects details. On t ## Runtime Parameters -Sometimes you want to be able to pass runtime query parameters to your statements. -Those parameters are not part of the CQL specs and are specific to the interpreter. +Sometimes you want to be able to pass runtime query parameters to your statements. +Those parameters are not part of the CQL specs and are specific to the interpreter. Below is the list of all parameters: -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterSyntaxDescription
Consistency Level@consistency=valueApply the given consistency level to all queries in the paragraph.
Serial Consistency Level@serialConsistency=valueApply the given serial consistency level to all queries in the paragraph.
Timestamp@timestamp=long value - Apply the given timestamp to all queries in the paragraph. - Please note that timestamp value passed directly in CQL statement will override this value. -
Retry Policy@retryPolicy=valueApply the given retry policy to all queries in the paragraph.
Fetch Size@fetchSize=integer valueApply the given fetch size to all queries in the paragraph.
-
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterSyntaxDescription
Consistency Level@consistency=valueApply the given consistency level to all queries in the paragraph.
Serial Consistency Level@serialConsistency=valueApply the given serial consistency level to all queries in the paragraph.
Timestamp@timestamp=long value + Apply the given timestamp to all queries in the paragraph. + Please note that timestamp value passed directly in CQL statement will override this value. +
Retry Policy@retryPolicy=valueApply the given retry policy to all queries in the paragraph.
Fetch Size@fetchSize=integer valueApply the given fetch size to all queries in the paragraph.
- Some parameters only accept restricted values: +Some parameters only accept restricted values: -
- - - - - - - - - - - - - - - - - - - - - - - - - -
ParameterPossible Values
Consistency LevelALL, ANY, ONE, TWO, THREE, QUORUM, LOCAL\_ONE, LOCAL\_QUORUM, EACH\_QUORUM
Serial Consistency LevelSERIAL, LOCAL\_SERIAL
TimestampAny long value
Retry PolicyDEFAULT, DOWNGRADING\_CONSISTENCY, FALLTHROUGH, LOGGING\_DEFAULT, LOGGING\_DOWNGRADING, LOGGING\_FALLTHROUGH
Fetch SizeAny integer value
-
+ + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterPossible Values
Consistency LevelALL, ANY, ONE, TWO, THREE, QUORUM, LOCAL\_ONE, LOCAL\_QUORUM, EACH\_QUORUM
Serial Consistency LevelSERIAL, LOCAL\_SERIAL
TimestampAny long value
Retry PolicyDEFAULT, DOWNGRADING\_CONSISTENCY, FALLTHROUGH, LOGGING\_DEFAULT, LOGGING\_DOWNGRADING, LOGGING\_FALLTHROUGH
Fetch SizeAny integer value
>Please note that you should **not** add semi-colon ( **;** ) at the end of each parameter statement. Some examples: ```sql +CREATE TABLE IF NOT EXISTS spark_demo.ts( + key int PRIMARY KEY, + value text +); +TRUNCATE spark_demo.ts; - CREATE TABLE IF NOT EXISTS spark_demo.ts( - key int PRIMARY KEY, - value text - ); - TRUNCATE spark_demo.ts; - - # Timestamp in the past - @timestamp=10 +# Timestamp in the past +@timestamp=10 - # Force timestamp directly in the first insert - INSERT INTO spark_demo.ts(key,value) VALUES(1,'first insert') USING TIMESTAMP 100; +# Force timestamp directly in the first insert +INSERT INTO spark_demo.ts(key,value) VALUES(1,'first insert') USING TIMESTAMP 100; - # Select some data to make the clock turn - SELECT * FROM spark_demo.albums LIMIT 100; +# Select some data to make the clock turn +SELECT * FROM spark_demo.albums LIMIT 100; - # Now insert using the timestamp parameter set at the beginning(10) - INSERT INTO spark_demo.ts(key,value) VALUES(1,'second insert'); +# Now insert using the timestamp parameter set at the beginning(10) +INSERT INTO spark_demo.ts(key,value) VALUES(1,'second insert'); - # Check for the result. You should see 'first insert' - SELECT value FROM spark_demo.ts WHERE key=1; +# Check for the result. You should see 'first insert' +SELECT value FROM spark_demo.ts WHERE key=1; ``` - + Some remarks about query parameters: - + > 1. **Many** query parameters can be set in the same paragraph. > 2. If the **same** query parameter is set many time with different values, the interpreter only take into account the first value. > 3. Each query parameter applies to **all CQL statements** in the same paragraph, unless you override the option using plain CQL text. ( Like forcing timestamp with the USING clause ) > 4. The order of each query parameter with regard to CQL statement does not matter. ## Support for Prepared Statements - -For performance reason, it is better to prepare statements before-hand and reuse them later by providing bound values. +For performance reason, it is better to prepare statements before-hand and reuse them later by providing bound values. This interpreter provides 3 commands to handle prepared and bound statements: 1. **@prepare** @@ -388,45 +375,41 @@ This interpreter provides 3 commands to handle prepared and bound statements: Example: ``` - @prepare[statement_name]=... +@prepare[statement_name]=... - @bind[statement_name]=’text’, 1223, ’2015-07-30 12:00:01’, null, true, [‘list_item1’, ’list_item2’] +@bind[statement_name]=’text’, 1223, ’2015-07-30 12:00:01’, null, true, [‘list_item1’, ’list_item2’] - @bind[statement_name_with_no_bound_value] +@bind[statement_name_with_no_bound_value] - @remove_prepare[statement_name] +@remove_prepare[statement_name] ``` #### @prepare -You can use the syntax `@prepare[statement_name]=SELECT ...` to create a prepared statement. -The `statement_name` is **mandatory** because the interpreter prepares the given statement with the Java driver and -saves the generated prepared statement in an **internal hash map**, using the provided `statement_name` as search key. - -> Please note that this internal prepared statement map is shared with **all notebooks** and **all paragraphs** because -there is only one instance of the interpreter for Cassandra. - +You can use the syntax `@prepare[statement_name]=SELECT ...` to create a prepared statement. +The `statement_name` is **mandatory** because the interpreter prepares the given statement with the Java driver and saves the generated prepared statement in an **internal hash map**, using the provided `statement_name` as search key. + +> Please note that this internal prepared statement map is shared with **all notebooks** and **all paragraphs** because there is only one instance of the interpreter for Cassandra. + > If the interpreter encounters **many** @prepare for the **same statement_name (key)**, only the **first** statement will be taken into account. - + Example: ``` - @prepare[select]=SELECT * FROM spark_demo.albums LIMIT ? +@prepare[select]=SELECT * FROM spark_demo.albums LIMIT ? - @prepare[select]=SELECT * FROM spark_demo.artists LIMIT ? +@prepare[select]=SELECT * FROM spark_demo.artists LIMIT ? ``` -For the above example, the prepared statement is `SELECT * FROM spark_demo.albums LIMIT ?`. -`SELECT * FROM spark_demo.artists LIMIT ?` is ignored because an entry already exists in the prepared statements map with the key select. +For the above example, the prepared statement is `SELECT * FROM spark_demo.albums LIMIT ?`. +`SELECT * FROM spark_demo.artists LIMIT ?` is ignored because an entry already exists in the prepared statements map with the key select. -In the context of **Zeppelin**, a notebook can be scheduled to be executed at regular interval, -thus it is necessary to **avoid re-preparing many time the same statement (considered an anti-pattern)**. +In the context of **Zeppelin**, a notebook can be scheduled to be executed at regular interval, thus it is necessary to **avoid re-preparing many time the same statement (considered an anti-pattern)**. #### @bind - Once the statement is prepared ( possibly in a separated notebook/paragraph ). You can bind values to it: ``` - @bind[select_first]=10 +@bind[select_first]=10 ``` Bound values are not mandatory for the `@bind` statement. However if you provide bound values, they need to comply to some syntax: @@ -446,29 +429,23 @@ Bound values are not mandatory for the `@bind` statement. However if you provide > It is possible to use the @bind statement inside a batch: > -> ```sql -> -> BEGIN BATCH -> @bind[insert_user]='jdoe','John DOE' -> UPDATE users SET age = 27 WHERE login='hsue'; -> APPLY BATCH; +> ```sql +> BEGIN BATCH +> @bind[insert_user]='jdoe','John DOE' +> UPDATE users SET age = 27 WHERE login='hsue'; +> APPLY BATCH; > ``` #### @remove_prepare - -To avoid for a prepared statement to stay forever in the prepared statement map, you can use the -`@remove_prepare[statement_name]` syntax to remove it. +To avoid for a prepared statement to stay forever in the prepared statement map, you can use the `@remove_prepare[statement_name]` syntax to remove it. Removing a non-existing prepared statement yields no error. ## Using Dynamic Forms +Instead of hard-coding your CQL queries, it is possible to use the mustache syntax ( **\{\{ \}\}** ) to inject simple value or multiple choices forms. -Instead of hard-coding your CQL queries, it is possible to use the mustache syntax ( **\{\{ \}\}** ) to inject simple value or multiple choices forms. - -The syntax for simple parameter is: **\{\{input_Label=default value\}\}**. The default value is mandatory because the first time the paragraph is executed, -we launch the CQL query before rendering the form so at least one value should be provided. +The syntax for simple parameter is: **\{\{input_Label=default value\}\}**. The default value is mandatory because the first time the paragraph is executed, we launch the CQL query before rendering the form so at least one value should be provided. -The syntax for multiple choices parameter is: **\{\{input_Label=value1 | value2 | … | valueN \}\}**. By default the first choice is used for CQL query -the first time the paragraph is executed. +The syntax for multiple choices parameter is: **\{\{input_Label=value1 | value2 | … | valueN \}\}**. By default the first choice is used for CQL query the first time the paragraph is executed. Example: @@ -479,13 +456,12 @@ Example: WHERE name='{{performer=Sheryl Crow|Doof|Fanfarlo|Los Paranoia}}' AND styles CONTAINS '{{style=Rock}}'; {% endraw %} - -In the above example, the first CQL query will be executed for _performer='Sheryl Crow' AND style='Rock'_. -For subsequent queries, you can change the value directly using the form. +In the above example, the first CQL query will be executed for _performer='Sheryl Crow' AND style='Rock'_. +For subsequent queries, you can change the value directly using the form. -> Please note that we enclosed the **\{\{ \}\}** block between simple quotes ( **'** ) because Cassandra expects a String here. -> We could have also use the **\{\{style='Rock'\}\}** syntax but this time, the value displayed on the form is **_'Rock'_** and not **_Rock_**. +> Please note that we enclosed the **\{\{ \}\}** block between simple quotes ( **'** ) because Cassandra expects a String here. +> We could have also use the **\{\{style='Rock'\}\}** syntax but this time, the value displayed on the form is **_'Rock'_** and not **_Rock_**. It is also possible to use dynamic forms for **prepared statements**: @@ -495,43 +471,38 @@ It is also possible to use dynamic forms for **prepared statements**: {% endraw %} -
- ## Execution parallelism and shared states - -It is possible to execute many paragraphs in parallel. However, at the back-end side, we’re still using synchronous queries. -_Asynchronous execution_ is only possible when it is possible to return a `Future` value in the `InterpreterResult`. +It is possible to execute many paragraphs in parallel. However, at the back-end side, we’re still using synchronous queries. +_Asynchronous execution_ is only possible when it is possible to return a `Future` value in the `InterpreterResult`. It may be an interesting proposal for the **Zeppelin** project. Another caveat is that the same `com.datastax.driver.core.Session` object is used for **all** notebooks and paragraphs. -Consequently, if you use the **USE _keyspace name_;** statement to log into a keyspace, it will change the keyspace for -**all current users** of the **Cassandra** interpreter because we only create 1 `com.datastax.driver.core.Session` object -per instance of **Cassandra** interpreter. +Consequently, if you use the **USE _keyspace name_;** statement to log into a keyspace, it will change the keyspace for **all current users** of the **Cassandra** interpreter because we only create 1 `com.datastax.driver.core.Session` object per instance of **Cassandra** interpreter. The same remark does apply to the **prepared statement hash map**, it is shared by **all users** using the same instance of **Cassandra** interpreter. Until **Zeppelin** offers a real multi-users separation, there is a work-around to segregate user environment and states: create different **Cassandra** interpreter instances. For this, first go to the **Interpreter** menu and click on the **Create** button. +
![Create Interpreter](../assets/themes/zeppelin/img/docs-img/cassandra-NewInterpreterInstance.png)
-In the interpreter creation form, put **cass-instance2** as **Name** and select the **cassandra** -in the interpreter drop-down list +In the interpreter creation form, put **cass-instance2** as **Name** and select the **cassandra** in the interpreter drop-down list +
![Interpreter Name](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterName.png)
- Click on **Save** to create the new interpreter instance. Now you should be able to see it in the interpreter list. - +Click on **Save** to create the new interpreter instance. Now you should be able to see it in the interpreter list. +
![Interpreter In List](../assets/themes/zeppelin/img/docs-img/cassandra-NewInterpreterInList.png)
Go back to your notebook and click on the **Gear** icon to configure interpreter bindings. -You should be able to see and select the **cass-instance2** interpreter instance in the available -interpreter list instead of the standard **cassandra** instance. +You should be able to see and select the **cass-instance2** interpreter instance in the available interpreter list instead of the standard **cassandra** instance.
![Interpreter Instance Selection](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterInstanceSelection.png) @@ -539,218 +510,215 @@ interpreter list instead of the standard **cassandra** instance. ## Interpreter Configuration To configure the **Cassandra** interpreter, go to the **Interpreter** menu and scroll down to change the parameters. -The **Cassandra** interpreter is using the official **[Cassandra Java Driver]** and most of the parameters are used -to configure the Java driver +The **Cassandra** interpreter is using the official **[Cassandra Java Driver]** and most of the parameters are used to configure the Java driver Below are the configuration parameters and their default value. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Property NameDescriptionDefault Value
cassandra.clusterName of the Cassandra cluster to connect toTest Cluster
cassandra.compression.protocolOn wire compression. Possible values are: NONE, SNAPPY, LZ4NONE
cassandra.credentials.usernameIf security is enable, provide the loginnone
cassandra.credentials.passwordIf security is enable, provide the passwordnone
cassandra.hosts - Comma separated Cassandra hosts (DNS name or IP address). -
- Ex: '192.168.0.12,node2,node3' -
localhost
cassandra.interpreter.parallelismNumber of concurrent paragraphs(queries block) that can be executed10
cassandra.keyspace - Default keyspace to connect to. - - It is strongly recommended to let the default value - and prefix the table name with the actual keyspace - in all of your queries. - - system
cassandra.load.balancing.policy - Load balancing policy. Default = new TokenAwarePolicy(new DCAwareRoundRobinPolicy()) - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. - At runtime the interpreter will instantiate the policy using - Class.forName(FQCN). - DEFAULT
cassandra.max.schema.agreement.wait.secondCassandra max schema agreement wait in second10
cassandra.pooling.core.connection.per.host.localProtocol V2 and below default = 2. Protocol V3 and above default = 12
cassandra.pooling.core.connection.per.host.remoteProtocol V2 and below default = 1. Protocol V3 and above default = 11
cassandra.pooling.heartbeat.interval.secondsCassandra pool heartbeat interval in secs30
cassandra.pooling.idle.timeout.secondsCassandra idle time out in seconds120
cassandra.pooling.max.connection.per.host.localProtocol V2 and below default = 8. Protocol V3 and above default = 18
cassandra.pooling.max.connection.per.host.remoteProtocol V2 and below default = 2. Protocol V3 and above default = 12
cassandra.pooling.max.request.per.connection.localProtocol V2 and below default = 128. Protocol V3 and above default = 1024128
cassandra.pooling.max.request.per.connection.remoteProtocol V2 and below default = 128. Protocol V3 and above default = 256128
cassandra.pooling.new.connection.threshold.localProtocol V2 and below default = 100. Protocol V3 and above default = 800100
cassandra.pooling.new.connection.threshold.remoteProtocol V2 and below default = 100. Protocol V3 and above default = 200100
cassandra.pooling.pool.timeout.millisecsCassandra pool time out in millisecs5000
cassandra.protocol.versionCassandra binary protocol version3
cassandra.query.default.consistency - Cassandra query default consistency level -
- Available values: ONE, TWO, THREE, QUORUM, LOCAL\_ONE, LOCAL\_QUORUM, EACH\_QUORUM, ALL -
ONE
cassandra.query.default.fetchSizeCassandra query default fetch size5000
cassandra.query.default.serial.consistency - Cassandra query default serial consistency level -
- Available values: SERIAL, LOCAL_SERIAL -
SERIAL
cassandra.reconnection.policy - Cassandra Reconnection Policy. - Default = new ExponentialReconnectionPolicy(1000, 10 * 60 * 1000) - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. - At runtime the interpreter will instantiate the policy using - Class.forName(FQCN). - DEFAULT
cassandra.retry.policy - Cassandra Retry Policy. - Default = DefaultRetryPolicy.INSTANCE - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. - At runtime the interpreter will instantiate the policy using - Class.forName(FQCN). - DEFAULT
cassandra.socket.connection.timeout.millisecsCassandra socket default connection timeout in millisecs500
cassandra.socket.read.timeout.millisecsCassandra socket read timeout in millisecs12000
cassandra.socket.tcp.no_delayCassandra socket TCP no delaytrue
cassandra.speculative.execution.policy - Cassandra Speculative Execution Policy. - Default = NoSpeculativeExecutionPolicy.INSTANCE - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. - At runtime the interpreter will instantiate the policy using - Class.forName(FQCN). - DEFAULT
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Property NameDescriptionDefault Value
cassandra.clusterName of the Cassandra cluster to connect toTest Cluster
cassandra.compression.protocolOn wire compression. Possible values are: NONE, SNAPPY, LZ4NONE
cassandra.credentials.usernameIf security is enable, provide the loginnone
cassandra.credentials.passwordIf security is enable, provide the passwordnone
cassandra.hosts + Comma separated Cassandra hosts (DNS name or IP address). +
+ Ex: '192.168.0.12,node2,node3' +
localhost
cassandra.interpreter.parallelismNumber of concurrent paragraphs(queries block) that can be executed10
cassandra.keyspace + Default keyspace to connect to. + + It is strongly recommended to let the default value + and prefix the table name with the actual keyspace + in all of your queries. + + system
cassandra.load.balancing.policy + Load balancing policy. Default = new TokenAwarePolicy(new DCAwareRoundRobinPolicy()) + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + At runtime the interpreter will instantiate the policy using + Class.forName(FQCN). + DEFAULT
cassandra.max.schema.agreement.wait.secondCassandra max schema agreement wait in second10
cassandra.pooling.core.connection.per.host.localProtocol V2 and below default = 2. Protocol V3 and above default = 12
cassandra.pooling.core.connection.per.host.remoteProtocol V2 and below default = 1. Protocol V3 and above default = 11
cassandra.pooling.heartbeat.interval.secondsCassandra pool heartbeat interval in secs30
cassandra.pooling.idle.timeout.secondsCassandra idle time out in seconds120
cassandra.pooling.max.connection.per.host.localProtocol V2 and below default = 8. Protocol V3 and above default = 18
cassandra.pooling.max.connection.per.host.remoteProtocol V2 and below default = 2. Protocol V3 and above default = 12
cassandra.pooling.max.request.per.connection.localProtocol V2 and below default = 128. Protocol V3 and above default = 1024128
cassandra.pooling.max.request.per.connection.remoteProtocol V2 and below default = 128. Protocol V3 and above default = 256128
cassandra.pooling.new.connection.threshold.localProtocol V2 and below default = 100. Protocol V3 and above default = 800100
cassandra.pooling.new.connection.threshold.remoteProtocol V2 and below default = 100. Protocol V3 and above default = 200100
cassandra.pooling.pool.timeout.millisecsCassandra pool time out in millisecs5000
cassandra.protocol.versionCassandra binary protocol version3
cassandra.query.default.consistency + Cassandra query default consistency level +
+ Available values: ONE, TWO, THREE, QUORUM, LOCAL\_ONE, LOCAL\_QUORUM, EACH\_QUORUM, ALL +
ONE
cassandra.query.default.fetchSizeCassandra query default fetch size5000
cassandra.query.default.serial.consistency + Cassandra query default serial consistency level +
+ Available values: SERIAL, LOCAL_SERIAL +
SERIAL
cassandra.reconnection.policy + Cassandra Reconnection Policy. + Default = new ExponentialReconnectionPolicy(1000, 10 * 60 * 1000) + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + At runtime the interpreter will instantiate the policy using + Class.forName(FQCN). + DEFAULT
cassandra.retry.policy + Cassandra Retry Policy. + Default = DefaultRetryPolicy.INSTANCE + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + At runtime the interpreter will instantiate the policy using + Class.forName(FQCN). + DEFAULT
cassandra.socket.connection.timeout.millisecsCassandra socket default connection timeout in millisecs500
cassandra.socket.read.timeout.millisecsCassandra socket read timeout in millisecs12000
cassandra.socket.tcp.no_delayCassandra socket TCP no delaytrue
cassandra.speculative.execution.policy + Cassandra Speculative Execution Policy. + Default = NoSpeculativeExecutionPolicy.INSTANCE + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + At runtime the interpreter will instantiate the policy using + Class.forName(FQCN). + DEFAULT
## Bugs & Contacts If you encounter a bug for this interpreter, please create a **[JIRA]** ticket and ping me on Twitter at **[@doanduyhai]**. - [Cassandra Java Driver]: https://github.com/datastax/java-driver [standard CQL syntax]: http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html [Tuple CQL syntax]: http://docs.datastax.com/en/cql/3.1/cql/cql_reference/tupleType.html diff --git a/docs/interpreter/elasticsearch.md b/docs/interpreter/elasticsearch.md index e4114ba7f46..7b70528c00a 100644 --- a/docs/interpreter/elasticsearch.md +++ b/docs/interpreter/elasticsearch.md @@ -6,12 +6,10 @@ group: manual --- {% include JB/setup %} - ## Elasticsearch Interpreter for Apache Zeppelin [Elasticsearch](https://www.elastic.co/products/elasticsearch) is a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time. It is generally used as the underlying engine/technology that powers applications that have complex search features and requirements. ## Configuration - @@ -44,7 +42,6 @@ group: manual ![Interpreter configuration](../assets/themes/zeppelin/img/docs-img/elasticsearch-config.png) - > **Note #1 :** You can add more properties to configure the Elasticsearch client. > **Note #2 :** If you use Shield, you can add a property named `shield.user` with a value containing the name and the password ( format: `username:password` ). For more details about Shield configuration, consult the [Shield reference guide](https://www.elastic.co/guide/en/shield/current/_using_elasticsearch_java_clients_with_shield.html). Do not forget, to copy the shield client jar in the interpreter directory (`ZEPPELIN_HOME/interpreters/elasticsearch`). @@ -56,8 +53,9 @@ In a notebook, to enable the **Elasticsearch** interpreter, click the **Gear** i In a paragraph, use `%elasticsearch` to select the Elasticsearch interpreter and then input all commands. To get the list of available commands, use `help`. ```bash -| %elasticsearch -| help +%elasticsearch +help + Elasticsearch interpreter: General format: ///
Property
@@ -36,7 +34,6 @@ This interpreter supports the [Geode](http://geode.incubator.apache.org/) [Objec This [Video Tutorial](https://www.youtube.com/watch?v=zvzzA9GXu3Q) illustrates some of the features provided by the `Geode Interpreter`. ### Create Interpreter - By default Zeppelin creates one `Geode/OQL` instance. You can remove it or create more instances. Multiple Geode instances can be created, each configured to the same or different backend Geode cluster. But over time a `Notebook` can have only one Geode interpreter instance `bound`. That means you _cannot_ connect to different Geode clusters in the same `Notebook`. This is a known Zeppelin limitation. @@ -46,11 +43,9 @@ To create new Geode instance open the `Interpreter` section and click the `+Crea > Note: The `Name` of the instance is used only to distinguish the instances while binding them to the `Notebook`. The `Name` is irrelevant inside the `Notebook`. In the `Notebook` you must use `%geode.oql` tag. ### Bind to Notebook - In the `Notebook` click on the `settings` icon in the top right corner. The select/deselect the interpreters to be bound with the `Notebook`. ### Configuration - You can modify the configuration of the Geode from the `Interpreter` section. The Geode interpreter expresses the following properties:
Name
@@ -77,13 +72,11 @@ You can modify the configuration of the Geode from the `Interpreter` section. T
### How to use - > *Tip 1: Use (CTRL + .) for OQL auto-completion.* > *Tip 2: Always start the paragraphs with the full `%geode.oql` prefix tag! The short notation: `%geode` would still be able run the OQL queries but the syntax highlighting and the auto-completions will be disabled.* #### Create / Destroy Regions - The OQL specification does not support [Geode Regions](https://cwiki.apache.org/confluence/display/GEODE/Index#Index-MainConceptsandComponents) mutation operations. To `create`/`destroy` regions one should use the [GFSH](http://geode-docs.cfapps.io/docs/tools_modules/gfsh/chapter_overview.html) shell tool instead. In the following it is assumed that the GFSH is colocated with Zeppelin server. ```bash @@ -104,8 +97,7 @@ EOF Above snippet re-creates two regions: `regionEmployee` and `regionCompany`. Note that you have to explicitly specify the locator host and port. The values should match those you have used in the Geode Interpreter configuration. Comprehensive list of [GFSH Commands by Functional Area](http://geode-docs.cfapps.io/docs/tools_modules/gfsh/gfsh_quick_reference.html). -#### Basic OQL - +#### Basic OQL ```sql %geode.oql SELECT count(*) FROM /regionEmployee @@ -145,7 +137,6 @@ SELECT e.key, e.value FROM /regionEmployee.entrySet e > Note: You can have multiple queries in the same paragraph but only the result from the first is displayed. [[1](https://issues.apache.org/jira/browse/ZEPPELIN-178)], [[2](https://issues.apache.org/jira/browse/ZEPPELIN-212)]. #### GFSH Commands From The Shell - Use the Shell Interpreter (`%sh`) to run OQL commands form the command line: ```bash @@ -155,7 +146,6 @@ gfsh -e "connect" -e "list members" ``` #### Apply Zeppelin Dynamic Forms - You can leverage [Zeppelin Dynamic Form](../manual/dynamicform.html) inside your OQL queries. You can use both the `text input` and `select form` parameterization features ```sql diff --git a/docs/interpreter/hive.md b/docs/interpreter/hive.md index 58b060079ad..5871feb6af4 100644 --- a/docs/interpreter/hive.md +++ b/docs/interpreter/hive.md @@ -6,12 +6,10 @@ group: manual --- {% include JB/setup %} - ## Hive Interpreter for Apache Zeppelin The [Apache Hive](https://hive.apache.org/) ™ data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL. ### Configuration - @@ -73,7 +71,6 @@ The [Apache Hive](https://hive.apache.org/) ™ data warehouse software facilita This interpreter provides multiple configuration with `${prefix}`. User can set a multiple connection properties by this prefix. It can be used like `%hive(${prefix})`. ## How to use - Basically, you can use ```sql @@ -92,7 +89,6 @@ select * from my_table; You can also run multiple queries up to 10 by default. Changing these settings is not implemented yet. ### Apply Zeppelin Dynamic Forms - You can leverage [Zeppelin Dynamic Form]({{BASE_PATH}}/manual/dynamicform.html) inside your queries. You can use both the `text input` and `select form` parameterization features. ```sql diff --git a/docs/interpreter/ignite.md b/docs/interpreter/ignite.md index 418765f9c77..6fa1f61248e 100644 --- a/docs/interpreter/ignite.md +++ b/docs/interpreter/ignite.md @@ -6,11 +6,9 @@ group: manual --- {% include JB/setup %} - ## Ignite Interpreter for Apache Zeppelin ### Overview - [Apache Ignite](https://ignite.apache.org/) In-Memory Data Fabric is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies. ![Apache Ignite](../assets/themes/zeppelin/img/docs-img/ignite-logo.png) @@ -18,64 +16,60 @@ group: manual You can use Zeppelin to retrieve distributed data from cache using Ignite SQL interpreter. Moreover, Ignite interpreter allows you to execute any Scala code in cases when SQL doesn't fit to your requirements. For example, you can populate data into your caches or execute distributed computations. ### Installing and Running Ignite example - In order to use Ignite interpreters, you may install Apache Ignite in some simple steps: - 1. Download Ignite [source release](https://ignite.apache.org/download.html#sources) or [binary release](https://ignite.apache.org/download.html#binaries) whatever you want. But you must download Ignite as the same version of Zeppelin's. If it is not, you can't use scala code on Zeppelin. You can find ignite version in Zepplin at the pom.xml which is placed under `path/to/your-Zeppelin/ignite/pom.xml` ( Of course, in Zeppelin source release ). Please check `ignite.version` .
Currently, Zeppelin provides ignite only in Zeppelin source release. So, if you download Zeppelin binary release( `zeppelin-0.5.0-incubating-bin-spark-xxx-hadoop-xx` ), you can not use ignite interpreter on Zeppelin. We are planning to include ignite in a future binary release. - - 2. Examples are shipped as a separate Maven project, so to start running you simply need to import provided /apache-ignite-fabric-1.2.0-incubating-bin/pom.xml file into your favourite IDE, such as Eclipse. - - * In case of Eclipse, Eclipse -> File -> Import -> Existing Maven Projects - * Set examples directory path to Eclipse and select the pom.xml. - * Then start `org.apache.ignite.examples.ExampleNodeStartup` (or whatever you want) to run at least one or more ignite node. When you run example code, you may notice that the number of node is increase one by one. - - > **Tip. If you want to run Ignite examples on the cli not IDE, you can export executable Jar file from IDE. Then run it by using below command.** - - ``` - $ nohup java -jar - ``` - -### Configuring Ignite Interpreter +1. Download Ignite [source release](https://ignite.apache.org/download.html#sources) or [binary release](https://ignite.apache.org/download.html#binaries) whatever you want. But you must download Ignite as the same version of Zeppelin's. If it is not, you can't use scala code on Zeppelin. You can find ignite version in Zepplin at the pom.xml which is placed under `path/to/your-Zeppelin/ignite/pom.xml` ( Of course, in Zeppelin source release ). Please check `ignite.version` .
Currently, Zeppelin provides ignite only in Zeppelin source release. So, if you download Zeppelin binary release( `zeppelin-0.5.0-incubating-bin-spark-xxx-hadoop-xx` ), you can not use ignite interpreter on Zeppelin. We are planning to include ignite in a future binary release. +2. Examples are shipped as a separate Maven project, so to start running you simply need to import provided /apache-ignite-fabric-1.2.0-incubating-bin/pom.xml file into your favourite IDE, such as Eclipse. + +* In case of Eclipse, Eclipse -> File -> Import -> Existing Maven Projects +* Set examples directory path to Eclipse and select the pom.xml. +* Then start `org.apache.ignite.examples.ExampleNodeStartup` (or whatever you want) to run at least one or more ignite node. When you run example code, you may notice that the number of node is increase one by one. +> **Tip. If you want to run Ignite examples on the cli not IDE, you can export executable Jar file from IDE. Then run it by using below command.** + +``` +$ nohup java -jar +``` + +### Configuring Ignite Interpreter At the "Interpreters" menu, you may edit Ignite interpreter or create new one. Zeppelin provides these properties for Ignite. -
Property
+
- - - + + + - - - + + + - - - + + + - - - - - - - - - - - - + + + -
Property NamevalueDescriptionProperty NamevalueDescription
ignite.addresses127.0.0.1:47500..47509Coma separated list of Ignite cluster hosts. See [Ignite Cluster Configuration](https://apacheignite.readme.io/v1.2/docs/cluster-config) section for more details.ignite.addresses127.0.0.1:47500..47509Coma separated list of Ignite cluster hosts. See [Ignite Cluster Configuration](https://apacheignite.readme.io/v1.2/docs/cluster-config) section for more details.
ignite.clientModetrueYou can connect to the Ignite cluster as client or server node. See [Ignite Clients vs. Servers](https://apacheignite.readme.io/v1.2/docs/clients-vs-servers) section for details. Use true or false values in order to connect in client or server mode respectively.ignite.clientModetrueYou can connect to the Ignite cluster as client or server node. See [Ignite Clients vs. Servers](https://apacheignite.readme.io/v1.2/docs/clients-vs-servers) section for details. Use true or false values in order to connect in client or server mode respectively.
ignite.config.urlConfiguration URL. Overrides all other settings.
ignite.jdbc.urljdbc:ignite:cfg://default-ignite-jdbc.xmlIgnite JDBC connection URL.
ignite.peerClassLoadingEnabledtrueEnables peer-class-loading. See [Zero Deployment](https://apacheignite.readme.io/v1.2/docs/zero-deployment) section for details. Use true or false values in order to enable or disable P2P class loading respectively.ignite.config.urlConfiguration URL. Overrides all other settings.
+ + ignite.jdbc.url + jdbc:ignite:cfg://default-ignite-jdbc.xml + Ignite JDBC connection URL. + + + ignite.peerClassLoadingEnabled + true + Enables peer-class-loading. See [Zero Deployment](https://apacheignite.readme.io/v1.2/docs/zero-deployment) section for details. Use true or false values in order to enable or disable P2P class loading respectively. + + ![Configuration of Ignite Interpreter](../assets/themes/zeppelin/img/docs-img/ignite-interpreter-setting.png) ### Interpreter Binding for Zeppelin Notebook - After configuring Ignite interpreter, create your own notebook. Then you can bind interpreters like below image. ![Binding Interpreters](../assets/themes/zeppelin/img/docs-img/ignite-interpreter-binding.png) @@ -83,38 +77,37 @@ After configuring Ignite interpreter, create your own notebook. Then you can bin For more interpreter binding information see [here](http://zeppelin.incubator.apache.org/docs/manual/interpreters.html). ### How to use Ignite SQL interpreter - In order to execute SQL query, use ` %ignite.ignitesql ` prefix.
Supposing you are running `org.apache.ignite.examples.streaming.wordcount.StreamWords`, then you can use "words" cache( Of course you have to specify this cache name to the Ignite interpreter setting section `ignite.jdbc.url` of Zeppelin ). For example, you can select top 10 words in the words cache using the following query - ``` - %ignite.ignitesql - select _val, count(_val) as cnt from String group by _val order by cnt desc limit 10 - ``` - - ![IgniteSql on Zeppelin](../assets/themes/zeppelin/img/docs-img/ignite-sql-example.png) - +``` +%ignite.ignitesql +select _val, count(_val) as cnt from String group by _val order by cnt desc limit 10 +``` + +![IgniteSql on Zeppelin](../assets/themes/zeppelin/img/docs-img/ignite-sql-example.png) + As long as your Ignite version and Zeppelin Ignite version is same, you can also use scala code. Please check the Zeppelin Ignite version before you download your own Ignite. - ``` - %ignite - import org.apache.ignite._ - import org.apache.ignite.cache.affinity._ - import org.apache.ignite.cache.query._ - import org.apache.ignite.configuration._ +``` +%ignite +import org.apache.ignite._ +import org.apache.ignite.cache.affinity._ +import org.apache.ignite.cache.query._ +import org.apache.ignite.configuration._ + +import scala.collection.JavaConversions._ - import scala.collection.JavaConversions._ +val cache: IgniteCache[AffinityUuid, String] = ignite.cache("words") - val cache: IgniteCache[AffinityUuid, String] = ignite.cache("words") +val qry = new SqlFieldsQuery("select avg(cnt), min(cnt), max(cnt) from (select count(_val) as cnt from String group by _val)", true) - val qry = new SqlFieldsQuery("select avg(cnt), min(cnt), max(cnt) from (select count(_val) as cnt from String group by _val)", true) +val res = cache.query(qry).getAll() - val res = cache.query(qry).getAll() +collectionAsScalaIterable(res).foreach(println _) +``` - collectionAsScalaIterable(res).foreach(println _) - ``` - - ![Using Scala Code](../assets/themes/zeppelin/img/docs-img/ignite-scala-example.png) +![Using Scala Code](../assets/themes/zeppelin/img/docs-img/ignite-scala-example.png) Apache Ignite also provides a guide docs for Zeppelin ["Ignite with Apache Zeppelin"](https://apacheignite.readme.io/docs/data-analysis-with-apache-zeppelin) diff --git a/docs/interpreter/lens.md b/docs/interpreter/lens.md index 8b46e4d128e..c883f10e8f3 100644 --- a/docs/interpreter/lens.md +++ b/docs/interpreter/lens.md @@ -16,69 +16,70 @@ group: manual ### Installing and Running Lens In order to use Lens interpreters, you may install Apache Lens in some simple steps: - 1. Download Lens for latest version from [the ASF](http://www.apache.org/dyn/closer.lua/lens/2.3-beta). Or the older release can be found [in the Archives](http://archive.apache.org/dist/lens/). - 2. Before running Lens, you have to set HIVE_HOME and HADOOP_HOME. If you want to get more information about this, please refer to [here](http://lens.apache.org/lenshome/install-and-run.html#Installation). Lens also provides Pseudo Distributed mode. [Lens pseudo-distributed setup](http://lens.apache.org/lenshome/pseudo-distributed-setup.html) is done by using [docker](https://www.docker.com/). Hive server and hadoop daemons are run as separate processes in lens pseudo-distributed setup. - 3. Now, you can start lens server (or stop). - - ``` - ./bin/lens-ctl start (or stop) - ``` +1. Download Lens for latest version from [the ASF](http://www.apache.org/dyn/closer.lua/lens/2.3-beta). Or the older release can be found [in the Archives](http://archive.apache.org/dist/lens/). +2. Before running Lens, you have to set HIVE_HOME and HADOOP_HOME. If you want to get more information about this, please refer to [here](http://lens.apache.org/lenshome/install-and-run.html#Installation). Lens also provides Pseudo Distributed mode. [Lens pseudo-distributed setup](http://lens.apache.org/lenshome/pseudo-distributed-setup.html) is done by using [docker](https://www.docker.com/). Hive server and hadoop daemons are run as separate processes in lens pseudo-distributed setup. +3. Now, you can start lens server (or stop). + +``` +./bin/lens-ctl start (or stop) +``` ### Configuring Lens Interpreter At the "Interpreters" menu, you can edit Lens interpreter or create new one. Zeppelin provides these properties for Lens. - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + +
Property NamevalueDescriptionProperty NamevalueDescription
lens.client.dbnamedefaultThe database schema namelens.client.dbnamedefaultThe database schema name
lens.query.enable.persistent.resultsetfalseWhether to enable persistent resultset for queries. When enabled, server will fetch results from driver, custom format them if any and store in a configured location. The file name of query output is queryhandle-id, with configured extensionslens.query.enable.persistent.resultsetfalseWhether to enable persistent resultset for queries. When enabled, server will fetch results from driver, custom format them if any and store in a configured location. The file name of query output is queryhandle-id, with configured extensions
lens.server.base.urlhttp://hostname:port/lensapiThe base url for the lens server. you have to edit "hostname" and "port" that you may use(ex. http://0.0.0.0:9999/lensapi)lens.server.base.urlhttp://hostname:port/lensapiThe base url for the lens server. you have to edit "hostname" and "port" that you may use(ex. http://0.0.0.0:9999/lensapi)
lens.session.cluster.user defaultHadoop cluster usernamelens.session.cluster.user defaultHadoop cluster username
zeppelin.lens.maxResult1000Max number of rows to displayzeppelin.lens.maxResult1000Max number of rows to display
zeppelin.lens.maxThreads10If concurrency is true then how many threads?zeppelin.lens.maxThreads10If concurrency is true then how many threads?
zeppelin.lens.run.concurrenttrueRun concurrent Lens Sessionszeppelin.lens.run.concurrenttrueRun concurrent Lens Sessions
xxxyyyanything else from [Configuring lens server](https://lens.apache.org/admin/config-server.html)xxxyyyanything else from [Configuring lens server](https://lens.apache.org/admin/config-server.html)
![Apache Lens Interpreter Setting](../assets/themes/zeppelin/img/docs-img/lens-interpreter-setting.png) ### Interpreter Bindging for Zeppelin Notebook -After configuring Lens interpreter, create your own notebook, then you can bind interpreters like below image. +After configuring Lens interpreter, create your own notebook, then you can bind interpreters like below image. + ![Zeppelin Notebook Interpreter Biding](../assets/themes/zeppelin/img/docs-img/lens-interpreter-binding.png) For more interpreter binding information see [here](http://zeppelin.incubator.apache.org/docs/manual/interpreters.html). @@ -90,80 +91,79 @@ As you can see in this video, they are using Lens Client Shell(./bin/lens-cli.sh
  • Create and Use(Switch) Databases. - ``` - create database newDb - ``` - - ``` - use newDb - ``` - +``` +create database newDb +``` + +``` +use newDb +``` +
  • Create Storage. - ``` - create storage your/path/to/lens/client/examples/resources/db-storage.xml - ``` - +``` +create storage your/path/to/lens/client/examples/resources/db-storage.xml +``` +
  • Create Dimensions, Show fields and join-chains of them. - ``` - create dimension your/path/to/lens/client/examples/resources/customer.xml - ``` - - ``` - dimension show fields customer - ``` - - ``` - dimension show joinchains customer - ``` - +``` +create dimension your/path/to/lens/client/examples/resources/customer.xml +``` + +``` +dimension show fields customer +``` + +``` +dimension show joinchains customer +``` +
  • Create Caches, Show fields and join-chains of them. - ``` - create cube your/path/to/lens/client/examples/resources/sales-cube.xml - ``` - - ``` - cube show fields sales - ``` - - ``` - cube show joinchains sales - ``` +``` +create cube your/path/to/lens/client/examples/resources/sales-cube.xml +``` + +``` +cube show fields sales +``` + +``` +cube show joinchains sales +```
  • Create Dimtables and Fact. - ``` - create dimtable your/path/to/lens/client/examples/resources/customer_table.xml - ``` - - ``` - create fact your/path/to/lens/client/examples/resources/sales-raw-fact.xml - ``` +``` +create dimtable your/path/to/lens/client/examples/resources/customer_table.xml +``` + +``` +create fact your/path/to/lens/client/examples/resources/sales-raw-fact.xml +```
  • Add partitions to Dimtable and Fact. - - ``` - dimtable add single-partition --dimtable_name customer_table --storage_name local --path your/path/to/lens/client/examples/resources/customer-local-part.xml - ``` - - ``` - fact add partitions --fact_name sales_raw_fact --storage_name local --path your/path/to/lens/client/examples/resources/sales-raw-local-parts.xml - ``` + +``` +dimtable add single-partition --dimtable_name customer_table --storage_name local --path your/path/to/lens/client/examples/resources/customer-local-part.xml +``` + +``` +fact add partitions --fact_name sales_raw_fact --storage_name local --path your/path/to/lens/client/examples/resources/sales-raw-local-parts.xml +```
  • Now, you can run queries on cubes. - - ``` - query execute cube select customer_city_name, product_details.description, product_details.category, product_details.color, store_sales from sales where time_range_in(delivery_time, '2015-04-11-00', '2015-04-13-00') - ``` - - - ![Lens Query Result](../assets/themes/zeppelin/img/docs-img/lens-result.png) + +``` +query execute cube select customer_city_name, product_details.description, product_details.category, product_details.color, store_sales from sales where time_range_in(delivery_time, '2015-04-11-00', '2015-04-13-00') +``` + +![Lens Query Result](../assets/themes/zeppelin/img/docs-img/lens-result.png) These are just examples that provided in advance by Lens. If you want to explore whole tutorials of Lens, see the [tutorial video](https://cwiki.apache.org/confluence/display/LENS/2015/07/13/20+Minute+video+demo+of+Apache+Lens+through+examples). ### Lens UI Service Lens also provides web UI service. Once the server starts up, you can open the service on http://serverhost:19999/index.html and browse. You may also check the structure that you made and use query easily here. - - ![Lens UI Servive](../assets/themes/zeppelin/img/docs-img/lens-ui-service.png) + +![Lens UI Servive](../assets/themes/zeppelin/img/docs-img/lens-ui-service.png) diff --git a/docs/interpreter/postgresql.md b/docs/interpreter/postgresql.md index 7cd48c25d87..54def09f3ec 100644 --- a/docs/interpreter/postgresql.md +++ b/docs/interpreter/postgresql.md @@ -6,9 +6,7 @@ group: manual --- {% include JB/setup %} - ## PostgreSQL, HAWQ Interpreter for Apache Zeppelin - @@ -30,11 +28,9 @@ This interpreter seamlessly supports the following SQL data processing engines: * [Apache HAWQ](http://pivotal.io/big-data/pivotal-hawq) - Powerful [Open Source](https://wiki.apache.org/incubator/HAWQProposal) SQL-On-Hadoop engine. * [Greenplum](http://pivotal.io/big-data/pivotal-greenplum-database) - MPP database built on open source PostgreSQL. - This [Video Tutorial](https://www.youtube.com/watch?v=wqXXQhJ5Uk8) illustrates some of the features provided by the `Postgresql Interpreter`. ### Create Interpreter - By default Zeppelin creates one `PSQL` instance. You can remove it or create new instances. Multiple PSQL instances can be created, each configured to the same or different backend databases. But over time a `Notebook` can have only one PSQL interpreter instance `bound`. That means you _cannot_ connect to different databases in the same `Notebook`. This is a known Zeppelin limitation. @@ -44,54 +40,50 @@ To create new PSQL instance open the `Interpreter` section and click the `+Creat > Note: The `Name` of the instance is used only to distinct the instances while binding them to the `Notebook`. The `Name` is irrelevant inside the `Notebook`. In the `Notebook` you must use `%psql.sql` tag. ### Bind to Notebook - In the `Notebook` click on the `settings` icon in the top right corner. The select/deselect the interpreters to be bound with the `Notebook`. ### Configuration - You can modify the configuration of the PSQL from the `Interpreter` section. The PSQL interpreter expenses the following properties:
    Name
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Property NameDescriptionDefault Value
    postgresql.urlJDBC URL to connect to jdbc:postgresql://localhost:5432
    postgresql.userJDBC user namegpadmin
    postgresql.passwordJDBC password
    postgresql.driver.nameJDBC driver name. In this version the driver name is fixed and should not be changedorg.postgresql.Driver
    postgresql.max.resultMax number of SQL result to display to prevent the browser overload1000
    Property NameDescriptionDefault Value
    postgresql.urlJDBC URL to connect to jdbc:postgresql://localhost:5432
    postgresql.userJDBC user namegpadmin
    postgresql.passwordJDBC password
    postgresql.driver.nameJDBC driver name. In this version the driver name is fixed and should not be changedorg.postgresql.Driver
    postgresql.max.resultMax number of SQL result to display to prevent the browser overload1000
    ### How to use - ``` Tip: Use (CTRL + .) for SQL auto-completion. ``` #### DDL and SQL commands - Start the paragraphs with the full `%psql.sql` prefix tag! The short notation: `%psql` would still be able run the queries but the syntax highlighting and the auto-completions will be disabled. You can use the standard CREATE / DROP / INSERT commands to create or modify the data model: @@ -121,7 +113,6 @@ select * from mytable; ``` #### PSQL command line tools - Use the Shell Interpreter (`%sh`) to access the command line [PSQL](http://www.postgresql.org/docs/9.4/static/app-psql.html) interactively: ```bash @@ -147,7 +138,6 @@ This will produce output like this: ``` #### Apply Zeppelin Dynamic Forms - You can leverage [Zeppelin Dynamic Form](../manual/dynamicform.html) inside your queries. You can use both the `text input` and `select form` parametrization features ```sql @@ -160,7 +150,6 @@ LIMIT ${limit=10}; ``` #### Example HAWQ PXF/HDFS Tables - Create HAWQ external table that read data from tab-separated-value data in HDFS. ```sql @@ -179,5 +168,4 @@ select * from retail_demo.payment_methods_pxf ``` ### Auto-completion - The PSQL Interpreter provides a basic auto-completion functionality. On `(Ctrl+.)` it list the most relevant suggestions in a pop-up window. In addition to the SQL keyword the interpreter provides suggestions for the Schema, Table, Column names as well. diff --git a/docs/interpreter/scalding.md b/docs/interpreter/scalding.md index 0f5e5d80d38..f84636af139 100644 --- a/docs/interpreter/scalding.md +++ b/docs/interpreter/scalding.md @@ -6,13 +6,10 @@ group: manual --- {% include JB/setup %} - ## Scalding Interpreter for Apache Zeppelin - [Scalding](https://github.com/twitter/scalding) is an open source Scala library for writing MapReduce jobs. ### Building the Scalding Interpreter - You have to first build the Scalding interpreter by enable the **scalding** profile as follows: ``` @@ -20,9 +17,8 @@ mvn clean package -Pscalding -DskipTests ``` ### Enabling the Scalding Interpreter - In a notebook, to enable the **Scalding** interpreter, click on the **Gear** icon,select **Scalding**, and hit **Save**. - +
    ![Interpreter Binding](../assets/themes/zeppelin/img/docs-img/scalding-InterpreterBinding.png) @@ -32,11 +28,9 @@ In a notebook, to enable the **Scalding** interpreter, click on the **Gear** ico
    ### Configuring the Interpreter - Zeppelin comes with a pre-configured Scalding interpreter in local mode, so you do not need to install anything. ### Testing the Interpreter - In example, by using the [Alice in Wonderland](https://gist.github.com/johnynek/a47699caa62f4f38a3e2) tutorial, we will count words (of course!), and plot a graph of the top 10 words in the book. ``` @@ -78,7 +72,6 @@ If you click on the icon for the pie chart, you should be able to see a chart li ![Scalding - Pie - Chart](../assets/themes/zeppelin/img/docs-img/scalding-pie.png) ### Current Status & Future Work - The current implementation of the Scalding interpreter does not support canceling jobs, or fine-grained progress updates. The pre-configured Scalding interpreter only supports Scalding in local mode. Hadoop mode for Scalding is currently unsupported, and will be future work (contributions welcome!). diff --git a/docs/manual/interpreters.md b/docs/manual/interpreters.md index ec77e60b4d3..c8867184364 100644 --- a/docs/manual/interpreters.md +++ b/docs/manual/interpreters.md @@ -19,14 +19,12 @@ limitations under the License. --> {% include JB/setup %} - ## Interpreters in Zeppelin In this section, we will explain about the role of interpreters, interpreters group and interpreter settings in Zeppelin. The concept of Zeppelin interpreter allows any language/data-processing-backend to be plugged into Zeppelin. Currently, Zeppelin supports many interpreters such as Scala ( with Apache Spark ), Python ( with Apache Spark ), SparkSQL, Hive, Markdown, Shell and so on. ## What is Zeppelin interpreter? - Zeppelin Interpreter is a plug-in which enables Zeppelin users to use a specific language/data-processing-backend. For example, to use scala code in Zeppelin, you need `%spark` interpreter. When you click the ```+Create``` button in the interpreter page, the interpreter drop-down list box will show all the available interpreters on your server. @@ -34,13 +32,11 @@ When you click the ```+Create``` button in the interpreter page, the interpreter ## What is Zeppelin Interpreter Setting? - Zeppelin interpreter setting is the configuration of a given interpreter on Zeppelin server. For example, the properties are required for hive JDBC interpreter to connect to the Hive server. ## What is Zeppelin Interpreter Group? - Every Interpreter is belonged to an **Interpreter Group**. Interpreter Group is a unit of start/stop interpreter. By default, every interpreter is belonged to a single group, but the group might contain more interpreters. For example, spark interpreter group is including Spark support, pySpark, SparkSQL and the dependency loader. @@ -51,7 +47,6 @@ Each interpreters is belonged to a single group and registered together. All of ## Programming Languages for Interpreter - If the interpreter uses a specific programming language ( like Scala, Python, SQL ), it is generally recommended to add a syntax highlighting supported for that to the notebook paragraph editor. To check out the list of languages supported, see the `mode-*.js` files under `zeppelin-web/bower_components/ace-builds/src-noconflict` or from [github.com/ajaxorg/ace-builds](https://github.com/ajaxorg/ace-builds/tree/master/src-noconflict). @@ -61,4 +56,3 @@ If you want to add a new set of syntax highlighting, 1. Add the `mode-*.js` file to `zeppelin-web/bower.json` ( when built, `zeppelin-web/src/index.html` will be changed automatically. ). 2. Add to the list of `editorMode` in `zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js` - it follows the pattern 'ace/mode/x' where x is the name. 3. Add to the code that checks for `%` prefix and calls `session.setMode(editorMode.x)` in `setParagraphMode` located in `zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js`. - \ No newline at end of file