diff --git a/docs/development/writingzeppelininterpreter.md b/docs/development/writingzeppelininterpreter.md index f64f09246a2..dede756a9a3 100644 --- a/docs/development/writingzeppelininterpreter.md +++ b/docs/development/writingzeppelininterpreter.md @@ -32,10 +32,8 @@ All Interpreters in the same interpreter group are launched in a single, separat ### Make your own Interpreter Creating a new interpreter is quite simple. Just extend [org.apache.zeppelin.interpreter](https://github.com/apache/incubator-zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/Interpreter.java) abstract class and implement some methods. - -You can include org.apache.zeppelin:zeppelin-interpreter:[VERSION] artifact in your build system. - -Your interpreter name is derived from the static register method +You can include `org.apache.zeppelin:zeppelin-interpreter:[VERSION]` artifact in your build system. +Your interpreter name is derived from the static register method. ``` static { @@ -44,16 +42,15 @@ static { ``` The name will appear later in the interpreter name option box during the interpreter configuration process. - The name of the interpreter is what you later write to identify a paragraph which should be interpreted using this interpreter. ``` %MyInterpreterName -some interpreter spesific code... +some interpreter specific code... ``` ### Install your interpreter binary -Once you have build your interpreter, you can place your interpreter under directory with all the dependencies. +Once you have built your interpreter, you can place it under the interpreter directory with all its dependencies. ``` [ZEPPELIN_HOME]/interpreter/[INTERPRETER_NAME]/ @@ -63,33 +60,34 @@ Once you have build your interpreter, you can place your interpreter under direc To configure your interpreter you need to follow these steps: -1. create conf/zeppelin-site.xml by copying conf/zeppelin-site.xml.template to conf/zeppelin-site.xml - -2. Add your interpreter class name to the zeppelin.interpreters property in conf/zeppelin-site.xml +1. Add your interpreter class name to the zeppelin.interpreters property in `conf/zeppelin-site.xml`. - Property value is comma separated [INTERPRETER_CLASS_NAME] -for example, + Property value is comma separated [INTERPRETER\_CLASS\_NAME]. + For example, - ``` +``` zeppelin.interpreters org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.hive.HiveInterpreter,com.me.MyNewInterpreter ``` -3. start zeppelin by running ```./bin/zeppelin-deamon start``` -4. in the interpreter page, click the +Create button and configure your interpreter properties. +2. Add your interpreter to the [default configuration](https://github.com/apache/incubator-zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/conf/ZeppelinConfiguration.java#L397) which is used when there is no `zeppelin-site.xml`. + +3. Start Zeppelin by running `./bin/zeppelin-daemon.sh start`. + +4. In the interpreter page, click the `+Create` button and configure your interpreter properties. Now you are done and ready to use your interpreter. -Note that the interpreters shipped with zeppelin have a [default configuration](https://github.com/apache/incubator-zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/conf/ZeppelinConfiguration.java#L397) which is used when there is no zeppelin-site.xml. +Note that the interpreters shipped with zeppelin have a [default configuration](https://github.com/apache/incubator-zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/conf/ZeppelinConfiguration.java#L397) which is used when there is no `conf/zeppelin-site.xml`. ### Use your interpreter #### 0.5.0 -Inside of a notebook, %[INTERPRETER_NAME] directive will call your interpreter. +Inside of a notebook, `%[INTERPRETER_NAME]` directive will call your interpreter. Note that the first interpreter configuration in zeppelin.interpreters will be the default one. -for example +For example, ``` %myintp @@ -100,16 +98,14 @@ println(a)
#### 0.6.0 and later -Inside of a notebook, %[INTERPRETER\_GROUP].[INTERPRETER\_NAME] directive will call your interpreter. +Inside of a notebook, `%[INTERPRETER_GROUP].[INTERPRETER_NAME]` directive will call your interpreter. Note that the first interpreter configuration in zeppelin.interpreters will be the default one. -You can omit either [INTERPRETER\_GROUP] or [INTERPRETER\_NAME]. Omit [INTERPRETER\_NAME] selects first available interpreter in the [INTERPRETER\_GROUP]. -Omit '[INTERPRETER\_GROUP]' will selects [INTERPRETER\_NAME] from default interpreter group. - +You can omit either [INTERPRETER\_GROUP] or [INTERPRETER\_NAME]. If you omit [INTERPRETER\_NAME], then first available interpreter will be selected in the [INTERPRETER\_GROUP]. +Likewise, if you skip [INTERPRETER\_GROUP], then [INTERPRETER\_NAME] will be chosen from default interpreter group. -For example, if you have two interpreter myintp1 and myintp2 in group mygrp, -you can call myintp1 like +For example, if you have two interpreter myintp1 and myintp2 in group mygrp, you can call myintp1 like ``` %mygrp.myintp1 @@ -125,7 +121,7 @@ and you can call myintp2 like codes for myintp2 ``` -If you omit your interpreter name, it'll selects first available interpreter in the group (myintp1) +If you omit your interpreter name, it'll select first available interpreter in the group ( myintp1 ). ``` %mygrp diff --git a/docs/interpreter/cassandra.md b/docs/interpreter/cassandra.md index eded545e52d..2eb86c3d000 100644 --- a/docs/interpreter/cassandra.md +++ b/docs/interpreter/cassandra.md @@ -6,10 +6,8 @@ group: manual --- {% include JB/setup %} -
## 1. Cassandra CQL Interpreter for Apache Zeppelin -
@@ -23,35 +21,32 @@ group: manual
Name
-
## 2. Enabling Cassandra Interpreter - In a notebook, to enable the **Cassandra** interpreter, click on the **Gear** icon and select **Cassandra** + In a notebook, to enable the **Cassandra** interpreter, click on the **Gear** icon and select **Cassandra**. -
- ![Interpreter Binding](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterBinding.png) - - ![Interpreter Selection](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterSelection.png) +
+ +
+
-
- + ## 3. Using the Cassandra Interpreter In a paragraph, use **_%cassandra_** to select the **Cassandra** interpreter and then input all commands. - To access the interactive help, type **HELP;** + To access the interactive help, type `HELP;` -
- ![Interactive Help](../assets/themes/zeppelin/img/docs-img/cassandra-InteractiveHelp.png) +
+
-
## 4. Interpreter Commands - The **Cassandra** interpreter accepts the following commands + The **Cassandra** interpreter accepts the following commands.
@@ -85,10 +80,9 @@ group: manual -
All CQL-compatible statements (SELECT, INSERT, CREATE ...) All CQL statements are executed directly against the Cassandra server
+
-
## 5. CQL statements This interpreter is compatible with any CQL statement supported by Cassandra. Ex: @@ -97,7 +91,7 @@ This interpreter is compatible with any CQL statement supported by Cassandra. Ex INSERT INTO users(login,name) VALUES('jdoe','John DOE'); SELECT * FROM users WHERE login='jdoe'; -``` +``` Each statement should be separated by a semi-colon ( **;** ) except the special commands below: @@ -149,7 +143,7 @@ This means that the following statements are equivalent and valid: ``` The complete list of all CQL statements and versions can be found below: -
+
@@ -163,7 +157,7 @@ The complete list of all CQL statements and versions can be found below: http://docs.datastax.com/en/cql/3.3/cql/cqlIntro.html - + - + - +
Cassandra Version
2.1 & 2.0 @@ -172,7 +166,7 @@ The complete list of all CQL statements and versions can be found below: http://docs.datastax.com/en/cql/3.1/cql/cql_intro_c.html
1.2 @@ -181,11 +175,10 @@ The complete list of all CQL statements and versions can be found below: http://docs.datastax.com/en/cql/3.0/cql/aboutCQL.html
-
## 6. Comments in statements @@ -203,21 +196,19 @@ It is possible to add comments between statements. Single line comments start wi Insert into users(login,name) vAlues('hsue','Helen SUE'); ``` -
## 7. Syntax Validation The interpreters is shipped with a built-in syntax validator. This validator only checks for basic syntax errors. -All CQL-related syntax validation is delegated directly to **Cassandra** +All CQL-related syntax validation is delegated directly to **Cassandra**. Most of the time, syntax errors are due to **missing semi-colons** between statements or **typo errors**. -
## 8. Schema commands To make schema discovery easier and more interactive, the following commands are supported: -
+
@@ -226,67 +217,65 @@ To make schema discovery easier and more interactive, the following commands are - + - + - + - + - + - + - - + + - + - + - - + + - - + +
Command
DESCRIBE CLUSTER; Show the current cluster name and its partitioner
DESCRIBE KEYSPACES; List all existing keyspaces in the cluster and their configuration (replication factor, durable write ...)
DESCRIBE TABLES; List all existing keyspaces in the cluster and for each, all the tables name
DESCRIBE TYPES; List all existing user defined types in the current (logged) keyspace
DESCRIBE FUNCTIONS <keyspace_name>; List all existing user defined functions in the given keyspace
DESCRIBE AGGREGATES <keyspace_name>; List all existing user defined aggregates in the given keyspace
DESCRIBE KEYSPACE <keyspace_name>;Describe the given keyspace configuration and all its table details (name, columns, ...)
Describe the given keyspace configuration and all its table details (name, columns, ...)
DESCRIBE TABLE (<keyspace_name>).<table_name>; Describe the given table. If the keyspace is not provided, the current logged in keyspace is used. If there is no logged in keyspace, the default system keyspace is used. - If no table is found, an error message is raised + If no table is found, an error message is raised.
DESCRIBE TYPE (<keyspace_name>).<type_name>; Describe the given type(UDT). If the keyspace is not provided, the current logged in keyspace is used. If there is no logged in keyspace, the default system keyspace is used. - If no type is found, an error message is raised + If no type is found, an error message is raised.
DESCRIBE FUNCTION (<keyspace_name>).<function_name>;Describe the given user defined function. The keyspace is optional
Describe the given user defined function. The keyspace is optional.
DESCRIBE AGGREGATE (<keyspace_name>).<aggregate_name>;Describe the given user defined aggregate. The keyspace is optional
Describe the given user defined aggregate. The keyspace is optional.
-
+
The schema objects (cluster, keyspace, table, type, function and aggregate) are displayed in a tabular format. There is a drop-down menu on the top left corner to expand objects details. On the top right menu is shown the Icon legend. -
![Describe Schema](../assets/themes/zeppelin/img/docs-img/cassandra-DescribeSchema.png)
-
## 9. Runtime Parameters @@ -294,8 +283,7 @@ Sometimes you want to be able to pass runtime query parameters to your statement Those parameters are not part of the CQL specs and are specific to the interpreter. Below is the list of all parameters: -
-
+
@@ -305,38 +293,37 @@ Below is the list of all parameters: - + - + - + - +
Parameter
Consistency Level @consistency=valueApply the given consistency level to all queries in the paragraphApply the given consistency level to all queries in the paragraph.
Serial Consistency Level @serialConsistency=valueApply the given serial consistency level to all queries in the paragraphApply the given serial consistency level to all queries in the paragraph.
Timestamp @timestamp=long value Apply the given timestamp to all queries in the paragraph. - Please note that timestamp value passed directly in CQL statement will override this value + Please note that timestamp value passed directly in CQL statement will override this value.
Retry Policy @retryPolicy=valueApply the given retry policy to all queries in the paragraphApply the given retry policy to all queries in the paragraph.
Fetch Size @fetchSize=integer valueApply the given fetch size to all queries in the paragraphApply the given fetch size to all queries in the paragraph.
Some parameters only accept restricted values: -
-
+
@@ -344,11 +331,11 @@ Below is the list of all parameters: - + - + @@ -356,7 +343,7 @@ Below is the list of all parameters: - + @@ -365,7 +352,7 @@ Below is the list of all parameters:
Parameter
Consistency LevelALL, ANY, ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUMALL, ANY, ONE, TWO, THREE, QUORUM, LOCAL\_ONE, LOCAL\_QUORUM, EACH\_QUORUM
Serial Consistency LevelSERIAL, LOCAL_SERIALSERIAL, LOCAL\_SERIAL
Timestamp
Retry PolicyDEFAULT, DOWNGRADING_CONSISTENCY, FALLTHROUGH, LOGGING_DEFAULT, LOGGING_DOWNGRADING, LOGGING_FALLTHROUGHDEFAULT, DOWNGRADING\_CONSISTENCY, FALLTHROUGH, LOGGING\_DEFAULT, LOGGING\_DOWNGRADING, LOGGING\_FALLTHROUGH
Fetch Size
->Please note that you should **not** add semi-colon ( **;** ) at the end of each parameter statement +>Please note that you should **not** add semi-colon ( **;** ) at the end of each parameter statement. Some examples: @@ -395,12 +382,11 @@ Some examples: Some remarks about query parameters: -> 1. **many** query parameters can be set in the same paragraph -> 2. if the **same** query parameter is set many time with different values, the interpreter only take into account the first value -> 3. each query parameter applies to **all CQL statements** in the same paragraph, unless you override the option using plain CQL text (like forcing timestamp with the USING clause) -> 4. the order of each query parameter with regard to CQL statement does not matter +> 1. **Many** query parameters can be set in the same paragraph. +> 2. If the **same** query parameter is set many time with different values, the interpreter only take into account the first value. +> 3. Each query parameter applies to **all CQL statements** in the same paragraph, unless you override the option using plain CQL text. ( Like forcing timestamp with the USING clause ) +> 4. The order of each query parameter with regard to CQL statement does not matter. -
## 10. Support for Prepared Statements @@ -424,17 +410,15 @@ Example: @remove_prepare[statement_name] ``` -
#### a. @prepare -
-You can use the syntax _"@prepare[statement_name]=SELECT ..."_ to create a prepared statement. -The _statement_name_ is **mandatory** because the interpreter prepares the given statement with the Java driver and -saves the generated prepared statement in an **internal hash map**, using the provided _statement_name_ as search key. +You can use the syntax `@prepare[statement_name]=SELECT ...` to create a prepared statement. +The `statement_name` is **mandatory** because the interpreter prepares the given statement with the Java driver and +saves the generated prepared statement in an **internal hash map**, using the provided `statement_name` as search key. > Please note that this internal prepared statement map is shared with **all notebooks** and **all paragraphs** because -there is only one instance of the interpreter for Cassandra +there is only one instance of the interpreter for Cassandra. -> If the interpreter encounters **many** @prepare for the **same _statement_name_ (key)**, only the **first** statement will be taken into account. +> If the interpreter encounters **many** @prepare for the **same statement_name (key)**, only the **first** statement will be taken into account. Example: @@ -443,37 +427,36 @@ Example: @prepare[select]=SELECT * FROM spark_demo.albums LIMIT ? @prepare[select]=SELECT * FROM spark_demo.artists LIMIT ? -``` +``` -For the above example, the prepared statement is _SELECT * FROM spark_demo.albums LIMIT ?_. -_SELECT * FROM spark_demo.artists LIMIT ?_ is ignored because an entry already exists in the prepared statements map with the key select. +For the above example, the prepared statement is `SELECT * FROM spark_demo.albums LIMIT ?`. +`SELECT * FROM spark_demo.artists LIMIT ?` is ignored because an entry already exists in the prepared statements map with the key select. In the context of **Zeppelin**, a notebook can be scheduled to be executed at regular interval, thus it is necessary to **avoid re-preparing many time the same statement (considered an anti-pattern)**. -
-
+ #### b. @bind -
-Once the statement is prepared (possibly in a separated notebook/paragraph). You can bind values to it: + +Once the statement is prepared ( possibly in a separated notebook/paragraph ). You can bind values to it: ``` @bind[select_first]=10 -``` +``` -Bound values are not mandatory for the **@bind** statement. However if you provide bound values, they need to comply to some syntax: +Bound values are not mandatory for the `@bind` statement. However if you provide bound values, they need to comply to some syntax: * String values should be enclosed between simple quotes ( ‘ ) * Date values should be enclosed between simple quotes ( ‘ ) and respect the formats: 1. yyyy-MM-dd HH:MM:ss 2. yyyy-MM-dd HH:MM:ss.SSS -* **null** is parsed as-is -* **boolean** (true|false) are parsed as-is +* **null** is parsed as-is. +* **boolean** (true|false) is parsed as-is. * collection values must follow the **[standard CQL syntax]**: * list: [‘list_item1’, ’list_item2’, ...] * set: {‘set_item1’, ‘set_item2’, …} * map: {‘key1’: ‘val1’, ‘key2’: ‘val2’, …} -* **tuple** values should be enclosed between parenthesis (see **[Tuple CQL syntax]**): (‘text’, 123, true) -* **udt** values should be enclosed between brackets (see **[UDT CQL syntax]**): {stree_name: ‘Beverly Hills’, number: 104, zip_code: 90020, state: ‘California’, …} +* **tuple** values should be enclosed between parenthesis ( see **[Tuple CQL syntax]** ): (‘text’, 123, true) +* **udt** values should be enclosed between brackets ( see **[UDT CQL syntax]** ): {stree_name: ‘Beverly Hills’, number: 104, zip_code: 90020, state: ‘California’, …} > It is possible to use the @bind statement inside a batch: > @@ -485,14 +468,12 @@ Bound values are not mandatory for the **@bind** statement. However if you provi > APPLY BATCH; > ``` -
#### c. @remove_prepare -
+ To avoid for a prepared statement to stay forever in the prepared statement map, you can use the -**@remove_prepare[statement_name]** syntax to remove it. +`@remove_prepare[statement_name]` syntax to remove it. Removing a non-existing prepared statement yields no error. -
## 11. Using Dynamic Forms @@ -529,7 +510,6 @@ It is also possible to use dynamic forms for **prepared statements**: {% endraw %} -
## 12. Execution parallelism and shared states @@ -543,44 +523,34 @@ Consequently, if you use the **USE _keyspace name_;** statement to log into a ke per instance of **Cassandra** interpreter. The same remark does apply to the **prepared statement hash map**, it is shared by **all users** using the same instance of **Cassandra** interpreter. - Until **Zeppelin** offers a real multi-users separation, there is a work-around to segregate user environment and states: -_create different **Cassandra** interpreter instances_ +create different **Cassandra** interpreter instances. -For this, first go to the **Interpreter** menu and click on the **Create** button -
-
+For this, first go to the **Interpreter** menu and click on the **Create** button.
![Create Interpreter](../assets/themes/zeppelin/img/docs-img/cassandra-NewInterpreterInstance.png)
In the interpreter creation form, put **cass-instance2** as **Name** and select the **cassandra** -in the interpreter drop-down list -
-
+in the interpreter drop-down list
![Interpreter Name](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterName.png) -
+
Click on **Save** to create the new interpreter instance. Now you should be able to see it in the interpreter list. -
-
![Interpreter In List](../assets/themes/zeppelin/img/docs-img/cassandra-NewInterpreterInList.png) -
+
Go back to your notebook and click on the **Gear** icon to configure interpreter bindings. You should be able to see and select the **cass-instance2** interpreter instance in the available interpreter list instead of the standard **cassandra** instance. -
-
![Interpreter Instance Selection](../assets/themes/zeppelin/img/docs-img/cassandra-InterpreterInstanceSelection.png)
-
## 13. Interpreter Configuration @@ -638,7 +608,7 @@ Below are the configuration parameters and their default value. It is strongly recommended to let the default value and prefix the table name with the actual keyspace - in all of your queries + in all of your queries. system @@ -649,7 +619,7 @@ Below are the configuration parameters and their default value. Load balancing policy. Default = new TokenAwarePolicy(new DCAwareRoundRobinPolicy()) To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. At runtime the interpreter will instantiate the policy using - Class.forName(FQCN) + Class.forName(FQCN). DEFAULT @@ -723,7 +693,7 @@ Below are the configuration parameters and their default value. Cassandra query default consistency level
- Available values: ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM, ALL + Available values: ONE, TWO, THREE, QUORUM, LOCAL\_ONE, LOCAL\_QUORUM, EACH\_QUORUM, ALL ONE @@ -748,7 +718,7 @@ Below are the configuration parameters and their default value. Default = new ExponentialReconnectionPolicy(1000, 10 * 60 * 1000) To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. At runtime the interpreter will instantiate the policy using - Class.forName(FQCN) + Class.forName(FQCN). DEFAULT @@ -759,7 +729,7 @@ Below are the configuration parameters and their default value. Default = DefaultRetryPolicy.INSTANCE To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. At runtime the interpreter will instantiate the policy using - Class.forName(FQCN) + Class.forName(FQCN). DEFAULT @@ -785,18 +755,17 @@ Below are the configuration parameters and their default value. Default = NoSpeculativeExecutionPolicy.INSTANCE To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. At runtime the interpreter will instantiate the policy using - Class.forName(FQCN) + Class.forName(FQCN). DEFAULT -
## 14. Bugs & Contacts If you encounter a bug for this interpreter, please create a **[JIRA]** ticket and ping me on Twitter - at **[@doanduyhai]** + at **[@doanduyhai]**. [Cassandra Java Driver]: https://github.com/datastax/java-driver diff --git a/docs/interpreter/elasticsearch.md b/docs/interpreter/elasticsearch.md index 34a23ba3543..ba1b6f78868 100644 --- a/docs/interpreter/elasticsearch.md +++ b/docs/interpreter/elasticsearch.md @@ -8,10 +8,11 @@ group: manual ## Elasticsearch Interpreter for Apache Zeppelin +[Elasticsearch](https://www.elastic.co/products/elasticsearch) is a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time. It is generally used as the underlying engine/technology that powers applications that have complex search features and requirements. -### 1. Configuration +
+## 1. Configuration -
@@ -31,7 +32,7 @@ group: manual - + @@ -45,22 +46,17 @@ group: manual -> Note #1: you can add more properties to configure the Elasticsearch client. +> **Note #1 :** You can add more properties to configure the Elasticsearch client. -> Note #2: if you use Shield, you can add a property named `shield.user` with a value containing the name and the password (format: `username:password`). For more details about Shield configuration, consult the [Shield reference guide](https://www.elastic.co/guide/en/shield/current/_using_elasticsearch_java_clients_with_shield.html). Do not forget, to copy the shield client jar in the interpreter directory (`ZEPPELIN_HOME/interpreters/elasticsearch`). +> **Note #2 :** If you use Shield, you can add a property named `shield.user` with a value containing the name and the password ( format: `username:password` ). For more details about Shield configuration, consult the [Shield reference guide](https://www.elastic.co/guide/en/shield/current/_using_elasticsearch_java_clients_with_shield.html). Do not forget, to copy the shield client jar in the interpreter directory (`ZEPPELIN_HOME/interpreters/elasticsearch`). - -
- -### 2. Enabling the Elasticsearch Interpreter +
+## 2. Enabling the Elasticsearch Interpreter In a notebook, to enable the **Elasticsearch** interpreter, click the **Gear** icon and select **Elasticsearch**. - -
- - -### 3. Using the Elasticsearch Interpreter +
+## 3. Using the Elasticsearch Interpreter In a paragraph, use `%elasticsearch` to select the Elasticsearch interpreter and then input all commands. To get the list of available commands, use `help`. @@ -82,14 +78,14 @@ Commands: . same comments as for the search - get /index/type/id - delete /index/type/id - - index /ndex/type/id + - index /index/type/id . the id can be omitted, elasticsearch will generate one ``` -> Tip: use (CTRL + .) for completion +> **Tip :** Use ( Ctrl + . ) for autocompletion. -#### get +### Get With the `get` command, you can find a document by id. The result is a JSON document. ```bash @@ -101,12 +97,12 @@ Example: ![Elasticsearch - Get](../assets/themes/zeppelin/img/docs-img/elasticsearch-get.png) -#### search +### Search With the `search` command, you can send a search query to Elasticsearch. There are two formats of query: * You can provide a JSON-formatted query, that is exactly what you provide when you use the REST API of Elasticsearch. * See [Elasticsearch search API reference document](https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html) for more details about the content of the search queries. -* You can also provide the content of a `query_string` +* You can also provide the content of a `query_string`. * This is a shortcut to a query like that: `{ "query": { "query_string": { "query": "__HERE YOUR QUERY__", "analyze_wildcard": true } } }` * See [Elasticsearch query string syntax](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-syntax) for more details about the content of such a query. @@ -134,7 +130,7 @@ Examples: ```bash | %elasticsearch -| search / { "query": { "match_all": {} } } +| search / { "query": { "match_all": { } } } | | %elasticsearch | search /logs { "query": { "query_string": { "query": "request.method:GET AND status:200" } } } @@ -159,7 +155,7 @@ Examples: | search /logs (404 AND (POST OR DELETE)) ``` -> **Important**: a document in Elasticsearch is a JSON document, so it is hierarchical, not flat as a row in a SQL table. +> **Important** : a document in Elasticsearch is a JSON document, so it is hierarchical, not flat as a row in a SQL table. For the Elastic interpreter, the result of a search query is flattened. Suppose we have a JSON document: @@ -179,12 +175,10 @@ Suppose we have a JSON document: The data will be flattened like this: - content_length | date | request.headers[0] | request.headers[1] | request.method | request.url | status ---------------|------|--------------------|--------------------|----------------|-------------|------- 1234 | 2015-12-08T21:03:13.588Z | Accept: \*.\* | Host: apache.org | GET | /zeppelin/4cd001cd-c517-4fa9-b8e5-a06b8f4056c4 | 403 - Examples: * With a table containing the results: @@ -206,7 +200,7 @@ Examples: ![Elasticsearch - Search with aggregation (multi-bucket)](../assets/themes/zeppelin/img/docs-img/elasticsearch-agg-multi-bucket-pie.png) -#### count +### Count With the `count` command, you can count documents available in some indices and types. You can also provide a query. ```bash @@ -223,7 +217,7 @@ Examples: ![Elasticsearch - Count with query](../assets/themes/zeppelin/img/docs-img/elasticsearch-count-with-query.png) -#### index +### Index With the `index` command, you can insert/update a document in Elasticsearch. ```bash @@ -234,7 +228,7 @@ With the `index` command, you can insert/update a document in Elasticsearch. | index /index/type ``` -#### delete +### Delete With the `delete` command, you can delete a document. ```bash @@ -243,14 +237,13 @@ With the `delete` command, you can delete a document. ``` +### Apply Zeppelin Dynamic Forms -#### Apply Zeppelin Dynamic Forms - -You can leverage [Zeppelin Dynamic Form]({{BASE_PATH}}/manual/dynamicform.html) inside your queries. You can use both the `text input` and `select form` parameterization features +You can leverage [Zeppelin Dynamic Form]({{BASE_PATH}}/manual/dynamicform.html) inside your queries. You can use both the `text input` and `select form` parameterization features. ```bash | %elasticsearch | size ${limit=10} -| search /index/type { "query": { "match_all": {} } } +| search /index/type { "query": { "match_all": { } } } ``` diff --git a/docs/interpreter/flink.md b/docs/interpreter/flink.md index ce1f7800814..6af339b2c53 100644 --- a/docs/interpreter/flink.md +++ b/docs/interpreter/flink.md @@ -8,13 +8,13 @@ group: manual ## Flink interpreter for Apache Zeppelin -[Apache Flink](https://flink.apache.org) is an open source platform for distributed stream and batch data processing. +[Apache Flink](https://flink.apache.org) is an open source platform for distributed stream and batch data processing. Flink’s core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams. Flink also builds batch processing on top of the streaming engine, overlaying native iteration support, managed memory, and program optimization. - -### How to start local Flink cluster, to test the interpreter +
+## How to start local Flink cluster, to test the interpreter Zeppelin comes with pre-configured flink-local interpreter, which starts Flink in a local mode on your machine, so you do not need to install anything. -### How to configure interpreter to point to Flink cluster +## How to configure interpreter to point to Flink cluster At the "Interpreters" menu, you have to create a new Flink interpreter and provide next properties:
Property
elasticsearch.port 9300Connection port (important: this is not the HTTP port, but the transport port)Connection port ( Important: this is not the HTTP port, but the transport port )
elasticsearch.result.size
@@ -33,18 +33,13 @@ At the "Interpreters" menu, you have to create a new Flink interpreter and provi - - - - -
6123 port of running JobManager
xxxyyyanything else from [Flink Configuration](https://ci.apache.org/projects/flink/flink-docs-release-0.9/setup/config.html)
-
+For more information about Flink configuration, you can find it [here](https://ci.apache.org/projects/flink/flink-docs-release-0.10/setup/config.html). -### How to test it's working +## How to test it's working -In example, by using the [Zeppelin notebook](https://www.zeppelinhub.com/viewer/notebooks/aHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL05GTGFicy96ZXBwZWxpbi1ub3RlYm9va3MvbWFzdGVyL25vdGVib29rcy8yQVFFREs1UEMvbm90ZS5qc29u) is from [Till Rohrmann's presentation](http://www.slideshare.net/tillrohrmann/data-analysis-49806564) "Interactive data analysis with Apache Flink" for Apache Flink Meetup. +In example, by using the [Zeppelin notebook](https://www.zeppelinhub.com/viewer/notebooks/aHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL05GTGFicy96ZXBwZWxpbi1ub3RlYm9va3MvbWFzdGVyL25vdGVib29rcy8yQVFFREs1UEMvbm90ZS5qc29u) is from Till Rohrmann's presentation [Interactive data analysis with Apache Flink](http://www.slideshare.net/tillrohrmann/data-analysis-49806564) for Apache Flink Meetup. ``` @@ -52,7 +47,7 @@ In example, by using the [Zeppelin notebook](https://www.zeppelinhub.com/viewer/ rm 10.txt.utf-8 wget http://www.gutenberg.org/ebooks/10.txt.utf-8 ``` -``` +{% highlight scala %} %flink case class WordCount(word: String, frequency: Int) val bible:DataSet[String] = env.readTextFile("10.txt.utf-8") @@ -65,4 +60,4 @@ val wordCounts = partialCounts.groupBy("word").reduce{ (left, right) => WordCount(left.word, left.frequency + right.frequency) } val result10 = wordCounts.first(10).collect() -``` +{% endhighlight %} diff --git a/docs/interpreter/hive.md b/docs/interpreter/hive.md index b37c421de12..c9e0a6c9a90 100644 --- a/docs/interpreter/hive.md +++ b/docs/interpreter/hive.md @@ -8,10 +8,10 @@ group: manual ## Hive Interpreter for Apache Zeppelin +The [Apache Hive](https://hive.apache.org/) ™ data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL. -### Configuration -
+## 1. Configuration @@ -31,48 +31,48 @@ group: manual - + - + - + - + - + - + - + - +
Property
default.user (Optional)Username of the connection( Optional ) Username of the connection
default.password (Optional)Password of the connection( Optional ) Password of the connection
default.xxx (Optional)Other properties used by the driver( Optional ) Other properties used by the driver
${prefix}.driver Driver class path of `%hive(${prefix})`Driver class path of %hive(${prefix})
${prefix}.url Url of `%hive(${prefix})`Url of %hive(${prefix})
${prefix}.user (Optional)Username of the connection of `%hive(${prefix})`( Optional ) Username of the connection of %hive(${prefix})
${prefix}.password (Optional)Password of the connection of `%hive(${prefix})`( Optional ) Password of the connection of %hive(${prefix})
${prefix}.xxx (Optional)Other properties used by the driver of `%hive(${prefix})`( Optional ) Other properties used by the driver of %hive(${prefix})
-This interpreter provides multiple configuration with ${prefix}. User can set a multiple connection properties by this prefix. It can be used like `%hive(${prefix})`. +This interpreter provides multiple configuration with `${prefix}`. User can set a multiple connection properties by this prefix. It can be used like `%hive(${prefix})`. -### How to use +## 2. How to use Basically, you can use @@ -91,9 +91,9 @@ select * from my_table; You can also run multiple queries up to 10 by default. Changing these settings is not implemented yet. -#### Apply Zeppelin Dynamic Forms +### Apply Zeppelin Dynamic Forms -You can leverage [Zeppelin Dynamic Form]({{BASE_PATH}}/manual/dynamicform.html) inside your queries. You can use both the `text input` and `select form` parameterization features +You can leverage [Zeppelin Dynamic Form]({{BASE_PATH}}/manual/dynamicform.html) inside your queries. You can use both the `text input` and `select form` parameterization features. ```sql %hive diff --git a/docs/interpreter/lens.md b/docs/interpreter/lens.md index a3eb2848e04..5fc7a4584fb 100644 --- a/docs/interpreter/lens.md +++ b/docs/interpreter/lens.md @@ -25,7 +25,7 @@ In order to use Lens interpreters, you may install Apache Lens in some simple st ``` ### Configuring Lens Interpreter -At the "Interpreters" menu, you can to edit Lens interpreter or create new one. Zeppelin provides these properties for Lens. +At the "Interpreters" menu, you can edit Lens interpreter or create new one. Zeppelin provides these properties for Lens. diff --git a/docs/interpreter/markdown.md b/docs/interpreter/markdown.md index 7c339d25083..08b44f84f20 100644 --- a/docs/interpreter/markdown.md +++ b/docs/interpreter/markdown.md @@ -10,13 +10,14 @@ group: manual ### Overview [Markdown](http://daringfireball.net/projects/markdown/) is a plain text formatting syntax designed so that it can be converted to HTML. -Zeppelin uses markdown4j, for more examples and extension support checkout [markdown4j](https://code.google.com/p/markdown4j/) -In Zeppelin notebook you can use ``` %md ``` in the beginning of a paragraph to invoke the Markdown interpreter to generate static html from Markdown plain text. +Zeppelin uses markdown4j. For more examples and extension support, please checkout [here](https://code.google.com/p/markdown4j/). +In Zeppelin notebook, you can use ` %md ` in the beginning of a paragraph to invoke the Markdown interpreter and generate static html from Markdown plain text. In Zeppelin, Markdown interpreter is enabled by default. - + + ### Example The following example demonstrates the basic usage of Markdown in a Zeppelin notebook. - + diff --git a/docs/interpreter/spark.md b/docs/interpreter/spark.md index 20be7f8324a..87eaac1504e 100644 --- a/docs/interpreter/spark.md +++ b/docs/interpreter/spark.md @@ -7,7 +7,7 @@ group: manual {% include JB/setup %} -## Spark Interpreter +## Spark Interpreter for Apache Zeppelin [Apache Spark](http://spark.apache.org) is supported in Zeppelin with Spark Interpreter group, which consisted of 4 interpreters. @@ -40,18 +40,15 @@ Spark Interpreter group, which consisted of 4 interpreters.
+
+## Configuration -

- -### Configuration -
- -Without any configuration, Spark interpreter works out of box in local mode. But if you want to connect to your Spark cluster, you'll need following two simple steps. +Without any configuration, Spark interpreter works out of box in local mode. But if you want to connect to your Spark cluster, you'll need to follow below two simple steps. -#### 1. export SPARK_HOME +### 1. Export SPARK_HOME -In **conf/zeppelin-env.sh**, export SPARK_HOME environment variable with your Spark installation path. +In **conf/zeppelin-env.sh**, export `SPARK_HOME` environment variable with your Spark installation path. for example @@ -66,9 +63,7 @@ export HADOOP_CONF_DIR=/usr/lib/hadoop export SPARK_SUBMIT_OPTIONS="--packages com.databricks:spark-csv_2.10:1.2.0" ``` - -
-#### 2. set master in Interpreter menu. +### 2. Set master in Interpreter menu After start Zeppelin, go to **Interpreter** menu and edit **master** property in your Spark interpreter setting. The value may vary depending on your Spark cluster deployment type. @@ -81,30 +76,24 @@ for example, -
-That's it. Zeppelin will work with any version of Spark and any deployment type without rebuild Zeppelin in this way. (Zeppelin 0.5.5-incubating release works up to Spark 1.5.1) - -Note that without exporting SPARK_HOME, it's running in local mode with included version of Spark. The included version may vary depending on the build profile. +That's it. Zeppelin will work with any version of Spark and any deployment type without rebuilding Zeppelin in this way. ( Zeppelin 0.5.5-incubating release works up to Spark 1.5.2 ) -

-### SparkContext, SQLContext, ZeppelinContext -
+> Note that without exporting `SPARK_HOME`, it's running in local mode with included version of Spark. The included version may vary depending on the build profile. +
+## SparkContext, SQLContext, ZeppelinContext SparkContext, SQLContext, ZeppelinContext are automatically created and exposed as variable names 'sc', 'sqlContext' and 'z', respectively, both in scala and python environments. -Note that scala / python environment shares the same SparkContext, SQLContext, ZeppelinContext instance. +> Note that scala / python environment shares the same SparkContext, SQLContext, ZeppelinContext instance. - -
-
-### Dependency Management -
-There are two ways to load external library in spark interpreter. First is using Zeppelin's %dep interpreter and second is loading Spark properties. + +## Dependency Management +There are two ways to load external library in spark interpreter. First is using Zeppelin's `%dep` interpreter and second is loading Spark properties. -#### 1. Dynamic Dependency Loading via %dep interpreter +### 1. Dynamic Dependency Loading via %dep interpreter -When your code requires external library, instead of doing download/copy/restart Zeppelin, you can easily do following jobs using %dep interpreter. +When your code requires external library, instead of doing download/copy/restart Zeppelin, you can easily do following jobs using `%dep` interpreter. * Load libraries recursively from Maven repository * Load libraries from local filesystem @@ -112,7 +101,7 @@ When your code requires external library, instead of doing download/copy/restart * Automatically add libraries to SparkCluster (You can turn off) Dep interpreter leverages scala environment. So you can write any Scala code here. -Note that %dep interpreter should be used before %spark, %pyspark, %sql. +Note that `%dep` interpreter should be used before `%spark`, `%pyspark`, `%sql`. Here's usages. @@ -150,9 +139,7 @@ z.load("groupId:artifactId:version").exclude("groupId:*") z.load("groupId:artifactId:version").local() ``` - -
-#### 2. Loading Spark Properties +### 2. Loading Spark Properties Once `SPARK_HOME` is set in `conf/zeppelin-env.sh`, Zeppelin uses `spark-submit` as spark interpreter runner. `spark-submit` supports two ways to load configurations. The first is command line options such as --master and Zeppelin can pass these options to `spark-submit` by exporting `SPARK_SUBMIT_OPTIONS` in conf/zeppelin-env.sh. Second is reading configuration options from `SPARK_HOME/conf/spark-defaults.conf`. Spark properites that user can set to distribute libraries are: @@ -181,9 +168,8 @@ Once `SPARK_HOME` is set in `conf/zeppelin-env.sh`, Zeppelin uses `spark-submit`
Comma-separated list of files to be placed in the working directory of each executor.
-Note that adding jar to pyspark is only availabe via %dep interpreter at the moment +> Note that adding jar to pyspark is only availabe via `%dep` interpreter at the moment. -
Here are few examples: * SPARK\_SUBMIT\_OPTIONS in conf/zeppelin-env.sh @@ -197,40 +183,43 @@ Here are few examples: spark.files /path/mylib1.py,/path/mylib2.egg,/path/mylib3.zip
-
-### ZeppelinContext -
+## ZeppelinContext Zeppelin automatically injects ZeppelinContext as variable 'z' in your scala/python environment. ZeppelinContext provides some additional functions and utility. -
-#### Object exchange +### Object Exchange ZeppelinContext extends map and it's shared between scala, python environment. So you can put some object from scala and read it from python, vise versa. +
+
-Put object from scala - -```scala +{% highlight scala %} +// Put object from scala %spark val myObject = ... z.put("objName", myObject) -``` +{% endhighlight %} -Get object from python +
+
-```python -%python +{% highlight python %} +# Get object from python +%pyspark myObject = z.get("objName") -``` - -
-#### Form creation +{% endhighlight %} + +
+
+### Form Creation ZeppelinContext provides functions for creating forms. In scala and python environments, you can create forms programmatically. +
+
-```scala +{% highlight scala %} %spark /* Create text input form */ z.input("formName") @@ -245,7 +234,30 @@ z.select("formName", Seq(("option1", "option1DisplayName"), /* Create select form with default value*/ z.select("formName", "option1", Seq(("option1", "option1DisplayName"), ("option2", "option2DisplayName"))) -``` +{% endhighlight %} + +
+
+ +{% highlight python %} +%pyspark +# Create text input form +z.input("formName") + +# Create text input form with default value +z.input("formName", "defaultValue") + +# Create select form +z.select("formName", [("option1", "option1DisplayName"), + ("option2", "option2DisplayName")]) + +# Create select form with default value +z.select("formName", [("option1", "option1DisplayName"), + ("option2", "option2DisplayName")], "option1") +{% endhighlight %} + +
+
In sql environment, you can create form in simple template. diff --git a/docs/manual/interpreters.md b/docs/manual/interpreters.md index 48cd1a35f3d..fd4c2bcec5f 100644 --- a/docs/manual/interpreters.md +++ b/docs/manual/interpreters.md @@ -20,45 +20,50 @@ limitations under the License. {% include JB/setup %} -## Interpreters in zeppelin +## Interpreters in Zeppelin +In this section, we will explain about the role of interpreters, interpreters group and interpreter settings in Zeppelin. +The concept of Zeppelin interpreter allows any language/data-processing-backend to be plugged into Zeppelin. +Currently, Zeppelin supports many interpreters such as Scala ( with Apache Spark ), Python ( with Apache Spark ), SparkSQL, Hive, Markdown, Shell and so on. -This section explain the role of Interpreters, interpreters group and interpreters settings in Zeppelin. -Zeppelin interpreter concept allows any language/data-processing-backend to be plugged into Zeppelin. -Currently Zeppelin supports many interpreters such as Scala(with Apache Spark), Python(with Apache Spark), SparkSQL, Hive, Markdown and Shell. +
+## What is Zeppelin interpreter? -### What is zeppelin interpreter? +Zeppelin Interpreter is a plug-in which enables Zeppelin users to use a specific language/data-processing-backend. For example, to use scala code in Zeppelin, you need `%spark` interpreter. -Zeppelin Interpreter is the plug-in which enable zeppelin user to use a specific language/data-processing-backend. For example to use scala code in Zeppelin, you need ```spark``` interpreter. - -When you click on the ```+Create``` button in the interpreter page the interpreter drop-down list box will present all the available interpreters on your server. +When you click the ```+Create``` button in the interpreter page, the interpreter drop-down list box will show all the available interpreters on your server. -### What is zeppelin interpreter setting? +
+## What is Zeppelin Interpreter Setting? -Zeppelin interpreter setting is the configuration of a given interpreter on zeppelin server. For example, the properties requried for hive JDBC interpreter to connect to the Hive server. +Zeppelin interpreter setting is the configuration of a given interpreter on Zeppelin server. For example, the properties are required for hive JDBC interpreter to connect to the Hive server. -### What is zeppelin interpreter group? -Every Interpreter belongs to an InterpreterGroup. InterpreterGroup is a unit of start/stop interpreter. -By default, every interpreter belong to a single group but the group might contain more interpreters. For example, spark interpreter group include spark support, pySpark, +
+## What is Zeppelin Interpreter Group? + +Every Interpreter is belonged to an **Interpreter Group**. Interpreter Group is a unit of start/stop interpreter. +By default, every interpreter is belonged to a single group, but the group might contain more interpreters. For example, spark interpreter group is including Spark support, pySpark, SparkSQL and the dependency loader. -Technically, Zeppelin interpreters from the same group are running in the same JVM. +Technically, Zeppelin interpreters from the same group are running in the same JVM. For more information about this, please checkout [here](../development/writingzeppelininterpreter.html). -Interpreters belong to a single group a registered together and all of their properties are listed in the interpreter setting. +Each interpreters is belonged to a single group and registered together. All of their properties are listed in the interpreter setting like below image. -### Programming langages for interpreter +
+## Programming Languages for Interpreter -If the interpreter uses a specific programming language (like Scala, Python, SQL), it is generally a good idea to add syntax highlighting support for that to the notebook paragraph editor. +If the interpreter uses a specific programming language ( like Scala, Python, SQL ), it is generally recommended to add a syntax highlighting supported for that to the notebook paragraph editor. -To check out the list of languages supported, see the mode-*.js files under zeppelin-web/bower_components/ace-builds/src-noconflict or from github https://github.com/ajaxorg/ace-builds/tree/master/src-noconflict +To check out the list of languages supported, see the `mode-*.js` files under `zeppelin-web/bower_components/ace-builds/src-noconflict` or from [github.com/ajaxorg/ace-builds](https://github.com/ajaxorg/ace-builds/tree/master/src-noconflict). -To add a new set of syntax highlighting, -1. add the mode-*.js file to zeppelin-web/bower.json (when built, zeppelin-web/src/index.html will be changed automatically) -2. add to the list of `editorMode` in zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js - it follows the pattern 'ace/mode/x' where x is the name -3. add to the code that checks for `%` prefix and calls `session.setMode(editorMode.x)` in `setParagraphMode` in zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js +If you want to add a new set of syntax highlighting, + +1. Add the `mode-*.js` file to `zeppelin-web/bower.json` ( when built, `zeppelin-web/src/index.html` will be changed automatically. ). +2. Add to the list of `editorMode` in `zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js` - it follows the pattern 'ace/mode/x' where x is the name. +3. Add to the code that checks for `%` prefix and calls `session.setMode(editorMode.x)` in `setParagraphMode` located in `zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js`. diff --git a/docs/rest-api/rest-interpreter.md b/docs/rest-api/rest-interpreter.md index 6e41fe87716..ccdea3f2f08 100644 --- a/docs/rest-api/rest-interpreter.md +++ b/docs/rest-api/rest-interpreter.md @@ -22,28 +22,29 @@ limitations under the License. ## Zeppelin REST API Zeppelin provides several REST API's for interaction and remote activation of zeppelin functionality. - All REST API are available starting with the following endpoint ```http://[zeppelin-server]:[zeppelin-port]/api``` - + All REST API are available starting with the following endpoint `http://[zeppelin-server]:[zeppelin-port]/api`. Note that zeppein REST API receive or return JSON objects, it it recommended you install some JSON view such as - [JSONView](https://chrome.google.com/webstore/detail/jsonview/chklaanhfefbnpoihckbnefhakgolnmc) + [JSON View](https://chrome.google.com/webstore/detail/jsonview/chklaanhfefbnpoihckbnefhakgolnmc). - If you work with zeppelin and find a need for an additional REST API please [file an issue or send us mail](../../community.html) + If you work with zeppelin and find a need for an additional REST API, please [file an issue or send us mail](http://zeppelin.incubator.apache.org/community.html).
-### Interpreter REST API list +## Interpreter REST API List - The role of registered interpreters, settings and interpreters group is described [here](../manual/interpreters.html) + The role of registered interpreters, settings and interpreters group are described in [here](../manual/interpreters.html). +### 1. List of Registered Interpreters & Interpreter Settings + - + - + @@ -54,12 +55,11 @@ limitations under the License. - + - +
List registered interpretersList of registered interpreters
DescriptionThis ```GET``` method return all the registered interpreters available on the server.This ```GET``` method returns all the registered interpreters available on the server.
URL200
Fail codeFail code 500
sample JSON response - Sample JSON response
 {
@@ -113,12 +113,12 @@ limitations under the License.
   
-      
+      
-      
+      
@@ -129,12 +129,11 @@ limitations under the License.
       
-      
+      
-      
+      
List interpreters settingsList of interpreters settings
DescriptionThis ```GET``` method return all the interpreters settings registered on the server.This ```GET``` method returns all the interpreters settings registered on the server.
URL200
Fail codeFail code 500
sample JSON response - Sample JSON response
 {
@@ -182,7 +181,8 @@ limitations under the License.
   

- +### 2. Create an Interpreter Setting + @@ -202,12 +202,11 @@ limitations under the License. - + - + - +
201
Fail codeFail code 500
sample JSON input - Sample JSON input
 {
@@ -227,8 +226,7 @@ limitations under the License.
       
sample JSON response - Sample JSON response
 {
@@ -256,7 +254,8 @@ limitations under the License.
   
   
 
- + +### 3. Update an Interpreter Setting @@ -276,12 +275,11 @@ limitations under the License. - + - + - +
200
Fail codeFail code 500
sample JSON input - Sample JSON input
 {
@@ -301,8 +299,7 @@ limitations under the License.
       
sample JSON response - Sample JSON response
 {
@@ -330,7 +327,8 @@ limitations under the License.
 
   
 
- +### 4. Delete an Interpreter Setting + @@ -354,17 +352,17 @@ limitations under the License. - +
500
sample JSON response - Sample JSON response -
{"status":"OK"}
+ {"status":"OK"}

- +### 5. Restart an Interpreter + @@ -373,7 +371,7 @@ limitations under the License. - + @@ -384,14 +382,13 @@ limitations under the License. - + - +
DescriptionThis ```PUT``` method restart the given interpreter id.This ```PUT``` method restarts the given interpreter id.
URL200
Fail codeFail code 500
sample JSON response - Sample JSON response -
{"status":"OK"}
+ {"status":"OK"}
diff --git a/docs/tutorial/tutorial.md b/docs/tutorial/tutorial.md index 68b2ee7e83d..2192c220dba 100644 --- a/docs/tutorial/tutorial.md +++ b/docs/tutorial/tutorial.md @@ -17,20 +17,20 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> -### Zeppelin Tutorial +## Zeppelin Tutorial -We will assume you have Zeppelin installed already. If that's not the case, see [Install](../install/install.html). +This tutorial walks you through some of the fundamental Zeppelin concepts. We will assume you have already installed Zeppelin. If not, please see [here](../install/install.html) first. -Zeppelin's current main backend processing engine is [Apache Spark](https://spark.apache.org). If you're new to the system, you might want to start by getting an idea of how it processes data to get the most out of Zeppelin. +Current main backend processing engine of Zeppelin is [Apache Spark](https://spark.apache.org). If you're new to this system, you might want to start by getting an idea of how it processes data to get the most out of Zeppelin.
-### Tutorial with Local File +## Tutorial with Local File -#### Data Refine +### 1. Data Refine Before you start Zeppelin tutorial, you will need to download [bank.zip](http://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip). -First, to transform data from csv format into RDD of `Bank` objects, run following script. This will also remove header using `filter` function. +First, to transform csv format data into RDD of `Bank` objects, run following script. This will also remove header using `filter` function. ```scala @@ -38,7 +38,7 @@ val bankText = sc.textFile("yourPath/bank/bank-full.csv") case class Bank(age:Integer, job:String, marital : String, education : String, balance : Integer) -// split each line, filter out header (starts with "age"), and map it into Bank case class +// split each line, filter out header (starts with "age"), and map it into Bank case class val bank = bankText.map(s=>s.split(";")).filter(s=>s(0)!="\"age\"").map( s=>Bank(s(0).toInt, s(1).replaceAll("\"", ""), @@ -52,8 +52,7 @@ val bank = bankText.map(s=>s.split(";")).filter(s=>s(0)!="\"age\"").map( bank.toDF().registerTempTable("bank") ``` -
-#### Data Retrieval +### 2. Data Retrieval Suppose we want to see age distribution from `bank`. To do this, run: @@ -74,9 +73,9 @@ Now we want to see age distribution with certain marital status and add combo bo ```
-### Tutorial with Streaming Data +## Tutorial with Streaming Data -#### Data Refine +### 1. Data Refine Since this tutorial is based on Twitter's sample tweet stream, you must configure authentication with a Twitter account. To do this, take a look at [Twitter Credential Setup](https://databricks-training.s3.amazonaws.com/realtime-processing-with-spark-streaming.html#twitter-credential-setup). After you get API keys, you should fill out credential related values(`apiKey`, `apiSecret`, `accessToken`, `accessTokenSecret`) with your API keys on following script. @@ -136,12 +135,11 @@ twt.print ssc.start() ``` -
-#### Data Retrieval +### 2. Data Retrieval For each following script, every time you click run button you will see different result since it is based on real-time data. -Let's begin by extracting maximum 10 tweets which contain the word "girl". +Let's begin by extracting maximum 10 tweets which contain the word **girl**. ```sql %sql select * from tweets where text like '%girl%' limit 10 @@ -154,7 +152,7 @@ This time suppose we want to see how many tweets have been created per sec durin ``` -You can make user-defined function and use it in Spark SQL. Let's try it by making function named `sentiment`. This function will return one of the three attitudes(positive, negative, neutral) towards the parameter. +You can make user-defined function and use it in Spark SQL. Let's try it by making function named `sentiment`. This function will return one of the three attitudes( positive, negative, neutral ) towards the parameter. ```scala def sentiment(s:String) : String = {