Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions conf/zeppelin-site.xml.template
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,14 @@
</property>
-->

<!-- For versioning your local norebook storage using Git repository
<property>
<name>zeppelin.notebook.storage</name>
<value>org.apache.zeppelin.notebook.repo.GitNotebookRepo</value>
<description>notebook persistence layer implementation</description>
</property>
-->

<property>
<name>zeppelin.notebook.storage</name>
<value>org.apache.zeppelin.notebook.repo.VFSNotebookRepo</value>
Expand Down
3 changes: 2 additions & 1 deletion docs/_includes/themes/zeppelin/_navigation.html
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,8 @@
<li><a href="{{BASE_PATH}}/manual/notebookashomepage.html">Notebook as Homepage</a></li>
<li role="separator" class="divider"></li>
<!-- li><span><b>Notebook Storage</b><span></li -->
<li><a href="{{BASE_PATH}}/storage/storage.html">S3 Storage</a></li>
<li><a href="{{BASE_PATH}}/storage/storage.html#Git">Git Storage</a></li>
<li><a href="{{BASE_PATH}}/storage/storage.html#S3">S3 Storage</a></li>
<li role="separator" class="divider"></li>
<!-- li><span><b>REST API</b><span></li -->
<li><a href="{{BASE_PATH}}/rest-api/rest-interpreter.html">Interpreter API</a></li>
Expand Down
24 changes: 22 additions & 2 deletions docs/storage/storage.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,30 @@ limitations under the License.
-->
### Notebook Storage

In Zeppelin there are two option for storage Notebook, by default the notebook is storage in the notebook folder in your local File System and the second option is S3.
Zeppelin has a pluggable notebook storage mechanism controlled by `zeppelin.notebook.storage` configuration option with multiple implementations.
There are few Notebook storages avaialble for a use out of the box:
- (default) all notes are saved in the notebook folder in your local File System - `VFSNotebookRepo`
- there is also an option to version it using local Git repository - `GitNotebookRepo`
- another option is Amazon S3 service - `S3NotebookRepo`

Multiple storages can be used at the same time by providing a comma-separated list of the calss-names in the confiruration.
By default, only first two of them will be automatically kept in sync by Zeppelin.

</br>
#### Notebook Storage in local Git repository <a name="Git"></a>

To enable versioning for all your local notebooks though a standard Git repository - uncomment the next property in `zeppelin-site.xml` in order to use GitNotebookRepo class:

```
<property>
<name>zeppelin.notebook.storage</name>
<value>org.apache.zeppelin.notebook.repo.GitNotebookRepo</value>
<description>notebook persistence layer implementation</description>
</property>
```

</br>
#### Notebook Storage in S3
#### Notebook Storage in S3 <a name="S3"></a>

For notebook storage in S3 you need the AWS credentials, for this there are three options, the enviroment variable ```AWS_ACCESS_KEY_ID``` and ```AWS_ACCESS_SECRET_KEY```, credentials file in the folder .aws in you home and IAM role for your instance. For complete the need steps is necessary:

Expand Down
8 changes: 5 additions & 3 deletions zeppelin-distribution/src/bin_license/LICENSE
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
(Apache 2.0) nvd3.js v1.1.15-beta (http://nvd3.org/) - https://github.com/novus/nvd3/blob/v1.1.15-beta/LICENSE.md
(Apache 2.0) gson v2.2 (com.google.code.gson:gson:jar:2.2 - https://github.com/google/gson) - https://github.com/google/gson/blob/gson-2.2/LICENSE
(Apache 2.0) Amazon Web Services SDK for Java v1.10.1 (https://aws.amazon.com/sdk-for-java/) - https://raw.githubusercontent.com/aws/aws-sdk-java/1.10.1/LICENSE.txt

(Apache 2.0) JavaEWAH v0.7.9 (https://github.com/lemire/javaewah) - https://github.com/lemire/javaewah/blob/master/LICENSE-2.0.txt


The following components are provided under Apache License.
Expand Down Expand Up @@ -115,13 +115,15 @@ The text of each license is also included at licenses/LICENSE-[project]-[version
(BSD 3 Clause) d3 v2.10.2 (https://d3js.org/) - https://github.com/mbostock/d3/blob/v2.10.2/LICENSE
(BSD 3 Clause) ace-builds v1.1.9 (https://github.com/ajaxorg/ace-builds) - https://github.com/ajaxorg/ace-builds/blob/v1.1.9/LICENSE
(BSD 3 Clause) Ace v1.1.9 (http://ace.c9.io/) - https://github.com/ajaxorg/ace/blob/v1.1.9/LICENSE
(BSD Style) dom4j v1.6.1 (http://www.dom4j.org) - https://github.com/dom4j/dom4j/blob/dom4j_1_6_1/LICENSE.txt
(BSD Style) dom4j v1.6.1 (http://www.dom4j.org) - https://github.com/dom4j/dom4j/blob/dom4j_1_6_1/LICENSE.txt
(BSD Style) JSch v0.1.53 (http://www.jcraft.com) - http://www.jcraft.com/jsch/LICENSE.txt
(BSD 3 Clause) highlightjs v8.4.0 (https://highlightjs.org/) - https://github.com/isagalaev/highlight.js/blob/8.4/LICENSE



The following components are provided under the BSD-style License.

(New BSD License) JGit (org.eclipse.jgit:org.eclipse.jgit:jar:4.1.1.201511131810-r - https://eclipse.org/jgit/)
(New BSD License) Kryo (com.esotericsoftware.kryo:kryo:2.21 - http://code.google.com/p/kryo/)
(New BSD License) MinLog (com.esotericsoftware.minlog:minlog:1.2 - http://code.google.com/p/minlog/)
(New BSD License) ReflectASM (com.esotericsoftware.reflectasm:reflectasm:1.07 - http://code.google.com/p/reflectasm/)
Expand Down Expand Up @@ -155,7 +157,7 @@ EPL license
The following components are provided under the EPL License.

(EPL 1.0) Aether (org.sonatype.aether - http://www.eclipse.org/aether/)

(EPL 1.0) JDT Annotations For Enhanced Null Analysis (org.eclipse.jdt:org.eclipse.jdt.annotation:1.1.0 - https://repo.eclipse.org/content/repositories/eclipse-releases/org/eclipse/jdt/org.eclipse.jdt.annotation)


========================================================================
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,12 @@

import java.io.File;
import java.io.IOException;
import java.lang.reflect.Constructor;
import java.util.EnumSet;
import java.util.HashSet;
import java.util.Set;

import javax.net.ssl.SSLContext;
import javax.servlet.DispatcherType;
import javax.servlet.Servlet;
import javax.ws.rs.core.Application;

import org.apache.cxf.jaxrs.servlet.CXFNonSpringJaxrsServlet;
Expand Down Expand Up @@ -60,22 +58,17 @@
/**
* Main class of Zeppelin.
*
* @author Leemoonsoo
*
*/

public class ZeppelinServer extends Application {
private static final Logger LOG = LoggerFactory.getLogger(ZeppelinServer.class);

private SchedulerFactory schedulerFactory;
public static Notebook notebook;

public static NotebookServer notebookServer;

public static Server jettyServer;

private InterpreterFactory replFactory;

private NotebookRepo notebookRepo;

public static void main(String[] args) throws Exception {
Expand Down Expand Up @@ -113,6 +106,7 @@ public static void main(String[] args) throws Exception {
try {
jettyServer.stop();
ZeppelinServer.notebook.getInterpreterFactory().close();
ZeppelinServer.notebook.close();
} catch (Exception e) {
LOG.error("Error while stopping servlet container", e);
}
Expand Down
32 changes: 23 additions & 9 deletions zeppelin-zengine/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -51,13 +51,13 @@
<artifactId>zeppelin-interpreter</artifactId>
<version>${project.version}</version>
</dependency>

<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-s3</artifactId>
<version>1.10.1</version>
</dependency>

<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
Expand Down Expand Up @@ -123,12 +123,6 @@
<artifactId>guava</artifactId>
</dependency>

<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.reflections</groupId>
<artifactId>reflections</artifactId>
Expand All @@ -151,11 +145,31 @@
<version>1.4.01</version>
</dependency>

<dependency>
<groupId>org.eclipse.jgit</groupId>
<artifactId>org.eclipse.jgit</artifactId>
<version>4.1.1.201511131810-r</version>
</dependency>

<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-all</artifactId>
<version>1.9.0</version>
<scope>test</scope>
</dependency>
</dependency>

<dependency>
<groupId>com.google.truth</groupId>
<artifactId>truth</artifactId>
<version>0.27</version>
<scope>test</scope>
</dependency>

</dependencies>
</project>
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ private Note loadNoteFromRepo(String id) {
String noteId = snapshot.getAngularObject().getNoteId();
// at this point, remote interpreter process is not created.
// so does not make sense add it to the remote.
//
//
// therefore instead of addAndNotifyRemoteProcess(), need to use add()
// that results add angularObject only in ZeppelinServer side not remoteProcessSide
registry.add(name, snapshot.getAngularObject().get(), noteId);
Expand Down Expand Up @@ -457,4 +457,8 @@ public ZeppelinConfiguration getConf() {
return conf;
}

public void close() {
this.notebookRepo.close();
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.zeppelin.notebook.repo;

import java.io.File;
import java.io.IOException;
import java.util.List;

import org.apache.zeppelin.conf.ZeppelinConfiguration;
import org.apache.zeppelin.notebook.Note;
import org.eclipse.jgit.api.Git;
import org.eclipse.jgit.api.errors.GitAPIException;
import org.eclipse.jgit.diff.DiffEntry;
import org.eclipse.jgit.dircache.DirCache;
import org.eclipse.jgit.internal.storage.file.FileRepository;
import org.eclipse.jgit.lib.Repository;
import org.eclipse.jgit.revwalk.RevCommit;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.google.common.base.Joiner;
import com.google.common.collect.Lists;

/**
* NotebookRepo that hosts all the notebook FS in a single Git repo
*
* This impl intended to be simple and straightforward:
* - does not handle branches
* - only basic local git file repo, no remote Github push\pull yet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we be handling push in the future? if so, how would it fit into the API or would there be new API for checkpointing/pushing to remote?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkpointing happens in Git on commit level, so I was thinking that push, if have enough user demand, can be implemented on the NotebookRepoSync level, like a FS->S3 sync does.
This should not require big API changes here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for remote push :)

*
* TODO(bzz): add default .gitignore
*/
public class GitNotebookRepo extends VFSNotebookRepo implements NotebookRepoVersioned {
private static final Logger LOG = LoggerFactory.getLogger(GitNotebookRepo.class);

private String localPath;
private Git git;

public GitNotebookRepo(ZeppelinConfiguration conf) throws IOException {
super(conf);
localPath = getRootDir().getName().getPath();
LOG.info("Opening a git repo at '{}'", localPath);
Repository localRepo = new FileRepository(Joiner.on(File.separator).join(localPath, ".git"));
if (!localRepo.getDirectory().exists()) {
LOG.info("Git repo {} does not exist, creating a new one", localRepo.getDirectory());
localRepo.create();
}
git = new Git(localRepo);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point, thank you! addressed in 5d7ffea

maybeAddAndCommit(".");
}

@Override
public synchronized void save(Note note) throws IOException {
super.save(note);
maybeAddAndCommit(note.getId());
}

private void maybeAddAndCommit(String pattern) {
try {
List<DiffEntry> gitDiff = git.diff().call();
if (!gitDiff.isEmpty()) {
LOG.debug("Changes found for pattern '{}': {}", pattern, gitDiff);
DirCache added = git.add().addFilepattern(pattern).call();
LOG.debug("{} changes are about to be commited", added.getEntryCount());
git.commit().setMessage("Updated " + pattern).call();
} else {
LOG.debug("No changes found {}", pattern);
}
} catch (GitAPIException e) {
LOG.error("Faild to add+comit {} to Git", pattern, e);
}
}

@Override
public Note get(String noteId, String rev) throws IOException {
//TODO(bzz): something like 'git checkout rev', that will not change-the-world though
return super.get(noteId);
}

@Override
public List<Rev> history(String noteId) {
List<Rev> history = Lists.newArrayList();
LOG.debug("Listing history for {}:", noteId);
try {
Iterable<RevCommit> logs = git.log().addPath(noteId).call();
for (RevCommit log: logs) {
history.add(new Rev(log.getName(), log.getCommitTime()));
LOG.debug(" - ({},{})", log.getName(), log.getCommitTime());
}
} catch (GitAPIException e) {
LOG.error("Failed to get logs for {}", noteId, e);
}
return history;
}

@Override
public void close() {
git.getRepository().close();
}

//DI replacements for Tests
Git getGit() {
return git;
}

void setGit(Git git) {
this.git = git;
}


}
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,9 @@ public interface NotebookRepo {
public Note get(String noteId) throws IOException;
public void save(Note note) throws IOException;
public void remove(String noteId) throws IOException;

/**
* Release any underlying resources
*/
public void close();
}
Loading