Skip to content

Commit

Permalink
Merge pull request #427 from c-martinez/gitlab-repo
Browse files Browse the repository at this point in the history
Gitlab repo
  • Loading branch information
albertmeronyo authored Sep 6, 2023
2 parents a7fc744 + d90f1b4 commit ab5ebf8
Show file tree
Hide file tree
Showing 11 changed files with 62 additions and 30 deletions.
1 change: 1 addition & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ services:
- USERMAP_GID=1000
- USERMAP_UID=1000
- GRLC_GITHUB_ACCESS_TOKEN=xxx
- GRLC_GITLAB_ACCESS_TOKEN=yyy
- GRLC_SERVER_NAME=grlc.io
```

Expand Down
2 changes: 2 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,12 @@ MAINTAINER albert.merono@vu.nl

# Default values for env variables
ARG GRLC_GITHUB_ACCESS_TOKEN=
ARG GRLC_GITLAB_ACCESS_TOKEN=
ARG GRLC_SERVER_NAME=grlc.io
ARG GRLC_SPARQL_ENDPOINT=http://dbpedia.org/sparql

ENV GRLC_GITHUB_ACCESS_TOKEN=$GRLC_GITHUB_ACCESS_TOKEN \
GRLC_GITLAB_ACCESS_TOKEN=$GRLC_GITLAB_ACCESS_TOKEN \
GRLC_SERVER_NAME=$GRLC_SERVER_NAME \
GRLC_SPARQL_ENDPOINT=$GRLC_SPARQL_ENDPOINT

Expand Down
36 changes: 26 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ If you use grlc in your work, please cite it as:
```

## What is grlc?
grlc is a lightweight server that takes SPARQL queries (stored in a GitHub repository, in your local filesystem, or listed in a URL), and translates them to Linked Data Web APIs. This enables universal access to Linked Data. Users are not required to know SPARQL to query their data, but instead can access a web API.
grlc is a lightweight server that takes SPARQL queries (stored in a GitHub or GitLab repository, in your local filesystem, or listed in a URL), and translates them to Linked Data Web APIs. This enables universal access to Linked Data. Users are not required to know SPARQL to query their data, but instead can access a web API.

## Quick tutorial
For a quick usage tutorial check out our wiki [walkthrough](https://github.com/CLARIAH/grlc/wiki/Quick-tutorial) and [list of features](https://github.com/CLARIAH/grlc/wiki/Features).
Expand All @@ -43,7 +43,7 @@ Your queries can add API parameters to each operation by using the [parameter ma
Your queries can include special [decorators](#decorator-syntax) to add extra functionality to your API.

### Query location
grlc can load your query collection from different locations: from a GitHub repository (`api-git`), from local storage (`api-local`), and from a specification file (`api-url`). Each type of location has specific features and is accessible via different paths. However all location types produce the same beautiful APIs.
grlc can load your query collection from different locations: from a GitHub repository (`api-git`), from a GitLab repository (`api-gitlab`), from local storage (`api-local`), and from a specification file (`api-url`). Each type of location has specific features and is accessible via different paths. However all location types produce the same beautiful APIs.

#### From a GitHub repository
> API path:
Expand All @@ -58,6 +58,19 @@ grlc can make use of git's version control mechanism to generate an API based on

grlc can also use a subdirectory inside your Github repo. This can be done by including a subdirectory in the URL path (`http://grlc-server/api-git/<user>/<repo>/subdir/<subdir>`).

#### From a GitLab repository
> API path:
`http://grlc-server/api-gitlab/<user>/<repo>`
grlc can build an API from any GitLab repository, specified by the GitLab user name of the owner (`<user>`) and repository name (`<repo>`).

For example, assuming your queries are stored on a GitLAb repo: `https://gitlab.com/c-martinez/grlc-queries`, point your browser to the following location
`http://grlc.io/api-gitlab/c-martinez/grlc-queries/`

grlc can make use of git's version control mechanism to generate an API based on a specific version of queries in the repository. This can be done by including the name of a branch in the URL path (`http://grlc-server/api-gitlab/<user>/<repo>/branch/<branch>`), for example: `http://grlc.io/api-gitlab/c-martinez/grlc-queries/branch/master`

grlc can also use a subdirectory inside your GitLab repo. This can be done by including a subdirectory in the URL path (`http://grlc-server/api-gitlab/<user>/<repo>/subdir/<subdir>`), for example: `http://grlc-server/api-gitlab/c-martinez/grlc-queries/subdir/subdir`.

#### From local storage
> API path:
`http://grlc-server/api-local/`
Expand Down Expand Up @@ -255,6 +268,7 @@ Example [query](https://github.com/CLARIAH/grlc-queries/blob/master/transform.rq

Check these out:
- http://grlc.io/api-git/CLARIAH/grlc-queries
- http://grlc.io/api-gitlab/c-martinez/grlc-queries
- http://grlc.io/api-url?specUrl=https://raw.githubusercontent.com/CLARIAH/grlc-queries/master/urls.yml
- http://grlc.io/api-git/CLARIAH/wp4-queries-hisco
- http://grlc.io/api-git/albertmeronyo/lodapi
Expand Down Expand Up @@ -282,9 +296,9 @@ To run grlc via [docker](https://www.docker.com/), you'll need a working install
docker run -it --rm -p 8088:80 clariah/grlc
```

The docker image allows you to setup several environment variable such as `GRLC_SERVER_NAME` `GRLC_GITHUB_ACCESS_TOKEN` and `GRLC_SPARQL_ENDPOINT`:
The docker image allows you to setup several environment variable such as `GRLC_SERVER_NAME` `GRLC_GITHUB_ACCESS_TOKEN`,`GRLC_GITLAB_ACCESS_TOKEN` and `GRLC_SPARQL_ENDPOINT`:
```bash
docker run -it --rm -p 8088:80 -e GRLC_SERVER_NAME=grlc.io -e GRLC_GITHUB_ACCESS_TOKEN=xxx -e GRLC_SPARQL_ENDPOINT=http://dbpedia.org/sparql -e DEBUG=true clariah/grlc
docker run -it --rm -p 8088:80 -e GRLC_SERVER_NAME=grlc.io -e GRLC_GITHUB_ACCESS_TOKEN=xxx -e GRLC_GITLAB_ACCESS_TOKEN=yyy -e GRLC_SPARQL_ENDPOINT=http://dbpedia.org/sparql -e DEBUG=true clariah/grlc
```

### Pip
Expand Down Expand Up @@ -346,19 +360,21 @@ You can use grlc as a library directly from your own python script. See the [usa
Regardless of how you are running your grlc server, you will need to configure it using the `config.ini` file. Have a look at the [example config file](./config.default.ini) to see how it this file is structured.

The configuration file contains the following variables:
- `github_access_token` [access token](#github-access-token) to communicate with Github API.
- `github_access_token` [access token](#gitaccess-token) to communicate with Github API.
- `gitlab_access_token` [access token](#git-access-token) to communicate with GitLab API.
- `local_sparql_dir` local storage directory where [local queries](#from-local-storage) are located.
- `server_name` name of the server (e.g. grlc.io)
- `sparql_endpoint` default SPARQL endpoint
- `user` and `password` SPARQL endpoint default authentication (if required, specify `'none'` if not required)
- `debug` enable debug level logging.
- `gitlab_url` to specify the base url of your GitLab instance.

##### GitHub access token
In order for grlc to communicate with GitHub, you'll need to tell grlc what your access token is:
##### Git access token
In order for grlc to communicate with GitHub and/or GitLab, you'll need to tell grlc what your access token is:

1. Get a GitHub personal access token. In your GitHub's profile page, go to _Settings_, then _Developer settings_, _Personal access tokens_, and _Generate new token_
2. You'll get an access token string, copy it and save it somewhere safe (GitHub won't let you see it again!)
3. Edit your `config.ini` or `docker-compose.yml` as value of the environment variable `GRLC_GITHUB_ACCESS_TOKEN`.
1. Get a [GitHub personal access token](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/about-authentication-to-github#authenticating-to-the-api-with-a-personal-access-token) or [GitLab personal access token](https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html#create-a-personal-access-token).
2. You'll get an access token string, copy it and save it somewhere safe.
3. Edit your `config.ini` (`github_access_token` and `gitlab_access_token` respectively) and/or `docker-compose.yml` (`GRLC_GITHUB_ACCESS_TOKEN` and `GRLC_GITLAB_ACCESS_TOKEN` environment variables).

# Contribute!
grlc needs **you** to continue bringing Semantic Web content to developers, applications and users. No matter if you are just a curious user, a developer, or a researcher; there are many ways in which you can contribute:
Expand Down
4 changes: 4 additions & 0 deletions config.default.ini
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

[auth]
github_access_token = xxx
gitlab_access_token = yyy

[local]
local_sparql_dir = /home/grlc/queries/
Expand All @@ -12,9 +13,12 @@ local_sparql_dir = /home/grlc/queries/
# Default endpoint, if none specified elsewhere
sparql_endpoint = http://dbpedia.org/sparql
server_name = grlc.io

# endpoint default authentication
user = none
password = none
# sparql_access_token = SPARQL endpoint HTTP authorization token

# Logging level
debug = True

Expand Down
3 changes: 2 additions & 1 deletion docker-assets/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,10 @@ case ${1} in
case ${1} in
app:start)
cd ${GRLC_INSTALL_DIR}
# put github's access_token in place
# put github and gitlab access_tokens in place
cp config.default.ini config.ini
sed -i "s/xxx/${GRLC_GITHUB_ACCESS_TOKEN}/" config.ini
sed -i "s/yyy/${GRLC_GITLAB_ACCESS_TOKEN}/" config.ini
# configure grlc server name
sed -i "s/grlc.io/${GRLC_SERVER_NAME}/" config.ini
# configure default sparql endpoint
Expand Down
15 changes: 9 additions & 6 deletions src/fileLoaders.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from github import Github
from github.GithubObject import NotSet
from github.GithubException import BadCredentialsException
from gitlab.exceptions import GitlabAuthenticationError
from configparser import ConfigParser
from urllib.parse import urljoin

Expand Down Expand Up @@ -82,7 +83,7 @@ def __init__(self, user, repo, subdir=None, sha=None, prov=None):
self.subdir = (subdir + "/") if subdir else ""
self.sha = sha if sha else NotSet
self.prov = prov
gh = Github(static.ACCESS_TOKEN)
gh = Github(static.GITHUB_ACCESS_TOKEN)
try:
self.gh_repo = gh.get_repo(user + '/' + repo, lazy=False)
except BadCredentialsException:
Expand Down Expand Up @@ -173,7 +174,7 @@ def getRepoDescription(self):

class GitlabLoader(BaseLoader):

def __init__(self, user, repo, subdir=None, sha=None, prov=None, branch='main'):
def __init__(self, user, repo, subdir=None, sha=None, prov=None, branch=None):
"""Create a new GithubLoader.
# TODO: Update to GITLAB !
Expand All @@ -192,20 +193,22 @@ def __init__(self, user, repo, subdir=None, sha=None, prov=None, branch='main'):
self.prov = prov
gl = gitlab.Gitlab(
url=static.GITLAB_URL,
private_token=static.ACCESS_TOKEN
private_token=static.GITLAB_ACCESS_TOKEN
)
try:
self.gl_repo = gl.projects.get(user + '/' + repo)
except BadCredentialsException:
raise Exception('BadCredentials: have you set up github_access_token on config.ini ?')
if not self.branch: # Use default branch if not specified
self.branch = self.gl_repo.default_branch
except GitlabAuthenticationError:
raise Exception('GitlabAuthenticationError: have you set up gitlab_access_token on config.ini ?')
except Exception:
raise Exception('Repo not found: ' + user + '/' + repo)

def fetchFiles(self):
"""Returns a list of file items contained on the github repo."""
gitlab_files = self.gl_repo.repository_tree(path=self.subdir.strip('/'), ref=self.branch, all=True)
files = []
for gitlab_file in gitlab_files:
for gitlab_file in gitlab_files:
if gitlab_file['type'] == 'blob':
name = gitlab_file['name']
files.append({
Expand Down
2 changes: 1 addition & 1 deletion src/gquery.py
Original file line number Diff line number Diff line change
Expand Up @@ -246,7 +246,7 @@ def get_enumeration_sparql(rq, v, endpoint, auth=None):
glogger.debug(endpoint)
codes_json = requests.get(endpoint, params={'query': codes_subquery},
headers={'Accept': static.mimetypes['json'],
'Authorization': 'token {}'.format(static.ACCESS_TOKEN)}, auth=auth).json()
'Authorization': 'token {}'.format(static.SPARQL_ACCESS_TOKEN)}, auth=auth).json()
for code in codes_json['results']['bindings']:
vcodes.append(list(code.values())[0]["value"])
else:
Expand Down
10 changes: 5 additions & 5 deletions src/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ def api_docs_template():
"""Generate Grlc API page."""
return render_template('api-docs.html', relative_path=relative_path())

def swagger_spec(user, repo, subdir=None, spec_url=None, sha=None, content=None, git_type=None, branch='main'):
def swagger_spec(user, repo, subdir=None, spec_url=None, sha=None, content=None, git_type=None, branch=None):
""" Generate swagger specification """
glogger.info("-----> Generating swagger spec for /{}/{} ({}), subdir {}, params {}, on commit {}".format(user, repo, git_type, subdir, spec_url, sha))

Expand All @@ -45,7 +45,7 @@ def swagger_spec(user, repo, subdir=None, spec_url=None, sha=None, content=None,
glogger.info("-----> API spec generation for /{}/{}, subdir {}, params {}, on commit {} complete".format(user, repo, subdir, spec_url, sha))
return resp_spec

def query(user, repo, query_name, subdir=None, spec_url=None, sha=None, content=None, git_type=None, branch='main'):
def query(user, repo, query_name, subdir=None, spec_url=None, sha=None, content=None, git_type=None, branch=None):
"""Execute SPARQL query for a specific grlc-generated API endpoint"""
glogger.info("-----> Executing call name at /{}/{} ({})/{}/{} on commit {}".format(user, repo, git_type, subdir, query_name, sha))
glogger.debug("Request accept header: " + request.headers["Accept"])
Expand Down Expand Up @@ -210,7 +210,7 @@ def query_git(user, repo, query_name, subdir=None, sha=None, content=None):
@app.route('/api-gitlab/<user>/<repo>/commit/<sha>/api-docs')
@app.route('/api-gitlab/<user>/<repo>/subdir/<path:subdir>/commit/<sha>')
@app.route('/api-gitlab/<user>/<repo>/subdir/<path:subdir>/commit/<sha>/api-docs')
def api_docs_gitlab(user, repo, subdir=None, sha=None, branch='main'):
def api_docs_gitlab(user, repo, subdir=None, sha=None, branch=None):
"""Grlc API page for specifications loaded from a Github repo."""
glogger.debug("Entry in function: __main__.api_docs_gitlab")
return api_docs_template()
Expand All @@ -223,7 +223,7 @@ def api_docs_gitlab(user, repo, subdir=None, sha=None, branch='main'):
@app.route('/api-gitlab/<user>/<repo>/commit/<sha>/swagger')
@app.route('/api-gitlab/<user>/<repo>/subdir/<path:subdir>/commit/<sha>/swagger')
@app.route('/api-gitlab/<user>/<repo>/<path:subdir>/commit/<sha>/swagger')
def swagger_spec_gitlab(user, repo, subdir=None, sha=None, branch='main'):
def swagger_spec_gitlab(user, repo, subdir=None, sha=None, branch=None):
"""Swagger spec for specifications loaded from a Github repo."""
glogger.debug("Entry in function: __main__.swagger_spec_gitlab")
return swagger_spec(user, repo, subdir=subdir, spec_url=None, sha=sha, content=None, git_type=static.TYPE_GITLAB, branch=branch)
Expand All @@ -239,7 +239,7 @@ def swagger_spec_gitlab(user, repo, subdir=None, sha=None, branch='main'):
@app.route('/api-gitlab/<user>/<repo>/query/subdir/<path:subdir>/commit/<sha>/<query_name>', methods=['GET', 'POST'])
@app.route('/api-gitlab/<user>/<repo>/query/commit/<sha>/<query_name>.<content>', methods=['GET', 'POST'])
@app.route('/api-gitlab/<user>/<repo>/query/subdir/<path:subdir>/commit/<sha>/<query_name>.<content>', methods=['GET', 'POST'])
def query_gitlab(user, repo, query_name, subdir=None, sha=None, content=None, branch='main'):
def query_gitlab(user, repo, query_name, subdir=None, sha=None, content=None, branch=None):
"""SPARQL query execution for specifications loaded from a Github repo."""
glogger.debug("Entry in function: __main__.query_gitlab")
return query(user, repo, query_name, subdir=subdir, sha=sha, content=content, git_type=static.TYPE_GITLAB, branch=branch)
Expand Down
7 changes: 6 additions & 1 deletion src/static.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@
# Setting headers to use access_token for the GitHub API
config_fallbacks = {
'github_access_token': '',
'gitlab_access_token': '',
'sparql_access_token': '',
'sparql_endpoint': '',
'user': '',
'password': '',
Expand All @@ -51,10 +53,13 @@
config.add_section('defaults')
config.add_section('local')
config.add_section('api_gitlab')

config_filename = os.path.join(os.getcwd(), 'config.ini')
print('Reading config file: ', config_filename)
config.read(config_filename)
ACCESS_TOKEN = config.get('auth', 'github_access_token')
GITHUB_ACCESS_TOKEN = config.get('auth', 'github_access_token')
GITLAB_ACCESS_TOKEN = config.get('auth', 'gitlab_access_token')
SPARQL_ACCESS_TOKEN = config.get('auth', 'sparql_access_token')

# Default endpoint, if none specified elsewhere
DEFAULT_ENDPOINT = config.get('defaults', 'sparql_endpoint')
Expand Down
2 changes: 1 addition & 1 deletion src/swagger.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ def get_path_for_item(item):
return item_path


def build_spec(user, repo, subdir=None, query_url=None, sha=None, prov=None, extraMetadata=[], git_type=None, branch='main'):
def build_spec(user, repo, subdir=None, query_url=None, sha=None, prov=None, extraMetadata=[], git_type=None, branch=None):
"""Build grlc specification for the given github user / repo."""
loader = grlc.utils.getLoader(user, repo, subdir, query_url, sha=sha, prov=prov, git_type=git_type, branch=branch)

Expand Down
10 changes: 5 additions & 5 deletions src/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@

glogger = glogging.getGrlcLogger(__name__)

def getLoader(user, repo, subdir=None, spec_url=None, sha=None, prov=None, git_type=None, branch='main'):
def getLoader(user, repo, subdir=None, spec_url=None, sha=None, prov=None, git_type=None, branch=None):
"""Build a fileLoader (LocalLoader, GithubLoader, URLLoader) for the given parameters."""
if user is None and repo is None and not spec_url:
loader = LocalLoader()
Expand All @@ -49,7 +49,7 @@ def build_spec(user, repo, subdir=None, sha=None, prov=None, extraMetadata=[]):
return items


def build_swagger_spec(user, repo, subdir, spec_url, sha, serverName, git_type, branch='main'):
def build_swagger_spec(user, repo, subdir, spec_url, sha, serverName, git_type, branch=None):
"""Build grlc specification for the given github user / repo in swagger format."""
if user and repo:
# Init provenance recording
Expand Down Expand Up @@ -99,7 +99,7 @@ def build_swagger_spec(user, repo, subdir, spec_url, sha, serverName, git_type,

def dispatch_query(user, repo, query_name, subdir=None, spec_url=None, sha=None,
content=None, requestArgs={}, acceptHeader='application/json',
requestUrl='http://', formData={}, method="POST", git_type=None, branch='main'):
requestUrl='http://', formData={}, method="POST", git_type=None, branch=None):
"""Executes the specified SPARQL or TPF query."""
loader = getLoader(user, repo, subdir, spec_url, sha=sha, prov=None, git_type=git_type, branch=branch)
query, q_type = loader.getTextForName(query_name)
Expand Down Expand Up @@ -255,9 +255,9 @@ def dispatchTPFQuery(raw_tpf_query, loader, acceptHeader, content):
# TODO: pagination for TPF

# Preapre HTTP request
reqHeaders = {'Accept': acceptHeader, 'Authorization': 'token {}'.format(static.ACCESS_TOKEN)}
reqHeaders = {'Accept': acceptHeader, 'Authorization': 'token {}'.format(static.SPARQL_ACCESS_TOKEN)}
if content:
reqHeaders = {'Accept': static.mimetypes[content], 'Authorization': 'token {}'.format(static.ACCESS_TOKEN)}
reqHeaders = {'Accept': static.mimetypes[content], 'Authorization': 'token {}'.format(static.SPARQL_ACCESS_TOKEN)}
tpf_list = re.split('\n|=', raw_tpf_query)
subject = tpf_list[tpf_list.index('subject') + 1]
predicate = tpf_list[tpf_list.index('predicate') + 1]
Expand Down

0 comments on commit ab5ebf8

Please sign in to comment.