The blob store is a simple file storage service backed by an S3 compatible storage system such as Minio. Storing a file provides a key - currently a UUID - that allows retrival of the file when provided along with proper credentials. Once stored, files are immutable other than deletion.
The user is responsible for saving the key for use later - in the context of KBase, that means
creating a handle for the file via the handle service
and saving an object to the workspace containing
that handle in an @id handle
annotation, or saving the key directly in the workspace object
in an @id bytestream
annotation. See the workspace documentation for details; also the
DataFileUtil module can assist with these functions
in the context of KBase applications.
The API is nominally compatible with a minimal subset of the KBase fork of Shock's API. The vast majority of functions are not supported; only those required for the KBase codebase are included.
This data structure is a subset of Shock's node data structure.
{
"data": {
"attributes": null, # DEPRECATED
"created_on": "2019-05-30T23:50:19.000Z",
"file": {
"checksum": {
"md5": "1b9554867d35f0d59e4705f6b2712cd1"
},
"name": "foo", # Provided filename (see below)
"size": 8
},
"format": "bar", # Provided file format (see below)
"id": "c39192c7-45b1-4fec-b196-5976d8e628f7", # The node ID generated by the blobstore.
"last_modified": "2019-05-30T23:50:19.000Z"
},
"error": null,
"status": 200
}
attributes
is deprecated, always null and is only provided for backwards compatibility reasons.
last_modified
is always the same as created_on
and is only included for backwards compatibility
reasons. Unlike Shock, the blobstore does not take ACL modifications into account when setting
the last_modified
date.
This data structure is a subset of Shock's ACL data structure.
{
"data": {
"delete": [User],
"owner": User,
"public": {
"delete": false,
"read:" <true if the node is publically readable, false otherwise>,
"write": false
},
"read": [User...],
"write": [User],
},
"error": null,
"status": 200
}
delete
and write
ACLs are deprecated and only provided for backwards compatibility reasons.
They are always false
for public access or contain only the node owner for standard ACLs.
A User is usually just the UUID assigned to the user by the blobstore, but when full verbosity (see below) is requested, the User data structure is:
{
"uuid": <the user's UUID assigned by the blobstore>,
"username": <the user's KBase account name>
}
This data structure is identical to Shock's error data structure.
{
"data": null,
"error": [<error string>],
"status": <http status code as an integer>
}
Requests are authenticated by including the header Authorization: OAuth <kbase token>
or
including a cookie with the value of <kbase token>
in the request.
The names of cookies that the server will check are set in the deployment configuration file.
The header takes precedence, then each cookie in the list in the configuration file in order.
Note that for backwards compatibility, incorrect or invalid authentication headers respond with a 400 HTTP code. Invalid cookies respond with the appropriate 401 code.
GET /
{
"deprecationwarning": "The id and version fields are deprecated.",
"id": "Shock",
"servername": "blobstore",
"servertime": <server time in epoch milliseconds>,
"serverversion": <server version>,
"version": "0.9.6"
"gitcommit": <git commit from which the server was built>
}
The id
and version
fields are deprecated and present only for backwards compatibility with
Shock. The version
field will not change.
AUTHORIZATION REQUIRED
Content-Length header required
POST /node[?filename=<filename>&format=<file format>]
<file content>
RETURNS: a Node.
The Content-Length
header must be present and accurate.
PUT
is also supported - but is not idempotent - in order to ease using the curl -T
option:
curl -H "Authorization: OAuth $KBASE_TOKEN" -T mylittlefile
"http://<host>/node?filename=mylittlefile&format=text"
filename
can be at most 256 characters consisting of only unicode alphanumerics, space, and
the characters [ ] ( ) = . - _
.
format
can be at most 100 characters consisting of only unicode alphanumerics and
the characters - _
.
AUTHORIZATION REQUIRED
POST /node/<id>/copy
RETURNS: a Node.
AUTHORIZATION OPTIONAL
GET /node/<id>
RETURNS: a Node.
AUTHORIZATION OPTIONAL
GET /node/<id>/acl[?verbosity=full]
RETURNS: an ACL.
AUTHORIZATION OPTIONAL
GET /node/<id>?download[_raw][&seek=#][&length=#][&del]
RETURNS: the file content.
?download_raw
, as opposed to ?download
, causes the Content-Disposition
header to be
omitted.
seek
causes the first #
bytes of the file to be skipped. A seek
value greater than or equal
to the file size is an error. Defaults to 0.
length
determines the number of bytes of the file to return after skipping seek
bytes.
length
may be greater than the remaining file length. Defaults to 0, which indicates that the
remainder of the file should be returned.
del
causes the node to be deleted once the file contents have been streamed. The user must
be the node owner or a service administrator. Note this is playing very fast and loose with the
semantics of an HTTP GET.
AUTHORIZATION REQUIRED
PUT /node/<id>/acl/public_read[?verbosity=full]
RETURNS: an ACL.
AUTHORIZATION REQUIRED
DELETE /node/<id>/acl/public_read[?verbosity=full]
RETURNS: an ACL.
AUTHORIZATION REQUIRED
PUT /node/<id>/acl/read?users=<comma separated list of KBase user names>[&verbosity=full]
RETURNS: an ACL.
AUTHORIZATION REQUIRED
DELETE /node/<id>/acl/read?users=<comma separated list of KBase user names>[&verbosity=full]
RETURNS: an ACL.
AUTHORIZATION REQUIRED
PUT /node/<id>/acl/owner?users=<KBase user name>[&verbosity=full]
RETURNS: an ACL.
The users
parameter must contain a single user name.
This upload method is provided for Shock compatibilty. It is recommended that the prior upload method is used rather than this one.
AUTHORIZATION REQUIRED
POST /node
<multipart form>
RETURNS: a Node.
The form MUST contain a part called upload
where the part contents are the file to be
uploaded.
The part MUST have an accurate Content-Length
header specifing the size of the file, not
the entire multipart form.
The form may contain a part called format
where the part contents are the format of the
file, equivalent to the format
query parameter for the standard upload method and with the same
restrictions. The format
part MUST come before the upload
part.
Any file name provided in the Content-Disposition
header has the same restrictions as the
filename parameter for the standard upload method.
curl -H "Authorization: OAuth $KBASE_TOKEN" \
-F "upload=@mydata.fasta;headers=\"Content-Length: 67452\"" \
http://<host>/node
import os
import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder
df = open(filename, 'rb')
files = {'upload': (filename, df, None, {'Content-Length': os.path.getsize(filename)})}
mpe = MultipartEncoder(fields=files)
headers = {'Content-Type': mpe.content_type,
'authorization': 'OAuth ' + token}
res = requests.post('http://<host>/node', headers=headers, data=mpe, stream=True)
res.json()
package blobstoreclienttest;
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import org.apache.commons.io.IOUtils;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.mime.FormBodyPartBuilder;
import org.apache.http.entity.mime.MultipartEntityBuilder;
import org.apache.http.entity.mime.content.InputStreamBody;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
public class blobstoreclient {
public static void main(final String[] args) throws ClientProtocolException, IOException {
final String fileName = args[0];
final String token = args[1];
// probably don't want to use the default client for most applications
final CloseableHttpClient cli = HttpClients.createDefault();
final HttpPost htp = new HttpPost("http://<host>/node");
htp.setHeader("authorization", "OAuth " + token);
final Path p = Paths.get(fileName);
final MultipartEntityBuilder mpeb = MultipartEntityBuilder.create();
final InputStream in = Files.newInputStream(p);
mpeb.addPart(FormBodyPartBuilder.create()
.setName("upload")
.addField("Content-Length", "" + Files.size(p))
.setBody(new InputStreamBody(in, p.getFileName().toString())).build());
htp.setEntity(mpeb.build());
final CloseableHttpResponse response = cli.execute(htp);
in.close();
IOUtils.copy(response.getEntity().getContent(), System.out);
response.close();
}
}
This copy method is provided for Shock compatibilty. It is recommended that the prior copy method is used rather than this one.
AUTHORIZATION REQUIRED
POST /node
<multipart form>
RETURNS: a Node.
The multipart form must have exactly one part with the name copy_data
and the value the id of
the node to copy.
Curl example:
curl -H "Authorization: OAuth $KBASE_TOKEN" -F "copy_data=<node id>" http://<host>/node/
- go 1.16
- An S3 compatible storage system. The Blobstore is tested with Minio version 2019-05-23T00-29-34Z.
- If Minio is used and the version is 2019-05-14T23-57-45Z or larger the server must
be run in
--compat
mode.
- If Minio is used and the version is 2019-05-14T23-57-45Z or larger the server must
be run in
- MongoDB 2.6+
- An S3 compatible storage system and MongoDB must be running.
- Copy
deploy.cfg.example
todeploy.cfg
and adjust the values as necessary. - In the module directory:
go build app/blobstore.go
./blobstore --conf deploy.cfg
To build the git commit into the server:
export GIT_COMMIT=$(git rev-list -1 HEAD)
&& go build -ldflags "-X main.gitCommit=$GIT_COMMIT" app/blobstore.go
- Adding code
- All code additions and updates must be made as pull requests directed at the develop branch.
- All tests must pass and all new code must be covered by tests.
- All new code must be documented appropriately
- Godoc
- General documentation if appropriate
- Release notes
- Exception mapping is handled in
server/errortypes.go
.
- All code additions and updates must be made as pull requests directed at the develop branch.
- Releases
- The master branch is the stable branch. Releases are made from the develop branch to the master branch.
- Update the version as per the semantic version rules in
app/blobstore.go
. - Tag the version in git and github.
- Tags must follow the Go module semantic version format, e.g.
vX.Y.Z
.
- Tags must follow the Go module semantic version format, e.g.
Copy test.cfg.example
to test.cfg
and adjust the values as necessary.
BLOBSTORE_TEST_CFG=[absolute path to test.cfg] go test ./...
Each package gets its own working directory during tests so the path to the test.cfg
file
cannot be relative.
Mocks are generated with https://github.com/vektra/mockery v1.0.0.
-
Providing a
Content-Type
header ofmultipart/form-data; boundary=
when trying to copy a node will result in thego
function that parses multipart data asserting that the http body is not form data, and so the body will be processed as a file upload. This is an issue in thego
mime
library. -
Providing a
Content-Length
that is larger than the http body when uploading a file will cause the connection to hang forever. (Note that a content length > file length looks the same to the server as a hanging upload.)
- HTTP2 support
While exploring upload speeds with various upload methods, this server was generated.