Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bigtable: initial support #1279

Merged
merged 2 commits into from
Jul 11, 2016
Merged

bigtable: initial support #1279

merged 2 commits into from
Jul 11, 2016

Conversation

callmehiphop
Copy link
Contributor

@callmehiphop callmehiphop commented May 2, 2016

Closes #722

Add support for Bigtable!


TODOS:

  • Implementation
    • Admin Service
    • Bigtable Service
  • Docs - preview
    • Walkthrough
    • Private modules
    • Public modules
  • Tests
    • e2e Tests
    • Unit Tests

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label May 2, 2016
@callmehiphop callmehiphop added the api: bigtable Issues related to the Bigtable API. label May 2, 2016
@stephenplusplus
Copy link
Contributor

Can you throw up some code examples, or is https://gist.github.com/callmehiphop/7849a495cbb5c8deedf9 still accurate?

@callmehiphop
Copy link
Contributor Author

@stephenplusplus I'll add some code examples, still working on convenience methods. Mostly opening this because of the GrpcService changes involving supporting multiple proto files and supporting gRPC streaming APIs. Feedback around those two particular items is pretty important to me ATM.

@stephenplusplus
Copy link
Contributor

@callmehiphop cool, can you call out / briefly explain the changes and how they work together?

apiVersion: 'v1',
protoServices: {
BigtableService: googleProtoFiles.bigtable.v1,
BigtableTableService: {

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@stephenplusplus
Copy link
Contributor

I think it looks okay to me. The language (var names, methods, comments) is the most important thing, making sure things are super-logically named to minimize confusion. Seems like everything is okay, but feel free to make tweaks if you find anywhere that could be more explicit.

@callmehiphop
Copy link
Contributor Author

I believe this was brought up a couple of weeks ago, but it would appear the cluster portion of the API has been decoupled from BigTable and moved to a separate proto file.

I'm not sure if this is something I should be tackling here or if a separate PR is in order.

@stephenplusplus
Copy link
Contributor

Ah, interesting. @jgeewax do we want to support the Container Engine API anytime soon? It looks like they're working on a grpc API maybe?

If so, we can create {module:container/cluster}, use it from {module:bigtable/cluster}, and just not document {module:container} until we have filled in the rest of the support for Container.

});

it('should insert multiple rows', function(done) {
var rows = {

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@callmehiphop
Copy link
Contributor Author

@stephenplusplus I started work on the documentation, it's still a ways away from being complete, however it will give you an overview of the API usage.

http://callmehiphop.github.io/gcloud-node/#/docs/master/bigtable

PTAL!

@callmehiphop
Copy link
Contributor Author

Clarifying that when I said PTAL, I meant at the doc examples, I'm still refactoring code.


/**
*
*/

This comment was marked as spam.

@stephenplusplus
Copy link
Contributor

I'm still learning Bigtable as I go here, so I'm moving a little slower than I'd like in the review-- very sorry about that. I'm tracking some thoughts as I go. I figured instead of blasting you with them at the end, I might as well give you something sooner than later. Sorry again if these are bad or impossible ideas, but... here we go.

deleteCellsFrom

Can we remove this and instead add some options right on the Row object?

var row = table.row('Kennedy');

// delete a row
row.delete(function(err, apiResponse) {});

// delete a column family
row.deleteColumnFamily('name', function(err, apiResponse) {});

// delete a column
row.deleteColumn('name:first', function(err, apiResponse) {});

// delete all cells
row.empty(function(err, apiResponse) {});

Then when we need to batch multiple mutations, we can introduce a batch method from the Table instance, that queues the mutations then runs them when exec() is called:

// delete multiple cells from rows
table.batch(function(exec) {
  // delete a column family
  row.deleteColumnFamily('name');

  // delete a column
  row.deleteColumn('name:first');

  // delete all cells
  row.empty();

  // run the deletes
  exec();
}, function(err, apiResponse) {});

getRows

At this point, I think there's a lot of scary terms and complexity with the filter builder. I think if we could come up with a more intuitive POJO, it might help:

var options = {
  filter: {
    column: [
      'name:first',
      'life:born'
    ]
  }
};

table.getRows(options)
  .on('data', function(row) {
    // row is a Row object
  });

insert

Introducing a save method on the Row object might save us from the verbose object that has to be created for insert currently.

var row = table.row('Lincoln');

row.save({
  name: {
    first: 'Abraham',
    last: 'Lincoln'
  }
}, function(err, apiResponse) {});

If we implement batch, that can support multiple saves to multiple rows.

@callmehiphop
Copy link
Contributor Author

Can we remove this and instead add some options right on the Row object?

Then when we need to batch multiple mutations, we can introduce a batch method from the Table instance, that queues the mutations then runs them when exec() is called

SGTM!

At this point, I think there's a lot of scary terms and complexity with the filter builder. I think if we could come up with a more intuitive POJO, it might help

I'm ok with making it a plain old javascript object, but I'm not sure I agree that it's going to make it any less scary. For very simple filters, like the one in your example, it might make it appear less frightening, however for more complex filters that leverage interleaving and/or conditionals the object will has the potential to be massive.

It might be worth mentioning that going POJO for this means that the order of the keys in the object matters - because this would be the order in which the filters are ran. I could see this being confusing for some.

Introducing a save method on the Row object might save us from the verbose object that has to be created for insert currently.

There's already a Row#put method that does exactly this, I can rename it to save() if you prefer.

@stephenplusplus
Copy link
Contributor

for more complex filters that leverage interleaving and/or conditionals the object will has the potential to be massive.

Can you whip one of those up so I can see?

There's already a Row#put

Woops, didn't see that. I do prefer save over put to match our API terminology more closely.

@callmehiphop
Copy link
Contributor Author

Here's what I'm thinking for a filter object

Simple filters

{
  sink: Boolean, // maps to `sink`
  pass: Boolean, // maps to `passAllFilter`
  block: Boolean, // maps to `blockAllFilter`
  time: { // maps to `timestampRangeFilter`
    start: Date,
    end: Date
  },
  labels: Boolean // maps to `applyLabelsTransformer`
}

Family filters

A number of filters can be regular expressions (family filter being one of them). For these I think we could allow for String, RegExp and String[].

{ // maps to `familyNameRegexFilter`
  family: 'name',
  family: /^(name|life)/,
  family: ['name', 'life']
}

Row filters

In its simplest form, we would use a similar pattern to the family name filter.

{ // maps to `rowKeyRegexFilter`
  row: 'lincoln',
  row: /^(lincoln|kennedy)/,
  row: ['lincoln', 'kennedy']
}

For further row options we would expand to an object

{
  row: {
    key: String || RegExp || Array, // maps to `rowKeyRegexFilter`
    sample: String, // maps to `rowSampleFilter`
    cellOffset: Number, // maps to `cellsPerRowOffsetFilter`
    cellLimit: Number // maps to `cellsPerRowLimitFilter`
  }
}

Column filters

In its simplest form, we would use a similar pattern to the family name filter.

{ // maps to `columnQualifierRegexFilter`
  column: 'first',
  column: /^(first|last)/,
  column: ['first', 'last']
}

For further column options we would expand into an object

{
  column: {
    name: String || RegExp || Array, // maps to `columnQualifierRegexFilter`
    cellLimit: Number // maps to `cellsPerColumnLimitFilter`
  }
}

You can also search for a range of columns, which I think should look like

{
  column: { // maps to `columnRangeFilter`
    // shorthand for { value: String }
    start: String,

     // expanded version
    start: {
      value: String,
      inclusive: Boolean
    },

    // shorthand for { value: String }
    end: String,

     // expanded version
    end: {
      value: String,
      inclusive: Boolean
    }
  }
}

Value filters

In its simplest form, we would use a similar pattern to the family name filter.

{ // maps to `valueRegexFilter`
  value: 'Abe',
  value: /(Abe|Abraham)/,
  value: ['Abe', 'Abraham']
}

An expanded version would include a value range filter and an option to strip values from results

{
  value: {
    value: String || RegExp || Array, // maps to `valueRegexFilter`
    start: String, // maps to `valueRangeFilter`
    end: String, // maps to `valueRangeFilter`
    strip: Boolean // maps to `stripValueTransformer`
  }
}

Advanced Filters

There are also interleave and condition filters which use multiple filter objects

{
  interleave: [Filter, Filter],
  condition: {
    test: Filter,
    pass: Filter,
    fail: Filter
  }
}

One caveat of using POJO for filters is that the order of the keys can matter, as the filters will execute in said order.

@stephenplusplus WDYT?

@stephenplusplus
Copy link
Contributor

It looks awesome 👍

One caveat of using POJO for filters is that the order of the keys can matter, as the filters will execute in said order.

The user will just have to take care to the order they define their filters in the object-- I actually think this is a nice "feature"; their filters will be executed in the order they're defined. I've never seen Object.keys or a for-in not sorted in the order of the definition of the object... is that the concern?

@callmehiphop
Copy link
Contributor Author

I've never seen Object.keys or a for-in not sorted in the order of the definition of the object... is that the concern?

While I have never seen it happen myself, it does not look like something we should depend on.

So maybe I spoke too soon and we should keep the builder?

@stephenplusplus
Copy link
Contributor

@callmehiphop
Copy link
Contributor Author

Nice, we should be ok then.

@sduskis
Copy link
Contributor

sduskis commented May 17, 2016

Are you aware that Cloud Bigtable is moving to a v2 API? A new node.js implementation should probably use v2 instead of v1. We should talk offline about timelines.

@dhermes
Copy link
Contributor

dhermes commented May 17, 2016

@sduskis I'd love to hear about v2 as well for gcloud-python

});

it('should delete a column family', function(done) {
FAMILY.delete(function(err) {

This comment was marked as spam.

@stephenplusplus
Copy link
Contributor

One lone change left, then I'll wait out the tests, and welcome Bigtable to the gcloud-node Family of APIs.

@coveralls
Copy link

coveralls commented Jul 11, 2016

Coverage Status

Coverage remained the same at 100.0% when pulling ccd85bd on callmehiphop:bigtable into 0a6da60 on GoogleCloudPlatform:master.

@stephenplusplus
Copy link
Contributor

CLAs are signed, but unable to verify author consent

I'm cool with you squashing these down to one commit... that might remedy this oddity.

add support for scan filters

add examples via jsdoc

add bigtable to docs site
// Update a row in your table.
var row = table.row('alincoln');

row.save('follows:gwashington', 1, function(err) {});

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@coveralls
Copy link

coveralls commented Jul 11, 2016

Coverage Status

Changes Unknown when pulling d7c5535 on callmehiphop:bigtable into * on GoogleCloudPlatform:master*.

@coveralls
Copy link

coveralls commented Jul 11, 2016

Coverage Status

Changes Unknown when pulling d7c5535 on callmehiphop:bigtable into * on GoogleCloudPlatform:master*.

6 similar comments
@coveralls
Copy link

Coverage Status

Changes Unknown when pulling d7c5535 on callmehiphop:bigtable into * on GoogleCloudPlatform:master*.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling d7c5535 on callmehiphop:bigtable into * on GoogleCloudPlatform:master*.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling d7c5535 on callmehiphop:bigtable into * on GoogleCloudPlatform:master*.

@coveralls
Copy link

coveralls commented Jul 11, 2016

Coverage Status

Changes Unknown when pulling d7c5535 on callmehiphop:bigtable into * on GoogleCloudPlatform:master*.

@coveralls
Copy link

coveralls commented Jul 11, 2016

Coverage Status

Changes Unknown when pulling d7c5535 on callmehiphop:bigtable into * on GoogleCloudPlatform:master*.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling d7c5535 on callmehiphop:bigtable into * on GoogleCloudPlatform:master*.

@callmehiphop callmehiphop merged commit 96eb260 into googleapis:master Jul 11, 2016
@stephenplusplus
Copy link
Contributor

@coveralls, don't worry man, you win the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigtable Issues related to the Bigtable API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants