-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perform load balancing when communicating with a cluster of Alternators #132
Conversation
* @param item The item retrieved from DynamoDB | ||
* @param toValue Where the converted item should be stored | ||
*/ | ||
protected abstract void convertDynamoDBItemToValue(DynamoDBItemWritable item, V toValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what other classes implemented this?
Aren't we loosing support for other key types than Text ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only implementation in emr-dynamodb-connector was DefaultDynamoDBRecordReader
, which also fixes the key type to Text
. Maybe there are other implementations in the wild…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to focus just on what is in emr-dynamodb-connector
@@ -38,26 +39,23 @@ | |||
* convertDynamoDBItemToValue with a conversion from the DynamoDB item to the desired value type. As | |||
* an alternative, you can also use DefaultDynamoDBRecordReader, and do the conversion to a | |||
* DynamoDBItemWritable inside of the reducer. | |||
* | |||
* @param <K> The type of Key that will be passed to the Mapper | |||
* @param <V> The type of Value that will be passed to the Mapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand you are simplifying, but I don't get how only Text can be the key
(I didn't check DynamoDBItemWritable but I assume it's an interface to most data types value can have in Dynamo)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hadoop requires a key, but here in emr-dynamodb-connector they always use an empty Text
instance as a key. So, it is not really used. DynamoDBItemWritable
is a class that only contains a Map<String, AttributeValue>
(where AttributeValue
comes from the AWS Java SDK and models a column in a DynamoDB row).
Unfortunately, the PR contains a lot of changes, but in my opinion this is still the best approach to go for now. I’ve also considered using reflection-based approaches to access the On the other hand, with simple changes in emr-dynamodb-connector, we could easily achieve our goal. For instance, if awslabs/emr-dynamodb-connector#196 is merged, we will be able to set up our custom request handler like so: // in DynamoUtils.setDynamoDBJobConf
jobConf.set(DynamoDBConstants.CUSTOM_CLIENT_BUILDER_TRANSFORMER, "WithLoadBalancing")
jobConf.set("scylladb.alternator.endpoint", "https://127.0.0.1:8043")
// Where the class WithLoadBalancing would be defined as follows
class WithLoadBalancing extends DynamoDbClientBuilderTransformer with Configurable {
private var conf: Configuration = null
def apply(builder: DynamoDbClientBuilder): DynamoDbClientBuilder =
builder.endpointProvider(
new AlternatorEndpointProvider(URI.create(conf.get("scylladb.alternator.endpoint")))
)
def setConf(conf: Configuration): Unit = {
this.conf = conf
}
def getConf: Configuration = this.conf
} (note that we would also have to upgrade to v2 of AWS SDK to use the latest version of the connector) |
we have no power over AWS code, so if they will accept it or not is a dice roll ;-) |
Superseded by #173 |
Customize the
emr-dynamodb-connector
to configure the DynamoDB client to use the load balancing request handler from https://github.com/scylladb/alternator-load-balancing.Unfortunately, the project
emr-dynamodb-connector
has not been designed to support customization of the underlying DynamoDB client. Therefore, I had to copy-paste code from the project and change it to customize the underlying DynamoDB client. This led to creating a parallel class hierarchy co-existing with the existing one, but with names all prefixed byLoadBalanced
.The PR should be reviewed commit-by-commit and ignoring the content of 424141b and 2b7f076 which copy-paste verbatim content from
emr-dynamodb-connector
.Blocked by scylladb/alternator-load-balancing#18
Fixes #117