Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote shard mapping over TCP #3376

Merged
merged 2 commits into from
Jul 20, 2015
Merged

Remote shard mapping over TCP #3376

merged 2 commits into from
Jul 20, 2015

Conversation

otoolep
Copy link
Contributor

@otoolep otoolep commented Jul 18, 2015

With this change remote mapping no longer uses HTTP, as the HTTP ports exposed by nodes on the cluster are not known cluster wide. The TCP ports exposed by the cluster service are, so this change uses that functionality. Each RemoteMapper has its own dedicated connection pool for each node, and remote mapping TCP connections are in no way coupled with query TCP connections.

It is not clear to me if this will be part of 0.9.2, so I have left the CHANGELOG untouched for now.

I tested this code by forcing all mapping to take place over the network, spun up a single node, and performed some queries. This meant shard mapping actually took place over localhost. The returned data looked good.

The structure is as follows.

  • Query planner requests mappers from ShardMapper for 1 or more shards. This takes place on the Co-ordinator node.
  • ShardMapper returns an object implementing the Mapper interface.
  • If a shard is local, the local store is asked for the Mapper.
  • If a shard is remote, the Mapper is actually a RemoteMapper.
  • RemoteMapper connects to the target node, sending the shard ID and query to the remote node.
  • The Cluster service on the remote node picks up the request, and asks its local store to create a Mapper for the requested shard.
  • Cluster service streams data, served by the Mapper, back to the co-ordinating node.
  • RemoteMapper decode results and presents them to the query engine.

@otoolep otoolep mentioned this pull request Jul 18, 2015
@otoolep
Copy link
Contributor Author

otoolep commented Jul 18, 2015

@@ -26,13 +26,13 @@ func (a mapperValues) Len() int { return len(a) }
func (a mapperValues) Less(i, j int) bool { return a[i].Time < a[j].Time }
func (a mapperValues) Swap(i, j int) { a[i], a[j] = a[j], a[i] }

type mapperOutput struct {
type MapperOutput struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply exporting this type, no other changes.

@otoolep otoolep force-pushed the mapping_over_tcp branch 2 times, most recently from 813ad49 to ab53bdc Compare July 19, 2015 06:15
err := s.processMapShardRequest(conn, buf)
if err != nil {
s.Logger.Printf("process map shard error: %s", err)
_ = writeMapShardResponseMessage(conn, NewMapShardResponse(1, err.Error()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error ignored. Should at least log it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

@jwilder
Copy link
Contributor

jwilder commented Jul 20, 2015

A few small things but 👍 after they're fixed.

@otoolep
Copy link
Contributor Author

otoolep commented Jul 20, 2015

Thanks for the feedback @jwilder

otoolep and others added 2 commits July 20, 2015 10:44
With this change remote mapping no longer uses HTTP, as the HTTP ports
exposed by nodes on the cluster are not known cluster wide. The TCP
ports exposed by the cluster service are, so this change uses that
functionality. Each RemoteMapper has its own dedicated connection pool
for each node, and remote mapping TCP connections are in no way coupled
with query TCP connections.
This is useful primarily for testing.
@otoolep
Copy link
Contributor Author

otoolep commented Jul 20, 2015

OK, I believe I addressed all comments, merging now

otoolep added a commit that referenced this pull request Jul 20, 2015
@otoolep otoolep merged commit 30345f5 into master Jul 20, 2015
@otoolep otoolep deleted the mapping_over_tcp branch July 20, 2015 19:37
@otoolep
Copy link
Contributor Author

otoolep commented Jul 20, 2015

@jwilder - I just realised I merged the protobuf changes that you pointed out. I did not mean to do that. I meant to handle that in a separate PR, as you requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants