This is the beginning of the clustering stuff. Client/server communication, etc. #20

pauldix · 2013-11-07T17:21:12Z

Don't bother merging this in yet. Just review and give feedback please :)

obfuscurity · 2013-11-07T17:25:28Z

Feedback... "yay". 🍒

obfuscurity · 2013-11-07T17:27:15Z

/cc @phobos182 @dysinger because they know a lot more about Go than I do.

… client and server for cluster communication. Fill out request/response in protocol. Add request handler interface and implementation.

…ything together. Move the server daemon to damon/infuxd.go. Remove extraneous stuff from the client server test.

…object from http/api over to cluster config since it makes more sense there.

…quest handler. Add LogRequest method stub to datastore.

Conflicts: src/datastore/leveldb_datastore.go

jvshahid · 2013-11-14T23:12:18Z

src/protocol/protocol.proto

-  required int64 timestamp = 2;
-  required uint32 sequence_number = 3;
+  optional int64 timestamp = 2;
+  optional uint32 sequence_number = 3;
 }


just a reminder. per our conversation, these will change back to required fields and timestamp/sequence number will be assigned by the server that received the initial request before proxying (possibly locally) the points to other nodes.

toddboom · 2013-11-15T14:25:22Z

💖

jvshahid · 2013-11-15T17:54:22Z

src/coordinator/protobuf_client.go

+				if ok {
+					responseChan <- response
+					if *response.Type == protocol.Response_END_STREAM {
+						close(responseChan)


Should the channel be deleted from the map as well ?

Actually related issue, what happens to requests that are lost for one reason or another. We need some sort of garbage collector to go back and clean up the map after a certain timeout has elapsed.

yep, it should be. Will probably update this to use gocache.

…locally

* Move ReplicateWrite to live only on Coordinator. * Change SequenceNumber on Point to be a uint64 to support higher numbers and host ids being the first part of it. * Move time and sequence_number assigning to the coordinator instead of in the datastore. * Roll RequestLogging and PersistentAtomicIncrement into the Datastore interface. * Add PersistentAtomicIncrement to LevelDbDatastore to support non-colliding sequence numbers.

…sts when responses are sent.

jvshahid · 2013-11-17T18:54:11Z

src/coordinator/protobuf_client.go

+}
+
+func (self *ProtobufClient) reconnect() {
+	swapped := atomic.CompareAndSwapUint32(&self.reconnecting, self.reconnecting, IS_RECONNECTING)


I think the second argument should be IS_CONNECTED, otherwise the compare will always be true.

…Swap

jvshahid · 2013-11-17T19:17:11Z

src/coordinator/coordinator.go

+				if err != nil {
+					// we failed to write locally and to any proxy, bail
+					return err
+				}


what happens if the local write failed and the proxy write succeed ? are we going to send the replicas another write ?

The one that took the proxy will be responsible for replicating to the other nodes. If those fail, the downed node will replay missing data when it comes back online.

ah, see what you mean, fixing.

…e twice.

the integration tests and the package script should be using daemon instead of server which is the old name of the influxdb binary

…e server mode

* Update Response in protocol to have nextPointTime * Wire up distribute query in coordinator * Add method to ClusterCongig to get servers to query * Add query handling to request handler * Add point sorting to protocol.Series

Conflicts: src/datastore/leveldb_datastore.go src/engine/engine.go

…striubted query bugs. 3 days of debugging pays off in glorious success! The ring filter for datastore is optional to have it filter what points get yielded by their location in the ring. This needs to be used when making distributed queries on a database whose replication factor doesn't evenly divide the size of the cluster.

…owner server id, and writing server id for use in request logging and replication.

…der than 48 hours.

Conflicts: package.sh

* Update protocol with replication log replays. * Update protobuf server to keep map of connections to be able to close cleanly * Add replay support to request handler, coordinator, and datastore

…functionality.

…h new interface.

Patch 1

* fix: wal skip persist and notify if empty buffer (#25211) * fix: wal skip persist and notify if empty buffer This fixes the WAL so that it will skip persisting a file and notifying the file notifier if the wal buffer is empty. * fix: fix last cache persist test * fix: make ParquetChunk fields and mod chunk pub (#25219) * fix: make ParquetChunk fields and mod chunk pub This doesn't affect anything in the OSS version, but these changes are needed for Pro as part of our compactor work. * fix: cargo deny failure * fix: query bugs with buffer (#25213) * fix: query bugs with buffer This fixes three different bugs with the buffer. First was that aggregations would fail because projection was pushed down to the in-buffer data that de-duplication needs to be called on. The test in influxdb3/tests/server/query.rs catches that. I also added a test in write_buffer/mod.rs to ensure that data is correctly queryable when combining with different states: only data in buffer, only data in parquet files, and data across both. This showed two bugs, one where the parquet data was being doubled up (parquet chunks were being created in write buffer mod and in queryable buffer. The second was that the timestamp min max on table buffer would panic if the buffer was empty. * refactor: PR feedback * fix: fix wal replay and buffer snapshot Fixes two problems uncovered by adding to the write_buffer/mod.rs test. Ensures we can replay wal data and that snapshots work properly with replayed data. * fix: run cargo update to fix audit * feat: use host identifier prefix in object store paths (#25224) This enforces the use of a host identifier prefix in all object store paths (currently, for parquet files, catalog files, and snapshot files). The persister retains the host identifier prefix, and uses it when constructing paths. The WalObjectStore also holds the host identifier prefix, so that it can use it when saving and loading WAL files. The influxdb3 binary requires a new argument 'host-id' to be passed that is used to specify the prefix. * feat: add `system.parquet_files` table (#25225) This extends the system tables available with a new `parquet_files` table which will list the parquet files associated with a given table in a database. Queries to system.parquet_files must provide a table_name predicate to specify the table name of interest. The files are accessed through the QueryableBuffer. In addition, a test was added to check success and failure modes of the new system table query. Finally, the Persister trait had its associated error type removed. This was somewhat of a consequence of how I initially implemented this change, but I felt cleaned the code up a bit, so I kept it in the commit. * fix: un-pub QueryableBuffer and fix compile errors (#25230) * refactor: Make Level0Duration part of WAL (#25228) * refactor: Make Level0Duration part of WAL I noticed this during some testing and cleanup with other PRs. The WAL had its own level_0_duration and the write buffer had a different one, which would cause some weird problems if they weren't the same. This refactors Level0Duration to be in the WAL and fixes up the tests. As an added bonus, this surfaced a bug where multiple L0 blocks getting persisted in the same snapshot wasn't supported. So now snapshot details can have many files per table. * fix: have persisted files always return in descending data time order * fix: sort record batches for test verification * fix: main (#25231) * feat: Add last cache create/delete to WAL (#25233) * feat: Add last cache create/delete to WAL This moves the LastCacheDefinition into the WAL so that it can be serialized there. This ended up being a pretty large refactor to get the last cache creation to work through the WAL. I think I also stumbled on a bug where the last cache wasn't getting initialized from the catalog on reboot so that it wouldn't actually end up caching values. The refactored last cache persistence test in write_buffer/mod.rs surfaced this. Finally, I also had to update the WAL so that it would persist if there were only catalog updates and no writes. Fixes #25203 * fix: typos * feat: Catalog apply_catalog_batch only updates if new (#25236) * feat: Catalog apply_catalog_batch only updates if new This updates the Catalog so that when applying a catalog batch it only updates the inner catalog and bumps the sequence number and updated tracker if there are new updates in the batch. Also does validation that the catalog batch schema is compatible with any existing. Closes #25205 * feat: only persist catalog when updated (#25238) * chore: ignore sqlx rustsec advisory (#25252) * feat: Add FileIndex type to influxdb3_index This commit does two important things: 1. It creates a new influxdb3_index crate under influxdb3_pro to contain all indexing logic and types that we might create for influxdb3_pro 2. Creates our first index type the FileIndex which is part of #20 Note we're starting off with just file ids as this will let us set up the logic for creating and working with the `FileIndex` inside of the compactor first. Later we can add row groups as that logic is a bit more complicated in nature. The `FileIndex` contains methods to lookup, insert, and delete items from the index as needed and an associated test to make sure it works as expected. Note that the `FileIndex` is meant to have one created for each database table that has an index created for it. Later on when it's being integrated into the compactor a `FileIndex` will be returned per compaction of a given table. We'll later integrate this into the `WriteBuffer` for querying as well as adding this to the WAL so that indexes can be recreated as needed. --------- Co-authored-by: Paul Dix <paul@pauldix.net> Co-authored-by: Trevor Hilton <thilton@influxdata.com>

ghost assigned pauldix Nov 11, 2013

pauldix added 8 commits November 13, 2013 19:34

Make datastore assign point times and sequence numbers if absent. Add…

b85c337

… client and server for cluster communication. Fill out request/response in protocol. Add request handler interface and implementation.

Move server to influxd to make way for server struct.

f2e7749

Add protobuf server port to config. Create server object to wrap ever…

26c1044

…ything together. Move the server daemon to damon/infuxd.go. Remove extraneous stuff from the client server test.

Add replication factor to databases with default of 1. Move Database …

11554a2

…object from http/api over to cluster config since it makes more sense there.

Initialize admin server in server.

023219d

Add sharding and replication to coordinator and replication to the re…

b255ffe

…quest handler. Add LogRequest method stub to datastore.

Make datastore test expect 0 fields on empty results

915356f

Merge branch 'master' into add-clustering

55faca5

Conflicts: src/datastore/leveldb_datastore.go

jvshahid reviewed Nov 14, 2013
View reviewed changes

jvshahid reviewed Nov 15, 2013
View reviewed changes

pauldix added 7 commits November 15, 2013 15:32

Fix bug where proxies would happen even if a successful write occurs …

75995f1

…locally

Merge branch 'master' into add-clustering

ec1ef67

Fix proxied writes to try all possible replicas

ee6e614

Use CompareAndSwap in ProtobufClient instead of locks

a7e5ca7

Move go-cache dependency to InfluxDB fork so we can control the version

5ec7698

Add sweeper to clear out timed out requests. Fixes for clearing reque…

a648134

…sts when responses are sent.

jvshahid reviewed Nov 17, 2013
View reviewed changes

Fix calls to CompareAndSwap in ProtobufClient

7e6ee7a

Some cleanup with renaming and removing extraneous call to CompareAnd…

c586397

…Swap

jvshahid reviewed Nov 17, 2013
View reviewed changes

pauldix and others added 14 commits November 17, 2013 14:24

Add method comments to make Johnny S. happy ;)

597f7a5

Fix bug where if local write fails, it would end up proxying the writ…

fa5ee70

…e twice.

use daemon instead of server

46a93cc

the integration tests and the package script should be using daemon instead of server which is the old name of the influxdb binary

make the integration test wait for the server to start

041dc79

Fix coordinator to assign sequence_number and time even when in singl…

28863f2

…e server mode

WIP: Wire up distributed queries

c7c88d8

* Update Response in protocol to have nextPointTime * Wire up distribute query in coordinator * Add method to ClusterCongig to get servers to query * Add query handling to request handler * Add point sorting to protocol.Series

Merge branch 'master' into add-clustering

eecc9be

Conflicts: src/datastore/leveldb_datastore.go src/engine/engine.go

Update write requests to have a cluster version. Add sequenceNumber, …

17ffb61

…owner server id, and writing server id for use in request logging and replication.

Add request logging that rolls over every day and deletes any logs ol…

7e4378f

…der than 48 hours.

Merge branch 'master' into add-clustering

065ccfc

Conflicts: package.sh

WIP: Implement replication log replays for servers that get out of sync.

28787ea

* Update protocol with replication log replays. * Update protobuf server to keep map of connections to be able to close cleanly * Add replay support to request handler, coordinator, and datastore

Fix api, coordinator, and client server test to work with new replay …

ad8afe9

…functionality.

Fix bugs in replication replays and test. Fix other tests to work wit…

3f7c2fa

…h new interface.

pauldix closed this in ec93c7f Dec 3, 2013

jvshahid deleted the add-clustering branch January 2, 2014 20:30

jvshahid pushed a commit that referenced this pull request Aug 12, 2014

Merge pull request #20 from temoto/patch-1

730cdd4

Patch 1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This is the beginning of the clustering stuff. Client/server communication, etc. #20

This is the beginning of the clustering stuff. Client/server communication, etc. #20

pauldix commented Nov 7, 2013

obfuscurity commented Nov 7, 2013

obfuscurity commented Nov 7, 2013

jvshahid Nov 14, 2013

toddboom commented Nov 15, 2013

jvshahid Nov 15, 2013

pauldix Nov 15, 2013

jvshahid Nov 17, 2013

jvshahid Nov 17, 2013

pauldix Nov 17, 2013

pauldix Nov 17, 2013

This is the beginning of the clustering stuff. Client/server communication, etc. #20

This is the beginning of the clustering stuff. Client/server communication, etc. #20

Conversation

pauldix commented Nov 7, 2013

obfuscurity commented Nov 7, 2013

obfuscurity commented Nov 7, 2013

jvshahid Nov 14, 2013

Choose a reason for hiding this comment

toddboom commented Nov 15, 2013

jvshahid Nov 15, 2013

Choose a reason for hiding this comment

pauldix Nov 15, 2013

Choose a reason for hiding this comment

jvshahid Nov 17, 2013

Choose a reason for hiding this comment

jvshahid Nov 17, 2013

Choose a reason for hiding this comment

pauldix Nov 17, 2013

Choose a reason for hiding this comment

pauldix Nov 17, 2013

Choose a reason for hiding this comment