optimizing memory usage #190

mariocynicys · 2023-02-18T15:27:18Z

Just by loading the minimal necessary data plus dropping the last_known_blocks we get a huge reduction (2GB -> 1.2GB for 2.2M appointments*).
We can even do better and further reduce the data stored in memory (dropping user_ids and potentially locators).

^{* This is on MacOS. On linux we get to as low as 1.3G but only 1.0G is actually being used, we know that because if we call malloc_trim we get to 1.0G}

Fixes #224

sr-gi · 2023-02-20T10:22:24Z

We can even do better and further reduce the data stored in memory (dropping user_ids and potentially locators).

I thought you were going to pack all mem optimizations in the same PR. Are you planning on a follow-up for that?

mariocynicys · 2023-02-20T19:56:07Z

I thought you were going to pack all mem optimizations in the same PR. Are you planning on a follow-up for that?

Nah, will follow-up in this PR.

Polling the least amount of data already solved the iter/non-iter issue since it only appears when a hashmap has a dynamic sized field (the encrypted_blob in our case). And not really in favor of playing with trimming and stuff since it's not a general solution.
What's left is:

streaming reads from the DB (shouldn't introduce any issues, but haven't looked into it yet).

Not sure if there are any trivial/non-structural opts left after that.

The next set of optimizations for watcher I think are:

removing user_id from appointment summary (64 byte reduction) -> will need to call the DB for user_id when needed.
squeezing the locator_uuid_map to only a locator set -> will need to call the DB for uuids if a locator is matched on a block.
removing the locator set all together and relying on db calls in sql transactions for each block as discussed in discord + stream the db reads with the breach handling.

I will need to review the watcher and see what roles does these two hashmaps play so to understand how would reducing/eliminating them would affect the tower CPU-wise in a normal case (not so many breaches).

Will also need to check similar memory optimization options for the responder and gatekeeper.

mariocynicys · 2023-02-23T14:03:03Z

After a very very long fight with lifetimes, boxes, dyns and everything I was able to assemble this f12cb90.

It's working as intended as a way of steaming data whilst hiding the DBM specific things, but the design isn't good at all:

QueryIterator isn't actually an iterator, you need to call .iter() to get the real iterator: if you try to make it an iterator, you will end up reimplementing rusqlite all over, because you need to replace the Rows & Statement structs they offer.
You need to annotate QueryIterator's params generic type which shouldn't be relevant at all to the caller: because stmt.query_map(params, ... needs params to be of known size at compile time, thus can't be placed in a Box<dyn Params>.

The issue lies in DBM methods constructing a Statement which dies after the method ends, but since we need an iterator and don't want to collect just yet, we want to return Rows which is bound to the life time of that Statement so then we will have to return the statement as well, which is what the above mentioned commit does. rusqlite/rusqlite#1265 should solve this if it were to be implemented.

That said, I might be complicating the issue and could be solved in another simpler way, so posting here to have a review on this.

Meanwhile I am moving on to other possible memory optimization options.

sr-gi

I am only reviewing f12cb90 for now.

I don't think this approach is that bad, it'll be best if we could simply implement Iter for QueryIterator, but for more than I've scratched my head, I haven't found a way of doing so.

With respect to changes being implemented in rusqlite, the one you've mentioned seems to depend on an issue from 2016, so I think it's pretty unlikely :(

Regarding the code itself. Assuming our goal is to use this only for bootstrapping, we don't actually need the params to be part of QueryIterator, given there are none for either the collection of appointment summaries nor the trackers. Therefore we could simplify it to something like the following:

diff --git a/teos/src/db_iterator.rs b/teos/src/db_iterator.rs
index 9bf61a1..feda098 100644
--- a/teos/src/db_iterator.rs
+++ b/teos/src/db_iterator.rs
@@ -1,22 +1,19 @@
-use rusqlite::{MappedRows, Params, Result, Row, Statement};
+use rusqlite::{MappedRows, Result, Row, Statement};
 use std::iter::Map;

 /// A struct that owns a [Statement] and has an `iter` method to iterate over the
 /// results of that DB query statement.
-pub struct QueryIterator<'db, P, T> {
+pub struct QueryIterator<'db, T> {
     stmt: Statement<'db>,
-    params_and_mapper: Option<(P, Box<dyn Fn(&Row) -> T>)>,
+    mapper: Option<Box<dyn Fn(&Row) -> T>>,
 }

-impl<'db, P, T> QueryIterator<'db, P, T>
-where
-    P: Params,
-{
+impl<'db, T> QueryIterator<'db, T> {
     /// Construct a new [QueryIterator].
-    pub fn new(stmt: Statement<'db>, params: P, f: impl Fn(&Row) -> T + 'static) -> Self {
+    pub fn new(stmt: Statement<'db>, f: impl Fn(&Row) -> T + 'static) -> Self {
         Self {
             stmt,
-            params_and_mapper: Some((params, Box::new(f))),
+            mapper: Some(Box::new(f)),
         }
     }

@@ -28,9 +25,9 @@ where
         &mut self,
     ) -> Option<Map<MappedRows<'_, impl FnMut(&Row) -> Result<T>>, impl FnMut(Result<T>) -> T>>
     {
-        self.params_and_mapper.take().map(move |(params, mapper)| {
+        self.mapper.take().map(move |mapper| {
             self.stmt
-                .query_map(params, move |row| Ok((mapper)(row)))
+                .query_map([], move |row| Ok((mapper)(row)))
                 .unwrap()
                 .map(|row| row.unwrap())
         })
diff --git a/teos/src/dbm.rs b/teos/src/dbm.rs
index 318d8c9..6823313 100644
--- a/teos/src/dbm.rs
+++ b/teos/src/dbm.rs
@@ -7,7 +7,7 @@ use std::path::PathBuf;
 use std::str::FromStr;

 use rusqlite::limits::Limit;
-use rusqlite::{params, params_from_iter, Connection, Error as SqliteError, Row, ParamsFromIter};
+use rusqlite::{params, params_from_iter, Connection, Error as SqliteError};

 use bitcoin::consensus;
 use bitcoin::hashes::Hash;
@@ -307,9 +307,7 @@ impl DBM {
     }

     /// Loads all [AppointmentSummary]s from that database.
-    pub(crate) fn load_appointment_summaries(
-        &self,
-    ) -> QueryIterator<ParamsFromIter<[u8; 0]>, (UUID, AppointmentSummary)> {
+    pub(crate) fn load_appointment_summaries(&self) -> QueryIterator<(UUID, AppointmentSummary)> {
         let stmt = self
                 .connection
                 .prepare(
@@ -318,7 +316,7 @@ impl DBM {
                 )
                 .unwrap();

-        let func = |row: &Row| {
+        QueryIterator::new(stmt, |row| {
             let raw_uuid: Vec<u8> = row.get(0).unwrap();
             let raw_locator: Vec<u8> = row.get(1).unwrap();
             let raw_userid: Vec<u8> = row.get(2).unwrap();
@@ -329,9 +327,7 @@ impl DBM {
                     UserId::from_slice(&raw_userid).unwrap(),
                 ),
             )
-        };
-
-        QueryIterator::new(stmt, params_from_iter([]), func)
+        })
     }

     /// Loads appointments from the database. If a locator is given, this method loads only the appointments

Also, notice how annotating the closure is not necessary as long as you add it directly to QueryIterator::new, given the compiler can infer the types in that case (if you assign it to a variable it could be used in different contexts, so it looks like the compiler doesn't really like that).

All in all, I don't think this is too terrible, the alternative would be replacing rusqlite with something that has a design more compatible with our needs (which I'm not opposed to, tbh), or being a rustlang master maybe and find a tricky solution 🙃

mariocynicys · 2023-03-01T18:15:35Z

Happy you find it not that terrible 😂
Regrading Params, I wanted this to be general over any DB query. I think an iterator gonna also be useful for processing a newly connected blocks breaches without collecting all the appointments in memory at one time.

sr-gi · 2023-03-01T18:28:00Z

Happy you find it not that terrible 😂

It was you calling it not good at all, I'm just smoothing it out a bit lol

mariocynicys · 2023-03-07T17:10:39Z

DB Iterator stuff are backed up in https://github.com/mariocynicys/rust-teos/tree/dbm-iterator

mariocynicys · 2023-03-07T18:08:56Z

After this change, reached 631MiB after trimming using the 1G DB.

Some tests needed to adapt because they were breaking the recipe for UUID = UserId + Locator, for example Watcher::store_appointment was sometimes passed some uuids that didn't link to the appointment being stored (was generated using generate_uuid).
This was problematic because if we store locators in the gatekeepers UserInfo and want to get the uuid back for some locator it wont match. And I think causes a similar problem after the absence of a uuid -> locator map (ie. Watcher::appointments) but don't recall the hows.
But these issues where in tests only anyways.

Also some methods might need to have its signatures changed a little bit to be less confusing and less error prone.
For example, GateKeeper::add_update_appointment, it accepts (user id, locator, extended appointment), one can get the user id and locator from the extended appointment already, so there is actually a replication of passed data here, and this avoids the us supplying wrong user id and locator and avoid confusion of what these parameter actually should be and whether they should match the fields in the extended appointment or not.

Another thing that I think we should change is the excessive use of hashsets and hashmaps in functions. Hash maps and sets are more expensive to insert in and use more memory than a vector. In many cases we pass a hashmaps to a function just to iterate on it (never using the mapping functionality it offers). Hashmaps could be replaced with a Vec<(key, val)>.

I pushed that commit 72b384d to get a concept ack, but I think will need to do some refactoring & renaming.
@sr-gi

mariocynicys · 2023-03-07T18:15:12Z

Let me also dump some notes I had written xD.

Trying to get rid of Watcher::locator_uuid_map and Watcher::appointments.

1- For locator_uuid_map:
locator_uuid_map is used to keep uuids for each locator.
On a breach for locator X, you will need to load all the uuids associated with it and load appointments for these uuids (load_appointment(uuid)) and try to decrypt them and broadcast.

This should be replaced with:

Get all the locators found in a newly mined block.
Do a DB tx looking for all the UUIDs for these locators.
Do all of this in an iteratable way to avoid storing any intermediaries in memory for no reason.

2- For appointments:
appointments hashmap maps from a uuid to its 2 parents (user_id & locator).

We extract locator from a uuid when a client calls get_subscription_info. We map from UUIDs from UserInfo from the gatekeeper to Locators using the appointments hashmap, this can be avoided if we store locators instead of uuids in UserInfo. We can always recover the uuids from the locators since we know the user_id for that UserInfo, and we also save a 20% in size (locators are 16 bytes while uuids are 20 bytes).

We extract user_id from a uuid when a new block is connected (in filtered_block_connected), user_ids are used while trackers for breaches (why store user_id for a tracker?), user_ids are also used to instruct the gatekeeper which user to update after broadcasting an appointment with uuid X like returning the users' slots and so on.
As an alternative, we can get the user_id from the database, when a new block is connected and we pull relevant penalty TXs for broadcasting, we can pull the user_signature as well which we can recover the user_id from. And everything should be done in an iteratable fashion.

That latest commit works on point 2, I will try to work on point 1 without the iteratable fashion mentioned and then adapt it once we can iterate over DB queries in a nice manner.

sr-gi · 2023-03-08T18:13:11Z

Alright, I think you are on the right path. I'm commenting on your notes, but haven't checked 72b384d (lmk if you'd like me to at this stage).

After this change, reached 631MiB after trimming using the 1G DB.

Some tests needed to adapt because they were breaking the recipe for UUID = UserId + Locator, for example Watcher::store_appointment was sometimes passed some uuids that didn't link to the appointment being stored (was generated using generate_uuid). This was problematic because if we store locators in the gatekeepers UserInfo and want to get the uuid back for some locator it wont match. And I think causes a similar problem after the absence of a uuid -> locator map (ie. Watcher::appointments) but don't recall the hows. But these issues where in tests only anyways.

Yeah, that should only affect tests. In the test suite the uuid recipe was not being followed because data was being generated sort of randomly, but that should be relatively easy to patch.

Also some methods might need to have its signatures changed a little bit to be less confusing and less error prone.
For example, GateKeeper::add_update_appointment, it accepts (user id, locator, extended appointment), one can get the user id and locator from the extended appointment already, so there is actually a replication of passed data here, and this avoids the us supplying wrong user id and locator and avoid confusion of what these parameter actually should be and whether they should match the fields in the extended appointment or not.

I partially agree with this. With respect to the user_id, I think we can simplify it given, as you mentioned, one is literally part of the other (and this indeed applies to multiple methods). With respect to the UUID, I'm not that sure given this will trigger a cascade of recomputing this same data. Take Watcher::add_appointment for instance. Here, after computing the UUID, we call:

self.responder.has_tracker
log (using UUID)
Gatekeeper::add_update_appointment
Watcher::store_triggered_appointment
Watcher::store_appointment

The two former need UUID but don't get ExtendedAppointment. The three latter need UUID and also receive ExtendedAppointment. Just for this method will be creating the UUID four times (and I'm not counting the calls that happen inside those methods, the store_X methods also call the DBM, which also receives UUID and ExtendedAppointment).

Another thing that I think we should change is the excessive use of hashsets and hashmaps in functions. Hash maps and sets are more expensive to insert in and use more memory than a vector. In many cases we pass a hashmaps to a function just to iterate on it (never using the mapping functionality it offers). Hashmaps could be replaced with a Vec<(key, val)>.

I agree as long as collecting the map doesn't involve ending up having worse performance. Same applies to sets. The reasoning behind sets instead of vectors is that the former allow item deletion by identifier, while the latter doesn't. In a nutshell, if we're only iterating it may be worth, but if we need addition/deletion it may be trickier. Actually, if we only want to iterate, wouldn't an iterator be an option? Also, are we passing copies of the maps/sets or references?

We extract locator from a uuid when a client calls get_subscription_info

The other way around, or am I missing something? The user requests a given locator and we compute the UUID based on his UserId and the requested Locator.

We map from UUIDs from UserInfo from the gatekeeper to Locators using the appointments hashmap, this can be avoided if we store locators instead of uuids in UserInfo. We can always recover the uuids from the locators since we know the user_id for that UserInfo, and we also save a 20% in size (locators are 16 bytes while uuids are 20 bytes).

I barely remember this now, but I think the reason why UUIDs were stored was so, on deletion, Gatekeeper data could be mapped to Watcher and Responder data. We can indeed re-compute the UUID based on the locator and the user_id (that's why the UUID is created that way, so we could serve queries without having to do a reverse lookup), so I think you may be right here.

We extract user_id from a uuid when a new block is connected (in filtered_block_connected), user_ids are used while trackers for breaches (why store user_id for a tracker?)

Same reasoning I think, so we can delete the corresponding data from Gatekeeper and Watcher/Responder when needed. But again, I'm talking from the top of my mind, would need to review a change of this to see if it makes sense.

mariocynicys · 2023-03-10T07:30:59Z

With respect to the UUID, I'm not that sure given this will trigger a cascade of recomputing this same data.

That's true, we can keep it taking UUID but be cautious later in the tests not to provide a random non-related UUID. Or we can embed the UUID inside the extended appointment.

Actually, if we only want to iterate, wouldn't an iterator be an option? Also, are we passing copies of the maps/sets or references?

Yup iterator would be the best we can do if we nail having converting the calls reacting to block_connected being stream-able.
We pass mostly/everywhere? references, but still some of these sets and maps not being used anytime for mapping and we can simplify them to vecs instead (& refs to vecs).

The other way around, or am I missing something? The user requests a given locator and we compute the UUID based on his UserId and the requested Locator.

We extract locator from a uuid when a client calls get_subscription_info

When a client asks for their subscription info, they wanna know the locators they have given out and not the UUIDs (UUIDs is tower-implementation specific after all). The thing is we store a user's appointments in terms of UUIDs inside UserInfo and we want to convert them to locators for get_subscription_info response.
There is not going back from UUID -> Locator (without a map) so we could have stored Locators in UserInfo instead.

get_subscription_info is the message that returns subscription info to the user including all the locators they have sent out to us.

common_msgs::GetSubscriptionInfoResponse {
  available_slots: subscription_info.available_slots,
  subscription_expiry: subscription_info.subscription_expiry,
  locators: locators.iter().map(|x| x.to_vec()).collect(),
}

I think you confused it with get_appointment.

but haven't checked 72b384d (lmk if you'd like me to at this stage).

Nah, ACKing these comments were enough.
Will clean this commit, do some renamings and some refactoring for the tests.

sr-gi · 2023-03-10T07:52:14Z

The other way around, or am I missing something? The user requests a given locator and we compute the UUID based on his UserId and the requested Locator.

We extract locator from a uuid when a client calls get_subscription_info

When a client asks for their subscription info, they wanna know the locators they have given out and not the UUIDs (UUIDs is tower-implementation specific after all). The thing is we store a user's appointments in terms of UUIDs inside UserInfo and we want to convert them to locators for get_subscription_info response.
There is not going back from UUID -> Locator (without a map) so we could have stored Locators in UserInfo instead.

get_subscription_info is the message that returns subscription info to the user including all the locators they have sent out to us.
common_msgs::GetSubscriptionInfoResponse {
  available_slots: subscription_info.available_slots,
  subscription_expiry: subscription_info.subscription_expiry,
  locators: locators.iter().map(|x| x.to_vec()).collect(),
}
I think you confused it with get_appointment.

I was indeed thinking about single appointment requests 😅

sr-gi

This is not a thorough review, some things may still be missing, but given I wanted to start using these optimizations at teos.talaia.watch I went ahead a give this a look.

I think this goes in the right direction, but it needs some polishing.

teos/src/api/internal.rs

teos/src/responder.rs

teos/src/api/internal.rs

teos/src/gatekeeper.rs

teos/src/watcher.rs

teos/src/extended_appointment.rs

mariocynicys · 2023-05-25T06:15:50Z

Current mem usage: 69M
Which is basically the locator_cache & tx_index & carrier data & in-memory user info in the gatekeeper.

The last 4 commits should have their tests adapted and squashed into one commit. Leaving them for now for an easy linear review.

Does they make sense in terms of how much they query the DB (bottle necks)?
@sr-gi

sr-gi

Overlooking that test are not building and that some methods have unnecessary arguments for now, just focusing on the big changes.

I reviewed up to the Gatekeeper (so Watcher and Responder are missing), but I don't trust GH to not delete all my pending comments, so I'll add the rest in a followup review.

teos/src/api/internal.rs

teos/src/dbm.rs

teos/src/gatekeeper.rs

sr-gi

Looks like GH posted the comment twice, so I'm reserving this stop for later 😆

sr-gi

Some comments on the Wartcher and Responder.

I would need to do another general pass once the code is cleaned up, it's otherwise har to follow.

Take a look at the comment regarding the disconnections, I think we should preserve that logic instead of trying to react to things on block_disconnected.

teos/src/watcher.rs

teos/src/responder.rs

mariocynicys · 2023-06-22T10:22:59Z

Current mem usage: 69M
Which is basically the locator_cache & tx_index & carrier data & in-memory user info in the gatekeeper.

Also need to mention the IO implications of this refactor. Most of the IO are reads triggered each block by batch_check_locators_exist, but other methods' IO will arise when breaches are found (like pulling encrypted penalties from the DB) but those aren't new.

111 calls to batch_check_locators_exist made around ~27.6GiB of IO reads (~255MiB per block).
No extra writes.

Other affected reads:

get_subscription_info: Pulls all the user's locators from the DB, reads depend on how big the user is.

The 111 calls to batch_check_locators_exist I based those measurements on, all had no breaches found. Thus the 255MiB is solely the search operation and not that some data is read and returned. I think the search will be more efficient if we have an index over the locators in the appointments table.

mariocynicys · 2023-06-22T11:23:51Z

The 111 calls to batch_check_locators_exist I based those measurements on, all had no breaches found. Thus the 255MiB is solely the search operation and not that some data is read and returned. I think the search will be more efficient if we have an index over the locators in the appointments table.

After applying CREATE INDEX IF NOT EXISTS locators_index ON appointments (locator). 266 calls to batch_check_locators_exist made 786MiB (~3MiB per block).

sr-gi · 2023-06-22T15:13:28Z

The 111 calls to batch_check_locators_exist I based those measurements on, all had no breaches found. Thus the 255MiB is solely the search operation and not that some data is read and returned. I think the search will be more efficient if we have an index over the locators in the appointments table.

After applying CREATE INDEX IF NOT EXISTS locators_index ON appointments (locator). 266 calls to batch_check_locators_exist made 786MiB (~3MiB per block).

Wow, what a reduction, that nice :)

mariocynicys · 2023-07-03T07:05:09Z

The current rework of the responder is actually equivalent to how things were previously before the memory opt stuff (but without ConfirmationStatus::ReorgedOut).

I believe it still needs some reworking to keep tracking reorged disputes for longer time (+ actually track reorged disputes for non-reoged penalties), but dunno whether they should be in a follow-up since this one has grown a lot. Thoughts @sr-gi ?

sr-gi

I think this makes sense, but it is starting to get to the point where a cleanup would be helpful, especially to be able to properly review each commit individually.

It would need more a thorough review, but see this as an Approach ACK

teos/src/responder.rs

teos/src/dbm.rs

teos/src/responder.rs

mariocynicys · 2023-07-16T07:34:35Z

I think this makes sense, but it is starting to get to the point where a cleanup would be helpful, especially to be able to properly review each commit individually.

Ummm, I thought we can only do so many optimization at the first and relied on some in-memory structs being there. Incrementally, I realized we can remove these as well, and updated on old commits, so it's a bit confusing not having done all of them in a single commit.

I would suggest squash reviewing all the commits (- the test fixing one & the merge one) at once, as separating each of them into a standalone commit will be a very nasty rebase (and probably will end up squashing most/all of them together).

sr-gi · 2023-07-17T15:38:47Z

I think this makes sense, but it is starting to get to the point where a cleanup would be helpful, especially to be able to properly review each commit individually.

Ummm, I thought we can only do so many optimization at the first and relied on some in-memory structs being there. Incrementally, I realized we can remove these as well, and updated on old commits, so it's a bit confusing not having done all of them in a single commit.

I would suggest squash reviewing all the commits (- the test fixing one & the merge one) at once, as separating each of them into a standalone commit will be a very nasty rebase (and probably will end up squashing most/all of them together).

Fair, as long as there is a cleanup of old comments and suggestions are addressed.

teos/src/responder.rs

sr-gi

Final nits.

Can you rebase so I can review this from the new base? I think only tests are missing but just to avoid having to go trough that again.

You can leave squashing for later, after the final review.

teos/src/responder.rs

teos/src/gatekeeper.rs

sr-gi

This is a full review, inckluding tests. It is looking pretty good.

We should be tracking things that are pending so we do not forget in the followups. I've added comments for some, here are more (some may be duplicated):

Store data as reorged in the DB so we don't have to work around it in check_confirmations
Return list of locators instead of UUID when querying for user appointments
handle_reorged_txs after landing Improves reorg logic by checking whether we are on sync with the backend or not #235 to avoid hacking things around ConfirmationStatus::IrrevocablyResolved
rebroadcast_stale_txs after landing Improves reorg logic by checking whether we are on sync with the backend or not #235 to deal with the case that the tower was off for some time and then force updated

sr-gi · 2023-08-09T19:27:45Z

teos/src/dbm.rs

+ assert_eq!(
+ dbm.get_appointment_length(uuid).unwrap(),
+ appointment.inner.encrypted_blob.len()
+ );


This is a bit tricky because all appointments created using generate_random_appointment do use the exact same penalty, which is hardcoded.

I don't really think it is a big issue, given the assertion doesn't need to work with random sizes, but in prod appointments will certainly have different lengths.

Ideally, a "random" transaction could be created by just modifying some of the transaction bits, such as value, prev_txid, ...

It may be worth adding something on these lines to the testing suite at some point.

prev_txid is already random (used for encryption) but I don't think this affects the length of the encrypted bytes.

pub struct Transaction { /// The protocol version, is currently expected to be 1 or 2 (BIP 68). pub version: i32, /// Block number before which this transaction is valid, or 0 for valid immediately. pub lock_time: u32, /// List of transaction inputs. pub input: Vec<TxIn>, /// List of transaction outputs. pub output: Vec<TxOut>, }

We can add more inputs/outputs to the transaction. A variable size OP_RETURN should do the trick.

teos/src/dbm.rs

teos/src/watcher.rs

sr-gi · 2023-08-09T22:07:24Z

teos/src/watcher.rs

@@ -329,6 +270,9 @@ impl Watcher {
 .lock()
 .unwrap()
 .store_appointment(uuid, appointment)
+ // TODO: Don't unwrap, or better, make this insertion atomic with the


Add this to the pending fixes issue

teos/src/watcher.rs

mariocynicys · 2023-08-14T21:06:36Z

We should be tracking things that are pending so we do not forget in the followups

I have typed the issues (one for add_appointment atomicity issues & one for DB persistence of tracker status) but will wait to add code references on them from this PR once it's on master.

sr-gi · 2023-08-24T08:28:58Z

Most comments have been addressed, I left additional comments in the ones that have not.

Once those are addressed, this should be good to go. It will need both rebasing and squashing.

teos-common/src/test_utils.rs

By loading the minimal necessary data during bootstrap, we get lower memory usage and faster bootstrapping. Co-authored-by: Sergi Delgado Segura <sergi.delgado.s@gmail.com>

`last_known_blocks` was taking up ~300migs of memory (for 100 blocks) because it was not dropped in `main`. Co-authored-by: Sergi Delgado Segura <sergi.delgado.s@gmail.com>

Regrading the `Watcher`, fields (appointments, locator_uuid_map) has been replaced by DB calls when needed. For `Responder`, the field `trackers` has been replaced by DB calls when needed, and `tx_tracker_map` wasn't actually needed for the tower to operate, so was just dropped. For `GateKeeper`, `registered_users::appointments` which used to hold the uuids of every appointment the user submitted was removed so that `registered_users` only holds meta information about users. Also now the gatekeeper is the entity responsible for deleting appointments from the database. Instead of the watcher/responder asking the gatekeeper for the users to update and carry out the deletion and update itself, now the watcher/responder will hand the gatekeeper the uuids to delete and the gatekeeper will figure out which users it needs to update (refund the freed slots to). Also now, like in `Watcher::store_triggered_appointment`, if the appointment is invalid or was rejected by the network in block connections, the freed slots will not be refunded to the user. Also the block connection order starts with the gatekeeper first, this allows the gatekeeper to delete the outdated users so that the watcher and the responder doesn't take them into account.

sr-gi

LGTM

mariocynicys force-pushed the better-mem-cpu-usage branch from f57736a to 92ad4e5 Compare February 22, 2023 10:17

sr-gi reviewed Mar 1, 2023

View reviewed changes

mariocynicys force-pushed the better-mem-cpu-usage branch from f12cb90 to 72b384d Compare March 7, 2023 17:09

sr-gi requested changes Mar 28, 2023

View reviewed changes

mariocynicys force-pushed the better-mem-cpu-usage branch from 72b384d to 8b357e0 Compare May 22, 2023 16:27

sr-gi requested changes Jun 1, 2023

View reviewed changes

sr-gi requested changes Jun 5, 2023

View reviewed changes

sr-gi requested changes Jul 6, 2023

View reviewed changes

teos/src/responder.rs Show resolved Hide resolved

teos/src/responder.rs Show resolved Hide resolved

teos/src/responder.rs Outdated Show resolved Hide resolved

teos/src/dbm.rs Show resolved Hide resolved

teos/src/responder.rs Outdated Show resolved Hide resolved

sr-gi mentioned this pull request Jul 11, 2023

Make teos-cli get_user return locators in appointments #229

Open

mariocynicys requested a review from sr-gi July 18, 2023 11:14

mariocynicys force-pushed the better-mem-cpu-usage branch from db73cd1 to 9150458 Compare July 18, 2023 17:03

sr-gi requested changes Aug 7, 2023

View reviewed changes

teos/src/responder.rs Outdated Show resolved Hide resolved

teos/src/responder.rs Outdated Show resolved Hide resolved

mariocynicys requested a review from sr-gi August 7, 2023 17:29

sr-gi requested changes Aug 7, 2023

View reviewed changes

teos/src/responder.rs Outdated Show resolved Hide resolved

mariocynicys force-pushed the better-mem-cpu-usage branch 4 times, most recently from 43cb570 to c9d4c3f Compare August 8, 2023 06:03

mariocynicys commented Aug 8, 2023

View reviewed changes

teos/src/gatekeeper.rs Outdated Show resolved Hide resolved

mariocynicys force-pushed the better-mem-cpu-usage branch from c9d4c3f to 550305f Compare August 8, 2023 06:15

sr-gi requested changes Aug 9, 2023

View reviewed changes

mariocynicys force-pushed the better-mem-cpu-usage branch from a012db4 to 19ea462 Compare August 14, 2023 21:09

mariocynicys requested a review from sr-gi August 23, 2023 12:24

mariocynicys force-pushed the better-mem-cpu-usage branch 2 times, most recently from 83614ec to b4e6ba6 Compare August 30, 2023 14:43

sr-gi reviewed Aug 31, 2023

View reviewed changes

teos-common/src/test_utils.rs Show resolved Hide resolved

mariocynicys force-pushed the better-mem-cpu-usage branch 2 times, most recently from 5d32e26 to f965c40 Compare August 31, 2023 11:48

mariocynicys requested a review from sr-gi September 1, 2023 13:30

mariocynicys and others added 2 commits September 1, 2023 16:31

loading minimal data during bootstrap

8df58f9

By loading the minimal necessary data during bootstrap, we get lower memory usage and faster bootstrapping. Co-authored-by: Sergi Delgado Segura <sergi.delgado.s@gmail.com>

dropping last_known_blocks after using it

fad3ad1

`last_known_blocks` was taking up ~300migs of memory (for 100 blocks) because it was not dropped in `main`. Co-authored-by: Sergi Delgado Segura <sergi.delgado.s@gmail.com>

mariocynicys force-pushed the better-mem-cpu-usage branch from f965c40 to fbcfebb Compare September 1, 2023 13:32

mariocynicys added 2 commits September 1, 2023 16:39

Make random appointments variable in size

1790fe3

mariocynicys force-pushed the better-mem-cpu-usage branch from fbcfebb to 1790fe3 Compare September 1, 2023 13:40

sr-gi approved these changes Sep 1, 2023

View reviewed changes

sr-gi mentioned this pull request Sep 14, 2023

Replace UUID with locator #237

Open

sr-gi merged commit 658fcca into talaia-labs:master Sep 14, 2023
7 checks passed

sr-gi mentioned this pull request Sep 14, 2023

Delete outdated appointment / trackers callback #41

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimizing memory usage #190

optimizing memory usage #190

mariocynicys commented Feb 18, 2023 •

edited

Loading

sr-gi commented Feb 20, 2023 •

edited

Loading

mariocynicys commented Feb 20, 2023

mariocynicys commented Feb 23, 2023

sr-gi left a comment •

edited

Loading

mariocynicys commented Mar 1, 2023

sr-gi commented Mar 1, 2023

mariocynicys commented Mar 7, 2023

mariocynicys commented Mar 7, 2023 •

edited

Loading

mariocynicys commented Mar 7, 2023

sr-gi commented Mar 8, 2023 •

edited

Loading

mariocynicys commented Mar 10, 2023 •

edited

Loading

sr-gi commented Mar 10, 2023

sr-gi left a comment

mariocynicys commented May 25, 2023 •

edited

Loading

sr-gi left a comment

sr-gi left a comment •

edited

Loading

sr-gi left a comment

mariocynicys commented Jun 22, 2023 •

edited

Loading

mariocynicys commented Jun 22, 2023

sr-gi commented Jun 22, 2023

mariocynicys commented Jul 3, 2023

sr-gi left a comment

mariocynicys commented Jul 16, 2023

sr-gi commented Jul 17, 2023

sr-gi left a comment

sr-gi left a comment

sr-gi Aug 9, 2023

sr-gi Aug 24, 2023

mariocynicys Aug 30, 2023

sr-gi Aug 9, 2023

mariocynicys commented Aug 14, 2023

sr-gi commented Aug 24, 2023

sr-gi left a comment

optimizing memory usage #190

optimizing memory usage #190

Conversation

mariocynicys commented Feb 18, 2023 • edited Loading

sr-gi commented Feb 20, 2023 • edited Loading

mariocynicys commented Feb 20, 2023

mariocynicys commented Feb 23, 2023

sr-gi left a comment • edited Loading

Choose a reason for hiding this comment

mariocynicys commented Mar 1, 2023

sr-gi commented Mar 1, 2023

mariocynicys commented Mar 7, 2023

mariocynicys commented Mar 7, 2023 • edited Loading

mariocynicys commented Mar 7, 2023

sr-gi commented Mar 8, 2023 • edited Loading

mariocynicys commented Mar 10, 2023 • edited Loading

sr-gi commented Mar 10, 2023

sr-gi left a comment

Choose a reason for hiding this comment

mariocynicys commented May 25, 2023 • edited Loading

sr-gi left a comment

Choose a reason for hiding this comment

sr-gi left a comment • edited Loading

Choose a reason for hiding this comment

sr-gi left a comment

Choose a reason for hiding this comment

mariocynicys commented Jun 22, 2023 • edited Loading

mariocynicys commented Jun 22, 2023

sr-gi commented Jun 22, 2023

mariocynicys commented Jul 3, 2023

sr-gi left a comment

Choose a reason for hiding this comment

mariocynicys commented Jul 16, 2023

sr-gi commented Jul 17, 2023

sr-gi left a comment

Choose a reason for hiding this comment

sr-gi left a comment

Choose a reason for hiding this comment

sr-gi Aug 9, 2023

Choose a reason for hiding this comment

sr-gi Aug 24, 2023

Choose a reason for hiding this comment

mariocynicys Aug 30, 2023

Choose a reason for hiding this comment

sr-gi Aug 9, 2023

Choose a reason for hiding this comment

mariocynicys commented Aug 14, 2023

sr-gi commented Aug 24, 2023

sr-gi left a comment

Choose a reason for hiding this comment

mariocynicys commented Feb 18, 2023 •

edited

Loading

sr-gi commented Feb 20, 2023 •

edited

Loading

sr-gi left a comment •

edited

Loading

mariocynicys commented Mar 7, 2023 •

edited

Loading

sr-gi commented Mar 8, 2023 •

edited

Loading

mariocynicys commented Mar 10, 2023 •

edited

Loading

mariocynicys commented May 25, 2023 •

edited

Loading

sr-gi left a comment •

edited

Loading

mariocynicys commented Jun 22, 2023 •

edited

Loading