Skip to content
This repository has been archived by the owner on Aug 30, 2022. It is now read-only.

refactor MaskObject into vector and scalar parts #524

Merged
merged 13 commits into from
Sep 28, 2020
Merged

Conversation

finiteprods
Copy link
Contributor

@finiteprods finiteprods commented Sep 18, 2020

The changes introduced for the generalised scalar extension had the unfortunate side effect of adding unwelcome duplication to the code base - effectively, a pair of MaskObjects were passed through the system, either a (masked model, masked scalar) pair, or their corresponding masks. This in turn lead to the need for a pair of Aggregations, a pair of MaskDicts, and so on. Worse, one is a vector and the other is a scalar, but the types don't reflect this at all.

This merge request introduces a new, more type-safe MaskObject:

pub struct MaskObject {
    pub vector: MaskMany,
    pub scalar: MaskOne,
}

consisting of a MaskMany (the new name for the old MaskObject) and a MaskOne (containing exactly one data value):

pub struct MaskOne {
    pub data: BigUint,
    pub config: MaskConfig,
}

As a result, the scalar-related code is better encapsulated away, for the most part. We get back the tidier code (roughly similar to before #496) - in particular, the new scaling correction step of the generalised extension becomes an implementation detail of the unmasking, as it probably should.

A separate MaskOne type also allows a MaskConfig to be associated with a scalar - previously, scalars could only be masked according to the same masking configuration as the weights. Although currently, masking configurations are very geared towards the masking of weights, this is nevertheless a necessary first step in achieving more flexibility in the masking of scalars.

a few extra details...

  1. The above refactoring also allows the existing unit tests to more properly cover the scalar extension code. This in turn revealed an issue with (probably) the masking of scalars, as unmasked models no longer match (within the tolerance we expect) the original model weights. As such, the tolerance checks of the masking tests are temporarily relaxed (commit). These should be reverted once the issue has been resolved.
  2. The change in structure of MaskObject necessitates changes in its (de)serialization. As with other types we serialize, a corresponding "buffer" type MaskObjectBuffer is introduced, and similarly MaskManyBuffer. But currently, a MaskOne is serialized by converting it into a MaskMany which was convenient to some extent for purposes of code reuse, but perhaps sub-optimal in other ways. For example, a MaskOne can be encoded more compactly than a MaskMany - there is always a single data value, so there is no need for a LENGTH field. This is an optimization that can be tackled later.
  3. Some comments in reviews (pairing of MaskConfigs into a type, more consistent naming) will also be dealt with in another PR, probably in conjunction with 1. above.

Todos:

  • tests still to fix
  • conflicts / rebase
  • complete the More to follow

Copy link
Contributor

@little-dude little-dude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. My only concern is the naming, which I find slightly inconsistent sometimes.

@@ -76,10 +94,10 @@ impl Into<MaskObject> for Aggregation {
#[allow(clippy::len_without_is_empty)]
impl Aggregation {
/// Creates a new, empty aggregator for masks or masked models.
pub fn new(config: MaskConfig, object_size: usize) -> Self {
pub fn new(config_many: MaskConfig, config_one: MaskConfig, object_size: usize) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it's worth it but maybe we could have a

struct AggregationConfig {
  model: MaskConfig,
  scalar: MaskConfig,
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would further increase tidiness, thanks. will add this in a later PR if that's ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this could be tackled with #524 (comment) right?

if self.object.config != mask.config || self.object_size != mask.data.len() {
return Err(UnmaskingError::MaskMismatch);
if self.nb_models > self.object.scalar.config.model_type.max_nb_models() {
return Err(UnmaskingError::TooManyScalars);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can self.object.vector.config.model_type.max_nb_models() and self.object.scalar.config.model_type.max_nb_models() actually differ? Seems like the max is actually capped to the min of these values. More generally, can the scalar masking config and model masking config have incompatibilities?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this could probably done in one check. more generally, since the scalar could be masked with a different config to weights, we should do the same check to make sure it admits all of them for aggregating. will address separately.


#[error("too many models were aggregated for the current unmasking configuration")]
TooManyModels,

#[error("the model to aggregate is incompatible with the current aggregated model")]
#[error("too many scalars were aggregated for the current unmasking configuration")]
TooManyScalars,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth distingushing between TooManyModels and TooManyScalars? Could they be unified into an UnmaskingError::TooManyMasks for instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess distinguishing the cases has the advantage when we get errors, e.g. to narrow down which mask config contributed to the error.

if self.object.vector.config != mask.vector.config
|| self.object_size != mask.vector.data.len()
{
return Err(UnmaskingError::MaskManyMismatch);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming is a bit inconsistent between model/scalar vs many/one. One one side we have TooManyScalars and TooManyModels, and here MaskManyMismatch and MaskOneMismatch. There are other places where that is the case. Is the motivation for introducing the many/one terminology to be broader than the ML specific term "model"? I like the idea but I think we should have a clear separation of where we use a specific terminology. For instance we could say that in xaynet-core there shouldn't be any mention of "model", while in xaynet-client and xaynet-server which are domain specific, we should only use the model/scalar terminology (it would even be better if we could find a more specific term than "scalar" imo).

On the same topic, I have a slight preference for vector/scalar over many/one. It's not domain specific but fits better the nature of the data we're referring to. Wdyt?

Copy link
Contributor Author

@finiteprods finiteprods Sep 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I agree the naming isn't the best at the moment. OT1H I had introduced new terminology, OTOH I was also trying to not deviate too much from the existing names here for these errors (which were already not named the best either!). Let me revise this separately if that's ok and we can continue the discussion there. And yes, the idea of Many vs One was to more abstractly capture the structure of masked models / scalars and model masks / scalar masks. Vector / Scalar would also have worked but scalar clashes with the more domain specific use of the term, as you also noted.

Copy link
Contributor

@Robert-Steiner Robert-Steiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice refactoring👍
This will make the extracting of the mask dictionary into Redis much easier

rust/xaynet-client/src/participant.rs Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Sep 28, 2020

Codecov Report

Merging #524 into master will increase coverage by 0.28%.
The diff coverage is 72.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #524      +/-   ##
==========================================
+ Coverage   56.82%   57.10%   +0.28%     
==========================================
  Files          65       65              
  Lines        3233     3299      +66     
==========================================
+ Hits         1837     1884      +47     
- Misses       1396     1415      +19     
Impacted Files Coverage Δ
...aynet-client/src/mobile_client/participant/sum2.rs 22.22% <0.00%> (+3.47%) ⬆️
...net-client/src/mobile_client/participant/update.rs 33.33% <0.00%> (-2.39%) ⬇️
rust/xaynet-server/src/state_machine/requests.rs 80.95% <ø> (-1.27%) ⬇️
rust/xaynet-server/src/storage/impls.rs 64.28% <ø> (ø)
rust/xaynet-core/src/mask/object/mod.rs 50.00% <51.72%> (+5.55%) ⬆️
rust/xaynet-core/src/message/payload/update.rs 88.88% <71.42%> (-2.26%) ⬇️
rust/xaynet-core/src/mask/masking.rs 71.96% <74.69%> (-8.72%) ⬇️
rust/xaynet-core/src/message/payload/sum2.rs 86.20% <80.00%> (-2.48%) ⬇️
...t/xaynet-server/src/state_machine/phases/update.rs 64.28% <83.33%> (+2.89%) ⬆️
...ust/xaynet-server/src/state_machine/phases/sum2.rs 78.26% <85.71%> (-0.08%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 29d23e0...aac5e1f. Read the comment docs.

@finiteprods finiteprods marked this pull request as ready for review September 28, 2020 09:28
Copy link
Contributor

@Robert-Steiner Robert-Steiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me

rust/xaynet-server/src/state_machine/phases/sum2.rs Outdated Show resolved Hide resolved
@little-dude little-dude self-requested a review September 28, 2020 12:53
Copy link
Contributor

@little-dude little-dude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Very nice cleanup. Could we just tidy the git history a bit before we merge?

…ect serialization

implementation of mask object serialization is completed here
the prior incomplete implementation lead to the failing tests above
note: these tests should be tightened once the source of the masking issue has been resolved
fixes a bug in aggregation for the case where the 1st mask object is added to an empty aggregation
the bug was revealed by the failing tests mentioned above
…-> MaskMany

fmt imports

remove stale comment

remove commented out code
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants