-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0xPop #199
0xPop #199
Conversation
c90ab0a
to
d2717d8
Compare
b2a2785
to
b4ebf19
Compare
77afcc8
to
ea39b51
Compare
if let Some(bgp) = p.bgp.as_ref() { | ||
if bgp.id == id { | ||
let mut marked = p.clone(); | ||
marked.bgp = Some(bgp.as_stale()); | ||
return Some(marked); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this relying on the peer's Router ID to identify routes learned from a particular BGP session? That would probably suffice in the short term, but may pose problems in deployments with multiple sessions to the same peer (e.g. over redundant L3 links).
e.g.
If we have BGP sessions A and B to RID X and a route-refresh is only received on session A, we should only mark the paths learned from session A as stale. Paths from session B should still be valid.
If we only use RID to identify a BGP session, then it seems like a route-refresh from session A would result in routes from both session A/B being marked stale... and all paths from session B getting removed when EoRR is received on session A.
I don't know of any current deployments with redundant sessions to the same peer (or re-use of RID across multiple routers) so I don't think this is high priority today, but I think is something we'll need to address at some point (assuming I'm understanding the code correctly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a follow-up issue.
7e47592
to
964d17a
Compare
This reverts commit 98c152b. will bring this back later when vlan work is complete in dendrite
// Omicron API XXX? use normal API now that it's somewhat civilized? | ||
register!(api, bgp_apply); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to refactor some things in that omicron RPW anyway, so we can move away from calling this api endpoint when we do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work, this is a huge PR! I've reached a stopping point for today's review session and leaving some notes. So far I only have some very minor items and a few questions.
mgadm/src/bgp.rs
Outdated
#[derive(Debug, Args)] | ||
pub struct OmicronSubcommand { | ||
#[command(subcommand)] | ||
command: OmicronCmd, | ||
} | ||
|
||
#[derive(Subcommand, Debug)] | ||
pub enum OmicronCmd { | ||
/// Apply a BGP peer group configuration. | ||
Apply { filename: String }, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔
Are peer groups specially related to Omicron in some way? Not a big deal, just curious why peer group configurations are under this command hierarchy and not one of the BGP neighbor related ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit more than a peer group in the typical sense, that comment is very misleading will fix. Thanks for pointing that out.
The pinned sha for maghemite was out of date, pointing to an invalid sha (I believe because of some force pushes that removed the specific commit). This updates omicron to point to [the most recent sha for maghemite](oxidecomputer/maghemite#199) and updates the necessary checksums so the tests pass. Some context on why this is even needed: because there is no mac build of mgd, we build it from source for macs, and need the correct sha for that.
This PR knocks off the following issues.
--no-default-features
stopped working #234Additionally, the BGP API is "normalized" in the sense that everything is now organized around resources, and every resource is a full set of CRUD operations. This is to facilitate better interactions with automation frameworks like Terraform. There is currently a work in progress terraform provider for maghemite. This also greatly helped in standing up testing environments where we iterate through a number of configuration states and make sure things land where we expect in container lab and bespoke falcon environments.