Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0xPop #199

Merged
merged 39 commits into from
May 4, 2024
Merged

0xPop #199

merged 39 commits into from
May 4, 2024

Conversation

rcgoodfellow
Copy link
Collaborator

@rcgoodfellow rcgoodfellow commented Apr 7, 2024

This PR knocks off the following issues.

Additionally, the BGP API is "normalized" in the sense that everything is now organized around resources, and every resource is a full set of CRUD operations. This is to facilitate better interactions with automation frameworks like Terraform. There is currently a work in progress terraform provider for maghemite. This also greatly helped in standing up testing environments where we iterate through a number of configuration states and make sure things land where we expect in container lab and bespoke falcon environments.

Comment on lines +606 to +626
if let Some(bgp) = p.bgp.as_ref() {
if bgp.id == id {
let mut marked = p.clone();
marked.bgp = Some(bgp.as_stale());
return Some(marked);
}
}
Copy link
Contributor

@taspelund taspelund Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this relying on the peer's Router ID to identify routes learned from a particular BGP session? That would probably suffice in the short term, but may pose problems in deployments with multiple sessions to the same peer (e.g. over redundant L3 links).
e.g.
If we have BGP sessions A and B to RID X and a route-refresh is only received on session A, we should only mark the paths learned from session A as stale. Paths from session B should still be valid.
If we only use RID to identify a BGP session, then it seems like a route-refresh from session A would result in routes from both session A/B being marked stale... and all paths from session B getting removed when EoRR is received on session A.

I don't know of any current deployments with redundant sessions to the same peer (or re-use of RID across multiple routers) so I don't think this is high priority today, but I think is something we'll need to address at some point (assuming I'm understanding the code correctly).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rcgoodfellow rcgoodfellow force-pushed the oxpop branch 2 times, most recently from 7e47592 to 964d17a Compare May 2, 2024 04:29
@rcgoodfellow rcgoodfellow marked this pull request as ready for review May 2, 2024 21:24
Comment on lines +105 to +106
// Omicron API XXX? use normal API now that it's somewhat civilized?
register!(api, bgp_apply);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to refactor some things in that omicron RPW anyway, so we can move away from calling this api endpoint when we do that.

Copy link
Contributor

@internet-diglett internet-diglett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work, this is a huge PR! I've reached a stopping point for today's review session and leaving some notes. So far I only have some very minor items and a few questions.

rdb/src/bestpath.rs Outdated Show resolved Hide resolved
rdb/src/bestpath.rs Outdated Show resolved Hide resolved
rdb/src/bestpath.rs Show resolved Hide resolved
rdb/src/bestpath.rs Outdated Show resolved Hide resolved
rdb/src/bestpath.rs Outdated Show resolved Hide resolved
interop-lab/create-containers.sh Show resolved Hide resolved
rdb/src/db.rs Show resolved Hide resolved
mgadm/src/bgp.rs Outdated
Comment on lines 79 to 89
#[derive(Debug, Args)]
pub struct OmicronSubcommand {
#[command(subcommand)]
command: OmicronCmd,
}

#[derive(Subcommand, Debug)]
pub enum OmicronCmd {
/// Apply a BGP peer group configuration.
Apply { filename: String },
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

Are peer groups specially related to Omicron in some way? Not a big deal, just curious why peer group configurations are under this command hierarchy and not one of the BGP neighbor related ones.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit more than a peer group in the typical sense, that comment is very misleading will fix. Thanks for pointing that out.

@rcgoodfellow rcgoodfellow merged commit 025389f into main May 4, 2024
12 checks passed
@rcgoodfellow rcgoodfellow deleted the oxpop branch May 4, 2024 00:25
@internet-diglett internet-diglett mentioned this pull request May 7, 2024
david-crespo pushed a commit to oxidecomputer/omicron that referenced this pull request May 12, 2024
The pinned sha for maghemite was out of date, pointing to an invalid sha
(I believe because of some force pushes that removed the specific
commit).

This updates omicron to point to [the most recent sha for
maghemite](oxidecomputer/maghemite#199) and
updates the necessary checksums so the tests pass.

Some context on why this is even needed: because there is no mac build
of mgd, we build it from source for macs, and need the correct sha for
that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment