Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Land RPKI tests in Internet.nl #613

Merged
merged 97 commits into from
Jul 19, 2022

Conversation

maertsen
Copy link
Contributor

@maertsen maertsen commented Oct 7, 2021

This is a request for comments on new RPKI functionality in Internet.nl (issue #30). All feedback to improve this work is greatly appreciated.

The RPKI category of tests is implemented for both the webserver and
mailserver tests. It tests IP-addresses of servers and nameservers for a
domain against two subtests: ROA existence for each IP-address [1] and
Route Announcement Validity against routes observed in BGP [2]:

  • the existence subtest checks if an IP-address is covered by at
    least a single valid ROA.
    It passes for a (name)server if all IP-addresses meet this test.
  • the validity subtest checks the outcome of validating the routes
    announcements observed in BGP for an IP-address against so-called
    validated ROA payloads (VRPs).
    It passes for a (name)server if all IP-addresses are covered by at
    least a single route that is RPKI Valid.

Platform Internet Standards should discuss these criteria for success.
For example, the criterion for validity described above would yield a
subtest failure during a forged origin attack, which may well be outside
the control of the domain owner or its suppliers.

[1] https://rpki.readthedocs.io/en/latest/rpki/securing-bgp.html#route-origin-authorisations
[2] https://rpki.readthedocs.io/en/latest/rpki/securing-bgp.html#route-announcement-validity

Dependencies

These tests depend on the availability of a Routinator instance and the
Team Cymru IPtoASN DNS service. docker/README.md contains an example
of a Routinator setup as part of the docker development environment. The
Team Cymru IPtoASN DNS service was already a dependency for the
connection test.

The increased use of the Team Cymru IPtoASN DNS service may require
implementation of additional caching, or switching over to an
alternative. See also the comments in checks/tasks/routing.py.

This work lays the foundation for a future batch test implementation of RPKI.

Limitations

This commit contains only the minimal translations to make the tables
with tech details work. It does not implement scoring beyond the bare
minimum to show a notice upon failure.

Signed-off-by: Maarten Aertsen maarten.aertsen@ncsc.nl
Reviewed-by: Christian Veenman christian.veenman@ncsc.nl

@maertsen
Copy link
Contributor Author

maertsen commented Oct 8, 2021

possible improvements:

  • write unit tests and/or integration tests, perhaps based on rpki-{in,}valid-beacon.meerval.net (thanks @ximon18)
  • improve error handling for unavailability of the configured routinator instance

Copy link
Collaborator

@gthess gthess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job all around!
Mostly nit picks but there are a couple of things that need addressing.
Also it would be good to add an extra "ready" function in checks/appconf.py that will check for routinator's API existence/readiness and print a message.

checks/migrations/0009_rpki.py Outdated Show resolved Hide resolved
checks/probes.py Outdated Show resolved Hide resolved
checks/probes.py Outdated Show resolved Hide resolved
checks/scoring.py Outdated Show resolved Hide resolved
checks/tasks/__init__.py Show resolved Hide resolved
checks/tasks/routing.py Outdated Show resolved Hide resolved
checks/tasks/routing.py Outdated Show resolved Hide resolved
checks/tasks/routing.py Outdated Show resolved Hide resolved
checks/tasks/rpki.py Outdated Show resolved Hide resolved
checks/tasks/rpki.py Outdated Show resolved Hide resolved
@maertsen maertsen changed the title [RFC] Implement RPKI tests in the frontend [RFC] Implement RPKI tests in Internet.nl Nov 18, 2021
@maertsen
Copy link
Contributor Author

maertsen commented Nov 18, 2021

71a3d77 implements RPKI for the batch test. It requires #636 and #637 and has been tested on top of #623.
update: the pull requests mentioned have since been merge into master.

@maertsen maertsen force-pushed the rpki-frontend-squashed branch from 011e98f to 3aafbf8 Compare November 18, 2021 15:49
@maertsen maertsen force-pushed the rpki-frontend-squashed branch from 2230a76 to acecfb7 Compare November 29, 2021 11:10
@maertsen maertsen mentioned this pull request Dec 10, 2021
@baknu
Copy link
Contributor

baknu commented Dec 10, 2021

The current implementation (i.e. this pull request) does not seem to cover testing RPKI ROA;s for the IP addresses of nameservers of CNAME domains (that of course can be different than the nameservers of primary domains). This is in line with the IPv6 test that currently does not test for the IPv6 capabilities of nameservers of CNAME domains (#265). On the other hand the DNSSEC test does currently check if zones of CNAME domains are DNSSEC signed (and thus tests if the full chain is validly DNSSEC signed). However the latter test does not make this transparant in the test results (#188).

@gthess: what are your thoughts on including testing RPKI ROA;s for the IP addresses of nameservers of CNAME domains?

@gthess
Copy link
Collaborator

gthess commented Dec 10, 2021

@baknu: I would say do as the IPv6 test. Currently no, in the future yes.
We currently don't do this explicitly in the IPv6 test, and in the DNSSEC test it is done implicitly because of DNSSEC logic.
So because those two places (IPv6 and RPKI) behave and present results similarly (RPKI is modeled after the IPv6 test) it would be less confusing if they behaved the same way in this matter.

@baknu
Copy link
Contributor

baknu commented Dec 10, 2021

@gthess: Thanks! Is there anyting that we could do in the this initial release to make it easier to add this extension later? And besides do you see any obstacels implementing this extension in the IPv6 and RPKI test, or should this later be quite straight forward?

Maarten Aertsen added 15 commits January 3, 2022 12:02
The RPKI category of tests is implemented for both the webserver and
mailserver tests. It tests IP-addresses of servers and nameservers for a
domain against two subtests: ROA /existence/ for each IP-address [1] and
Route Announcement /Validity/ against routes observed in BGP [2]:

  - the /existence/ subtest checks if an IP-address is covered by at
    least a single valid ROA.
    It passes for a (name)server if all IP-addresses meet this test.
  - the /validity/ subtest checks the outcome of validating the routes
    announcements observed in BGP for an IP-address against so-called
    validated ROA payloads (VRPs).
    It passes for a (name)server if all IP-addresses are covered by at
    least a single route that is RPKI Valid.

Platform Internet Standards should discuss these criteria for success.
For example, the criterion for validity described above would yield a
subtest failure during a forged origin attack, which may well be outside
the control of the domain owner or its suppliers.

[1] https://rpki.readthedocs.io/en/latest/rpki/securing-bgp.html#route-origin-authorisations
[2] https://rpki.readthedocs.io/en/latest/rpki/securing-bgp.html#route-announcement-validity

Dependencies
------------

These tests depend on the availability of a Routinator instance and the
Team Cymru IPtoASN DNS service. `docker/README.md` contains an example
of a Routinator setup as part of the docker development environment. The
Team Cymru IPtoASN DNS service was already a dependency for the
connection test.

The increased use of the Team Cymru IPtoASN DNS service may require
implementation of additional caching, or switching over to an
alternative. See also the comments in `checks/tasks/routing.py`.

This work lays the foundation for a future batch test implementation of RPKI.

Limitations
-----------

This commit contains only the minimal translations to make the tables
with tech details work. It does not implement scoring beyond the bare
minimum to show a notice upon failure.

Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Reviewed-by: Christian Veenman <christian.veenman@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Fixes: 35538eb
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
Signed-off-by: Maarten Aertsen <maarten.aertsen@ncsc.nl>
mxsasha added 4 commits June 16, 2022 16:27
# Conflicts:
#	documentation/example_configuration/etc_apache2_sites-available/internet_nl_shared_auth.conf
#	interface/batch/util.py
@mxsasha mxsasha force-pushed the rpki-frontend-squashed branch from 5d0eec3 to e912aca Compare June 16, 2022 14:29
@mxsasha
Copy link
Collaborator

mxsasha commented Jun 20, 2022

Latest progress: we're getting close to ready. Management commands for mail and web now work, test actually tests something, many smaller improvements and cleanups. API looks good as well. Remaining:

  • Fix a few more host/domain terminology issues
  • Why are web_score/ns_score empty in WebTestRpki objects? Bug? Intended? To be picked up when we implement scoring impact.
  • Some documentation on how it works and why
  • Review with @stitch whether the celery queue setup is correct Create a new celery queue for RPKI and configure it in the Makefile, settings-dist and sample systemd/celery configs
  • Deployment instructions
  • Content to be added by @baknu
  • Merge content and check
  • Fix issue where not-routed is displayed as not-tested
  • Migrations cleanup?
  • Review of the PR by @stitch
  • Deploy and check on Sasha's test server
  • Merge into main
  • Deploy on dev setups for further testing
  • Inform steering committee

Less specific not-found routes

@maertsen not sure whether you saw my last comment on this, but I would like your input on this: not-found less specifics currently lead to a warning. It's visible for forumstandaardisatie.nl:

One of the mailservers is 147.181.98.131
There is a BGP route for 147.181.98.0/24 with origin AS40837 and a matching ROA, so valid
There is a less specific BGP route for 147.181.64.0/18 with origin AS40837 with no ROA, so not-found

Also visible here in IRRexplorer. This causes a warning for route announcement validity now, because the RPKI test has found a not-found route. I think this is a bug: the check should count this as valid, because any host connecting to 147.181.98.14 will use the most specific /24 route, which is valid, therefore the route used to reach the host is secure. It's kind of a moot point as the routing for the /24 and /18 is probably identical, but it might not always be. Thoughts?

@maertsen
Copy link
Contributor Author

Nice work!

@maertsen not sure whether you saw my last comment on this, but I would like your input on this

I hope to look into this later this week.

@maertsen
Copy link
Contributor Author

Less specific not-found routes

@maertsen not sure whether you saw my last comment on this, but I would like your input on this: not-found less specifics currently lead to a warning. It's visible for forumstandaardisatie.nl:

One of the mailservers is 147.181.98.131
There is a BGP route for 147.181.98.0/24 with origin AS40837 and a matching ROA, so valid
There is a less specific BGP route for 147.181.64.0/18 with origin AS40837 with no ROA, so not-found

Also visible here in IRRexplorer. This causes a warning for route announcement validity now, because the RPKI test has found a not-found route. I think this is a bug: the check should count this as valid, because any host connecting to 147.181.98.14 will use the most specific /24 route, which is valid, therefore the route used to reach the host is secure. It's kind of a moot point as the routing for the /24 and /18 is probably identical, but it might not always be. Thoughts?

I'll try to summarize my thinking that went into the test; I hope that's useful.

The best practice is to have a ROA for every prefix announced in BGP. I like the current test design, because it flags BGP announcements lacking one.
I agree that it's likely that all traffic in practice will flow via valid routes, but at the same time we cannot know this for sure. Because neither the perspective on BGP we use (Team Cymru) or that of IRR explorer is necessarily representative of what all operators see.

For those reasons, I would advise against turning this into a success. You could argue that this should be a notice instead of the current warning, but I'm unclear why an operator would publish ROAs in this manner (which the test would be 'rewarding' in a sense). There's also the (minor) aspect of test complexity: the current behaviour is simple and fully relies on Routinator for prefix matching.

Incidentally, this pattern of not publishing a ROA for a less specific prefix seems to be a pattern within 147.181.64.0/18. It also applies to ns4.belastingdienst.nl (more specific: 147.181.112.0/24). This missing ROA for the less specific results in some services in this /18 to be unprotected, e.g. if you look at min{1,2}.ncsc.nl (mailtest for ncsc.nl) the /18 is the longest prefix announced. Perhaps @baknu can ask his colleagues for the reasoning, perhaps I'm missing something.

@maertsen
Copy link
Contributor Author

* [ ]  Why are web_score/ns_score empty in WebTestRpki objects? Bug? Intended?

This has not been implemented for lack of understanding of the scoring code. (see also: limitations at the top of this PR).

This will become a problem once the RPKI test results start to contribute to the overall test score. Currently, they do not. So intended or bug: a bit of both :-)

@mxsasha mxsasha force-pushed the rpki-frontend-squashed branch from 26193a9 to 2ccf9a8 Compare June 21, 2022 19:20
@mxsasha mxsasha force-pushed the rpki-frontend-squashed branch from 2ccf9a8 to 4f528be Compare June 21, 2022 19:21
@mxsasha
Copy link
Collaborator

mxsasha commented Jul 12, 2022

@maertsen another clarification question: if a host is not routed, this code sets the state to not-tested. Seems to me that should be not-routed. Am I right or am I overlooking something?

@maertsen
Copy link
Contributor Author

@maertsen another clarification question: if a host is not routed, this code sets the state to not-tested. Seems to me that should be not-routed. Am I right or am I overlooking something?

hmm, state is perhaps a misnomer here: it refers to the value of the last column of the tech table, not to the test status.

From the top of my head: I think I refrained from adding a detail tech data not-routed value (and stick to the existing values) because we can't know for sure if it's not routed. It's just that we can't find any routes, which precludes us from running the validity test (which is also what result_not_routed amounts to). Does that make any sense?

The other perspective (is that how your question arose?), is that the end-user may want to understand why a certain address is not tested. I figured that would be covered in the detail text for the verdict, but I suppose that's up for discussion.

I'm noticing it's been a while since I wrote this code; if you feel some changes help readability of the code or accessibility of the results, do go ahead!

@mxsasha mxsasha dismissed gthess’s stale review July 18, 2022 12:59

this comment was addressed already

@mxsasha mxsasha merged commit 8d20433 into internetstandards:main Jul 19, 2022
@maertsen maertsen deleted the rpki-frontend-squashed branch August 1, 2022 07:56
@maertsen
Copy link
Contributor Author

maertsen commented Aug 1, 2022

thanks @mxsasha!

@baknu
Copy link
Contributor

baknu commented Nov 4, 2022

The best practice is to have a ROA for every prefix announced in BGP.

In response to a question about this, MA pointed at RFC 7115 and especially at:
"Operators owning prefix P should issue ROAs for all ASes that may announce P. If a prefix is legitimately announced by more than one AS, ROAs for all of the ASes SHOULD be issued so that all are considered Valid."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

RPKI - whether BGP routes are signed
6 participants