Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Add very basic domain validation for DomainSpecificString. #9071

Merged
merged 3 commits into from
Jan 13, 2021
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/9071.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix "Failed to send request" errors when a client provides an invalid room alias.
8 changes: 8 additions & 0 deletions synapse/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,14 @@ def from_string(cls: Type[DS], s: str) -> DS:

domain = parts[1]

# TODO The checking of a valid domain name should be made stricter.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bunch of rules about valid domain names that we could enforce (< 253 characters, made up of labels of 63 characters, etc) but I'm no confident in those not being broke by same real domain names out there. Commas shouldn't be allowed, as well as other general punctuation (:!@#$%^&*(), etc.)... I can expand this if we'd like. 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

: absolutely can go in the domain part of a domain specific string, since they can include ipv6 literals, with all the fun that entails.

There already exists a function parse_and_validate_server_name which does what I think you want here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to remember the reason we didn't already do more validation here. I think the reason is some combination of:

  • efficiency: we don't necessarily want to validate an identifier every time we pull it out of the database or otherwise wrap it in a DomainSpecificString. However I'm prepared to believe that increased robustness from better checking makes that a price worth paying.
  • there may be cases where there are existing events which contain malformed identifiers, but which we have already been accepted into the event DAG of a room. Changing the rules now could have unforseen consequences.

If you want to give it a go, I won't object. Just trying to give some background.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, IP literals complicate things a bit. 😢

Pulling things out of the database might not need validation, but we should be validating new data coming in. I might be able to do something a bit more specific for that endpoint to fix the Sentry issue, but that feels more like a hack. I could believe there's some broken aliases and such already in events, unfortunately. I'll think a bit about.

Thanks for the pointer to parse_and_validate_server_name!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some changes based on this conversation -- the additional validation is only done on is_valid and not with from_string. I did a bit of auditing and it looks like is_valid is never called with data pulled from the database so I hope this is safer!

if "," in domain:
raise SynapseError(
400,
"Invalid domain name for %s: %s" % (cls.__name__, domain),
Codes.INVALID_PARAM,
)

# This code will need changing if we want to support multiple domain
# names on one HS
return cls(localpart=parts[0], domain=domain)
Expand Down
10 changes: 10 additions & 0 deletions tests/test_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,16 @@ def test_build(self):

self.assertEquals(room.to_string(), "#channel:my.domain")

def test_validate(self):
id_string = "#test:domain,test"

try:
RoomAlias.from_string(id_string)
self.fail("Parsing '%s' should raise exception" % id_string)
except SynapseError as exc:
self.assertEqual(400, exc.code)
self.assertEqual("M_INVALID_PARAM", exc.errcode)


class GroupIDTestCase(unittest.TestCase):
def test_parse(self):
Expand Down