Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Socket endpoints, in particular reading them from a String #464

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

infinity0
Copy link
Contributor

This comes from the main part of #428, except I've made the following changes:

  • renamed the concept to "endpoint" which hopefully is less confusing than "name"
  • added some tests & usage examples

I did not implement the read functions as the Read instance because it doesn't follow the standard Read contract. [*]

In #428 the comments were that the code belongs in a more "high-level" library, however I hope the test cases & usage examples help to convince you that it is better in this library. I can't justify creating a separate package on hackage just for these few extra functions, and they would be out-of-place in a more specialised library (e.g. a http / web / p2p library).

The main entry point is socketFromEndpoint, which is meant as a convenience utility, replacing the need to call combinations of 3-4 other functions - similar to how openSocket was added to this library in the meantime (5b09871, 2 weeks ago). In my opinion, saying this is "too high-level" is similar to saying openSocket is "too high-level".

readSockEndpoint is useful for CLI parsers or other UI-related things - the user can specify an endpoint easily as a string, and you can pass it to some CLI framework like optparse-applicative and have the validation & parsing done automatically, your code can then directly work with a SockEndpoint object.

[*] I have another commit, not part of this PR, that adds support for reading SockAddr as well, and then has separate deriving (Show, Read) instances, which I can file afterwards if this PR is accepted.

@kazu-yamamoto
Copy link
Collaborator

First minor things:

  • Please use return instead of pure
  • Please use HostName instead of String

@kazu-yamamoto
Copy link
Collaborator

Major things: From test examples, the essential port is parsing strings such as "host:port". Personally, I support to add such a parser to combine getAddrInfo.

However, I don't want to add a new intermediate data type, SockEndpoint. At least, I don't want to export the internal of SockEndpoint. This is similar to Network.PortID. We had hard experience to remove it.

I'm also the maintainer of network-run. If you add this kind of feature to network-run, I'd love to accept.

@infinity0
Copy link
Contributor Author

Please use return instead of pure

Sorry, I will do this, this is just an automatic habit for me. My understanding is that newer Haskell code prefers pure due to it being cleaner conceptually - associated with Applicative a less restrictive concept than Monad. Is there any plan for network to switch?

Personally, I support to add such a parser to combine getAddrInfo.

I'm not sure what you mean by "combine". In this PR, getAddrInfo is only used as a shortcut to parse a numeric IP address using AI_NUMERICHOST. Otherwise it will do a DNS lookup, which IMO is inappropriate for a parser to do.

This is similar to Network.PortID. We had hard experience to remove it.

Taking a look at https://hackage.haskell.org/package/network-2.6.0.0/docs/src/Network.html#PortID - it seems that this was an inappropriate abstraction, combing a hostless portnumber with a unix domain socket path. I agree this doesn't make much sense. However, a previous bad experience with an inappropriate abstraction, should not prejudice you against a better abstraction.

I'm also the maintainer of network-run. If you add this kind of feature to network-run, I'd love to accept.

My motivation is to develop appropriate abstractions. With a single function socketFromEndpoint, it encapsulates everything you could want to do with addresses. Then, the caller can do further things with a socket - set socket options, bind or connect, anything they want as they wish. OTOH, with network-run it does not let the caller do everything with addresses (e.g. it does not support unix-domain sockets), yet it also does not let the caller do everything with sockets either (e.g. it sets options, and makes specific choices like listen sock 1024). So if this PR is rejected in the end, I would rather just create a separate package instead of merging it with network-run.

@infinity0
Copy link
Contributor Author

Please use return instead of pure

Sorry, I will do this, this is just an automatic habit for me. My understanding is that newer Haskell code prefers pure due to it being cleaner conceptually - associated with Applicative a less restrictive concept than Monad. Is there any plan for network to switch?

Done in 65b9b0b. BTW, some other uses of pure slipped into the current code in the meantime, in getAddrInfo and instance Read SocketType.

@infinity0
Copy link
Contributor Author

Please use HostName instead of String

Done in 1c11e49, it requires moving the aliases from Info into Types. Since PortNumber is already in Types, I think the new location is fine (and even better).

@kazu-yamamoto
Copy link
Collaborator

BTW, some other uses of pure slipped into the current code in the meantime, in getAddrInfo and instance Read SocketType.

Than you for pointing out. I have fixed them.

@kazu-yamamoto
Copy link
Collaborator

Sorry, I will do this, this is just an automatic habit for me. My understanding is that newer Haskell code prefers pure due to it being cleaner conceptually - associated with Applicative a less restrictive concept than Monad. Is there any plan for network to switch?

Maybe in version 4 (or 3.2). I think that we will have version 4 for #364. We are planning to release 3.1.2 with the current master.

@kazu-yamamoto
Copy link
Collaborator

I'm not sure what you mean by "combine". In this PR, getAddrInfo is only used as a shortcut to parse a numeric IP address using AI_NUMERICHOST. Otherwise it will do a DNS lookup, which IMO is inappropriate for a parser to do.

If we have:

parseEndpoint :: String -> (HostName, ServiceName)

we can easily pass the result to getAddrInfo.

@kazu-yamamoto
Copy link
Collaborator

@infinity0 What is your endpoint based on? RFC3986?

@infinity0
Copy link
Contributor Author

infinity0 commented Jun 2, 2020

@infinity0 What is your endpoint based on? RFC3986?

The concept fits parts of the ideas in that RFC, but it originates more directly from my years of experience programming a wide variety of real-world networking programs - client/server, different types of proxies, peer-to-peer nodes, etc etc, typically with strong security requirements.

I've seen many examples of this type of low-level networking code that does essentially the same thing but written slightly differently using whatever language's API that mirrors the C API, and they all support slightly different things because everyone solves their own problem in their own specific non-general way. Then you have to file a specific PR to add unix socket support, or another specific PR to add support for localhost, or another specific PR to add support for domain names, or another specific PR to support arbitrary port numbers, or another PR to support IPv6, etc etc etc. This all stems from the fact that the C API exposes the wrong abstractions (for modern usage, at least) - it exposes internal implementations, rather than thinking about the external abstractions.

To put it simply, there are two sources of information in opening a network link - (1) user-supplied info i.e. the address, which can vary arbitrarily, and (2) programmer-supplied parameters, which the user cannot vary. (It is highly unusual for a user to decide whether they want to run something as TCP or UDP for example).

Endpoint expresses (1), and expresses it fully [*]. Then the functions socketFromEndpoint captures part of (2), and the remaining socket-related traditional functions such as listen/connect/setSockOption etc capture the remainder of (2).

So if a programmer uses Endpoint, this is a uniform API consisting of one datatype and 2 functions (readEndpoint / socketFromEndpoint), you can only call it one way, you only need to call it one way, and it will enable your networking application to support any address the user can possibly want to use, in as general of a way as possible.

[*] OK, there is one minor thing missing, it doesn't support ServiceName right now, but I can add that to this PR if needed, simply by relaxing PortNumber to Either ServiceName PortNumber or equivalent.

…n of them

Useful for 3rd-party networking applications that might want to pass around
service specifiers without worrying whether these are IP addresses, DNS names,
or UNIX-domain socket paths.

Previously, there was no data type to encapsulate these options together. In
particular, getAddrInfo had to be used to resolve DNS names into a SockAddr
before calling connect/bind, but it could not deal with UNIX domain sockets.
The new function sockNameToAddr takes this role, transparently converting DNS
names and passing through non-DNS-names unaltered, so that it can be used
uniformly without worrying about the specific type of input name/address.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants