Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACP(core::net): add Ipv[4|6]Address::from_octets and Ipv6Address::from_segments #447

Closed
thvdveld opened this issue Sep 25, 2024 · 19 comments
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@thvdveld
Copy link

thvdveld commented Sep 25, 2024

Proposal: Add Ipv[4|6]Address::from_octets and Ipv6Address::from_segments

Problem statement

It is common to convert &[u8] to an Ipv[4|6]Addr. This can currently be achieved using:

  • impl From<[u8; 16]> for Ipv6Addr
  • impl From<[u16; 8]> for Ipv6Addr
  • impl From<[u8; 4]> for Ipv4Addr

which requires us to write:

Ipv4Addr::from(<[u8;4]>::try_from(x).unwrap())

or

Ipv6Addr::from(<[u8;16]>::try_from(x).unwrap())

This is not convenient as the type needs to be implicitly written.

Motivating examples or use cases

Packets are usually just &[u8], which a TCP/IP stack needs to parse. Indexing the fields return either a u8 or a &[u8] when the field is one octet or multiple octets, respectively. Creating an IP address therefore requires us to parse a &[u8] into an Ipv4Addr or Ipv6Addr.

To achieve this in C#, the public IPAddress (byte[] address) can be used. When the address contains 4 elements, an IPv4 address is constructed. For other cases, an IPv6 address is constructed.

In go, an address can also be constructed by a byte array, without specifying the length of the array.

addr := net.IP([]byte{192, 168, 1, 1})

Python also allows creating an address by passing a bytes object.

smoltcp is a TCP/IP stack, which contains custom Ipv[4|6]Address types, instead of using core::net::Ipv[4|6]Addr. The network stack parses IPv6 address like this:

/// Return the source address field.
#[inline]
pub fn src_addr(&self) -> Address {
    let data = self.buffer.as_ref();
    Address::from_bytes(&data[field::SRC_ADDR])
}

This ACP seeks to add functions which allow creating addresses from a &[u8], where the compiler can infer the type when calling try_into().

Solution sketch

By adding the following functions:

pub const fn from_octets(octets: [u8; 4]) -> Ipv4Addr {
    Ipv4Addr { octets }
}

pub const fn from_octets(octets: [u8; 16]) -> Ipv6Addr {
    Ipv6Addr { octets }
}

pub const fn from_segments(segments: [u16; 8]) -> Ipv6Addr {
    let [a, b, c, d, e, f, g, h] = segments;
    Ipv6Addr::new(a, b, c, d, e, f, g, h)
}

the compiler can infer the type for try_into(), which was previously not possible with the From implementations

let octets: &[u8] = &[127, 0, 0, 1][..];
Ipv4Address::from_octets(<[u8; 4]>::try_into(octets).unwrap());

can now be written as

let octets: &[u8] = &[127, 0, 0, 1][..];
Ipv4Address::from_octets(octets.try_into().unwrap());

These functions are consistent with the Ipv4Addr::octets(&self) -> [u8; 4], Ipv6Addr::octets(&self) -> [u8; 16] and Ipv6Addr::segments(&self) -> [u16; 8] functions.

Another way of easily constructing an address from a &[u8], would be by implementing:

impl TryFrom<&[u8]> for Ipv4Addr
impl TryFrom<&[u8]> for Ipv6Addr

This would make it possible to construct an address as:

Ipv4Addr::try_from(some_slice).unwrap();
Ipv6Addr::try_from(some_slice).unwrap();

Alternatives

Instead of using try_into to convert &[u8] to [u8; 4] or [u8; 16], from_octets could just take a &[u8], and panic when the length is not 4 or 16 for Ipv4Addr and Ipv6Addr respectively.

Links and related work

rust-lang/rust#130629 already implements this ACP.

C# documentation: https://learn.microsoft.com/en-us/dotnet/api/system.net.ipaddress.-ctor?view=net-8.0#system-net-ipaddress-ctor(system-byte())

go documentation: https://pkg.go.dev/net#IP

Python documentation: https://docs.python.org/3/library/ipaddress.html#ipaddress.IPv4Address

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.
@thvdveld thvdveld added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Sep 25, 2024
@thvdveld thvdveld changed the title ACP(core::net): add Ipv[4|6]Address::from_octets and Ipv6Address::from_segments ACP(core::net): add Ipv[4|6]Address::from_octets and Ipv6Address::from_segments Sep 25, 2024
@scottmcm
Copy link
Member

If the goal is just to not write the length of the array, something like this works fine:

use std::net::Ipv4Addr;
fn demo(x: &[u8]) -> Ipv4Addr {
    let bytes = *x.first_chunk().unwrap();
    bytes.into()
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=6a61d31a38c421a4c372c5490329a513

Or if you want to check the slice size, something like

use std::net::Ipv4Addr;
fn demo2(x: &[u8]) -> Option<(Ipv4Addr, &[u8])> {
    let (bytes, rest) = x.split_first_chunk()?;
    Some(((*bytes).into(), rest))
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=4414f739381fad30fa3520ee7503feac

Basically, use a different way to get the array rather than .try_into().unwrap() and I think things are nicer, but maybe there's something I'm missing about why that's not ok. How did you get the correct-length slice in the first place?

@joshtriplett
Copy link
Member

joshtriplett commented Sep 25, 2024

I think the problem of handling a slice of bytes could be addressed better by impl TryFrom<&[u8]> for Ipv4Addr and impl TryFrom<&[u8]> for Ipv6Addr.

@thvdveld
Copy link
Author

How did you get the correct-length slice in the first place?

Most of the fields in IP packets have a fixed position and length. In smoltcp, the positions are represented as Fields, which are just Ranges. When parsing a packet, we first check if the length of the data we received is enough to contain a valid packet. Getting a field out of the packet is then just taking a subslice. For example, getting the source address:

&data[field::SRC_ADDR]

In smoltcp, we can then easily convert this subslice into an Ipv4Address (from smoltcp, not from core::net), doing Address::from_bytes(&data[field::SRC_ADDR]).

This is implemented as follows in smoltcp:

/// Construct an IPv4 address from a sequence of octets, in big-endian.
///
/// # Panics
/// The function panics if `data` is not four octets long.
pub fn from_bytes(data: &[u8]) -> Address {
    let mut bytes = [0; ADDR_SIZE];
    bytes.copy_from_slice(data);
    Address(bytes)
}

I find this method of creating an IP address in smoltcp very straightforward, which contrasts with the more complex approach currently required by the types in the core library.

However, I think the problem of handling a slice of bytes could be addressed better by impl TryFrom<&[u8]> for Ipv4Addr.

This would also be better to address the problem. I modified the ACP to also mention this.

@Dirbaio
Copy link

Dirbaio commented Sep 26, 2024

@scottmcm your suggested snippets behave a bit differently, they'll silently ignore slices longer than 4 bytes, while we'd like to panic in that case.

There's another argument in favor of adding these methods: consistency with other conversions, like for u32.

Currently the Ipv4Addr<->u32 conversions have all possible combinations implemented (method and From, in both directions):

  • from_bits()
  • to_bits()
  • impl From<u32> for Ipv4Addr
  • impl From<Ipv4Addr> for u32

but the Ipv4Addr<->[u8;4] conversions only have

  • octets()
  • impl From<[u8; 4]> for Ipv4Addr

Why is it like that? Why is one direction only available with a method, why is the other direction only available with a From impl? IMO it'd make a lot of sense to add the missing conversions:

  • from_octets()
  • impl From<Ipv4Addr> for [u8;4]

(and same for ipv6 octets and segments)

@scottmcm
Copy link
Member

scottmcm commented Sep 26, 2024

So I'm not at all opposed to the clearly-named methods for this that take arrays by value. That seems entirely reasonable, and array-of-bytes and IPv4 aren't exactly value-preserving in the https://doc.rust-lang.org/std/convert/trait.From.html#when-to-implement-from sense, so using From is a bit iffy for it anyway, though not really wrong either.

What I mean is that the slice part of this feels like it wants something different. You're only really having that problem because the &data[field::SRC_ADDR] is giving you a &[u8] even though you have all the information you need to get a &[u8; N] from it.

If you could do data.const_index::<SRC_ADDR>() or data.const_index_offset_len::<12, 4>(), for example, then you'd be able to preserve the length information that you already have from using the constant range anyway, and wouldn't need the try_+unwrap dance at all.

TL/DR: let's add a way that means you don't need to .try_into().unwrap() at all, rather than just avoiding type annotations on it. (And from_octets+from_segments can be part of that solution.)


your suggested snippets behave a bit differently, they'll silently ignore slices longer than 4 bytes, while we'd like to panic in that case.

There's of course various other phrasings of the same thing that do that, like

use std::net::Ipv4Addr;
fn demo(x: &[u8]) -> Ipv4Addr {
    let Some((bytes, [])) = x.split_first_chunk() else { panic!() };
    From::from(*bytes)
}

@Dirbaio
Copy link

Dirbaio commented Sep 26, 2024

If you could do data.const_index::<SRC_ADDR>() or data.const_index_offset_len::<12, 4>(), for example, then you'd be able to preserve the length information that you already have from using the constant range anyway, and wouldn't need the try_+unwrap dance at all.

TL/DR: let's add a way that means you don't need to .try_into().unwrap() at all, rather than just avoiding type annotations on it. (And from_octets+from_segments can be part of that solution.)

that would be awesome, but such a thing doesn't exist today in Rust and adding it is a much larger undertaking than adding a few methods to Ip*Addr. I would appreciate if this ACP wasn't blocked on that.

@programmerjake
Copy link
Member

&data[field::SRC_ADDR]

you could use something like:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=de44f37a6fd78bc5572a974cb8ea8960

use std::ops::Index;
pub struct MyField;

impl Index<MyField> for [u8] {
    type Output = [u8; 4];
    fn index(&self, _i: MyField) -> &Self::Output {
        (&self[8..12]).try_into().unwrap()
    }
}

@scottmcm
Copy link
Member

I would appreciate if this ACP wasn't blocked on that.

Well I bring it up because of other posts here mentioning impl TryFrom<&[u8]> for Ipv4Addr, for example.

Adding

pub const fn from_octets(octets: [u8; 4]) -> Ipv4Addr;
pub const fn from_segments(segments: [u16; 8]) -> Ipv6Addr;

seems entirely fine to me, I agree.

I'm just much less convinced that anything dealing in unknown-length slices is solving the right problem, because that problem should be solved at the slicing level instead IMHO, rather than at the every-type-constructible-from-an-array level.

@scottmcm
Copy link
Member

@programmerjake including generalizing it to something like

struct Field<const OFFSET: usize, const LEN: usize>;
const SRC_ADDR: Field<12, 4> = Field;
const DST_ADDR: Field<16, 4> = Field;

impl<T, const OFFSET: usize, const LEN: usize> Index<Field<OFFSET, LEN>> for [T] {
    type Output = [T; LEN];
    fn index(&self, Field: Field<OFFSET, LEN>) -> &Self::Output {
        (&self[OFFSET..][..LEN]).try_into().unwrap()
    }
}

fn packet_src_dst_addrs(packet: &[u8]) -> (Ipv4Addr, Ipv4Addr) {
    (packet[SRC_ADDR].into(), packet[DST_ADDR].into())
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=be3bbe2a37010bc6be86522e00b4e301

@programmerjake
Copy link
Member

tbh this makes me think that Rust needs some length-based range types, kinda like verilog's +: operator: value[start +: len]
https://electronics.stackexchange.com/questions/74277/what-is-the-operator-called-in-verilog/74285#74285

@scottmcm
Copy link
Member

scottmcm commented Sep 26, 2024

@programmerjake I only wrote it that way because I can't do

type Output = [T; END - BEGIN];

in Rust today :P

(Given the choice, I'd absolutely use a half-open range here too.)

@programmerjake
Copy link
Member

tbh this makes me think that Rust needs some length-based range types,

I guess they still wouldn't do what we want, which is: let v: [u8; 5] = v[3 +: 5] since the len is still passed in as a runtime argument and not a const generic.

@the8472
Copy link
Member

the8472 commented Sep 30, 2024

What I mean is that the slice part of this feels like it wants something different.

The way this is usually done in Java and Node is to use a buffer object with a cursor and then read things field by field into local variables and then process those variables further.

It could look like this

fn parse_header(buf: &[u8]) -> Option<Header> {
    let buf = Buffer::new(buf, Endianness::Big);
    let header = Header::default();
    header.version = buf.read_u8()?;
    header.flags = buf.read_u32()?;
    header.src_ip6 = buf::read_chunk::<16>()?.into();
    header.dst_ip6 = buf::read_chunk::<16>()?.into();
    Some(header)
}

this could even be added to io::Cursor, at least for reading, it would be less appropriate for writing.

@joshtriplett
Copy link
Member

We discussed this in today's @rust-lang/libs-api meeting.

We were in favor of adding the proposed from_octets/from_segments.

Some of us were in favor of adding the direct TryFrom<&[u8]> impls as well. That one will need an FCP to see if we have team consensus to add it. Please feel free to propose it, though, and we can start an FCP.

@Dirbaio
Copy link

Dirbaio commented Oct 1, 2024

i'm not convinced about TryFrom<&[u8]>, i'd prefer to FCP just from_octets/from_segments, would that be possible?

@cuviper
Copy link
Member

cuviper commented Oct 1, 2024

New methods can be added #[unstable] without FCP yet; just the trait impl needs that right away as it's instantly stable.

@scottmcm
Copy link
Member

scottmcm commented Oct 1, 2024

They should be separate PRs regardless -- the new methods can be added in nightly as soon as a reviewer r+s them, whereas a new trait impl needs a 10-day FCP, so procedurally they ought to be split even if they both might end up landing.

@Dirbaio
Copy link

Dirbaio commented Oct 6, 2024

ping! so what's the next steps here? someone from T-libs accepts this ACP?

@Amanieu
Copy link
Member

Amanieu commented Oct 7, 2024

The next step is to submit a PR to add from_octects/from_segments. This ACP is considered accepted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

8 participants