Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: MacAddress validation to support packed input types #108

Open
mcdiarmid opened this issue Dec 8, 2023 · 0 comments
Open

Comments

@mcdiarmid
Copy link

mcdiarmid commented Dec 8, 2023

Support Casting/Conversion of More Input Types

One of the awesome things about pydantic is the ability to convert input types that differ with the field type with a cast or conversion. For example, if my field is of type datetime and I pass in a float/int, it will be treated as a unix epoch and converted to a datetime type accordingly. Similarly, if a timestamp string such as 2023-09-22T12:30:01Z is passed to this field, it will also be converted to a datetime type accordingly.

Currently the MacAddress._validate class method only supports inputs of type str with length 14.

def _validate(cls, __input_value: str, _: Any) -> str:
return cls.validate_mac_address(__input_value.encode())

Mac Addresses are realistically often represented as a sequence of 6 bytes in packet headers. In my case I've written some code to unpack the header of a Layer 2 Ethernet Frame https://en.wikipedia.org/wiki/Ethernet_frame#Structure. However, I must first transform mac_destination and mac_source before constructing my model.

import struct
from enum import IntEnum
from typing import Final

from pydantic import BaseModel
from pydantic_extra_types.mac_address import MacAddress


class EtherType(IntEnum):
    IPV4 = 0x0800
    IPV6 = 0x86DD

class Layer2EthernetHeader(BaseModel):
    mac_dst: MacAddress
    mac_src: MacAddress
    ethertype: EtherType
    size_t: Final[int] = 14


def mac_str(mac_bytes: bytes) -> str:
    return ":".join(f"{b:02x}" for b in mac_bytes)


def decode_layer2_ethernet_header(data:  bytes, index: int = 0) -> Layer2EthernetHeader:
    mac_destination, mac_source, ethertype = struct.unpack(">6s6sH", test)
    return Layer2EthernetHeader(
        mac_dst=mac_str(mac_destination),
        mac_srt=mac_str(mac_source),
        mac_srt=ethertype,
    )

I propose that during validation, MacAddress performs a check for non-str input types, and handles them accordingly. Specifically iterables types of length 6 (bytes, bytearray, List[int], NDArray[int], ...). Below is a some code that could accomplish this (have also added a conversion from an int, but this might not be an appropriate representation of a Mac Address):

    def _validate(cls, __input_value: Union[str, Sequence[int]], _: Any) -> str: 
        if isinstance(__input_value, int):
            __input_value = [0xff & (__input_value >> (i*8)) for i in range(6)]

        if not isinstance(__input_value, str) and len(__input_value) == 6:
            __input_value = ":".join(f"{b:02x}" for b in __input_value)
        elif isinstance(__input_value, str):
            pass
        else:
            raise TypeError(
                f"Input must be str of length 14, or Sequence[int] of length 6.  Got: {__input_value}."
            )
        return cls.validate_mac_address(__input_value.encode()) 

Furthermore, for IP addresses pydantic uses the standard library's IP Address implementation, which stores the IP address as an int internally, but presents the human-readable format with the __str__ method. Would it make sense to store Mac Addresses in a Sequence[int] format behind the scenes, and implementing the human-readable colon separated bytes as __str__?

Edits: Typos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant