Skip to content

Document when to use Union[str, unicode] vs AnyStr #871

Closed
@JukkaL

Description

@JukkaL

This comes up pretty often, and it would be useful to have some documentation in the CONTRIBUTING.md file (or at least a link to this documentation).

Here's a starting point for what to document (there probably are other things that we should mention):

  • If a function accepts both str and ascii-only unicode arguments, usually the best type to use is Union[str, unicode] (or Union[str, Text] in a 2and3 stub).
  • Use AnyStr if you have two or more types in a signature that must agree on whether they are str or unicode. (It would also be nice to give an example where this is important.)
  • You can also use AnyStr in invariant positions in generic type arguments. For example, List[AnyStr] is generally better than List[Union[str, unicode]] (also explain why). However, often it's even better to use a covariant type such as Iterable or Sequence. In that case the union variant is preferable if the container may contain a mix of str and unicode. For example, Iterable[Union[str, unicode]] is fine if the iterable may contain a mix of str and unicode values.
  • Try to avoid using Union[str, unicode] in a return type, since it means that every call site will have to deal with both str and unicode values. It may be fine to use this if the return type is sufficiently unpredictable.
  • Similarly, try to avoid using Union[str, unicode] as an attribute type -- again code using this attribute would have to deal with both str and unicode values.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions