Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support field docstrings for dataclasses that use __slots__ #113878

Closed
rhettinger opened this issue Jan 9, 2024 · 13 comments
Closed

Support field docstrings for dataclasses that use __slots__ #113878

rhettinger opened this issue Jan 9, 2024 · 13 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@rhettinger
Copy link
Contributor

rhettinger commented Jan 9, 2024

Feature or enhancement

Proposal:

A really nice __slots__ feature is the ability to provide a data dictionary in the form of docstrings. That is also available for property objects. Consider this example:

class Rocket:
    __slots__ = {
        'velocity': 'Speed relative to Earth in meters per second',
        'mass': 'Gross weight in kilograms',
    }
    @property
    def kinetic_energy(self):
        'Kinetic energy in Newtons'
        return self.mass * self.velocity ** 2

Running help(Rocket) shows all the field docstrings which is super helpful:

class Rocket(builtins.object)
 |  Readonly properties defined here:
 |
 |  kinetic_energy
 |      Kinetic energy in Newtons
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |
 |  mass
 |      Gross weight in kilograms
 |
 |  velocity
 |      Speed relative to Earth in meters per second

It would be nice is the same can be done with dataclasses that define __slots__:

@dataclass(slots=True)
class Rocket:
    velocity: float = field(doc='Speed relative to Earth in meters per second')
    weight: float = field(doc='Gross weight in kilograms')

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

Linked PRs

@rhettinger rhettinger added the type-feature A feature request or enhancement label Jan 9, 2024
@AlexWaygood AlexWaygood added the stdlib Python modules in the Lib dir label Jan 9, 2024
@sobolevn
Copy link
Member

The change is quite simple, it results in:

import dataclasses

@dataclasses.dataclass(slots=True)
class A:
    a: int
    b: int = dataclasses.field(doc='bb')

print(help(A))
Снимок экрана 2024-01-10 в 11 41 17

The change:

diff --git Lib/dataclasses.py Lib/dataclasses.py
index 2fba32b5ffb..356c3df5874 100644
--- Lib/dataclasses.py
+++ Lib/dataclasses.py
@@ -303,11 +303,12 @@ class Field:
                  'compare',
                  'metadata',
                  'kw_only',
+                 'doc',
                  '_field_type',  # Private: not to be used by user code.
                  )
 
     def __init__(self, default, default_factory, init, repr, hash, compare,
-                 metadata, kw_only):
+                 metadata, kw_only, doc):
         self.name = None
         self.type = None
         self.default = default
@@ -320,6 +321,7 @@ def __init__(self, default, default_factory, init, repr, hash, compare,
                          if metadata is None else
                          types.MappingProxyType(metadata))
         self.kw_only = kw_only
+        self.doc = doc
         self._field_type = None
 
     @_recursive_repr
@@ -335,6 +337,7 @@ def __repr__(self):
                 f'compare={self.compare!r},'
                 f'metadata={self.metadata!r},'
                 f'kw_only={self.kw_only!r},'
+                f'doc={self.doc!r},'
                 f'_field_type={self._field_type}'
                 ')')
 
@@ -402,7 +405,7 @@ def __repr__(self):
 # so that a type checker can be told (via overloads) that this is a
 # function whose type depends on its parameters.
 def field(*, default=MISSING, default_factory=MISSING, init=True, repr=True,
-          hash=None, compare=True, metadata=None, kw_only=MISSING):
+          hash=None, compare=True, metadata=None, kw_only=MISSING, doc=None):
     """Return an object to identify dataclass fields.
 
     default is the default value of the field.  default_factory is a
@@ -414,7 +417,7 @@ def field(*, default=MISSING, default_factory=MISSING, init=True, repr=True,
     comparison functions.  metadata, if specified, must be a mapping
     which is stored but not otherwise examined by dataclass.  If kw_only
     is true, the field will become a keyword-only parameter to
-    __init__().
+    __init__().  doc is a docstring for this field.
 
     It is an error to specify both default and default_factory.
     """
@@ -422,7 +425,7 @@ def field(*, default=MISSING, default_factory=MISSING, init=True, repr=True,
     if default is not MISSING and default_factory is not MISSING:
         raise ValueError('cannot specify both default and default_factory')
     return Field(default, default_factory, init, repr, hash, compare,
-                 metadata, kw_only)
+                 metadata, kw_only, doc)
 
 
 def _fields_in_init_order(fields):
@@ -1156,7 +1159,7 @@ def _process_class(cls, init, repr, eq, order, unsafe_hash, frozen,
     if weakref_slot and not slots:
         raise TypeError('weakref_slot is True but slots is False')
     if slots:
-        cls = _add_slots(cls, frozen, weakref_slot)
+        cls = _add_slots(cls, frozen, weakref_slot, fields)
 
     abc.update_abstractmethods(cls)
 
@@ -1191,7 +1194,7 @@ def _get_slots(cls):
             raise TypeError(f"Slots of '{cls.__name__}' cannot be determined")
 
 
-def _add_slots(cls, is_frozen, weakref_slot):
+def _add_slots(cls, is_frozen, weakref_slot, defined_fields):
     # Need to create a new class, since we can't set __slots__
     #  after a class has been created.
 
@@ -1208,16 +1211,17 @@ def _add_slots(cls, is_frozen, weakref_slot):
     )
     # The slots for our class.  Remove slots from our base classes.  Add
     # '__weakref__' if weakref_slot was given, unless it is already present.
-    cls_dict["__slots__"] = tuple(
-        itertools.filterfalse(
+    cls_dict["__slots__"] = {
+        slot: getattr(defined_fields.get(slot), 'doc', None)
+        for slot in itertools.filterfalse(
             inherited_slots.__contains__,
             itertools.chain(
                 # gh-93521: '__weakref__' also needs to be filtered out if
                 # already present in inherited_slots
                 field_names, ('__weakref__',) if weakref_slot else ()
             )
-        ),
-    )
+        )
+    }
 
     for field_name in field_names:
         # Remove our attributes, if present. They'll still be

The only concern I have is that we change how __slots__ is defined: instead of tuple it would be a dict. I think that it should be fine, since it is a part of the language. But, some people might expect it to be only a tuple.

If others are fine with the change, I will add tests with docs and send a PR.

@ericvsmith
Copy link
Member

I think this change is okay. I don't think it would be reasonable for anyone to assume that dataclasses' __slots__ would always be a tuple.

That said, the documentation at https://docs.python.org/3/reference/datamodel.html says of __slots__: "This class variable can be assigned a string, iterable, or sequence of strings with variable names used by instances.". Perhaps that needs to be updated for the dict case?

@rhettinger
Copy link
Contributor Author

It would be nice to have the dict case specifically called out in the data model docs. Right now, all we have is the announcement in the 3.8 whatsnew.

@ericvsmith
Copy link
Member

Created #114044 for the documentation issue.

@ericvsmith
Copy link
Member

One question I have about this feature: should it be an error to specify doc if not using __slots__? It will be available in the Field object, so I guess that's of some utility. I guess if we got super excited about it, we could teach inspect about Field.doc if the class is a dataclass and __slots__ isn't a dict. Or maybe even if it is a dict, then we could skip changing __slots__ to a dict for dataclasses.

@sobolevn
Copy link
Member

sobolevn commented Jan 13, 2024

One question I have about this feature: should it be an error to specify doc if not using slots?

I think it should be ok, some other tool might also use this doc= field.

I guess if we got super excited about it, we could teach inspect about Field.doc if the class is a dataclass and slots isn't a dict. Or maybe even if it is a dict, then we could skip changing slots to a dict for dataclasses.

As a separate feature, I guess :)
Right now I don't understand the whole idea.

@rhettinger
Copy link
Contributor Author

One question I have about this feature: should it be an error to specify doc if not using slots? It will be available in the Field object, so I guess that's of some utility.

That makes sense to me. In this regard, it would be a bit like the metadata field but for a dedicated purpose.

@ericvsmith
Copy link
Member

Right now I don't understand the whole idea.

I'm not sure this answers your question, @sobolevn, but:

I mean that inspect.getdoc() could contain logic that says "if this is a dataclass, then show the documentation of the fields, using the Field.doc values, if they exist". It doesn't really fit into the "Data descriptors defined here" section, but we should be able to come up with something.

I'll grant you that this is an unrelated issue.

@ericvsmith
Copy link
Member

My typical questions when adding a feature to dataclasses are:

  • Does attrs do anything similar? If so, we probably want to do it in a similar way.
  • Is there any impact on typing.dataclass_transform (PEP 681)?

@sobolevn
Copy link
Member

I don't think attrs has this feature:

It does not impact dataclass_transform, because this is about field(), not about dataclass()

@kwollaston
Copy link

Hi, is there any update on this? I am working on a configuration module that needs to be able to read default values from the dataclass and I'd like to be able to set the docstring for the field as well.

@sobolevn
Copy link
Member

sobolevn commented Aug 1, 2024

I am waiting on @ericvsmith's review :)

ericvsmith pushed a commit that referenced this issue Sep 27, 2024
If using `slots=True`, the `doc` parameter ends up in the `__slots__` dict. The `doc` parameter is also in the corresponding `Field` object.
@ericvsmith
Copy link
Member

Merged in #114051. Thanks, @sobolevn!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

5 participants