Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-71587: Establish global state in _datetime #110475

Conversation

erlend-aasland
Copy link
Contributor

@erlend-aasland erlend-aasland commented Oct 6, 2023

  • Establish global state
  • Put singletons in global state

@erlend-aasland erlend-aasland changed the title gh-78469: Establish global state in _datetime gh-71587: Establish global state in _datetime Oct 6, 2023
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe with shorter names, you can just write get_datetime_state()->utc directly (for example).

Modules/_datetimemodule.c Outdated Show resolved Hide resolved
Modules/_datetimemodule.c Outdated Show resolved Hide resolved
Modules/_datetimemodule.c Outdated Show resolved Hide resolved
Modules/_datetimemodule.c Outdated Show resolved Hide resolved
@erlend-aasland
Copy link
Contributor Author

PTAL, @vstinner

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If PyInit__datetime() fails, I suppose that the next call may override global state without clearing the previous strong refererence: reference leak. I don't think that we should bother since this state is going to become more dynamic and set differently, no?

Modules/_datetimemodule.c Outdated Show resolved Hide resolved
Modules/_datetimemodule.c Outdated Show resolved Hide resolved
Modules/_datetimemodule.c Outdated Show resolved Hide resolved
@erlend-aasland
Copy link
Contributor Author

erlend-aasland commented Oct 11, 2023

If PyInit__datetime() fails, I suppose that the next call may override global state without clearing the previous strong refererence: reference leak. I don't think that we should bother since this state is going to become more dynamic and set differently, no?

Well, it is easy to already now add the machinery needed to clear the state (future m_clear function). It's easy to add such a function now, and call it in the PyInit__datetime() error path. It will be an improvement over status quo where global state is overwritten anyway.

Note that parts of the state actually is cleared if the module exec function fails.

@erlend-aasland
Copy link
Contributor Author

Perhaps we should start with a PR that cleans up _datetime init code?

if (x == NULL) {
return -1;
if (st->utc == NULL) {
goto error;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, now exec function has two different parts:

  • First part uses return -1
  • Second part initializes the static state and uses goto to make sure that the state always stays in a consistent state: either fully initialized, or cleared

I suggest to split the function in two parts (create a subfunction). I don't think that a second PR is needed.

Maybe just add a second _datetime_init_state() function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can bake it into this PR. I'll get to it later tonight.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some helpers in a9e8d1a (init_state and create_timezone_from_delta). Initialisation should now be slightly more robust than before, unless I missed something :)

Comment on lines +6771 to +6774
st->seconds_per_day = PyLong_FromLong(24 * 3600);
if (st->seconds_per_day == NULL) {
return -1;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to initialise the state members in the order they are declared, we'll have to move this. However, the following comment will then need to be tweaked. I'm not sure it is worth it.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@erlend-aasland
Copy link
Contributor Author

Thanks for the review, Victor!

@erlend-aasland erlend-aasland merged commit ec5622d into python:main Oct 12, 2023
22 checks passed
@erlend-aasland erlend-aasland deleted the isolate-datetime/part1-establish-global-state branch October 12, 2023 08:28
@erlend-aasland
Copy link
Contributor Author

There are lots of paths we can take for the follow-up PRs.

Alt 1, continue adding to static state:

  • Convert static types to heap types; add them to static state
  • Convert static state to module state (will break encapsulated C API)
  • Deprecate encapsulated C API and introduce a new one based on module state

Alt 2, try to save the encapsulated C API by moving static state to the interpreter

  • Move existing static state to the interpreter
  • Convert static types to heap types; add them to the interpreter owned state struct

What do you think, Victor?

@vstinner
Copy link
Member

Convert static types to heap types; add them to static state

Sounds like a good step.

Move existing static state to the interpreter

I would prefer to consider that after converting types to heap types.

IMO the C API like PyDateTime_Check() will prevent to make this state per module. Instead, you can consider splitting this state in two states:

  • One state per interpreter: anything involing types and instances of these types
  • One state per module instance: constants like "microseconds per second"

You simply cannot have more than one instance of the same type beause of PyDateTime_Check().

PyDateTime_Check() will be per-interpreter. I had the exact same problem with PyAST_Check():

int PyAST_Check(PyObject* obj)
{
    struct ast_state *state = get_ast_state();
    if (state == NULL) {
        return -1;
    }
    return PyObject_IsInstance(obj, state->AST_type);
}

@vstinner
Copy link
Member

    if (state == NULL) {
        return -1;
    }

By the way, the error case could be avoided by running the initialization at Python startup.

@jackyk02
Copy link

Hi Erlend, would it be possible for you to support this for Python 3.12?

@erlend-aasland
Copy link
Contributor Author

Hi Erlend, would it be possible for you to support this for Python 3.12?

No; that would be a breaking change.

Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
* Use explicit initialiser for m_base
* Add module state stub; establish global state on stack
* Put conversion factors in state struct
* Move PyDateTime_TimeZone_UTC to state
* Move PyDateTime_Epoch to state struct
* Fix ref leaks in and clean up initialisation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants