Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENG-3867] Garden Variety Pickle #4054

Merged
merged 4 commits into from
Oct 4, 2024
Merged

[ENG-3867] Garden Variety Pickle #4054

merged 4 commits into from
Oct 4, 2024

Conversation

masenf
Copy link
Collaborator

@masenf masenf commented Oct 4, 2024

Avoid dill

Now that our states are simpler, we can pickle them using the built in c-implementation which is much faster.

Update __module__ for ComponentState

When we create component state substates, cache these generated classes on reflex.state so that the pickler can find them.

Reduce state instantiations

The StateManagerDisk was creating a new state tree for every get_state call, which is a very expensive operation. Instead pull the root state from the self.states cache if it's there.

Simplify loading logic to make it clear that we only populate states on initial load where a client token is not saved in memory.

Optimize output by only serializing states where _get_was_touched() returns True Moved to a different PR

Bonus

Redis now throws away state when the schema changes, allowing for smoother app updates.

Profile

This profiles 4000 operations against the state. The operation causes the state to grow linearly in size.

"""Measuring round-trip time of different StateManagers."""
import asyncio
import contextlib
import uuid

import async_timeit
from redis.asyncio import Redis

import reflex as rx
from reflex.state import StateManagerMemory, StateManagerDisk, StateManagerRedis, _substate_key


OPS = 4000
TOKEN = str(uuid.uuid4())


class MyState(rx.State):
    d_int: int = 0
    d_str: str = ""
    d_list: list = []
    d_dict: dict = {}
    d_set: set = set()

    def ev(self):
        self.d_int += 1
        self.d_str = str(self.d_int)
        self.d_list.append(self.d_int)
        self.d_dict[self.d_int] = self.d_int
        self.d_set.add(self.d_int)


async def test_state_manager(clz=StateManagerMemory, **kwargs):
    sm = clz(state=rx.State, **kwargs)

    async def modify_state():
        async with sm.modify_state(_substate_key(TOKEN, MyState)) as s:
            (await s.get_state(MyState)).ev()

    print(await async_timeit.timeit(modify_state, number=OPS))


PROFILE = False


@contextlib.contextmanager
def profile(filename):
    if not PROFILE:
        yield
        return
    import cProfile
    profiler = cProfile.Profile()
    profiler.enable()
    yield
    profiler.disable()
    profiler.dump_stats(filename)


if __name__ == "__main__":
    with profile("statemanagermemory.prof"):
        print("StateManagerMemory", end=": ")
        asyncio.run(test_state_manager())

    with profile("statemanagerdisk.prof"):
        print("StateManagerDisk", end=": ")
        asyncio.run(test_state_manager(clz=StateManagerDisk))

    with profile("statemanagerredis.prof"):
        print("StateManagerRedis", end=": ")
        asyncio.run(test_state_manager(clz=StateManagerRedis, redis=Redis.from_url("redis://localhost")))

async_timeit.py

Before (0.6.1)

(VENV-release311) masenf@minicone pickle-perf % python timing.py                 
StateManagerMemory: 1.048973000026308
StateManagerDisk: 30.536243582959287
StateManagerRedis: 35.89346333395224

After

(VENV-fd) masenf@minicone pickle-perf % python timing.py
StateManagerMemory: 1.0538166250335053
StateManagerDisk: 3.6349450410343707
StateManagerRedis: 11.734799916972406

Copy link

linear bot commented Oct 4, 2024

adhami3310
adhami3310 previously approved these changes Oct 4, 2024
@masenf
Copy link
Collaborator Author

masenf commented Oct 4, 2024

Removed the StateManagerDisk optimizations, since they are breaking some integration tests.

elif fp is not None and data is None:
(substate_schema, state) = pickle.load(fp)
else:
raise ValueError("Only one of `data` or `fp` must be provided")
Copy link
Collaborator

@Lendemor Lendemor Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This else will also trigger if both data and fp are None.
Maybe need to adjust the message for that case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If neither are provided, then that's still an error. We need exactly one.

@masenf masenf merged commit d77b900 into main Oct 4, 2024
39 checks passed
Kastier1 pushed a commit that referenced this pull request Oct 23, 2024
* Use regular `pickle` module from stdlib

* Avoid recreating the rx.State tree for every `get_state`

* Remove dill dependency

* relock deps
@masenf masenf deleted the masenf/garden-variety-pickle branch December 12, 2024 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants