Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge/json improvements #650

Merged
merged 12 commits into from
Dec 7, 2021
Merged

Merge/json improvements #650

merged 12 commits into from
Dec 7, 2021

Conversation

MariusWirtz
Copy link
Collaborator

Rebased #644 on the master branch to solve conflicts

@MariusWirtz MariusWirtz force-pushed the merge/json-improvements branch from d06afac to f7fa5d4 Compare December 2, 2021 21:16
@MariusWirtz
Copy link
Collaborator Author

MariusWirtz commented Dec 2, 2021

Hi @gbryant-dev,

I rebased your branch on the master branch and solved the few conflicts in the CellService. I aligned the naming as you and Ryan suggested.
The tests in the CellService all still pass with use_compact_json=True by default.

Before we merge it into the master, I think there are a few small tasks left.

  • Today I wasn't able to quantify a difference in performance (use_compact_json=True vs use_compact_json=False) or see a difference in the memory footprint of the script. I attached a script and the output below. I will look into this again, but I would appreciate it if you could do a similar test on your end. It would be good to have some numbers on performance and memory improvement.

  • Decide whether to use use_compact_json by default or not

  • It would be good to have some specific tests for use_compact_json=True. I will do this in the next few days.

I will continue working in this branch for the moment and keep the remote up to date.
If you want to add more commits, I think it would be easiest if you continue working based on this branch.

Here is my simple benchmarking script:

import time

from TM1py import TM1Service

with TM1Service(address="", port=12297, user="admin", password="apple", ssl=True) as tm1:
    mdx = """
    SELECT
    {[Big Cube Measure].[Big Cube Measure].Members} ON COLUMNS,
    {[Big Dimension 1].[Big Dimension 1].Members} *
    {[Big Dimension 2].[Big Dimension 2].[000001],
    [Big Dimension 2].[Big Dimension 2].[000002]} ON ROWS
    FROM [Big Cube]
    """

    times = list()
    for _ in range(10):
        before = time.time()
        data = tm1.cells.execute_mdx(mdx=mdx, use_compact_json=True)
        elapsed_time = time.time() - before
        times.append(elapsed_time)
    print(f"execute_mdx with use_compact_json=True: {sum(times) / len(times)}")

    times = list()
    for _ in range(10):
        before = time.time()
        data = tm1.cells.execute_mdx(mdx=mdx, use_compact_json=False)
        elapsed_time = time.time() - before
        times.append(elapsed_time)
    print(f"execute_mdx with use_compact_json=False: {sum(times) / len(times)}")

    times = list()
    for _ in range(10):
        before = time.time()
        data = tm1.cells.execute_mdx_values(mdx=mdx, use_compact_json=True)
        elapsed_time = time.time() - before
        times.append(elapsed_time)
    print(f"execute_mdx_values with use_compact_json=True: {sum(times) / len(times)}")

    times = list()
    for _ in range(10):
        before = time.time()
        data = tm1.cells.execute_mdx_values(mdx=mdx, use_compact_json=False)
        elapsed_time = time.time() - before
        times.append(elapsed_time)
    print(f"execute_mdx_values with use_compact_json=False: {sum(times) / len(times)}")

And the output:

execute_mdx with use_compact_json=True: 50.3670879205068
execute_mdx with use_compact_json=False: 50.339932560920715
execute_mdx_values with use_compact_json=True: 4.766747236251831
execute_mdx_values with use_compact_json=False: 4.669796625773112

@gbryant-dev
Copy link
Contributor

gbryant-dev commented Dec 4, 2021

Hi @MariusWirtz,

Great, thanks for looking at this.

As for performance, the two key things that I would expect to have a positive impact when use_compact_json=True are size of payload and network latency.

From your benchmarking script, I can see you are running an instance on localhost so that will eliminate any improvement you would gain from the network latency element. In terms of payload size, this will depend on how big the cellset is. From the initial testing I did, the payload size can reduce by up to 70%. I suspect the reason why you're not seeing any difference (besides testing on a local instance) is because the cellset isn't large enough to make a noticeable difference. How big are the dimensions in the cube used in the script?

In addition, as this MR only uses compact json for cells, the performance benefit you would gain from network latency / payload size would be netted-off to a varying degree as there would be two requests instead of one, one to get the axes, tuples and member information and the second to get the cell information. This wouldn't apply to functions that eventually call extract_cellset_values as that's only one request to get cell values. In some cases the performance could be worse, as it's a weigh off between how long it takes the TM1 server to return the cellset information vs the size of reduction in the payload and how much network latency there is. I think it therefore makes to have the use_compact_json=False as default so it's only used when beneficial for the environment.

I'll try and do some benchmarking but I suspect that won't be until the latter half of next week.

@MariusWirtz
Copy link
Collaborator Author

Hi @gbryant-dev,

You are absolutely right. When I re-run the test against a remote server, I get a different picture. See below.
The cellset size is 1'000'000. Each dimension has 500K elements.

Agree. By default, we should probably set it to False. Like you said due to splitting the request into two it can be slower than the current approach.
For the execute_mdx_values functions it's a nice small performance boost.

I'm curious to see benchmarking results in other environments.

execute_mdx execute_mdx_values
use_compact_json=True 106,95 sec 15,42 sec
use_compact_json=False 104,52 sec 15,92 sec

@MariusWirtz MariusWirtz force-pushed the merge/json-improvements branch from 0f30ee7 to 4224866 Compare December 6, 2021 20:51
@MariusWirtz
Copy link
Collaborator Author

I renamed I few things and refactored the code a bit here and there.

@gbryant-dev please take a final look and approve the pull request.

Copy link
Contributor

@gbryant-dev gbryant-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MariusWirtz, looks great. In hindsight, extracting the logic in the decorator into a core extract function seems like a no-brainer really, I don't know why I didn't do it. I'll keep it in mind for future contributions! 🙂

TM1py/Services/CellService.py Show resolved Hide resolved
TM1py/Utils/Utils.py Show resolved Hide resolved
@MariusWirtz MariusWirtz merged commit 2d38935 into master Dec 7, 2021
@MariusWirtz MariusWirtz deleted the merge/json-improvements branch April 9, 2023 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants