Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve kassenbuch speed #138

Merged
merged 8 commits into from
Jun 21, 2016
Merged

Improve kassenbuch speed #138

merged 8 commits into from
Jun 21, 2016

Conversation

patkan
Copy link
Member

@patkan patkan commented Jun 9, 2016

Nachdem der Julian von der Kassenbuch-Performance begeistert war, ich auch und überhaupt alle, haben @schrolli und ich uns hingesetzt und ein paar einfache Perfomance-Upgrades eingefrickelt :-)

Ich bitte um ein Review, da es auf der Datenbank rumschreibt :-P

@codecov-io
Copy link

codecov-io commented Jun 9, 2016

Current coverage is 28.22%

Merging #138 into development will increase coverage by 1.09%

@@           development       #138   diff @@
=============================================
  Files               54         54          
  Lines             6709       6756    +47   
  Methods              0          0          
  Messages             0          0          
  Branches             0          0          
=============================================
+ Hits              1820       1907    +87   
+ Misses            4889       4849    -40   
  Partials             0          0          

Powered by Codecov. Last updated by 3c8a7cf...321cb9d

elif until_date:
query = query + " WHERE datum < Datetime('{until_date}')".format(
until_date=until_date
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is exactly the same code as above (lines 370-381), why not make a "where query generator" function, which can be reused?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I will look into it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done :-)

@patkan patkan self-assigned this Jun 9, 2016
@cod3monk
Copy link
Contributor

cod3monk commented Jun 9, 2016

A cache of the sums would be another improvement, significantly speeding up summary_to_string. I did not include a running sum into the buchung table, because it could be wrong. But if you find a way to make sure that nothing has changed up to a certain point, a cache could provide sums up to that point, if things have changed the cache needs to be updated. The buchung table is to be the authoritative source of all account balances at all times.

The cache should be kept separate, at least a separate table if not a separate database.

@patkan
Copy link
Member Author

patkan commented Jun 9, 2016

Wouldn't computing the sums for summary_to_string in the database be faster? (Instead of creating the buchung-objects and then adding them)

In a next step we could use a SQL-trigger to automatically compute the sum of all or certain entries on update of the table. For this to be useful we however have to come up with a scheme of how to use it.

@patkan
Copy link
Member Author

patkan commented Jun 9, 2016

With 3dad868 I have added trigger that record the date of last change in a table. We could then check the timestamps against the timestamp of a cache. This way we can keep the cache as long as no change has been made to the database.

@schrolli
Copy link

before you start caching things in seperate tables, use sql-queries instead of objects to crawl over. I can think of such a query:

SELECT
    konto,
    SUM(CASE WHEN betrag > 0 THEN betrag ELSE 0 END) as haben,
    SUM(CASE WHEN betrag < 0 THEN ABS(betrag) ELSE 0 END) as soll,
    SUM(betrag) as saldo,
FROM buchung
GROUP BY konto
ORDER BY konto asc
WHERE date < ?

If the Group-by is too slow, an option would be to outsource the konto textfield into a seperate table and referencing the konto.id in the buchung-table. Then the group-by is performed on an integer-field. But that involves database-restructuring...

to get the newest buchung, a separate query is needed:

SELECT
    beschreibung,
    datum
FROM buchung
WHERE date < ?
ORDER BY date desc
LIMIT 1

Since i have no filled database to play with, i could not verify or test things, when fuzzing in the code...

@patkan
Copy link
Member Author

patkan commented Jun 11, 2016

I would strip 3dad868 from this PR and the merge it. OK?

@mgmax
Copy link
Member

mgmax commented Jun 11, 2016

We cannot do SUM in the database because we use Decimal and sqlite uses float.

@patkan patkan force-pushed the improve-kassenbuch-speed branch from 3dad868 to 62b754c Compare June 11, 2016 19:46
@patkan
Copy link
Member Author

patkan commented Jun 11, 2016

I have removed the mentioned commit and would now say that this is ready to merge.

@patkan patkan removed their assignment Jun 11, 2016
@mgmax
Copy link
Member

mgmax commented Jun 11, 2016

now would be a good point to add a few tests that cover the date filtering and sum calculation.

@schrolli
Copy link

in the meantime i have got a populated database to play with and have implemented the above mentioned query. it works fine in the sense of displaying the same results as the former object-crawl version. By this i got the execution time from 1.2s down to under 0.25s. I agree to the problems of summing up many float values. But when this is a mission-breaker, i would move away from sqlite to a more sophisticated database-engine like mysql/mariadb or postgres, that have decimal maths build in.

Testing: I can imagine to provide a "test database" file with special values inserted and asserting against a preprocessed result. the values include some corner cases like a buchung with the exact datetime of an --until argument, where this buchung should not be included...

@mgmax
Copy link
Member

mgmax commented Jun 12, 2016

I think the existing unittests should be amended to generate test data by making bookings and running queries.
https://github.com/fau-fablab/FabLabKasse/blob/development/FabLabKasse/test_kassenbuch.py

@patkan
Copy link
Member Author

patkan commented Jun 12, 2016

And for the generation of data the package hypothesis should be used. This makes it easier to write tests that do not make too much assumptions.

@patkan patkan force-pushed the improve-kassenbuch-speed branch from 62b754c to 0e617f7 Compare June 20, 2016 22:51
@patkan patkan force-pushed the improve-kassenbuch-speed branch from beb28a9 to a5913c6 Compare June 20, 2016 23:53
@patkan
Copy link
Member Author

patkan commented Jun 21, 2016

I have added some basic unittests they should be enough for now.

@patkan patkan merged commit ca065a8 into development Jun 21, 2016
@patkan patkan deleted the improve-kassenbuch-speed branch June 21, 2016 00:50
@patkan
Copy link
Member Author

patkan commented Jun 21, 2016

Deployed to Kasse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants