-
Notifications
You must be signed in to change notification settings - Fork 3
Optimize fetch performance, particularly for old zeroed accounts #21
Optimize fetch performance, particularly for old zeroed accounts #21
Conversation
Here is an example to help make that first discussion topic more concrete. Consider an account that had a daily balance at the end of May 2021 then remained zero until present day. After this pull request the following data will be loaded from the API for this account:
Which of these options are best for the final CSV? See the discussion above for some of the considerations. Option 1 (vote with 🎉 )Truncate all zero balances at the end of the CSV.
Option 2 (vote with 👀 )Truncate zero balances at the end of the CSV but leave the first one.
Option 3 (vote with 🚀)Fill the CSV with zeroes out to the current day (matches output before this PR just with fewer API requests).
|
It sounds like we're in general agreement on Option 1 (truncate trailing 0s), in which case, @masonwolters or @vanessa if you can do the line-by-line here...? Thanks so much for contributing @idpaterson |
Thanks for reviewing! The CSV files should now be trimmed but any account with exclusively zero balances will keep one row so that the CSV at least notes the date the account was added to Mint. Zero balance accounts may be something to consider including in a post-export summary for the request to denote which accounts failed from @nickrenda. |
Mint seems to have implemented a severe cooling off penalty for automated API access. The entire mint.intuit.com is blocked by a "Oops! We ran into an issue. Please try again later." message for about 5-10 minutes. It kicks me out after just a few small exports (not even whole account history, just a 6 month trend). Not sure if this is targeted at specific accounts, specific access patterns and request data, or if it is just a basic rate limit that would affect all users of the extension. I recommend using a significantly longer rate limit delay for development until we learn more. I will have one more pull request after this one that may be relevant to this problem. I reimplemented the "Export Daily History to CSV" button from my original userscript using your background script to pull the trend and build the CSV. The original intent was to allow exporting big picture trends that combine multiple accounts like "all assets over time" but it may end up being more important as a way to export individual accounts 🤷♂️. |
I don't know how long it took the Mint API to respond previously, but I'm seeing about 10 seconds per trend request in the service worker and most of those responses do not include the Additionally, Even in the Mint trend view I'm usually seeing "We got nothing" when selecting any combination of trend parameters. Hopefully it's just a bad day for the API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks @idpaterson.
As far as the rate limit goes, my observations from last week were that it seems to be purely triggered from exceeding some threshold of X requests / second, and not by more general api usage patterns. Every case I saw was the same where it returns 429 for all requests for ~10 minutes or so which is quite unfortunate because that doesn't allow for any sort of delay & retry opportunity. The 50ms delay seemed to be long enough most of the time, but given that we're seeing some users hit the rate limit still we should likely increase it.
That said, 10 second responses does seem unusually long so maybe it is also a bad day for the api 🤷♂️
Submitting for discussion to fix #17, there are a lot of factors to consider here but let's just say this is not exclusively about optimization.
Fewer requests for all accounts
This pull request changes the daily balance API requests from 1-per-month to 1-per-43 days. This magic number is the largest range where the API will still return daily granularity. The larger batches reduce the total number of API requests by about 30%.
Much fewer requests for zeroed accounts
For a 10 year old account that had 1 month of activity then went forever zero it is far more efficient to pull only a couple months rather than 10 years of zeroes from the Mint API.
The more nuanced reason is that we're going to be importing these files into Monarch. Consider a 401k account that upon change of employment was rolled over into an IRA. In Monarch I prefer to see the balance of that IRA from day one even though it was technically a different account at that time. If the 401k CSV contains zeroes all the way through to today's date I have to be very careful when importing those balances to choose the old defunct account first, then upload the new account which will overwrite the zeroes. Imported in the opposite order, the zeroes in the 401k CSV would overwrite my IRA balance history.
I also have one account that went through no fewer than 5 such transitions, some at the financial account level and some just from Mint cycling connection providers. In that case the advice to upload the oldest one first is pretty tough.
❓ Discussion: What should the CSV include for zeroed accounts?
My opinion is that after pulling the daily data, all trailing zero balances should be removed from the CSV.
In my Mint export extension I implemented this differently, leaving the last zero as a way to see that the balance went to zero. I think it was well-intentioned, the idea was that showing a zero there would reinforce that the balance went to zero. However that is actually not good for the use case above where the balance was never actually zero, Mint just transitioned the connection one day. We can't distinguish "the balance went to zero" from "Mint shut down the connection" so including a trailing zero in the CSV may not even be representative of the actual balance.
❓ Discussion: What about accounts that have no transactions at all?
The CSV has two pieces of useful information - the balance and the date. If we remove all zero balance rows then there is no data left for accounts with no transactions. I think that trimming zero balance rows from the end of the CSV should preserve the first and only row in the CSV. Then at least you can see the date the account was created in Mint even if there was never a balance.
What does this pull request do?
Currently it includes about a month of zeroes for each account because any removal of trailing zeroes is pending discussion. See test cases for the specific date intervals that are requested in each scenario.
I found in my own accounts a few cases where the Mint Trends chart showed a nonzero balance month followed by a zero balance month even though there were nonzero daily balances a week or two into that zero balance month. So the entire first zero balance month is requested from the daily API for safety.
I also think the refactoring helps to avoid some confusion in reading the code between the monthly trend data coming from Mint and the daily trend data that we're gathering (renamed from "months" to "periods").
How does this affect the progress bar?
The progress bar still works perfectly and has not been changed in this PR. It is tracking the total amount of data that needs to be downloaded, not the total number of accounts. However, now the amount of data required for each account is far more variant. In my account, the progress bar was nearly full with less than 75% of accounts processed because the last couple dozen have very little history.

Oh yeah and progress goes brrrrr now, takes much less time to export all 97 accounts.