Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix_Yahoo_returning_live_separate() fails for "1wk" interval if a new year occurs midweek #2206

Open
meirinberg opened this issue Jan 5, 2025 · 0 comments

Comments

@meirinberg
Copy link

meirinberg commented Jan 5, 2025

Describe bug

This week, I noticed the final row in my weekly dataframe is extraneous and does not represent a full week. After some investigation, I saw that "fix_Yahoo_returning_live_separate()", called in the "history" function, was created to address this issue.

The comment in the "fix_Yahoo_returning_live_separate()" function claims that yfinance sometimes returns live data as a separate row when the market is still open (or for the most recent interval), which can result in inconsistent intervals in the downloaded data. This appears to be a known quirk with yfinance, especially when working with weekly or monthly intervals. I found that "fix_Yahoo_returning_live_separate()" does this for the weekly interval by checking if the year and week are the same (on line 620 and 621):

 if interval == "1wk":
       ast_rows_same_interval = dt1.year == dt2.year and dt1.week == dt2.week

However, a new year may fall midweek, such as the start of 2025. So, unfortunately, I am seeing this extra row when I currently get weekly data (since today is 1/4/2025).

Suggested Solution:
To correctly handle this scenario, the function should use ISO week numbers (isocalendar().week), which account for weeks spanning across years.

if interval == "1wk":
    # Check if both rows fall in the same ISO calendar week -- might have to use "isocalendar().week" instead.
    last_rows_same_interval = (dt1.isocalendar()[1] == dt2.isocalendar()[1])

This approach ensures rows are merged based on their true calendar week, not limited by year boundaries.

It would be great if this fix can be incorporated into both Ticker.history() and yf.download(). I would love a fast resolution as this is important to my personal project. Thank you!

Simple code that reproduces your problem

Use Ticker.history() or yf.download() with a "1wk" interval (might have to be before next week, 1/6/2025) to see that there is an extraneous row for 2025-01-03 when it should be merged into the week of 2024-12-30.

For example, I see the data below:

2024-12-30   110.690002  115.480003  109.570000  115.199997 ...
2025-01-03   112.480003  115.459999  112.290001  115.199997 ...

These two rows are part of the same ISO calendar week, yet they remain separate because the function assumes week boundaries are confined to the same year.

Debug log

I don't see any new output when I enable_debug_mode().

Bad data proof

No response

yfinance version

0.2.51

Python version

No response

Operating system

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant