-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NCAAB Boxscore causes IndexError #591
Comments
Issue is with parsing team records. Appears the HTML has changed. A temporary fix (if you don't need team records) is commenting line 671 of boxscore.py and replacing with a dummy value. I'll leave the parsing issue to the experts since I couldn't figure it out.
|
Ahh the HTML change was my initial thought. Luckily I don't need team records; thank you for the fix! :) |
Hi, Im currently trying to obtain the dataframe_extended for each team but i am getting the "list index out of range" issue. I tried to comment out the line add add '0-0' but the problem still persists. Do you have any other suggestions? |
When I went to make the fix, I noticed that the self._parse_record was past line 671. Make sure when you are commenting out value it is the one where it says:
then add:
|
Assuming I understand how the whole PyQuery objects works, In my case the index is failing trying to parse the away_record field. The BOXSCORE_SCHEME for away_record is: 'div#boxes div[class="section_heading"] h2' And in the HTML of the records that work the HTML looks like:
Which seems to match up to the BOXSCORE_SCHEME for the away_record. For the boxscores that give me an Index Exception the HTML looks like:
It appears that some additional text (assoc_box-score-basic-texas-am-corpus-christi) gets added into the "section_heading" name - and I can't find a pattern to why or when it happens. -- But maybe somebody smarter than I will know how to tweak the PyQuery to make this work correctly given the two scenarios. *** Update |
Was the updated fixed presented above over pushed out? I am still getting the error. I could apply the fix mentioned above, however, I would rather use the modified/fixed code. |
I posted this on another issue for the same error. It might be of use to you all. I just worked my way through this issue -- I believe the format for boxscore pages has changed on sports reference. @roclark I have it running locally by updating the
Not positive this is the correct way to fix the issue, but it's working for me. Great project by the way! I'd been trying to parse manually prior to finding it. Edit: apologies for the double @, roclark! |
Pulling an NCAAB Boxscore gives and IndexError in _parse_record
boxscore = Boxscore('2021-02-17-19-virginia-military-institute')
Should pull the boxscore from the 02/17 VMI game. Throws same exception for any other boxscore index as well.
Traceback (most recent call last):
ncaam_scraper.py", line 24, in
if name == "main": main()
ncaam_scraper.py", line 12, in main
print(Boxscore('2021-02-17-19-virginia-military-institute'))
sportsipy\ncaab\boxscore.py", line 225, in init
self._parse_game_data(uri)
sportsipy\ncaab\boxscore.py", line 683, in _parse_game_data
value = self._parse_record(short_field, boxscore, index)
sportsipy\ncaab\boxscore.py", line 390, in _parse_record
return records[index]
IndexError: list index out of range
The text was updated successfully, but these errors were encountered: