You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Actual output: volume: None, reporter: 'Thompson', page: '394'
Other examples of inputs that are incorrectly parsed are: Adams v. Thompson, 560 F. Supp. 894 and Mozena v. Thompson, 44 A.2d 276.
I've been using the first example to debug this issue, and noticed that Eyecite identifies two tokens within the input string: "Thompson's Unreported Cases (TN)" and "United States Supreme Court Reports.". The problem arises because these tokens overlap (both include "394") and Eyecite's tokenize method prioritizes the rightmost token when encountering overlaps, leading to this results.
The text was updated successfully, but these errors were encountered:
ERosendo
changed the title
parsing error for citations with defendant 'Thompson'
Parsing error for citations with defendant 'Thompson'
Mar 28, 2024
Per discussion today, seems to be happening when citations appear to overlap. The simple solution here is to find both citations that overlap and then filter out the one that's incomplete.
In issue #3924, we identified a bug in Eyecite's parsing method when the defendant's last name is 'Thompson'.
For example, for the citation
'Shapiro v. Thompson, 394 U. S. 618'
:volume: 394, reporter: 'U.S.', page: '618'
volume: None, reporter: 'Thompson', page: '394'
Other examples of inputs that are incorrectly parsed are:
Adams v. Thompson, 560 F. Supp. 894
andMozena v. Thompson, 44 A.2d 276
.I've been using the first example to debug this issue, and noticed that Eyecite identifies two tokens within the input string: "Thompson's Unreported Cases (TN)" and "United States Supreme Court Reports.". The problem arises because these tokens overlap (both include "394") and Eyecite's tokenize method prioritizes the rightmost token when encountering overlaps, leading to this results.
The text was updated successfully, but these errors were encountered: