Don't set court='scotus' for South Carolina citations #84

mattdahl · 2021-06-26T20:45:10Z

Eyecite thinks that South Carolina citations are SCOTUS citations:

from eyecite import get_citations
text = 'Lee County School Dist. No. 1 v. Gardner,  263 F.Supp. 26 (SC 1967)'
cites = get_citations(text)
cites[0].metadata.court

# prints 'scotus'

The SC in the year could be ambiguous, but the F.Supp. reporter should automatically rule SCOTUS out as a possibility for the court here.

The text was updated successfully, but these errors were encountered:

devlux76 · 2021-12-20T13:31:38Z

You will never see SC in the year field of a Supreme Court decision (if the person who wrote it is citing things properly).
The court is only included when the relevant court is unclear from the reporter cited.
The Supreme court will always be cited to the U.S. or the S.Ct. reporters unless it's a slip opininon, so there should never be any ambiguity.

Bluebook R. 10.4(b) State courts.

In general, indicate the state and court of decision. However, do not include the name of the court if the court of decision is the highest court of the state.
The Bluebook: A Uniform System of Citation R. 10.4(b), at 106 (Columbia L. Rev. Ass’n et al. eds., 21st ed. 2020).*

mlissner · 2021-12-21T20:08:29Z

Thanks @devlux76. This looks like a great first bug. Any interest in trying to tackle it with a test and a fix?

devlux76 · 2021-12-21T21:29:30Z

Sure! Do we know where this citation came from?
A search of "263 F.Supp. 26" on Westlaw shows just 5 citing cases.

Bowen v. Massachusetts, 487 U.S. 879, 108 S. Ct. 2722, 101 L. Ed. 2d 749 (1988)

Bd. of Pub. Instruction of Palm Beach Cty., Fla. v. Cohen, 413 F.2d 1201 (5th Cir. 1969)

Fort Sumter Tours, Inc. v. Andrus, 440 F. Supp. 914 (D.S.C.), aff'd, 564 F.2d 1119 (4th Cir. 1977)

Mandel v. U.S. Dep't of Health, Ed. & Welfare, 411 F. Supp. 542 (D. Md. 1976), rev'd in part, vacated in part sub nom. Mayor & City Council of Baltimore v. Mathews, 562 F.2d 914 (4th Cir. 1977), opinion withdrawn and superseded on reh'g, 571 F.2d 1273 (4th Cir. 1978), and aff'd sub nom. Mayor & City Council of Baltimore v. Mathews, 571 F.2d 1273 (4th Cir. 1978)

Cites above are from Westlaw, but links are to CourtListener

But I don't see any of them with the above citation just doing a fulltext search.
Is this coming from secondary authority of some kind?

mlissner · 2021-12-21T22:14:15Z

That'd be a question for @jcushman, but I suspect he wouldn't know anymore. At this point, it's worth just running with the example he gave. I'd make a test using it, make sure the test fails, then write the code to fix it.

mattdahl · 2021-12-22T00:28:34Z

The one I encountered it in was the Bowen case. Not sure if it got my version from Courtlistener or Lexis, but in the Courtlistener one you link you'll see it if you search 263 F. Supp. 26 (SC 1967). Thanks for working on this!!

flooie · 2025-01-13T19:18:58Z

Lets review this for this sprint to see if this is still occurring and issue

quevon24 · 2025-01-15T18:43:49Z

It seems that it doesn't just fail there, the plaintiff is incorrect, it returns 1 instead of Lee County School Dist. No. 1

If you try to parse something like: 'Foo 12334 v. Bar, 1 U.S. 1' the plaintiff only returns the number, the defendant is correct

quevon24 · 2025-01-17T20:54:57Z

It seems that it doesn't just fail there, the plaintiff is incorrect, it returns 1 instead of Lee County School Dist. No. 1

If you try to parse something like: 'Foo 12334 v. Bar, 1 U.S. 1' the plaintiff only returns the number, the defendant is correct

I think this happens because the current approach to get the plaintiff names is to get the two words before v., and the problem is that in those two words we count the spaces.

For example, if we pass this string Smith v. Bar, 263 F.Supp. 26 (SC 1967) we get this list of "words":

['Smith', ' ', StopWordToken(data='v.', start=6, end=8, groups={'stop_word': 'v'}), ' ', 'Bar,', ' ', CitationToken(data='263 F.Supp. 26', start=14, end=28, groups={'volume': '263', 'reporter': 'F.Supp.', 'page': '26'}, exact_editions=(), variation_editions=(Edition(reporter=Reporter(short_name='F. Supp.', name='Federal Supplement', cite_type='federal', source='reporters', is_scotus=False), short_name='F. Supp.', start=datetime.datetime(1932, 1, 1, 0, 0), end=datetime.datetime(1988, 12, 31, 0, 0)),), short=False), ' ', '(SC', ' ', '1967)']

The current algorithm looks for the stopword v. (It means there is a plaintiff), we use the stopword index from the word list to get the two previous elements in the list, in this case: `['Smith', ' '] which is correct. Here:

if isinstance(word, StopWordToken):
    if word.groups["stop_word"] == "v" and index > 0:
        citation.metadata.plaintiff = "".join(
            str(w) for w in words[max(index - 2, 0) : index]
        ).strip()

But it fails when the plaintiff has more than two words, for example: Lee County School Dist. No. 1 in Lee County School Dist. No. 1 v. Gardner, 263 F.Supp. 26 (SC 1967)

The algorithm will return this words list:

['Lee', ' ', 'County', ' ', 'School', ' ', 'Dist.', ' ', 'No.', ' ', '1', ' ', StopWordToken(data='v.', start=30, end=32, groups={'stop_word': 'v'}), ' ', 'Gardner,', ' ', CitationToken(data='263 F.Supp. 26', start=42, end=56, groups={'volume': '263', 'reporter': 'F.Supp.', 'page': '26'}, exact_editions=(), variation_editions=(Edition(reporter=Reporter(short_name='F. Supp.', name='Federal Supplement', cite_type='federal', source='reporters', is_scotus=False), short_name='F. Supp.', start=datetime.datetime(1932, 1, 1, 0, 0), end=datetime.datetime(1988, 12, 31, 0, 0)),), short=False), ' ', '(SC', ' ', '1967)']

and the two words before v. are: ['1', ' ']

I'm guessing this was set to two elements before v. because is common for plaintiffs to have short names.

That's why I'm thinking about how we can adjust this.

quevon24 · 2025-01-24T19:29:11Z

It seems that it doesn't just fail there, the plaintiff is incorrect, it returns 1 instead of Lee County School Dist. No. 1
If you try to parse something like: 'Foo 12334 v. Bar, 1 U.S. 1' the plaintiff only returns the number, the defendant is correct

I think this happens because the current approach to get the plaintiff names is to get the two words before v., and the problem is that in those two words we count the spaces.

For example, if we pass this string Smith v. Bar, 263 F.Supp. 26 (SC 1967) we get this list of "words":

['Smith', ' ', StopWordToken(data='v.', start=6, end=8, groups={'stop_word': 'v'}), ' ', 'Bar,', ' ', CitationToken(data='263 F.Supp. 26', start=14, end=28, groups={'volume': '263', 'reporter': 'F.Supp.', 'page': '26'}, exact_editions=(), variation_editions=(Edition(reporter=Reporter(short_name='F. Supp.', name='Federal Supplement', cite_type='federal', source='reporters', is_scotus=False), short_name='F. Supp.', start=datetime.datetime(1932, 1, 1, 0, 0), end=datetime.datetime(1988, 12, 31, 0, 0)),), short=False), ' ', '(SC', ' ', '1967)']

The current algorithm looks for the stopword v. (It means there is a plaintiff), we use the stopword index from the word list to get the two previous elements in the list, in this case: `['Smith', ' '] which is correct. Here:
if isinstance(word, StopWordToken):
    if word.groups["stop_word"] == "v" and index > 0:
        citation.metadata.plaintiff = "".join(
            str(w) for w in words[max(index - 2, 0) : index]
        ).strip()
But it fails when the plaintiff has more than two words, for example: Lee County School Dist. No. 1 in Lee County School Dist. No. 1 v. Gardner, 263 F.Supp. 26 (SC 1967)

The algorithm will return this words list:

['Lee', ' ', 'County', ' ', 'School', ' ', 'Dist.', ' ', 'No.', ' ', '1', ' ', StopWordToken(data='v.', start=30, end=32, groups={'stop_word': 'v'}), ' ', 'Gardner,', ' ', CitationToken(data='263 F.Supp. 26', start=42, end=56, groups={'volume': '263', 'reporter': 'F.Supp.', 'page': '26'}, exact_editions=(), variation_editions=(Edition(reporter=Reporter(short_name='F. Supp.', name='Federal Supplement', cite_type='federal', source='reporters', is_scotus=False), short_name='F. Supp.', start=datetime.datetime(1932, 1, 1, 0, 0), end=datetime.datetime(1988, 12, 31, 0, 0)),), short=False), ' ', '(SC', ' ', '1967)']

and the two words before v. are: ['1', ' ']

I'm guessing this was set to two elements before v. because is common for plaintiffs to have short names.

That's why I'm thinking about how we can adjust this.

I'll move this to a new issue so we can close this one, now we can relate SC to South Carolina instead of scotus. A test case has been added.

mattdahl changed the title ~~Don't set court='scotus South Carolina cit~~ Don't set court='scotus' for South Carolina citations Jun 26, 2021

devlux76 added a commit to devlux76/eyecite that referenced this issue Dec 25, 2021

Addressing issue freelawproject#84

2590595

devlux76 mentioned this issue Dec 25, 2021

Don't set court='scotus' for South Carolina citations #84 #105

Open

github-project-automation bot added this to Citator Oct 12, 2024

github-project-automation bot added this to Case Law Sprint Nov 15, 2024

flooie moved this to General Backlog in Case Law Sprint Nov 19, 2024

flooie moved this from General Backlog to Backlog Dec 16 - Dec 27th in Case Law Sprint Dec 16, 2024

flooie moved this from Backlog Dec 16 - Dec 27th to To Do in Case Law Sprint Dec 17, 2024

flooie moved this from To Do to Buffer Zone in Case Law Sprint Jan 13, 2025

flooie assigned quevon24 Jan 13, 2025

flooie moved this from Buffer Zone to Backlog Jan 13 to Jan 24 in Case Law Sprint Jan 13, 2025

quevon24 closed this as completed Jan 24, 2025

github-project-automation bot moved this to Done in Citator Jan 24, 2025

github-project-automation bot moved this from Backlog Jan 13 to Jan 24 to Done in Case Law Sprint Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't set court='scotus' for South Carolina citations #84

Don't set court='scotus' for South Carolina citations #84

mattdahl commented Jun 26, 2021

devlux76 commented Dec 20, 2021

mlissner commented Dec 21, 2021

devlux76 commented Dec 21, 2021

mlissner commented Dec 21, 2021

mattdahl commented Dec 22, 2021

flooie commented Jan 13, 2025

quevon24 commented Jan 15, 2025

quevon24 commented Jan 17, 2025

quevon24 commented Jan 24, 2025

Don't set court='scotus' for South Carolina citations #84

Don't set court='scotus' for South Carolina citations #84

Comments

mattdahl commented Jun 26, 2021

devlux76 commented Dec 20, 2021

mlissner commented Dec 21, 2021

devlux76 commented Dec 21, 2021

mlissner commented Dec 21, 2021

mattdahl commented Dec 22, 2021

flooie commented Jan 13, 2025

quevon24 commented Jan 15, 2025

quevon24 commented Jan 17, 2025

quevon24 commented Jan 24, 2025