Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add State Attorney General Opinion Scrapers #168

Open
2 tasks done
mlissner opened this issue Dec 21, 2016 · 10 comments · May be fixed by #647
Open
2 tasks done

Add State Attorney General Opinion Scrapers #168

mlissner opened this issue Dec 21, 2016 · 10 comments · May be fixed by #647

Comments

@mlissner
Copy link
Member

mlissner commented Dec 21, 2016

Two top level tasks here:

  • Trawl the internet and find all the available sources.

  • Make the scrapers.

I'll develop a list below of all scrapers we want to build.

@arderyp
Copy link
Collaborator

arderyp commented Jan 20, 2017

Maybe this should be a living ticket where we update the list of new scrapers to add. If so, maybe we could rename this issue, or create a new issue to handle the living list. Do you think we should create separate child issues for each new scraper we want to add, so we can close those when done? It would be helpful when adding to our living list to indicate:

  1. Resource URL
  2. Desired module name for new scraper (juriscraper.opinions.whatever.new_scraper_name)
  3. Link to child issue for this specific new scraper, if you think child issues is something you want

Maybe it could be a 3 column table that we add new desired scrapers to with the info above. If this sounds reasonable, could you update this issue (or create new issue for living list) to include this new format/info for your Maryland example above? And if we want this to be a living issue with links to sub issues, could you also add the new scrapers from #167 to the list here with the pertinent 1-3 info mentioned above?

@mlissner
Copy link
Member Author

mlissner commented Jan 20, 2017

I like the idea of having one ticket per court, but obviously I occasionally make bigger tickets like this one. We've also used the github wiki in the past for this purpose (I think there's still a page with a gazillion court links).

Dunno. I'd say let's keep this one focused on State AG. Closing tickets is nice, and keeping them short is nice too.

So, in that vein, here's the info we need:

Done State Link Module Name
Alabama http://www.ago.state.al.us/Opinions.aspx alaag
Alaska http://www.law.state.ak.us/doclibrary/opinions-index/opinions_chron.html arkag
Arizona https://www.azag.gov/ag-opinions azag
Arkansas http://www.arkansasag.gov/opinions/index.php arizag
California https://oag.ca.gov/opinions calag
Colorado https://coag.gov/resources/formal-ag-opinions coloag
Connecticut http://www.ct.gov/ag/cwp/browse.asp?a=1770 connag
Delaware http://opinions.attorneygeneral.delaware.gov/ delag
D.C. https://oag.dc.gov/page/oagcorporation-counsel-opinions dcag
Florida http://myfloridalegal.com/opinions flaag
Georgia http://law.ga.gov/opinions gaag
Guam http://www.guamag.org/ (but down at present) guamag
Hawaii http://ag.hawaii.gov/publications/opinions/ haag
Idaho http://www.ag.idaho.gov/publications/op-guide-cert_index.html idahoag
Illinios http://www.ag.state.il.us/opinions/ illag
Indiana http://www.in.gov/attorneygeneral/2352.htm indag
Iowa https://www.iowaattorneygeneral.gov/about-us/attorney-general-opinions/ iowaag
Kansas http://ag.ks.gov/media-center/ag-opinions kanag
Kentucky http://ag.ky.gov/civil/civil-enviro/opinions/Pages/default.aspx kyag
Louisiana https://www.ag.state.la.us/Opinions laag
Maine https://www.maine.gov/ag/about/ag_opinions.html meag
Maryland http://www.marylandattorneygeneral.gov/Pages/Opinions/index.aspx mdag
Mass. http://www.mass.gov/ago/government-resources/ags-opinions/ massag
Michigan http://www.ag.state.mi.us/opinion/opinions.aspx michag
Minnesota http://www.ag.state.mn.us/office/opinions/ minnag
Mississsippi https://govt.westlaw.com/msag/Index?__lrTS=20170120220617535 (review TOS?) missag
Missouri https://ago.mo.gov/other-resources/ag-opinions moag
Montana https://dojmt.gov/agooffice/attorney-generals-opinions/ montag
Nebraskta https://ago.nebraska.gov/ag_opinion nebag
Nevada http://ag.nv.gov/Publications/Opinions/ nevag
New Hampshire http://www.doj.nh.gov/media-center/opinions.htm nhag
New Jersey http://nj.gov/oag/ag-opinions.htm njag
New Mexico http://public-records.nmag.gov/opinions nmag
New York https://ag.ny.gov/appeals-and-opinions/numerical-index nyag
N. Carolina http://www.ncdoj.gov/getdoc/683dc0b7-ad27-4dbd-8ea5-c9109ff016a0/Legal-Opinions.aspx ncag
N. Dakota https://attorneygeneral.nd.gov/attorney-generals-office/legal-opinions/opinion-search ndag
N. Mariana Islands No online presence found nmiag
Ohio http://www.ohioattorneygeneral.gov/About-AG/Service-Divisions/Opinions/Opinions-Archive ohag
Oklahoma https://www.ok.gov/oag/Legal_Resources/AG_Opinions.html oklaag
Oregon http://www.doj.state.or.us/agoffice/pages/index.aspx orag
Penn. https://www.attorneygeneral.gov/The_Office/Official_Attorney_General_Opinions/ paag
Puerto Rico http://www.justicia1.pr.gov/ordenesa/opiniones.aspx (spanish) prag
Rhode Island Homepage, but cannot find opinions: http://www.riag.ri.gov/ riag
S. Carolina http://www.scag.gov/opinions scag
S. Dakota http://atg.sd.gov/OurOffice/OfficialOpinions/opinions.aspx sdag
Tennessee https://www.tn.gov/attorneygeneral/topic/attorney-general-opinions tennag
Texas https://texasattorneygeneral.gov/opinion/index-to-opinions texag
Utah 1969-present on microfiche, but that's it! https://archives.utah.gov/research/inventories/20369.html utahag
Vermont There's 12 between 2000 and today: http://ago.vermont.gov/divisions/about-the-attorney-generals-office/attorney-general-opinions.php More here: http://libguides.vermontlaw.edu/vermontlawguide/vermontadministrativelaw and here: http://llmc.com/titledescfull.aspx?type=6&coll=49&div=195&set=08140 (paywall?) vtag
Virgin Islands Homepage, but no opinions: http://usvidoj.codemeta.com/DivisionContent_1.php?divId=84 viag
Virginia http://ag.virginia.gov/citizen-resources/opinions/official-opinions vaag
Washington http://www.atg.wa.gov/AGOOpinions/opinion washag
West Virginia http://www.ago.wv.gov/publicresources/Attorney%20General%20Opinions/Pages/default.aspx wvaag
Wisconsin https://docs.legis.wisconsin.gov/misc/oag wisag
Wyoming http://ag.wyo.gov/formal-opinions wyoag

I'll also just point out that the module name will almost always correspond with the ID in CourtListener.

@mlissner
Copy link
Member Author

If you're new here and can help, please say which scraper you're able to work on, and check out the readme to get started.

@arderyp
Copy link
Collaborator

arderyp commented Jan 21, 2017

dear lord! Is there any strategy here, or just start working from the top?

I'll take care of the others in #167 first so we can close that ticket.

opinions/united_states/federal_special/ag.py
opinions/united_states/state_special/mdag.py

@mlissner
Copy link
Member Author

mlissner commented Jan 21, 2017 via email

arderyp added a commit that referenced this issue Feb 4, 2017
…cludes a backscraper that should be run after deployment. Related to #168
arderyp added a commit that referenced this issue Feb 4, 2017
…is included and should be run after deployment. It will capture 533 cases and take only about 1 second to run. Relates to #168
arderyp added a commit that referenced this issue Feb 5, 2017
…f HTML variation across the history of opinions, so I've added multiple example files for coverage. A backscraper is included and should be run after deployment, taking around 2 minutes and yeilding 18,377 cases. Relates to #168
@dmvj
Copy link
Contributor

dmvj commented Apr 8, 2021

I don't know if this is useful or not, but the Alabama AG has opinions going back to 1979. They're numbered differently depending on period:

When searching by opinion number use the following formats:
Opinions 79-00001 to 95-00338, use format yynnnnn (ex: 8500001)
Opinions 96-00001 to 99-00290, use format yy-nnnnn (ex: 96-00331)
Opinions 2000-001 to Present, use format yyyy-nnn (ex: 2000-025)

And these are the correct ranges from 1994-2000:

9400001 - 9400267
9500001 - 9500338
96-00001 - 96-00331
97-00001 - 97-00298
98-00001 - 98-00225
99-00001 - 99-00290
2000-001 - 2000-252

The PDFs are named according to the above scheme. So, examples from the three different date formats:

https://www.alabamaag.gov/Documents/opin/9400001.pdf
https://www.alabamaag.gov/Documents/opin/97-00001.pdf
https://www.alabamaag.gov/Documents/opin/2000-003.pdf

@mlissner
Copy link
Member Author

mlissner commented Apr 8, 2021

That's great. One day, perhaps, we'll get on this, but absent a volunteer picking it up, it's outside of our budget to do this work for the moment.

@dmvj
Copy link
Contributor

dmvj commented Apr 8, 2021

I'm just learning Python, so maybe if I ever get to a place where I understand *args and **kwargs, I can help. But at least the information's there now. :)

@flooie flooie linked a pull request Feb 1, 2023 that will close this issue
@flooie
Copy link
Contributor

flooie commented Feb 1, 2023

Notes on AGs

Missing

  • Massachusetts, - no longer issued

  • Mississippi - Westlaw

  • Iowa - Westlaw

  • Rhode Island - None Found

  • Utah - None Found

  • Wyoming - Google drive

  • New Mexico - None Found

  • Virgin Islands - None found

  • Puerto Rico - None found

  • Guam - Google drive

I suppose the google drives could maybe be tackled with selenium but I wasn't up for figuring that out.
Massachusetts does issue opinions on meetings - but its not the same thing.

Rhode Island is a mystery because I do think they exists but I dont know where.

Future Notes

This was the bare minimum - and didnt set up for back scraping opinions.
Also It looks to me that lots of offices have scaled back - or they post these very intermittently.
Some may not post in years and some haven't posted in years - but I set up the scrapers anyway.

@flooie
Copy link
Contributor

flooie commented Feb 1, 2023

I also moved all the AG scrapers into a new folder juriscraper/opinions/united_states/attorney_general

And moved all the previously added ones into that directory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: State Trial/AG
Status: Scraper Coding
Development

Successfully merging a pull request may close this issue.

4 participants