-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the definition of 'identical' bills to encompass 'companion or antecedent' bills #465
Comments
PR #466 takes a step toward narrowing the definition, by only including bills that we identify as 'identical' and 'nearly identical'. As discussed above, that will eliminate most bills from previous congresses from the list of 'identical' bills. |
Here's my best guess for what might help to make antecedent work to include
"this bill but from the prior Congress":
1. Identified by CRS (so we get the companion bill in the current Congress)
1b. For companion bills that CRS has identified, we also may wish to run
the query for its antecedents. (And only look at prior congresses so we
don't create a recursive loop. :) )
2. Nearly identical text (should help us go across multiple Congresses, but
will miss some, but that's okay)
3. Same short title -- not additional titles. This should help us go across
multiple Congresses. This likely will lead to some false positives, but
hopefully there will not be too many false positives.
(a) Josh identified an approach of looking to see whether the first two
identified sponsors are the same as the first two identified sponsors in
previous legislation, but that could entail significant effort although it
would be very precise.
(b) Perhaps a less intensive effort would be to compare the size of the
bill. You could look at the file size or the number of characters or words
in a bill. If they are within, say, 10% of each other in terms of size,
then it is more likely an antecedent bill. This would do a good job of
keeping out when the bill we are examining is included in a much larger
bill, which is fine by me for the purposes of identifying an antecedent.
To narrow it further, you could also require at least one section to match.
Is this a useful approach?
…On Tue, Jun 22, 2021 at 4:40 PM Ari Hershowitz ***@***.***> wrote:
PR #466 <#466> takes a step toward
narrowing the definition, by only including bills that we identify as
'identical' and 'nearly identical'. As discussed above, that will eliminate
most bills from previous congresses from the list of 'identical' bills.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#465 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAWRVUAPQDVF2L5SQLT3EUTTUDYMTANCNFSM47ENVGTQ>
.
|
aih
changed the title
Narrow the definition of 'identical' bills
Update the definition of 'identical' bills to encompass 'companion or antecedent' bills
Jul 8, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As discussed today on the call, the definition of 'identical' bills, used in the context section, should be narrower than it is now.
It should include: the bill itself, the companion bill (in the other chamber), bills identified by CRS as 'identical', and 'antecedent legislation'.
Our current categories in the BillMap project do not easily map to 'antecedent' legislation. We find bills that are related in the following categories:
To identify antecedent legislation, Josh considers bills that have the same title for the whole bill + same sponsor. That is a more accurate measure, but would be a significant additional effort to implement in BillMap.
The text was updated successfully, but these errors were encountered: