-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addresses #125 by adding some missing members of 115th Congress #132
Conversation
This is very impressive work. I especially appreciate the addition of metadata documenting the source of each image -- that's crucial, if we're not getting them from GPO. I do have some concerns about copyright for some of these, unfortunately. Photos from house.gov and senate.gov sites should be fine. But the photos from campaign websites have no guarantees around being public domain, and are likely to be copyrighted by the photographer. The metadata for photos from Wikipedia/Wikimedia is a mixed bag. They're cited in a few different ways -- by the Wikipedia article, by the Wikimedia Commons URL, or the Wikipedia file URL. In each of these cases, Wikipedia is getting them from an underlying source, and our metadata should link to the underlying source instead. For example, https://commons.wikimedia.org/wiki/File:Scott_Taylor_(politician).jpg is from a Facebook album from a House committee, so that would be the appropriate link. (And I would also feel comfortable with the public domain status of that photo.) However, it's impossible to tell from the Wikipedia links where these come from. We do want public domain here, because we want even commercial uses of this data to be unencumbered, and for there just to be as few questions as possible. So, while respecting all the work that has gone into this, I would like to see the PR drop photos from campaign sources, and to replace the Wikipedia links with the underlying source that Wikipedia drew it from. |
@konklone thank you for the detailed feedback. We will refine our PR as described:
Will get back to you soon. |
Unfortunately, I have to recommend removing the Ballotpedia one, since they took it from his campaign. Inspired by your links from dems.gov, I did also take a look at https://www.gop.gov/about/members/, but Mike Gallagher's not there either. I think he might just have to stand empty unless an official photo of him can be found from another source. |
That's Gallagher, gone. Mike Johnson is also sourced from Ballotpedia. J000299 |
congress/metadata/G000580.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: Ballotpedia | |||
link: https://ballotpedia.org/Vicente_Gonz%C3%A1lez |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed through reverse image search, also from Gonzales' campaign.
congress/metadata/J000299.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: Ballotpedia | |||
link: https://ballotpedia.org/Mike_Johnson_(Louisiana) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed through reverse-image-search, also from the campaign. :/
congress/metadata/G000581.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: Facebook | |||
link: https://www.facebook.com/votevicente/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why this is two separate bioguide IDs -- the images for G000580 and G000581 are both for Vicente Gonzalez. Also, this one is from a campaign Facebook page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
G000580 should have been a Thomas Garret, I think.
congress/metadata/G000583.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: Twitter | |||
link: https://pbs.twimg.com/profile_images/735547603912433664/TIW7-Jzs.jpg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This image appears to be from the member's campaign. Two candidate replacements:
- Header from official page: https://gottheimer.house.gov/images/jg-headshot.png
- Avatar from https://twitter.com/RepJoshG
congress/metadata/H001077.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: Facebook | |||
link: https://www.facebook.com/captclayhiggins/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Facebook page is the member's campaign account.
congress/metadata/K000390.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: Votesmart | |||
link: http://votesmartnv.org/us-congress/ruben-kihuen/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from the campaign. Candidates for replacement:
- Home page header: https://kihuen.house.gov/images/headshot-web.png
- Twitter avatar: https://twitter.com/RepKihuen
congress/metadata/L000587.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: Genesis Communications Network | |||
link: https://commons.wikimedia.org/w/index.php?title=User:Genesis_Communications_Network&action=edit&redlink=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to an edit
link for a user page.
congress/metadata/L000586.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: Al Lawson Campaign - Own work | |||
link: https://upload.wikimedia.org/wikipedia/commons/e/e2/Al_Lawson.jpg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The page with details is https://en.wikipedia.org/wiki/File:Al_Lawson.jpg, and it says the campaign donated it with a CC-BY-SA 4.0 license. That's better than usual, but we still need to look for a public domain image and not pass attribution requirements on down to downstream users.
Triaging remaining comments here:
|
Looks like the following were those this PR tried to add, but removed due to attribution problems:
updated (described in later comment) |
congress/metadata/B001299.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: Gage Skidmore | |||
link: https://commons.wikimedia.org/wiki/File:Jim_Banks_by_Gage_Skidmore.jpg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has a CC-BY-SA 3.0 license attached, and needs to be replaced with a public domain version.
@konklone once this is suitable, I can squash so @unitedstates doesn't have to keep the git history for all the images we've rejected. |
convert_to_jpg.sh
Outdated
convert $file $filename.jpg # Convert to a jpg | ||
rm $file | ||
fi | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you'd like to keep this script in the repo, please move it to the scripts/
directory. Thank you.
congress/metadata/C001111.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: LWVSPA | |||
link: https://lwvspa.org/congressional-representatives-pinellas-county/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't figure out where Crist's image ultimately comes from here, so it's not safe to assume it's free of copyright. Options:
- http://crist.house.gov/images/crist-head.jpg
- https://twitter.com/repcharliecrist avatar (seems like the same)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(BTW, sorry for the delays here -- I'm doing this in my spare time at the end of evenings.)
congress/metadata/G000578.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: Florida House of Representatives | |||
link: http://www.myfloridahouse.gov/Sections/Representatives/representatives.aspx?SortField=district&SortDirection=asc&SessionId=73 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, sorry for this, I know this is a .gov, but state governments don't have the same guarantee of public domain as the federal government. I can't find a quick replacement for this one except for the (bad) avatar at https://twitter.com/Rep_Matt_Gaetz.
congress/metadata/L000587.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: Genesis Communications Network | |||
link: https://commons.wikimedia.org/wiki/File:Jason_Lewis.png |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is licensed under CC-BY-SA 3.0, not public domain.
congress/metadata/M001200.yaml
Outdated
@@ -0,0 +1,2 @@ | |||
name: JCWilmore - Own work | |||
link: https://commons.wikimedia.org/wiki/User:JCWilmore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link doesn't contain the image in question. I would recommend looking for an official .gov source instead of Wikipedia anyway.
OK, I've completed my review, and the rest of the URLs look like they are from official government sources. In the future, it would be better to do a review like this ahead of time (this was fairly intensive). I'm going to add an issue and PR template to make it more clear that the images must be public domain, to hopefully avoid this going forward. |
@hugovk You're the better maintainer to review this and merge. |
And yeah, that would be great. |
|
e4207cf
to
039d6b0
Compare
@konklone, thank you for the additional work in preparing this PR. I've squashed our work so @unitedstates will not need to retain the rejected images in its git history. I've preserved that git history in a branch in case anyone wants or needs any of it later. Further feedback is welcomed, I'm happy to continue amending these changes until they are satisfactory. Thanks. |
- We wanted to help get it up to date - Many thanks to @woobietuesday, @DesmondJones for helping to gather images - We worked from this list https://gist.github.com/rthbound/9abeca9b0c4890d58f66d29b3e75bafe - Hope this helps! Adds originals. These need to be converted to .jpg of same name (for those that are jpeg or png) Switch to images for which attributions can be found for some being added Metadata for new additions Converts updated originals, c/o @hiemanshu - also adds imagemagick script used in converting originals to the prescribed <BIOGUIDE>.jpg format Refine Wikipedia/Wikimedia attributions Improve source for R000606 Improve source for B001303 Improve source for B001300 Improve source for K000389 Improve source for B001301 Improve source for C001110 Improve source for G000579 Delete images for T000478 until public domain images can be found. Improve source for J000298 Delete images for R000606 until public domain images can be found. Delete images for J000298 until public domain images can be found. Removes Ballotpedia sourced image taken from campaign photos Delete images for G000580 until public domain images can be found. Delete images for J000299 until public domain images can be found. Delete images for H001077 until public domain images can be found. Delete images for G000581 until public domain images can be found. Delete images for L000586 until public domain images can be found. Improve L000587 attribution Adds G000580 Changes K000390, using twitter avatar as source Changes G000583, previous choice was from a campaign photo Use house.gov source for B001299 Reloceate @hiemanshu's (png|jpeg) -> jpg script to ./scripts/ Use public domain image for C001111 Use public domain image for G000578 Omit M001200 until a public domain image can be found Remove L000587 until a public domain replacement can be found
039d6b0
to
d93b20c
Compare
@konklone Looks ok to me, feel free to merge after making any final checks. |
Thank you @rthbound, @woobietuesday, @DesmondJones, @hiemanshu, @ProgressiveCoders and @konklone! |
A group on ProgCode is using this repo
Adds originals.
Switch to images for which attributions can be found for some being added.
Metadata for new additions
Converts updated originals, c/o @hiemanshu
to the prescribed .jpg format
Addreses #125.