-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A first pass at a third party disclosure page #399
Conversation
source for symantec's privacy policy covering their work as a CA: http://www.symantec.com/content/en/us/about/media/repository/stn-cps.pdf
Wow, great. |
Starter text to provide some context about these sites. To a lay reader, calling them out seems alarming. Need to set the correct context that these are not unusual things.
So, I really like this. But I don't think it should be on its own page. This is for a few reasons:
(sorry, NASA - it's out of love)
Anyway, just my two cents. Agree this is really important to state, but I think we should investigate other ways of doing it first. Perhaps creating a merged Privacy Policy that includes all of this, and links to the associated PIA/TPWA documents. |
👍 This is a helpful addition for several reasons. It's transparent and considerate to our users, but I also like it as a further post about our technical process. @OriHoffer raised a good point though about the dynamic of the page, but I think you've handled that really well with the third paragraph. @polastre - On the subject of user-centered design, I agree with its central role but don't think that it precludes informed piloting. This page would be important precedence on the .gov scene, possibly the first of its kind. It's based off of significant sector research on the current state of privacy concerns and if anything, I'd contrast it to the strong HTTPS work (and the text posted to explain it). That was a incredibly successful effort and more lightweight piloting like this seems well worth it. |
@jtag added some language to make this seem less scary to users, and I edited it further in 88c71c7. So there's an additional paragraph there now:
@polastre @gbinal I also feel that it's all right for us to get out ahead of what the public is clamoring for here, and to gauge the response. In general, I think the 18F website gives us the perfect opportunity to pilot things, as a (relatively) low-traffic, non-critical piece of infrastructure that nonetheless is still a .gov and an important part of our brand. @seanherron I agree with you in principle, but the two main reasons I'd argue for shipping this as a separate page right now are:
I completely share your view about footer creep. We're spinning up a full on redesign of the 18F site in the near future, and we should make sure we structure our site in such a way that we have other places to put this sort of thing than in the footer. For example, in #397 I couldn't find a better easy place to put the Edit link than in the footer. It's like the worst possible place for it, but it's all that's there right now. Let's consider that issue separately, and talk about it as part of the redesign process. |
@konklone good points. What about instead making this a generic "Privacy" page, including this content, and linking to the GSA Privacy Policy? Then we can build from there as we go. If we're going to create a URL (in this case |
Mentioning our own privacy policy is solid sense. And I like I've updated the site in 643ce73. I've updated the screenshot in this PR's description with a current screenshot. |
👍 This sets us up to do a bigger privacy initiative at |
Is the use of Google Fonts significantly special/different from the way we use code libraries? |
It's not. If 18F used other CDNs to serve code (e.g. jQuery), we'd have to list those too. I've specifically removed all use of CDNs from the 18F homepage to remove third parties from the mix. If other sites do a third party services page like this, they'd need to either remove the CDN or add the CDN to the page. Taking a quick glance at the dashboard, I see it uses Font Awesome from a CDN. If the dashboard wants to update its footer to also link to this page, as it's written, it'd need to move the Font Awesome resources into the repo and not use the CDN. |
@polastre - Of course we can take our time to work through this (fwiw - in the issue would be helpful), but I don't necc. agree that it's reasonable to block proposed content with a counterproposal of something several times more complex. On its merits, I think this pull request is a good one. Do I think it's the end all be all? No. But it's helpful progress. Instead of saying no to this until it goes through several more levels of work, it seems reasonable to focus on whether this is progress and to suggest further commits (or better, suggest pull requests). |
How is the AWS privacy policy relevant to an end user of our services? It seems mainly (only?) applicable to a first-hand user of aws.amazon.com. |
This is a good point, and I'd like to find a way to handle this better. What I'm trying to communicate is that Amazon very much has technical access to the behavior and profile of visitors to websites hosted on their servers. Their privacy policy and TOS is written to assure us about how they treat that responsibility, but it's also the only document I can think of that is relevant to end users about how Amazon handles their data. I never see websites talk about their hosting companies as an intermediary between them and the user, which I think gives users an incomplete view of the parties being depended upon to behave well in an interaction with our website. Any suggestions for a better way of tackling this? |
I think linking to AWS's privacy policy is confusing/misleading at best. Symantec's privacy policy is similarly tangential. Frankly, I suspect this page is overkill for the vast majority of users-- which touches back on the points made by @polastre, @cpapazian, @OriHoffer, @jpyuda and @quepol. |
Yeah, I don't see where the AWS privacy policy really particularly enters into matters. Let's walk this back a bit and do some research on two things:
We can use what we learn from those things to guide the design of such a page. Also could be a good opportunity for some of the folks doing the human-centered design class to use some of those techniques on a real (albeit small) thing. |
I'm with @jpyuda -- I think everything you're trying to accomplish should be part of the Privacy Policy, as I understand the federal regulations. Particularly the section in privacy policies entitled "Information collected automatically" (for example, see https://myusa.18f.us/legal#privacy-policy) What I would like to see 18F do is put in place our own privacy policy, using standard common language, that can be used as a model for other government websites and services. Submitted issue #411. I vote 👎 on this PR and 👎 on adding a new page, as I believe everything @konklone is looking to accomplish can and should be done in a privacy policy. |
Then let's put this PR at a #411 is a fantastic idea and I'm a big 👍 on it, but it's a policy task and could take a substantially long time to get shipped. The information in this PR is a subset of what we want to accomplish there. |
Makes sense. 👍 to iterations. |
This is a placeholder because I don't want to forget; I apologize if you already addressed this. We should probably mention that GSA has negotiated its own terms of service for government agencies with some of these companies that may affect this. For example: the Google Analytics tos amendment. |
Thanks, @leahbannon. @jpyuda - I've recently been going through all of the .gov requirements and can tell you that we're not in a position to replace the GSA.gov/privacy material anytime soon. Requirements can be found here (and in greater detail down a bit here). I think we'll continue having to link off to GSA's for a good while. That said, let's get started. It definitely makes sense to have this PR reside at /privacy. To @seanherron's and @polastre's point, it then asks to be subsumed into other related content instead of living on its own forever. To the question of user research, three significant government privacy experts have weighed in on this as a good idea (EFF, @JoshData, and @konklone). That seems ample to me, plus I believe that we should be bold in experimenting with new and innovative ideas - this qualifies for that. +1 to merging this as /privacy and communicating about it, with an emphasis on asking for feedback and promising iteration. |
After some further chatting with the comms team, I'm curious where we go
I'm feeling the need to put the brakes on this and back up a bit. (For Thoughts? On Wednesday, December 10, 2014, Eric Mill notifications@github.com wrote:
|
@quepol - think that's fine but that it's incumbent on you and any others to step up and help flesh out the underlying issues. @konklone is already putting a significant amount of work into this PR and I don't think its reasonable for folks to say 'wait', 'stop', 'do more first' without stepping up to be a part of the solution.
Solid point, though I don't think I've even seen nearly this as cost be added to someone trying to publish to our site as is such with this PR. Your points are very reasonable but I think we need to make the effort to work with @konklone to figure these things out. |
There are 10 people involved in this thread, making the effort! Thanks, all. @konklone let's chat today about the "slippery slope" issues (i.e. being clear that this is just for 18f.gsa.gov and doesn't apply to the organization, like our open source policy or other declarations); where it should live if we work those out; and anything else. |
Holidays, yes, but it's been over a month.... Any further word, because this continues to seem to me a pocket veto. I strongly believe there needs to be an articulated standard for who/how decisions will be made about site edits or there needs to be a length of time or number of pings for people to make pull requests, otherwise something should be able to move forward. On another note - it's a bit uninformed piece, but this recent article buoys the argument for this page. |
To my above, one response:
|
@konklone - in line with @gbinal comments, it may be worthwhile to revisit and decide if we still want to flesh this out more. In my mind, we still have some unresolved questions:
|
I believe I have addressed these points above, but it's good to get them succinctly stated.
The GSA Privacy Policy doesn't cover any of this. There's no information in the GSA privacy policy on the specific third parties that we use. The language is intentionally broad, and doesn't tell people what the third parties involved do with their data. We're also not in a position to say everything that happens to users' data. We can point to what Google says they'll do with it, or Amazon (if they said anywhere). At the very least, we can identify what parties are involved in the transaction. I think we should ship a separate disclosure page for this, and then work on a unified privacy and disclosure page. I'd be very much into a larger statement. I see it as all downside to wait on this aspect of it, when it's a small discrete component like this is.
If it makes sense for healthcare.gov, it makes sense for every website. There's nothing particularly special about healthcare.gov in this regard, except that it's more well known and more likely to get stories written about it. I see this as very similar to the reasons we push for HTTPS on internet connections. Fundamentally, it's about ensuring that the connection between two parties is between only two parties. When Comcast inserts ads into a public website, or Verizon inserts a tracking header, that's violating the connection between two parties, without the knowledge or consent of at least the user (if not the website). By using hosting outside of GSA's direct physical control, 18F (like many, many other organizations) has made the decision to grant access to our users' connections to other parties. We rely on Google and Amazon's internal policies to safeguard that information, and our privacy policies can do nothing to control how they're used, or to assure the user of anything. Mainstream users generally don't know that
I wrote this intentionally only to apply to 18f.gsa.gov. I think that's clear in the text, but I'm happy to make it clearer. I'm not sure what you mean by how far down the stack we disclose, but I've attempted to carefully limit this to third party services which see information about individual user connections. I don't think this would apply to e.g. New Relic (which we don't use for 18f.gsa.gov right now anyway), and certainly not to any of the open source software we use in our stack. More generally -- I don't think concerns about website sprawl or user testing should be this serious, with regards to this pull request. As stated above, user testing is not our absolute religion and we should be capable of piloting things in the wild when we want to, and adapting to user feedback as necessary. The content on the page is also quite small, and easily adaptable to wherever the site redesign takes us. I recognize that much of the team doesn't seem to feel the same passion for this issue that I do, and that there's not the same obvious consensus as for an open source policy. But that's because we're ahead of the curve, not just catching up to the private sector. It gives us an opportunity to get out ahead of the issue as attention grows on third party services, and to be a leader on the issue. Being a leader on an issue means making a statement on something when it's not obvious or popular yet. This is what that feels like. I strongly believe that if we give this a shot, it will be all upside for 18F, and potentially have ripple effects outside of our organization that help move the needle on making the web a place that better serves the people's interests. |
True, the GSA Privacy Policy doesn't cover this, but it does point people to GSA's disclosure of Privacy Information Act and Systems of Record Notices, which do tell people what specific third parties do with their data, and what the government is doing to protect and secure that data, many times to great detail. We are in a position to dictate what happens with our user's data - by demanding specific outlines and privacy procedures from third parties we work with, like with the DAP. I'm not suggesting making that information easier to understand and more direct is a bad thing, but I also don't think there is enough value in what we have presented in this PR to warrant the creation of a new page on the site, especially if that information is already available in more detail elsewhere.
I very much disagree with that. Healthcare.gov deals with very sensitive, private information. We do not. We need to look at this through the lens of use - the impact of a list of people who have visited 18f.gsa.gov being released to the public is inherently less damaging than a list of people who have signed up for healthcare via healthcare.gov. I would go so far as to say that the "if it makes sense for x, it makes sense for everything" argument embodies everything wrong with the traditional government approach to IT Security.
Proactive disclosure here is very different than HTTPS. Implementing respect for Do Not Track (and disabling third party embeds where we can't ensure DNT is respected for DNT-enabled clients) is similar to HTTPS. A list of third party embeds (which is already available to users who look in our source code or open their inspector) is much less critical. I would venture to guess that a venn diagram of people who care enough to read this and people who can already determine our third party embeds has a lot of crossover, though I can't say that definitively without user research.
How does the existence of this page improve the experience for users who don't know about it and don't care? Further more, open source as a policy was most definitely user tested, starting from way before either you or I were even born. The 18F open source policy is a descendant of, literally, decades of work, much of it pioneered in government by the military, research institutions, NASA, CFPB, and others.
By "how far in the stack", we, for instance, include AWS in here (though I know there is discussion on that above). We don't, however, disclose the fact that internet traffic between the user and AWS may pass through networks owned by a number of other companies. How is that materially different? Why would this not apply to New Relic, considering New Relic would have access to nginx access logs and the like? I understand where you're coming from on this, but I think we could do more work to clearly state the scope of this. Perhaps something along the lines of "here are the third party organizations that may know you are accessing this website when you load a page on our domain".
They absolutely should be this serious. I've been in far too many organizations without a disciplined approach to content strategy. It's a slippery slope that can turn very bad. We need to come to greater consensus as to when we determine something has reached some level of acceptance (a benevolent dictator? a vote? I don't know) but it absolutely cannot be death by a thousand small, individually important components. We should be piloting things, when there is a demonstrable need for such a pilot, which I frankly haven't seen here.
Okay, if we want to be a leader on this, then let's do it right and get team consensus on an approach. I deeply care about this issue and actually am in favor of minimizing third party dependencies as much as we can. But it doesn't seem like this PR does anything other than, frankly, get us something we can post on HackerNews. It definitely does not help solve the core issue of better informing users how we and third parties we use handle their private information. |
If it isn't, it should be pretty high on the list. After all, "user centric" comes right after "effective" on our home page. The point of user research and validation here is not to shut things down, but rather, to set things up for success. If third party disclosure information is important for us to communicate (I agree with you that it is), then we should make sure that we deliver that information successfully. To that end, if we are going to pilot this, we should have an idea of the audience we are trying to reach and test whether our messaging resonates or confuses that audience. If it's important enough for us to be doing this, then it's important enough to do right. |
👍 👍 👍 |
Though constructive pull requests and contributing to the tasks being racked up for @konklone's PR would be the most helpful, in their stead would someone be game to produce the standards or workflows that would satisfy the general concerns expressed here? I ask because I'm willing to chip in and help, but as best I can tell, such a standard or workflow hasn't been articulated, in this thread or elsewhere. Please let me know if I'm wrong though. |
@gbinal opposition to a pull request doesn't obligate the individual to contributing back to that pull request. Just because I think something isn't a good idea for us as an organization doesn't mean I should somehow feel committed to instead building my own idea out. Especially when all of us have obligations to 18F clients that max out our time. I believe it's been noted prior that some people are thinking through content standards, though I haven't been privy to that. Maybe @quepol could shed some light? |
(Hi everyone!) I think the general idea of being transparent and explicit about the privacy implications of our technical decisions has a lot of value, and strongly support it. It feels like this discussion is kind of caught up trying to perfectly capture what we want to say, rather than trying to iterate towards a final result. Is there a common set of things to talk about we can all agree on (e.g. start with Google Analytics), land that, and then try to have separate discussions on any additions; that way getting it perfect doesn't block getting it good. |
I agree that without user research that tells us that the users of our site want this information, and tells us how we might deliver it to them in a way that they'll be receptive to, we should hold off.
Agreed. Time and again, I've seen stakeholders in an organization insist that their pet thing is obviously vitally important to be on the website and forcing it on. Which leads to a thicket of a lot of information nobody actually cares about. In this case, we are the stakeholders. And while I think we pretty much all agree that this stuff is at least somewhat important, we're also not our users.
User research that tells us these things:
Lacking those three things above, there's an alternate route: work with GSA's existing privacy infrastructure to move our already-required privacy policies, privacy act statements, SORNs, etc. to a model that is more similar to this. It's a pretty high bar for me to add another privacy-related piece of information to the site. But fixing the ones we already have? Yes please. |
I'm not totally persuaded by the idea that the content in the article about healthcare.gov is relevant to our site but that does seem to have re-ignited the conversation. Apart from the new catalyst, it really feels like we're just restarting a conversation we had a month ago here. We're still concerned that this content isn't addressing a clear user need or who those users are. @jpyuda captured this concern very succinctly as I was typing nearly the same thing:
I think the goals of this PR are great but I'm more in favor of finding out what user needs do exist around this issue and then finding ways to make what we have better serve those needs. |
@gboone - what are our team's standards for closing issues? |
I had most of this typed up earlier, so to respond to @seanherron --
Where could I find a Systems of Record Notice that mentions our use of Google or Amazon? GSA has 28 SORNs since 2003, and the one 18F project that's included there, MyUSA's SORN, doesn't mention any services at all. There's no disclosure of our third party services, or what those services do with that data, anywhere except for our website's source code.
The DAP isn't the third party there, Google is.
It's not available in more detail elsewhere.
The knowledge that someone visited healthcare.gov is more revealing than someone visiting 18f.gsa.gov, but when third parties like Google or Amazon are in a position to see an individual's -- and a significant portion of the human species' -- traffic, we should consider all website visits to be part of a mosaic effect. It's not about creating yet another checkbox for government websites to check off before they can do anything, it's finding some way to communicate to users something important that they have no other recourse for without digging into highly technical details.
HTTPS is at least visually auditable for everyone (though that's still fraught), and Do Not Track ostensibly requires the user to confirm their preference in a visual dialogue. Making people look in our source code for third party embeds isn't reasonable as a method of disclosure -- and nor is that complete. That reveals our use of Google (and is the only thing the healthcare.gov article focused on), but not the OCSP call or our web host. I've reached out to Mozilla about showing the OCSP call in their native developer tools, and to Amazon about any public information on how they handle network traffic to their infrastructure. But unless either of those go anywhere, there's no reasonable auditability for those. And I'd argue the auditability of our use of Google is pretty limited.
I'd be interested in that research as well, but to the extent I think that what you say is likely, it's caveated to only be the embeds (Google) and definitely not the web host (Amazon).
On its own, and with no further attention? Not much. I think it can be part of the conversation as norms around third party services evolve, that may ultimately make services like Amazon feel similar kinds of pressure to more visible third parties like Google (who have extensive privacy features and information). It also makes us look awesome and proactive, in a way that's totally consonsant with our image, at very little (to my mind) cost.
That's not "user testing" -- it was just made uncontroversial over decades of offices pushing the envelope inside the government. Nobody ever sat a group of users down and evaluated their reactions to their website having an open source policy on it. (This is such a cute image that I would greatly enjoy being proved wrong.)
The networks between us and the user aren't under our control. If anything, they're under the user's control. We chose to involve Amazon, and it's the third party services we chose to engage that I'm suggesting we list.
Maybe I just haven't thought it through - perhaps it should (though we're not using it on this site).
That sounds good to me.
Sorry, I just can't muster the sympathy here. We're so far away from hitting a slippery slope of content sprawl. And I don't see any disagreement that we'd want to have a Privacy Policy link of some kind anyway. So if we were to deploy this content, and then later replace it with a more comprehensive document that included it, we've not sprawled any outside of our plan.
This is me working on that. :)
If others on the team care about this and want to help get it past what is very clearly a "first pass" at the style and text, I'd love that. I've put in a couple hours of work now on the page and revisions to it, not counting time spent on the discussion. I'm happy to put in more, but only if I feel like it's on a positive trajectory with the team. |
So @jpyuda and @gboone, this clearly needs to advance further elsewhere before I'm going to get consensus to have this on our website. But I do need to point out that this is not accurate:
Our website has the following information about privacy on it, in its entirety:
That's not exactly a huge privacy footprint on our website. And it's also inaccurate. We totally share our users' information with a bunch of services, that we don't name and don't take responsibility for. The linked GSA doc doesn't fix that. I think we all have a pretty visceral negative reaction (at least an eye-roll) to giant boilerplate Terms of Service that no one reads, and when organizations use vapid corporate-speak to avoid confronting an issue head on. I think we'd all like to be better and more human than that. That's the negative reaction I have to our current privacy language, and the only reason we a) put it up without realizing how inaccurate it was, and b) get away with it without being called out for it, is because the norms right now around third party services are so depressingly invisible and exploitative. I will say, you've successfully motivated me to start asking around the privacy scene to see what research exists on the topic -- and if not, to start making some, the way that organizations like Google test out HTTPS warnings. But I absolutely do not believe that that's the bar that all ideas for content on our website need to clear, if we want to ever say anything bold in the world. |
For @konklone and @seanherron -- There's no System of Records Notice for GSA sites that do not collect info, like 18f.gsa.gov. A System of Records Notice (or SORN) is for websites that collect and store information from the public. As such, looking in the SORN is the wrong place, since a SORN would never be used for a one-way informational website like 18f.gsa.gov. We have no "records"!
This sentence is just wrong, and I agree with the criticism of it. GSA OGC scolded MyUSA for making similar claims. The correct statement is "we'll only share your information as described in GSA's Privacy and Security Notice." I definitely recommend making that content change. |
It came to our attention while discussing #399 that the privacy language we have on 18f.gsa.gov is inaccurate and not in keeping with terms we are using on other 18F projects. This commit integrates more accurate terms.
(Hi @alex, welcome!)
I highly support this approach. I suggested early on (but perhaps not here) that a blog post about how we engage (or don't) with third parties would have broader support than a monolithic page. We can go deeper and be more engaging with a real audience that cares about these issues with a blog post. We can also be more active with blog posts, covering different aspects of the issue in greater depth over a longer period of time. If traffic on recent blog posts are any indication, our readers love deep, technical posts where we show our work. We already know we want more of that, we already have more posts pending, and I'd love to see more posts explaining how different parts of our stack work and what the implications for our site visitors is. I don't think we can or should attempt to accomplish that with a static page. |
Team - I'd like to hit pause on the digital side of this conversation for just a moment. Since we're all going to be physically together this week, I'd like to grab some time with everyone to circle up not only on this, but on a even larger issue superset we're dealing with around privacy. We'll then circle back to this thread and document. Thanks for your flexibility. |
Super late to this discussion, but I, for one, would like to see this page. I vote 👍 to the page and this PR. As @gbinal said, it's considerate to our users. But putting the specific substance of this feature aside, I've got a question about our process here: Given the role the need for user research has played in the decision to close the issue, can we articulate how to satisfy this need somewhere? (e.g. when is it warranted, and how someone like me or @konklone or anyone not versed in the technique can conduct it.) |
I added a page for third party service disclosure, and attempted to explain how each one plays into our overall service. It also explains what extra things we do to try to protect privacy.
You can see the page, and the new link in the footer, in this screenshot (updated 2014-11-03):
If we keep going with our current plan, we can eventually remove the webfont dependency, and limit OCSP privacy leakage to browsers on Safari (which doesn't support stapling at all, but does make OCSP calls). (Also, we'll be switching from Symantec/Verisign to Comodo, when we get our new cert...someday.)
This giant Symantec PDF shows Symantec referencing their overall privacy policy to cover their work as a CA.
Fixes #293.