Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A first pass at a third party disclosure page #399

Closed
wants to merge 14 commits into from
Closed

Conversation

konklone
Copy link
Contributor

I added a page for third party service disclosure, and attempted to explain how each one plays into our overall service. It also explains what extra things we do to try to protect privacy.

You can see the page, and the new link in the footer, in this screenshot (updated 2014-11-03):

18f digital services delivery

If we keep going with our current plan, we can eventually remove the webfont dependency, and limit OCSP privacy leakage to browsers on Safari (which doesn't support stapling at all, but does make OCSP calls). (Also, we'll be switching from Symantec/Verisign to Comodo, when we get our new cert...someday.)

This giant Symantec PDF shows Symantec referencing their overall privacy policy to cover their work as a CA.

Fixes #293.

@JoshData
Copy link

Wow, great.

Starter text to provide some context about these sites. To a lay reader, calling them out seems alarming. Need to set the correct context that these are not unusual things.
@seanherron
Copy link
Contributor

So, I really like this. But I don't think it should be on its own page. This is for a few reasons:

  1. Navigation. Right now, we already have three different pages under "Policies" - it literally requires its own navigation section. We should probably also include to the GSA Privacy Policy, FOIA Information, required egov notices, etc. If we get on a trend of giving each distinct subject of policy its own page and place in the footer, we're very quickly going to end up like this:

screen shot 2014-12-03 at 9 48 44 am

(sorry, NASA - it's out of love)

  1. User Friendliness. We're sprawling our policies around to different areas. As a user, I would expect "Privacy Policies" to include how third parties handle by privacy. By splitting this between an 18F "Third Party Privacy" policy page and a GSA "Privacy Policy", we're introducing complexity that doesn't need to exist. Especially because now how the Government vs. companies we use manage information are in totally different areas.

  2. Duplication. I think a lot of this (should be) already covered by the Privacy Information Act and Third Party Web Assessments we do for our platforms. We should focus on making those documents, which are the official standard for how we manage private information, human readable and accessible, rather than duplicating elsewhere.

Anyway, just my two cents. Agree this is really important to state, but I think we should investigate other ways of doing it first. Perhaps creating a merged Privacy Policy that includes all of this, and links to the associated PIA/TPWA documents.

@gbinal
Copy link
Member

gbinal commented Dec 3, 2014

👍

This is a helpful addition for several reasons. It's transparent and considerate to our users, but I also like it as a further post about our technical process.

@OriHoffer raised a good point though about the dynamic of the page, but I think you've handled that really well with the third paragraph.

@polastre - On the subject of user-centered design, I agree with its central role but don't think that it precludes informed piloting. This page would be important precedence on the .gov scene, possibly the first of its kind. It's based off of significant sector research on the current state of privacy concerns and if anything, I'd contrast it to the strong HTTPS work (and the text posted to explain it). That was a incredibly successful effort and more lightweight piloting like this seems well worth it.

@konklone
Copy link
Contributor Author

konklone commented Dec 3, 2014

@jtag added some language to make this seem less scary to users, and I edited it further in 88c71c7. So there's an additional paragraph there now:

The third party services we use are quite common on the internet today, and you almost certainly visit other websites using them on a regular basis. It's nonetheless important to be aware of them, and in the spirit of transparency and an informed public, we're listing them here.

@polastre @gbinal I also feel that it's all right for us to get out ahead of what the public is clamoring for here, and to gauge the response. In general, I think the 18F website gives us the perfect opportunity to pilot things, as a (relatively) low-traffic, non-critical piece of infrastructure that nonetheless is still a .gov and an important part of our brand.

@seanherron I agree with you in principle, but the two main reasons I'd argue for shipping this as a separate page right now are:

  1. We haven't done any of that other work yet. I don't want to hold this up while we go through a lengthy legal review of our privacy policies and federal regulations. This page I've written is strictly informative and does not impose any policies on the user, so the bureaucratic overhead is extremely low.
  2. I don't want to dilute the message by surrounding it or integrating it into discussion of other aspects of the website, federal law, etc. This page only does one thing: it tells you what third party services the website uses. It's potentially a template other agencies (or even non-governmental groups) might adopt, without having to think hard about how to integrate it into their own site and workflow.

I completely share your view about footer creep. We're spinning up a full on redesign of the 18F site in the near future, and we should make sure we structure our site in such a way that we have other places to put this sort of thing than in the footer.

For example, in #397 I couldn't find a better easy place to put the Edit link than in the footer. It's like the worst possible place for it, but it's all that's there right now. Let's consider that issue separately, and talk about it as part of the redesign process.

@seanherron
Copy link
Contributor

@konklone good points. What about instead making this a generic "Privacy" page, including this content, and linking to the GSA Privacy Policy? Then we can build from there as we go. If we're going to create a URL (in this case /third-parties/), we're going to need to support it for some amount of time in to the future, so I'd rather set us up for smart growth rather than sprawl. Either by doing /privacy/third-parties or /privacy#third-parties.

@konklone
Copy link
Contributor Author

konklone commented Dec 3, 2014

Mentioning our own privacy policy is solid sense. And I like /privacy/third-parties better as a URL anyway. GSA's privacy policy includes some stuff that doesn't apply to us -- like the creation of an ephemeral session cookie -- but whatever.

I've updated the site in 643ce73. I've updated the screenshot in this PR's description with a current screenshot.

@seanherron
Copy link
Contributor

👍

This sets us up to do a bigger privacy initiative at /privacy, which can eventually just replace all related privacy things in the footer and link to details for specific groups/parties like this one.

@leahbannon
Copy link
Contributor

Is the use of Google Fonts significantly special/different from the way we use code libraries?

@konklone
Copy link
Contributor Author

konklone commented Dec 3, 2014

Is the use of Google Fonts significantly special/different from the way we use code libraries?

It's not. If 18F used other CDNs to serve code (e.g. jQuery), we'd have to list those too. I've specifically removed all use of CDNs from the 18F homepage to remove third parties from the mix. If other sites do a third party services page like this, they'd need to either remove the CDN or add the CDN to the page.

Taking a quick glance at the dashboard, I see it uses Font Awesome from a CDN. If the dashboard wants to update its footer to also link to this page, as it's written, it'd need to move the Font Awesome resources into the repo and not use the CDN.

@gbinal
Copy link
Member

gbinal commented Dec 4, 2014

@polastre - Of course we can take our time to work through this (fwiw - in the issue would be helpful), but I don't necc. agree that it's reasonable to block proposed content with a counterproposal of something several times more complex.

On its merits, I think this pull request is a good one. Do I think it's the end all be all? No. But it's helpful progress. Instead of saying no to this until it goes through several more levels of work, it seems reasonable to focus on whether this is progress and to suggest further commits (or better, suggest pull requests).

@arowla
Copy link
Contributor

arowla commented Dec 4, 2014

How is the AWS privacy policy relevant to an end user of our services? It seems mainly (only?) applicable to a first-hand user of aws.amazon.com.

@konklone
Copy link
Contributor Author

konklone commented Dec 4, 2014

How is the AWS privacy policy relevant to an end user of our services? It seems mainly (only?) applicable to a first-hand user of aws.amazon.com.

This is a good point, and I'd like to find a way to handle this better. What I'm trying to communicate is that Amazon very much has technical access to the behavior and profile of visitors to websites hosted on their servers. Their privacy policy and TOS is written to assure us about how they treat that responsibility, but it's also the only document I can think of that is relevant to end users about how Amazon handles their data.

I never see websites talk about their hosting companies as an intermediary between them and the user, which I think gives users an incomplete view of the parties being depended upon to behave well in an interaction with our website.

Any suggestions for a better way of tackling this?

@arowla
Copy link
Contributor

arowla commented Dec 4, 2014

I think linking to AWS's privacy policy is confusing/misleading at best. Symantec's privacy policy is similarly tangential. Frankly, I suspect this page is overkill for the vast majority of users-- which touches back on the points made by @polastre, @cpapazian, @OriHoffer, @jpyuda and @quepol.

@jpyuda
Copy link
Member

jpyuda commented Dec 4, 2014

Yeah, I don't see where the AWS privacy policy really particularly enters into matters.

Let's walk this back a bit and do some research on two things:

  1. What are the legal/policy requirements around a privacy page (not necessarily what the norms are, but the actual requirements)
  2. Talk to a diverse handful of users and find out what they expect out of such a page, and what would be useful to them.

We can use what we learn from those things to guide the design of such a page.

Also could be a good opportunity for some of the folks doing the human-centered design class to use some of those techniques on a real (albeit small) thing.

@polastre
Copy link
Contributor

polastre commented Dec 5, 2014

I'm with @jpyuda -- I think everything you're trying to accomplish should be part of the Privacy Policy, as I understand the federal regulations. Particularly the section in privacy policies entitled "Information collected automatically" (for example, see https://myusa.18f.us/legal#privacy-policy)

What I would like to see 18F do is put in place our own privacy policy, using standard common language, that can be used as a model for other government websites and services.

Submitted issue #411. I vote 👎 on this PR and 👎 on adding a new page, as I believe everything @konklone is looking to accomplish can and should be done in a privacy policy.

@konklone
Copy link
Contributor Author

konklone commented Dec 5, 2014

Then let's put this PR at a /privacy URL, instead of a /privacy/third-parties URL, and plan for #411 to be a second version of that page.

#411 is a fantastic idea and I'm a big 👍 on it, but it's a policy task and could take a substantially long time to get shipped. The information in this PR is a subset of what we want to accomplish there.

@gbinal
Copy link
Member

gbinal commented Dec 5, 2014

Makes sense. 👍 to iterations.

@leahbannon
Copy link
Contributor

This is a placeholder because I don't want to forget; I apologize if you already addressed this. We should probably mention that GSA has negotiated its own terms of service for government agencies with some of these companies that may affect this. For example: the Google Analytics tos amendment.

@gbinal
Copy link
Member

gbinal commented Dec 10, 2014

Thanks, @leahbannon.

@jpyuda - I've recently been going through all of the .gov requirements and can tell you that we're not in a position to replace the GSA.gov/privacy material anytime soon. Requirements can be found here (and in greater detail down a bit here). I think we'll continue having to link off to GSA's for a good while.

That said, let's get started. It definitely makes sense to have this PR reside at /privacy. To @seanherron's and @polastre's point, it then asks to be subsumed into other related content instead of living on its own forever.

To the question of user research, three significant government privacy experts have weighed in on this as a good idea (EFF, @JoshData, and @konklone). That seems ample to me, plus I believe that we should be bold in experimenting with new and innovative ideas - this qualifies for that.

+1 to merging this as /privacy and communicating about it, with an emphasis on asking for feedback and promising iteration.

@quepol
Copy link
Contributor

quepol commented Dec 11, 2014

After some further chatting with the comms team, I'm curious where we go
from "we're just disclosing this for 18f.gsa.gov" to "this is how the team
operates." Issues with both, really...

  1. If it does move into the realm of policy, this is potentially an
    annoyance to product leads, if we intend for them to have to disclose all
    3rd party services they use with a product.
  2. Even if it's just for our website (and if that's the case, we need to be
    explicit about that in the content), what about potentially wanting to
    embed tweets, or storify, or a YouTube video, or an Instagram photo in a
    blog post. The "cost" is already very high for people who want to publish
    on our site. I would hate to think that they also have to be thinking about
    this issue, and potentially also line up a PR to this page in order to
    publish their blog post.

I'm feeling the need to put the brakes on this and back up a bit. (For
numerous reasons, really.) I love the spirit of this page, and the idea
that we could be leaders here; however, I want us to continue down this
already thoughtful path.

Thoughts?

On Wednesday, December 10, 2014, Eric Mill notifications@github.com wrote:

It is. Fundamentally, we trust Amazon not to use their network equipment
to log all visitor behavior and IP addresses to they can be
correlated/resold/weaponized. All hosting providers are in this position,
and worth disclosing — Amazon's scope (including Netflix and Heroku) gives
them an especially powerful vantage point.


Reply to this email directly or view it on GitHub
#399 (comment).

@gbinal
Copy link
Member

gbinal commented Dec 11, 2014

I'm feeling the need to put the brakes on this and back up a bit.

@quepol - think that's fine but that it's incumbent on you and any others to step up and help flesh out the underlying issues. @konklone is already putting a significant amount of work into this PR and I don't think its reasonable for folks to say 'wait', 'stop', 'do more first' without stepping up to be a part of the solution.

The "cost" is already very high for people who want to publish on our site.

Solid point, though I don't think I've even seen nearly this as cost be added to someone trying to publish to our site as is such with this PR. Your points are very reasonable but I think we need to make the effort to work with @konklone to figure these things out.

@quepol
Copy link
Contributor

quepol commented Dec 11, 2014

There are 10 people involved in this thread, making the effort! Thanks, all.

@konklone let's chat today about the "slippery slope" issues (i.e. being clear that this is just for 18f.gsa.gov and doesn't apply to the organization, like our open source policy or other declarations); where it should live if we work those out; and anything else.

@gbinal
Copy link
Member

gbinal commented Jan 20, 2015

Holidays, yes, but it's been over a month.... Any further word, because this continues to seem to me a pocket veto.

I strongly believe there needs to be an articulated standard for who/how decisions will be made about site edits or there needs to be a length of time or number of pings for people to make pull requests, otherwise something should be able to move forward.

On another note - it's a bit uninformed piece, but this recent article buoys the argument for this page.

@gbinal
Copy link
Member

gbinal commented Jan 20, 2015

To my above, one response:

IIRC, last time we discussed this, there were unanswered questions regarding what user need this is attempting to address. has there been any resolution on that? would like to see a clear user centered goal before we start throwing stuff against the wall and adding more complexity to things.

... a month has gone by, but no work to validate the content was done. so i would imagine status would remain the same

@seanherron
Copy link
Contributor

@konklone - in line with @gbinal comments, it may be worthwhile to revisit and decide if we still want to flesh this out more. In my mind, we still have some unresolved questions:

  1. How is this materially different than the GSA Privacy Policy, and instead of shipping a separate disclosure page for this, should we instead focus efforts on a unified Privacy and Disclosure approach that makes sense?
  2. Who is typically coming to 18f.gsa.gov, and does third party disclosure add value to them? In line with @gbinal's linked article - it definitely makes sense for healthcare.gov, but does it make as much sense for our site, or do we run the risk of confusing users more than educating them? We should do some research here, which I will leave to more informed UX minds to educate us on.
  3. Does this stand in for all 18F properties, or just 18f.gsa.gov? How far down the stack do we disclose?

@konklone
Copy link
Contributor Author

I believe I have addressed these points above, but it's good to get them succinctly stated.

How is this materially different than the GSA Privacy Policy, and instead of shipping a separate disclosure page for this, should we instead focus efforts on a unified Privacy and Disclosure approach that makes sense?

The GSA Privacy Policy doesn't cover any of this. There's no information in the GSA privacy policy on the specific third parties that we use. The language is intentionally broad, and doesn't tell people what the third parties involved do with their data.

We're also not in a position to say everything that happens to users' data. We can point to what Google says they'll do with it, or Amazon (if they said anywhere). At the very least, we can identify what parties are involved in the transaction.

I think we should ship a separate disclosure page for this, and then work on a unified privacy and disclosure page. I'd be very much into a larger statement. I see it as all downside to wait on this aspect of it, when it's a small discrete component like this is.

Who is typically coming to 18f.gsa.gov, and does third party disclosure add value to them? In line with @gbinal's linked article - it definitely makes sense for healthcare.gov, but does it make as much sense for our site, or do we run the risk of confusing users more than educating them? We should do some research here, which I will leave to more informed UX minds to educate us on.

If it makes sense for healthcare.gov, it makes sense for every website. There's nothing particularly special about healthcare.gov in this regard, except that it's more well known and more likely to get stories written about it.

I see this as very similar to the reasons we push for HTTPS on internet connections. Fundamentally, it's about ensuring that the connection between two parties is between only two parties. When Comcast inserts ads into a public website, or Verizon inserts a tracking header, that's violating the connection between two parties, without the knowledge or consent of at least the user (if not the website).

By using hosting outside of GSA's direct physical control, 18F (like many, many other organizations) has made the decision to grant access to our users' connections to other parties. We rely on Google and Amazon's internal policies to safeguard that information, and our privacy policies can do nothing to control how they're used, or to assure the user of anything.

Mainstream users generally don't know that http:// connections can be preyed upon as much as they are. Users generally don't know or care whether the website they're visiting is open source. We don't user test these policies. They're how we want to work, and they're what we believe is right for the web.

Does this stand in for all 18F properties, or just 18f.gsa.gov? How far down the stack do we disclose?

I wrote this intentionally only to apply to 18f.gsa.gov. I think that's clear in the text, but I'm happy to make it clearer. I'm not sure what you mean by how far down the stack we disclose, but I've attempted to carefully limit this to third party services which see information about individual user connections. I don't think this would apply to e.g. New Relic (which we don't use for 18f.gsa.gov right now anyway), and certainly not to any of the open source software we use in our stack.

More generally --

I don't think concerns about website sprawl or user testing should be this serious, with regards to this pull request. As stated above, user testing is not our absolute religion and we should be capable of piloting things in the wild when we want to, and adapting to user feedback as necessary. The content on the page is also quite small, and easily adaptable to wherever the site redesign takes us.

I recognize that much of the team doesn't seem to feel the same passion for this issue that I do, and that there's not the same obvious consensus as for an open source policy.

But that's because we're ahead of the curve, not just catching up to the private sector. It gives us an opportunity to get out ahead of the issue as attention grows on third party services, and to be a leader on the issue. Being a leader on an issue means making a statement on something when it's not obvious or popular yet. This is what that feels like.

I strongly believe that if we give this a shot, it will be all upside for 18F, and potentially have ripple effects outside of our organization that help move the needle on making the web a place that better serves the people's interests.

@seanherron
Copy link
Contributor

The GSA Privacy Policy doesn't cover any of this. There's no information in the GSA privacy policy on the specific third parties that we use. The language is intentionally broad, and doesn't tell people what the third parties involved do with their data.

We're also not in a position to say everything that happens to users' data. We can point to what Google says they'll do with it, or Amazon (if they said anywhere). At the very least, we can identify what parties are involved in the transaction.

True, the GSA Privacy Policy doesn't cover this, but it does point people to GSA's disclosure of Privacy Information Act and Systems of Record Notices, which do tell people what specific third parties do with their data, and what the government is doing to protect and secure that data, many times to great detail. We are in a position to dictate what happens with our user's data - by demanding specific outlines and privacy procedures from third parties we work with, like with the DAP. I'm not suggesting making that information easier to understand and more direct is a bad thing, but I also don't think there is enough value in what we have presented in this PR to warrant the creation of a new page on the site, especially if that information is already available in more detail elsewhere.

If it makes sense for healthcare.gov, it makes sense for every website. There's nothing particularly special about healthcare.gov in this regard, except that it's more well known and more likely to get stories written about it.

I very much disagree with that. Healthcare.gov deals with very sensitive, private information. We do not. We need to look at this through the lens of use - the impact of a list of people who have visited 18f.gsa.gov being released to the public is inherently less damaging than a list of people who have signed up for healthcare via healthcare.gov. I would go so far as to say that the "if it makes sense for x, it makes sense for everything" argument embodies everything wrong with the traditional government approach to IT Security.

I see this as very similar to the reasons we push for HTTPS on internet connections. Fundamentally, it's about ensuring that the connection between two parties is between only two parties. When Comcast inserts ads into a public website, or Verizon inserts a tracking header, that's violating the connection between two parties, without the knowledge or consent of at least the user (if not the website).

By using hosting outside of GSA's direct physical control, 18F (like many, many other organizations) has made the decision to grant access to our users' connections to other parties. We rely on Google and Amazon's internal policies to safeguard that information, and our privacy policies can do nothing to control how they're used, or to assure the user of anything.

Proactive disclosure here is very different than HTTPS. Implementing respect for Do Not Track (and disabling third party embeds where we can't ensure DNT is respected for DNT-enabled clients) is similar to HTTPS. A list of third party embeds (which is already available to users who look in our source code or open their inspector) is much less critical. I would venture to guess that a venn diagram of people who care enough to read this and people who can already determine our third party embeds has a lot of crossover, though I can't say that definitively without user research.

Mainstream users generally don't know that http:// connections can be preyed upon as much as they are. Users generally don't know or care whether the website they're visiting is open source. We don't user test these policies. They're how we want to work, and they're what we believe is right for the web.

How does the existence of this page improve the experience for users who don't know about it and don't care?

Further more, open source as a policy was most definitely user tested, starting from way before either you or I were even born. The 18F open source policy is a descendant of, literally, decades of work, much of it pioneered in government by the military, research institutions, NASA, CFPB, and others.

I wrote this intentionally only to apply to 18f.gsa.gov. I think that's clear in the text, but I'm happy to make it clearer. I'm not sure what you mean by how far down the stack we disclose, but I've attempted to carefully limit this to third party services which see information about individual user connections. I don't think this would apply to e.g. New Relic (which we don't use for 18f.gsa.gov right now anyway), and certainly not to any of the open source software we use in our stack.

By "how far in the stack", we, for instance, include AWS in here (though I know there is discussion on that above). We don't, however, disclose the fact that internet traffic between the user and AWS may pass through networks owned by a number of other companies. How is that materially different? Why would this not apply to New Relic, considering New Relic would have access to nginx access logs and the like? I understand where you're coming from on this, but I think we could do more work to clearly state the scope of this. Perhaps something along the lines of "here are the third party organizations that may know you are accessing this website when you load a page on our domain".

I don't think concerns about website sprawl or user testing should be this serious, with regards to this pull request. As stated above, user testing is not our absolute religion and we should be capable of piloting things in the wild when we want to, and adapting to user feedback as necessary. The content on the page is also quite small, and easily adaptable to wherever the site redesign takes us.

They absolutely should be this serious. I've been in far too many organizations without a disciplined approach to content strategy. It's a slippery slope that can turn very bad. We need to come to greater consensus as to when we determine something has reached some level of acceptance (a benevolent dictator? a vote? I don't know) but it absolutely cannot be death by a thousand small, individually important components. We should be piloting things, when there is a demonstrable need for such a pilot, which I frankly haven't seen here.

I recognize that much of the team doesn't seem to feel the same passion for this issue that I do, and that there's not the same obvious consensus as for an open source policy.

But that's because we're ahead of the curve, not just catching up to the private sector. It gives us an opportunity to get out ahead of the issue as attention grows on third party services, and to be a leader on the issue. Being a leader on an issue means making a statement on something when it's not obvious or popular yet. This is what that feels like.

Okay, if we want to be a leader on this, then let's do it right and get team consensus on an approach. I deeply care about this issue and actually am in favor of minimizing third party dependencies as much as we can. But it doesn't seem like this PR does anything other than, frankly, get us something we can post on HackerNews. It definitely does not help solve the core issue of better informing users how we and third parties we use handle their private information.

@cpapazian
Copy link

As stated above, user testing is not our absolute religion and we should be capable of piloting things in the wild when we want to, and adapting to user feedback as necessary.

If it isn't, it should be pretty high on the list. After all, "user centric" comes right after "effective" on our home page.

The point of user research and validation here is not to shut things down, but rather, to set things up for success. If third party disclosure information is important for us to communicate (I agree with you that it is), then we should make sure that we deliver that information successfully.

To that end, if we are going to pilot this, we should have an idea of the audience we are trying to reach and test whether our messaging resonates or confuses that audience. If it's important enough for us to be doing this, then it's important enough to do right.

@seanherron
Copy link
Contributor

If it isn't, it should be pretty high on the list. After all, "user centric" comes right after "effective" on our home page.

The point of user research and validation here is not to shut things down, but rather, to set things up for success. If third party disclosure information is important for us to communicate (I agree with you that it is), then we should make sure that we successfully deliver that information successfully.

👍 👍 👍

@gbinal
Copy link
Member

gbinal commented Jan 20, 2015

Though constructive pull requests and contributing to the tasks being racked up for @konklone's PR would be the most helpful, in their stead would someone be game to produce the standards or workflows that would satisfy the general concerns expressed here?

I ask because I'm willing to chip in and help, but as best I can tell, such a standard or workflow hasn't been articulated, in this thread or elsewhere. Please let me know if I'm wrong though.

@seanherron
Copy link
Contributor

@gbinal opposition to a pull request doesn't obligate the individual to contributing back to that pull request. Just because I think something isn't a good idea for us as an organization doesn't mean I should somehow feel committed to instead building my own idea out. Especially when all of us have obligations to 18F clients that max out our time.

I believe it's been noted prior that some people are thinking through content standards, though I haven't been privy to that. Maybe @quepol could shed some light?

@alex
Copy link

alex commented Jan 20, 2015

(Hi everyone!)

I think the general idea of being transparent and explicit about the privacy implications of our technical decisions has a lot of value, and strongly support it.

It feels like this discussion is kind of caught up trying to perfectly capture what we want to say, rather than trying to iterate towards a final result. Is there a common set of things to talk about we can all agree on (e.g. start with Google Analytics), land that, and then try to have separate discussions on any additions; that way getting it perfect doesn't block getting it good.

@jpyuda
Copy link
Member

jpyuda commented Jan 20, 2015

I agree that without user research that tells us that the users of our site want this information, and tells us how we might deliver it to them in a way that they'll be receptive to, we should hold off.

They absolutely should be this serious. I've been in far too many organizations without a disciplined approach to content strategy. It's a slippery slope that can turn very bad. We need to come to greater consensus as to when we determine something has reached some level of acceptance (a benevolent dictator? a vote? I don't know) but it absolutely cannot be death by a thousand small, individually important components.

Agreed. Time and again, I've seen stakeholders in an organization insist that their pet thing is obviously vitally important to be on the website and forcing it on. Which leads to a thicket of a lot of information nobody actually cares about. In this case, we are the stakeholders. And while I think we pretty much all agree that this stuff is at least somewhat important, we're also not our users.

Though constructive pull requests and contributing to the tasks being racked up for @konklone's PR would be the most helpful, in their stead would someone be game to produce the standards or workflows that would satisfy the general concerns expressed here?

User research that tells us these things:

  1. Who are the users of 18f.gsa.gov?
  2. What are they trying to do with the site?
  3. Is this information something that they would like to find out more about?

Lacking those three things above, there's an alternate route: work with GSA's existing privacy infrastructure to move our already-required privacy policies, privacy act statements, SORNs, etc. to a model that is more similar to this. It's a pretty high bar for me to add another privacy-related piece of information to the site. But fixing the ones we already have? Yes please.

@gboone
Copy link
Contributor

gboone commented Jan 20, 2015

I'm not totally persuaded by the idea that the content in the article about healthcare.gov is relevant to our site but that does seem to have re-ignited the conversation. Apart from the new catalyst, it really feels like we're just restarting a conversation we had a month ago here. We're still concerned that this content isn't addressing a clear user need or who those users are.

@jpyuda captured this concern very succinctly as I was typing nearly the same thing:

It's a pretty high bar for me to add another privacy-related piece of information to the site. But fixing the ones we already have? Yes please.

I think the goals of this PR are great but I'm more in favor of finding out what user needs do exist around this issue and then finding ways to make what we have better serve those needs.

@gboone gboone closed this Jan 20, 2015
@gbinal
Copy link
Member

gbinal commented Jan 20, 2015

@gboone - what are our team's standards for closing issues?

@konklone
Copy link
Contributor Author

I had most of this typed up earlier, so to respond to @seanherron --

True, the GSA Privacy Policy doesn't cover this, but it does point people to GSA's disclosure of Privacy Information Act and Systems of Record Notices, which do tell people what specific third parties do with their data, and what the government is doing to protect and secure that data, many times to great detail.

Where could I find a Systems of Record Notice that mentions our use of Google or Amazon? GSA has 28 SORNs since 2003, and the one 18F project that's included there, MyUSA's SORN, doesn't mention any services at all.

There's no disclosure of our third party services, or what those services do with that data, anywhere except for our website's source code.

We are in a position to dictate what happens with our user's data - by demanding specific outlines and privacy procedures from third parties we work with, like with the DAP.

The DAP isn't the third party there, Google is.

I'm not suggesting making that information easier to understand and more direct is a bad thing, but I also don't think there is enough value in what we have presented in this PR to warrant the creation of a new page on the site, especially if that information is already available in more detail elsewhere.

It's not available in more detail elsewhere.

I very much disagree with that. Healthcare.gov deals with very sensitive, private information. We do not. We need to look at this through the lens of use - the impact of a list of people who have visited 18f.gsa.gov being released to the public is inherently less damaging than a list of people who have signed up for healthcare via healthcare.gov. I would go so far as to say that the "if it makes sense for x, it makes sense for everything" argument embodies everything wrong with the traditional government approach to IT Security.

The knowledge that someone visited healthcare.gov is more revealing than someone visiting 18f.gsa.gov, but when third parties like Google or Amazon are in a position to see an individual's -- and a significant portion of the human species' -- traffic, we should consider all website visits to be part of a mosaic effect.

It's not about creating yet another checkbox for government websites to check off before they can do anything, it's finding some way to communicate to users something important that they have no other recourse for without digging into highly technical details.

Proactive disclosure here is very different than HTTPS. Implementing respect for Do Not Track (and disabling third party embeds where we can't ensure DNT is respected for DNT-enabled clients) is similar to HTTPS. A list of third party embeds (which is already available to users who look in our source code or open their inspector) is much less critical.

HTTPS is at least visually auditable for everyone (though that's still fraught), and Do Not Track ostensibly requires the user to confirm their preference in a visual dialogue. Making people look in our source code for third party embeds isn't reasonable as a method of disclosure -- and nor is that complete. That reveals our use of Google (and is the only thing the healthcare.gov article focused on), but not the OCSP call or our web host.

I've reached out to Mozilla about showing the OCSP call in their native developer tools, and to Amazon about any public information on how they handle network traffic to their infrastructure. But unless either of those go anywhere, there's no reasonable auditability for those. And I'd argue the auditability of our use of Google is pretty limited.

I would venture to guess that a venn diagram of people who care enough to read this and people who can already determine our third party embeds has a lot of crossover, though I can't say that definitively without user research.

I'd be interested in that research as well, but to the extent I think that what you say is likely, it's caveated to only be the embeds (Google) and definitely not the web host (Amazon).

How does the existence of this page improve the experience for users who don't know about it and don't care?

On its own, and with no further attention? Not much. I think it can be part of the conversation as norms around third party services evolve, that may ultimately make services like Amazon feel similar kinds of pressure to more visible third parties like Google (who have extensive privacy features and information). It also makes us look awesome and proactive, in a way that's totally consonsant with our image, at very little (to my mind) cost.

Further more, open source as a policy was most definitely user tested, starting from way before either you or I were even born. The 18F open source policy is a descendant of, literally, decades of work, much of it pioneered in government by the military, research institutions, NASA, CFPB, and others.

That's not "user testing" -- it was just made uncontroversial over decades of offices pushing the envelope inside the government. Nobody ever sat a group of users down and evaluated their reactions to their website having an open source policy on it. (This is such a cute image that I would greatly enjoy being proved wrong.)

By "how far in the stack", we, for instance, include AWS in here (though I know there is discussion on that above). We don't, however, disclose the fact that internet traffic between the user and AWS may pass through networks owned by a number of other companies. How is that materially different?

The networks between us and the user aren't under our control. If anything, they're under the user's control. We chose to involve Amazon, and it's the third party services we chose to engage that I'm suggesting we list.

Why would this not apply to New Relic, considering New Relic would have access to nginx access logs and the like?

Maybe I just haven't thought it through - perhaps it should (though we're not using it on this site).

I understand where you're coming from on this, but I think we could do more work to clearly state the scope of this. Perhaps something along the lines of "here are the third party organizations that may know you are accessing this website when you load a page on our domain".

That sounds good to me.

They absolutely should be this serious. I've been in far too many organizations without a disciplined approach to content strategy. It's a slippery slope that can turn very bad. We need to come to greater consensus as to when we determine something has reached some level of acceptance (a benevolent dictator? a vote? I don't know) but it absolutely cannot be death by a thousand small, individually important components. We should be piloting things, when there is a demonstrable need for such a pilot, which I frankly haven't seen here.

Sorry, I just can't muster the sympathy here. We're so far away from hitting a slippery slope of content sprawl. And I don't see any disagreement that we'd want to have a Privacy Policy link of some kind anyway. So if we were to deploy this content, and then later replace it with a more comprehensive document that included it, we've not sprawled any outside of our plan.

Okay, if we want to be a leader on this, then let's do it right and get team consensus on an approach.

This is me working on that. :)

I deeply care about this issue and actually am in favor of minimizing third party dependencies as much as we can. But it doesn't seem like this PR does anything other than, frankly, get us something we can post on HackerNews. It definitely does not help solve the core issue of better informing users how we and third parties we use handle their private information.

If others on the team care about this and want to help get it past what is very clearly a "first pass" at the style and text, I'd love that. I've put in a couple hours of work now on the page and revisions to it, not counting time spent on the discussion. I'm happy to put in more, but only if I feel like it's on a positive trajectory with the team.

@konklone
Copy link
Contributor Author

So @jpyuda and @gboone, this clearly needs to advance further elsewhere before I'm going to get consensus to have this on our website. But I do need to point out that this is not accurate:

It's a pretty high bar for me to add another privacy-related piece of information to the site. But fixing the ones we already have? Yes please.

Our website has the following information about privacy on it, in its entirety:

Your privacy and security are important to us, and we'll never share your information. Please see GSA's Privacy and Security Notice for more information.

That's not exactly a huge privacy footprint on our website. And it's also inaccurate. We totally share our users' information with a bunch of services, that we don't name and don't take responsibility for. The linked GSA doc doesn't fix that.

I think we all have a pretty visceral negative reaction (at least an eye-roll) to giant boilerplate Terms of Service that no one reads, and when organizations use vapid corporate-speak to avoid confronting an issue head on. I think we'd all like to be better and more human than that.

That's the negative reaction I have to our current privacy language, and the only reason we a) put it up without realizing how inaccurate it was, and b) get away with it without being called out for it, is because the norms right now around third party services are so depressingly invisible and exploitative.

I will say, you've successfully motivated me to start asking around the privacy scene to see what research exists on the topic -- and if not, to start making some, the way that organizations like Google test out HTTPS warnings. But I absolutely do not believe that that's the bar that all ideas for content on our website need to clear, if we want to ever say anything bold in the world.

@polastre
Copy link
Contributor

For @konklone and @seanherron --

There's no System of Records Notice for GSA sites that do not collect info, like 18f.gsa.gov. A System of Records Notice (or SORN) is for websites that collect and store information from the public. As such, looking in the SORN is the wrong place, since a SORN would never be used for a one-way informational website like 18f.gsa.gov. We have no "records"!

Your privacy and security are important to us, and we'll never share your information. Please see GSA's Privacy and Security Notice for more information.

This sentence is just wrong, and I agree with the criticism of it. GSA OGC scolded MyUSA for making similar claims. The correct statement is "we'll only share your information as described in GSA's Privacy and Security Notice." I definitely recommend making that content change.

gboone pushed a commit that referenced this pull request Jan 21, 2015
It came to our attention while discussing #399 that the privacy language we
have on 18f.gsa.gov is inaccurate and not in keeping with terms we are using on
other 18F projects. This commit integrates more accurate terms.
@gboone gboone mentioned this pull request Jan 21, 2015
@gboone
Copy link
Contributor

gboone commented Jan 21, 2015

(Hi @alex, welcome!)

I think the general idea of being transparent and explicit about the privacy implications of our technical decisions has a lot of value, and strongly support it.
...
Is there a common set of things to talk about we can all agree on (e.g. start with Google Analytics), land that, and then try to have separate discussions on any additions; that way getting it perfect doesn't block getting it good.

I highly support this approach. I suggested early on (but perhaps not here) that a blog post about how we engage (or don't) with third parties would have broader support than a monolithic page.

We can go deeper and be more engaging with a real audience that cares about these issues with a blog post. We can also be more active with blog posts, covering different aspects of the issue in greater depth over a longer period of time. If traffic on recent blog posts are any indication, our readers love deep, technical posts where we show our work. We already know we want more of that, we already have more posts pending, and I'd love to see more posts explaining how different parts of our stack work and what the implications for our site visitors is.

I don't think we can or should attempt to accomplish that with a static page.

@polastre @konklone see #474.

@NoahKunin
Copy link
Contributor

Team -

I'd like to hit pause on the digital side of this conversation for just a moment. Since we're all going to be physically together this week, I'd like to grab some time with everyone to circle up not only on this, but on a even larger issue superset we're dealing with around privacy. We'll then circle back to this thread and document. Thanks for your flexibility.

@adelevie
Copy link
Contributor

Super late to this discussion, but I, for one, would like to see this page. I vote 👍 to the page and this PR. As @gbinal said, it's considerate to our users.

But putting the specific substance of this feature aside, I've got a question about our process here:

Given the role the need for user research has played in the decision to close the issue, can we articulate how to satisfy this need somewhere? (e.g. when is it warranted, and how someone like me or @konklone or anyone not versed in the technique can conduct it.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.