-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start plumbing redirect choice info into Director's response body #2054
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I think it's a violation of the GeoIP database license to provide this information in a web service.
I think we can release whether or not there's an accurate estimate, whether we leveraged a location override, etc. Just can't state a location.
@bbockelm, can you point me to where you're determining that this would violate their license? I spent some time trying to determine this today, and I came to the opposite conclusion. My observations were:
We use the GeoLite2 databases, so we fall under the "GeoLite2 End User License Agreement"
Under the following terms:
Since this section begins with "Except as explicitly permitted by the Creative Commons License...", it seems to me that we are explicitly allowed to "Share" and "Adapt" the information I've plumbed back to clients in this PR. If I'm misinterpreting all this legalese and you're right, then we're probably already violating the user agreement by exposing geolite-derived lat/lon coordinates for all of the origins/caches through the Director web ui. Do we need to remove that feature or put it behind an auth wall? |
@jhiemstrawisc - I couldn't be happier to hear that I'm wrong and operating off old information. A bit behind on things currently so I'm not able to double-check your assertions. @turetske - could you take a look as well? |
@jhiemstrawisc @bbockelm As Justin pointed out, their language of "Except as explicitly permitted by the Creative Commons License" seems to be contradictory since the Creative Commons license allows for almost anything. But, at the top of the license is the line: "This Agreement controls in the event of any conflict with the above-referenced documents, except as otherwise provided in Section 7 (Personal Data). Thereafter, for any conflicts among the above 4 documents, the priority and precedence of interpretation is DPA, PP, WT and Creative Commons License." Which I believe means that in the event of a conflict, what's written in the GeoLite2 End User License Agreement takes precedence over the Creative Commons license. There is also this line in the agreement: "In addition and if you are using the Services for internal use, subject to the terms and conditions of this Agreement, MaxMind also hereby grants you a non-exclusive, non-transferable limited license to access and use the Services for your own internal business purposes." Which I think we actually violate just as is, since we aren't using it just for internal business purposes. By which they mean using it to, say, develop targeted ads based on ip location whereas we are actually distributing a tool that uses the services. I believe in order to explicitly share the data, MaxMind would insist we actually need a commercial license: Specifically this line: "If you would like to include data from MaxMind’s GeoLite2 databases in a product or service you provide to your users or customers, you will need a Commercial Redistribution License for GeoLite2" |
I followed up with the MaxMind sales department about what our obligations are here. I explained the information we wanted to provide as part of this PR, and how we integrate with Maxmind. Their response:
Their "attribution requirement" link states explicitly:
So I think our only obligation is to say the lat/lon/accuracy information comes from MaxMind in the client response. I've asked for further clarification on whether we need that attribution per client request, or whether we can just add it to the Director's web page somewhere. |
In response to my question about per-request attribution vs general web page attribution, they said sticking one attribution in the Director's web UI is sufficient. The set of further guidelines they provided is attached |
There was a bug previously that assumed all incoming weights were positive. However, some of the weights may be negative, which is a trick used to make sure the weights are always sorted at the end of a list. When negative weights made it into adaptive sort, it triggered an infinite loop. This fixes the bug by normalizing the weights in a way that guarantees negative weights are treated as the smallest possible ranges. Additionally, rather than generate random numbers over and over and over only to see them consistently falling in an already-visited range, this changes the function to remove and rescale ranges as we go. One consequence of this is that it no longer makes sense for the stochastic sort function to return the weights because these weights only make sense in the context of the exact set of ranges that existed when the weight was created.
Most of these changes are to make sure the redirectInfo object is picking up the values it should. There's also a new test called "test-adaptive-sort" that covers a previously-untested set of code. When I started writing the test, I found the infinite loop bug mentioned in this PR's previous commit. NOTE: The new test is not super rigorous because I don't know that it makes a lot of sense to spend time designing a good test for something as stochastic as our adaptive sort. In the future, when we're convinced adaptive sort does exactly what we want, we should revisit the test and make sure its assumptions hold true.
f83e788
to
387db89
Compare
cc6c8b8
to
6472ee9
Compare
One of the most frequent user/integrator issues I've encountered is "why is the Director choosing this list of caches/origins to redirect me to." As this redirection logic grows more complex, it may be useful to empower curious users to inspect some of the underlying logic themselves (with the appropriate debug logging flags). Not only does provide information about the Director's choices, but it might also let users identify issues that we can solve, such as client IPs that aren't resolvable with MaxMind.
This PR should accomplish two things: