Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add rate limiter to GQL endpoint #750

Merged
merged 15 commits into from
Aug 29, 2024
Merged

Conversation

ajay-sentry
Copy link
Contributor

@ajay-sentry ajay-sentry commented Aug 13, 2024

Purpose/Motivation

This PR aims to add a rate limiter to our graphQL endpoint, with two different "key" mechanisms depending on if the user is logged in or not.

Our first check will check for the user being logged in, and having a "pk" on their user object passed into the request. If they do have this value, we will use it as the basis for their rate limit. If they don't have this value (i.e. their just a guest or not logged in), we will fall back to trying to derive their IP address via some request headers being passed in. From my investigation on which headers are typically defined with an IP address, it seemed to be either via x_forwarded_at or remote_addr. We'll use the former but fallback to the latter if it doesn't exist. In the event none of those exist we're a little S.O.L, but we can monitor on the initial launch and test on stage prior to see if we can "break it" before going to prod

Outside of that stuff though, the rate limiter implementation is pretty straight forward. You get a flat 300 req/min per "key," and we can easily modify and extend this implementation to include daily rate limits too if we want.

Links to relevant tickets

Closes codecov/engineering-team#2148

Testing

  • Tested on stage, the most amount of requests I was able to come up with seemed to be clicking into a repo -> clicking back to repo list -> clicking into a repo -> back to repo list -> etc until I hit the rate limit
  • I had to do this pretty quick to hit the rate limit even at 300 requests per min so I'm somewhat confident this is a safe upperbound to start with
  • After hitting it while logged in, I tried navigating to a separate page after logging out and was able to, meaning I wasn't hitting the IP level check either ✅
  • After doing the same process I successfully was able to rate limit myself both via IP and via userId lol

Screenshots

Screenshot 2024-08-27 at 4 55 42 PM

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

@ajay-sentry ajay-sentry requested a review from a team as a code owner August 13, 2024 18:57
@codecov-staging
Copy link

codecov-staging bot commented Aug 27, 2024

Codecov Report

Attention: Patch coverage is 96.66667% with 1 line in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files Patch % Lines
graphql_api/views.py 96.66% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@codecov-qa
Copy link

codecov-qa bot commented Aug 27, 2024

❌ 5 Tests Failed:

Tests completed Failed Passed Skipped
2256 5 2251 6
View the top 3 failed tests by shortest run time
graphql_api.tests.test_views.ArianeViewTestCase test_when_debug_is_false_and_exception_we_know
Stack Traces | 0.004s run time
self = <graphql_api.tests.test_views.ArianeViewTestCase testMethod=test_when_debug_is_false_and_exception_we_know>

    @override_settings(DEBUG=False)
    async def test_when_debug_is_false_and_exception_we_know(self):
        schema = generate_schema_that_raise_with(Unauthorized())
>       data = await self.do_query(schema)

graphql_api/tests/test_views.py:110: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
graphql_api/tests/test_views.py:53: in do_query
    res = await view(request, service="gh")
graphql_api/views.py:231: in post
    if self._check_ratelimit(request=request):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <graphql_api.views.AsyncGraphqlView object at 0x7fe208c6e840>
request = <WSGIRequest: POST '/graphql/gh'>

    def _check_ratelimit(self, request):
        redis = get_redis_connection()
        user_ip = self.get_client_ip(request)
        try:
            # eagerly try to get user_id from request object
            user_id = request.user.pk
        except Exception:
            pass
    
>       if user_id:
E       UnboundLocalError: cannot access local variable 'user_id' where it is not associated with a value

graphql_api/views.py:317: UnboundLocalError
graphql_api.tests.test_views.ArianeViewTestCase test_when_debug_is_false_and_random_exception
Stack Traces | 0.004s run time
self = <graphql_api.tests.test_views.ArianeViewTestCase testMethod=test_when_debug_is_false_and_random_exception>

    @override_settings(DEBUG=False)
    async def test_when_debug_is_false_and_random_exception(self):
        schema = generate_schema_that_raise_with(Exception("hello"))
>       data = await self.do_query(schema)

graphql_api/tests/test_views.py:101: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
graphql_api/tests/test_views.py:53: in do_query
    res = await view(request, service="gh")
graphql_api/views.py:231: in post
    if self._check_ratelimit(request=request):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <graphql_api.views.AsyncGraphqlView object at 0x7fe208c94bf0>
request = <WSGIRequest: POST '/graphql/gh'>

    def _check_ratelimit(self, request):
        redis = get_redis_connection()
        user_ip = self.get_client_ip(request)
        try:
            # eagerly try to get user_id from request object
            user_id = request.user.pk
        except Exception:
            pass
    
>       if user_id:
E       UnboundLocalError: cannot access local variable 'user_id' where it is not associated with a value

graphql_api/views.py:317: UnboundLocalError
graphql_api.tests.test_views.ArianeViewTestCase test_when_bad_query
Stack Traces | 0.005s run time
self = <graphql_api.tests.test_views.ArianeViewTestCase testMethod=test_when_bad_query>

    @override_settings(DEBUG=False)
    async def test_when_bad_query(self):
        schema = generate_schema_that_raise_with(Unauthorized())
>       data = await self.do_query(schema, " { fieldThatDoesntExist }")

graphql_api/tests/test_views.py:119: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
graphql_api/tests/test_views.py:53: in do_query
    res = await view(request, service="gh")
graphql_api/views.py:231: in post
    if self._check_ratelimit(request=request):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <graphql_api.views.AsyncGraphqlView object at 0x7fe208c29340>
request = <WSGIRequest: POST '/graphql/gh'>

    def _check_ratelimit(self, request):
        redis = get_redis_connection()
        user_ip = self.get_client_ip(request)
        try:
            # eagerly try to get user_id from request object
            user_id = request.user.pk
        except Exception:
            pass
    
>       if user_id:
E       UnboundLocalError: cannot access local variable 'user_id' where it is not associated with a value

graphql_api/views.py:317: UnboundLocalError

To view individual test run time comparison to the main branch, go to the Test Analytics Dashboard

Copy link

Test Failures Detected: Due to failing tests, we cannot provide coverage reports at this time.

❌ Failed Test Results:

Completed 2262 tests with 5 failed, 2251 passed and 6 skipped.

View the full list of failed tests

pytest

  • Class name: graphql_api.tests.test_views.ArianeViewTestCase
    Test name: test_when_bad_query

    self = <graphql_api.tests.test_views.ArianeViewTestCase testMethod=test_when_bad_query>

    @override_settings(DEBUG=False)
    async def test_when_bad_query(self):
    schema = generate_schema_that_raise_with(Unauthorized())
    > data = await self.do_query(schema, " { fieldThatDoesntExist }")

    graphql_api/tests/test_views.py:119:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    graphql_api/tests/test_views.py:53: in do_query
    res = await view(request, service="gh")
    graphql_api/views.py:231: in post
    if self._check_ratelimit(request=request):
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    self = <graphql_api.views.AsyncGraphqlView object at 0x7fe208c29340>
    request = <WSGIRequest: POST '/graphql/gh'>

    def _check_ratelimit(self, request):
    redis = get_redis_connection()
    user_ip = self.get_client_ip(request)
    try:
    # eagerly try to get user_id from request object
    user_id = request.user.pk
    except Exception:
    pass

    > if user_id:
    E UnboundLocalError: cannot access local variable 'user_id' where it is not associated with a value

    graphql_api/views.py:317: UnboundLocalError
  • Class name: graphql_api.tests.test_views.ArianeViewTestCase
    Test name: test_when_costly_query

    self = <graphql_api.tests.test_views.ArianeViewTestCase testMethod=test_when_costly_query>
    mock_error_logger = <MagicMock name='error' id='140608786397184'>

    @override_settings(DEBUG=False, GRAPHQL_QUERY_COST_THRESHOLD=1000)
    @patch("logging.Logger.error")
    async def test_when_costly_query(self, mock_error_logger):
    schema = generate_cost_test_schema()
    > data = await self.do_query(schema, " { stuff }")

    graphql_api/tests/test_views.py:130:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    graphql_api/tests/test_views.py:53: in do_query
    res = await view(request, service="gh")
    graphql_api/views.py:231: in post
    if self._check_ratelimit(request=request):
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    self = <graphql_api.views.AsyncGraphqlView object at 0x7fe208d3f3e0>
    request = <WSGIRequest: POST '/graphql/gh'>

    def _check_ratelimit(self, request):
    redis = get_redis_connection()
    user_ip = self.get_client_ip(request)
    try:
    # eagerly try to get user_id from request object
    user_id = request.user.pk
    except Exception:
    pass

    > if user_id:
    E UnboundLocalError: cannot access local variable 'user_id' where it is not associated with a value

    graphql_api/views.py:317: UnboundLocalError
  • Class name: graphql_api.tests.test_views.ArianeViewTestCase
    Test name: test_when_debug_is_false_and_exception_we_know

    self = <graphql_api.tests.test_views.ArianeViewTestCase testMethod=test_when_debug_is_false_and_exception_we_know>

    @override_settings(DEBUG=False)
    async def test_when_debug_is_false_and_exception_we_know(self):
    schema = generate_schema_that_raise_with(Unauthorized())
    > data = await self.do_query(schema)

    graphql_api/tests/test_views.py:110:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    graphql_api/tests/test_views.py:53: in do_query
    res = await view(request, service="gh")
    graphql_api/views.py:231: in post
    if self._check_ratelimit(request=request):
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    self = <graphql_api.views.AsyncGraphqlView object at 0x7fe208c6e840>
    request = <WSGIRequest: POST '/graphql/gh'>

    def _check_ratelimit(self, request):
    redis = get_redis_connection()
    user_ip = self.get_client_ip(request)
    try:
    # eagerly try to get user_id from request object
    user_id = request.user.pk
    except Exception:
    pass

    > if user_id:
    E UnboundLocalError: cannot access local variable 'user_id' where it is not associated with a value

    graphql_api/views.py:317: UnboundLocalError
  • Class name: graphql_api.tests.test_views.ArianeViewTestCase
    Test name: test_when_debug_is_false_and_random_exception

    self = <graphql_api.tests.test_views.ArianeViewTestCase testMethod=test_when_debug_is_false_and_random_exception>

    @override_settings(DEBUG=False)
    async def test_when_debug_is_false_and_random_exception(self):
    schema = generate_schema_that_raise_with(Exception("hello"))
    > data = await self.do_query(schema)

    graphql_api/tests/test_views.py:101:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    graphql_api/tests/test_views.py:53: in do_query
    res = await view(request, service="gh")
    graphql_api/views.py:231: in post
    if self._check_ratelimit(request=request):
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    self = <graphql_api.views.AsyncGraphqlView object at 0x7fe208c94bf0>
    request = <WSGIRequest: POST '/graphql/gh'>

    def _check_ratelimit(self, request):
    redis = get_redis_connection()
    user_ip = self.get_client_ip(request)
    try:
    # eagerly try to get user_id from request object
    user_id = request.user.pk
    except Exception:
    pass

    > if user_id:
    E UnboundLocalError: cannot access local variable 'user_id' where it is not associated with a value

    graphql_api/views.py:317: UnboundLocalError
  • Class name: graphql_api.tests.test_views.ArianeViewTestCase
    Test name: test_when_debug_is_true

    self = <graphql_api.tests.test_views.ArianeViewTestCase testMethod=test_when_debug_is_true>
    patched_log = <MagicMock name='info' id='140608786589520'>

    @override_settings(DEBUG=True)
    @patch("logging.Logger.info")
    async def test_when_debug_is_true(self, patched_log):
    before = REGISTRY.get_sample_value(
    "api_gql_counts_hits_total",
    labels={"operation_type": "unknown_type", "operation_name": "unknown_name"},
    )
    errors_before = REGISTRY.get_sample_value(
    "api_gql_counts_errors_total",
    labels={"operation_type": "unknown_type", "operation_name": "unknown_name"},
    )
    timer_before = REGISTRY.get_sample_value(
    "api_gql_timers_full_runtime_seconds_count",
    labels={"operation_type": "unknown_type", "operation_name": "unknown_name"},
    )
    schema = generate_schema_that_raise_with(Exception("hello"))
    > data = await self.do_query(schema)

    graphql_api/tests/test_views.py:72:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    graphql_api/tests/test_views.py:53: in do_query
    res = await view(request, service="gh")
    graphql_api/views.py:231: in post
    if self._check_ratelimit(request=request):
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    self = <graphql_api.views.AsyncGraphqlView object at 0x7fe208c6eb70>
    request = <WSGIRequest: POST '/graphql/gh'>

    def _check_ratelimit(self, request):
    redis = get_redis_connection()
    user_ip = self.get_client_ip(request)
    try:
    # eagerly try to get user_id from request object
    user_id = request.user.pk
    except Exception:
    pass

    > if user_id:
    E UnboundLocalError: cannot access local variable 'user_id' where it is not associated with a value

    graphql_api/views.py:317: UnboundLocalError

Copy link

codecov bot commented Aug 27, 2024

Codecov Report

Attention: Patch coverage is 96.66667% with 1 line in your changes missing coverage. Please review.

Project coverage is 96.14%. Comparing base (66c7dd3) to head (7f8441f).
Report is 1 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
graphql_api/views.py 96.66% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##               main       #750   +/-   ##
===========================================
  Coverage   96.14000   96.14000           
===========================================
  Files           812        812           
  Lines         18430      18459   +29     
===========================================
+ Hits          17719      17747   +28     
- Misses          711        712    +1     
Flag Coverage Δ
unit 91.98% <96.66%> (+<0.01%) ⬆️
unit-latest-uploader 91.98% <96.66%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

if user_id:
key = f"rl-user:{user_id}"
else:
key = f"rl-ip:{user_ip}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just fetch the ip here? Seems like we don't use it unless the user id does not exist

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will make the change

@ajay-sentry ajay-sentry added this pull request to the merge queue Aug 29, 2024
Merged via the queue into main with commit 4a389f0 Aug 29, 2024
17 of 18 checks passed
@ajay-sentry ajay-sentry deleted the Ajay/add-ratelimit-gql branch August 29, 2024 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[API] Implement rate limiting on the GQL endpoint
2 participants