Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use cached results from GCS in TA GQL #952

Merged
merged 10 commits into from
Nov 11, 2024

Conversation

joseph-sentry
Copy link
Contributor

we want the GQL endpoints to consume the cached query results from GCS
and then do some filtering and ordering on top of that, and also do some
extra caching of those results in redis.

this commit also contains a bunch of reorganization and refactoring of
the GQL code

@codecov-staging
Copy link

codecov-staging bot commented Nov 1, 2024

Codecov Report

Attention: Patch coverage is 78.57143% with 54 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
graphql_api/types/test_analytics/test_analytics.py 78.41% 30 Missing ⚠️
...test_results_aggregates/test_results_aggregates.py 70.45% 13 Missing ⚠️
...hql_api/types/flake_aggregates/flake_aggregates.py 72.97% 10 Missing ⚠️
services/task/task.py 50.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link

codecov bot commented Nov 1, 2024

Codecov Report

Attention: Patch coverage is 78.57143% with 54 lines in your changes missing coverage. Please review.

Project coverage is 96.04%. Comparing base (72f3c2f) to head (b2ff349).
Report is 4 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
graphql_api/types/test_analytics/test_analytics.py 78.41% 30 Missing ⚠️
...test_results_aggregates/test_results_aggregates.py 70.45% 13 Missing ⚠️
...hql_api/types/flake_aggregates/flake_aggregates.py 72.97% 10 Missing ⚠️
services/task/task.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #952      +/-   ##
==========================================
- Coverage   96.25%   96.04%   -0.22%     
==========================================
  Files         827      827              
  Lines       19090    19068      -22     
==========================================
- Hits        18376    18313      -63     
- Misses        714      755      +41     
Flag Coverage Δ
unit 92.28% <78.57%> (-0.25%) ⬇️
unit-latest-uploader 92.28% <78.57%> (-0.25%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@codecov-qa
Copy link

codecov-qa bot commented Nov 1, 2024

❌ 5 Tests Failed:

Tests completed Failed Passed Skipped
2600 5 2595 6
View the top 3 failed tests by shortest run time
graphql_api/tests/test_test_analytics.py::TestAnalyticsTestCase::test_get_test_results_no_storage
Stack Traces | 6.36s run time
self = &lt;urllib3.connection.HTTPConnection object at 0x7fc1f7218500&gt;

    def _new_conn(self):
        """Establish a socket connection and set nodelay settings on it.
    
        :return: New socket connection.
        """
        extra_kw = {}
        if self.source_address:
            extra_kw["source_address"] = self.source_address
    
        if self.socket_options:
            extra_kw["socket_options"] = self.socket_options
    
        try:
&gt;           conn = connection.create_connection(
                (self._dns_host, self.port), self.timeout, **extra_kw
            )

.../local/lib/python3.12............/site-packages/urllib3/connection.py:174: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.../local/lib/python3.12.../urllib3/util/connection.py:72: in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

host = 'minio', port = 9000, family = &lt;AddressFamily.AF_UNSPEC: 0&gt;
type = &lt;SocketKind.SOCK_STREAM: 1&gt;, proto = 0, flags = 0

    def getaddrinfo(host, port, family=0, type=0, proto=0, flags=0):
        """Resolve host and port into list of address info entries.
    
        Translate the host/port argument into a sequence of 5-tuples that contain
        all the necessary arguments for creating a socket connected to that service.
        host is a domain name, a string representation of an IPv4/v6 address or
        None. port is a string service name such as 'http', a numeric port number or
        None. By passing None as the value of host and port, you can pass NULL to
        the underlying C API.
    
        The family, type and proto arguments can be optionally specified in order to
        narrow the list of addresses returned. Passing zero as a value for each of
        these arguments selects the full range of results.
        """
        # We override this function since we want to translate the numeric family
        # and socket type values to enum constants.
        addrlist = []
&gt;       for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
E       socket.gaierror: [Errno -3] Temporary failure in name resolution

.../local/lib/python3.12/socket.py:976: gaierror

During handling of the above exception, another exception occurred:

self = &lt;urllib3.connectionpool.HTTPConnectionPool object at 0x7fcb9c901e50&gt;
method = 'GET', url = '/codecov?location=', body = None
headers = HTTPHeaderDict({'Host': 'minio:9000', 'User-Agent': 'MinIO (Linux; x86_64) minio-py/7.1.13', 'x-amz-content-sha256': '...ers=host;x-amz-content-sha256;x-amz-date, Signature=05af952d2846b4ed5e339dcca185860b954d6bfa77fe8a375b3c6f7638bc6d11'})
retries = Retry(total=0, connect=None, read=None, redirect=None, status=None)
redirect = False, assert_same_host = False
timeout = &lt;object object at 0x7fcc0a922fe0&gt;, pool_timeout = None
release_conn = True, chunked = False, body_pos = None
response_kw = {'preload_content': True}
parsed_url = Url(scheme=None, auth=None, host=None, port=None, path='/codecov', query='location=', fragment=None)
destination_scheme = None, conn = None, release_this_conn = True
http_tunnel_required = False, err = None, clean_exit = False

    def urlopen(
        self,
        method,
        url,
        body=None,
        headers=None,
        retries=None,
        redirect=True,
        assert_same_host=True,
        timeout=_Default,
        pool_timeout=None,
        release_conn=None,
        chunked=False,
        body_pos=None,
        **response_kw
    ):
        """
        Get a connection from the pool and perform an HTTP request. This is the
        lowest level call for making a request, so you'll need to specify all
        the raw details.
    
        .. note::
    
           More commonly, it's appropriate to use a convenience method provided
           by :class:`.RequestMethods`, such as :meth:`request`.
    
        .. note::
    
           `release_conn` will only behave as expected if
           `preload_content=False` because we want to make
           `preload_content=False` the default behaviour someday soon without
           breaking backwards compatibility.
    
        :param method:
            HTTP request method (such as GET, POST, PUT, etc.)
    
        :param url:
            The URL to perform the request on.
    
        :param body:
            Data to send in the request body, either :class:`str`, :class:`bytes`,
            an iterable of :class:`str`/:class:`bytes`, or a file-like object.
    
        :param headers:
            Dictionary of custom headers to send, such as User-Agent,
            If-None-Match, etc. If None, pool headers are used. If provided,
            these headers completely replace any pool-specific headers.
    
        :param retries:
            Configure the number of retries to allow before raising a
            :class:`~urllib3.exceptions.MaxRetryError` exception.
    
            Pass ``None`` to retry until you receive a response. Pass a
            :class:`~urllib3.util.retry.Retry` object for fine-grained control
            over different types of retries.
            Pass an integer number to retry connection errors that many times,
            but no other types of errors. Pass zero to never retry.
    
            If ``False``, then retries are disabled and any exception is raised
            immediately. Also, instead of raising a MaxRetryError on redirects,
            the redirect response will be returned.
    
        :type retries: :class:`~urllib3.util.retry.Retry`, False, or an int.
    
        :param redirect:
            If True, automatically handle redirects (status codes 301, 302,
            303, 307, 308). Each redirect counts as a retry. Disabling retries
            will disable redirect, too.
    
        :param assert_same_host:
            If ``True``, will make sure that the host of the pool requests is
            consistent else will raise HostChangedError. When ``False``, you can
            use the pool on an HTTP proxy and request foreign hosts.
    
        :param timeout:
            If specified, overrides the default timeout for this one
            request. It may be a float (in seconds) or an instance of
            :class:`urllib3.util.Timeout`.
    
        :param pool_timeout:
            If set and the pool is set to block=True, then this method will
            block for ``pool_timeout`` seconds and raise EmptyPoolError if no
            connection is available within the time period.
    
        :param release_conn:
            If False, then the urlopen call will not release the connection
            back into the pool once a response is received (but will release if
            you read the entire contents of the response such as when
            `preload_content=True`). This is useful if you're not preloading
            the response's content immediately. You will need to call
            ``r.release_conn()`` on the response ``r`` to return the connection
            back into the pool. If None, it takes the value of
            ``response_kw.get('preload_content', True)``.
    
        :param chunked:
            If True, urllib3 will send the body using chunked transfer
            encoding. Otherwise, urllib3 will send the body using the standard
            content-length form. Defaults to False.
    
        :param int body_pos:
            Position to seek to in file-like body in the event of a retry or
            redirect. Typically this won't need to be set because urllib3 will
            auto-populate the value when needed.
    
        :param \\**response_kw:
            Additional parameters are passed to
            :meth:`urllib3.response.HTTPResponse.from_httplib`
        """
    
        parsed_url = parse_url(url)
        destination_scheme = parsed_url.scheme
    
        if headers is None:
            headers = self.headers
    
        if not isinstance(retries, Retry):
            retries = Retry.from_int(retries, redirect=redirect, default=self.retries)
    
        if release_conn is None:
            release_conn = response_kw.get("preload_content", True)
    
        # Check host
        if assert_same_host and not self.is_same_host(url):
            raise HostChangedError(self, url, retries)
    
        # Ensure that the URL we're connecting to is properly encoded
        if url.startswith("/"):
            url = six.ensure_str(_encode_target(url))
        else:
            url = six.ensure_str(parsed_url.url)
    
        conn = None
    
        # Track whether `conn` needs to be released before
        # returning/raising/recursing. Update this variable if necessary, and
        # leave `release_conn` constant throughout the function. That way, if
        # the function recurses, the original value of `release_conn` will be
        # passed down into the recursive call, and its value will be respected.
        #
        # See issue #651 [1] for details.
        #
        # [1] &lt;https://github..../urllib3/issues/651&gt;
        release_this_conn = release_conn
    
        http_tunnel_required = connection_requires_http_tunnel(
            self.proxy, self.proxy_config, destination_scheme
        )
    
        # Merge the proxy headers. Only done when not using HTTP CONNECT. We
        # have to copy the headers dict so we can safely change it without those
        # changes being reflected in anyone else's copy.
        if not http_tunnel_required:
            headers = headers.copy()
            headers.update(self.proxy_headers)
    
        # Must keep the exception bound to a separate variable or else Python 3
        # complains about UnboundLocalError.
        err = None
    
        # Keep track of whether we cleanly exited the except block. This
        # ensures we do proper cleanup in finally.
        clean_exit = False
    
        # Rewind body position, if needed. Record current position
        # for future rewinds in the event of a redirect/retry.
        body_pos = set_file_position(body, body_pos)
    
        try:
            # Request a connection from the queue.
            timeout_obj = self._get_timeout(timeout)
            conn = self._get_conn(timeout=pool_timeout)
    
            conn.timeout = timeout_obj.connect_timeout
    
            is_new_proxy_conn = self.proxy is not None and not getattr(
                conn, "sock", None
            )
            if is_new_proxy_conn and http_tunnel_required:
                self._prepare_proxy(conn)
    
            # Make the request on the httplib connection object.
&gt;           httplib_response = self._make_request(
                conn,
                method,
                url,
                timeout=timeout_obj,
                body=body,
                headers=headers,
                chunked=chunked,
            )

.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:715: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:416: in _make_request
    conn.request(method, url, **httplib_request_kw)
.../local/lib/python3.12............/site-packages/urllib3/connection.py:244: in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
.../local/lib/python3.12/http/client.py:1336: in request
    self._send_request(method, url, body, headers, encode_chunked)
.../local/lib/python3.12/http/client.py:1382: in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
.../local/lib/python3.12/http/client.py:1331: in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
.../local/lib/python3.12/http/client.py:1091: in _send_output
    self.send(msg)
.../local/lib/python3.12/http/client.py:1035: in send
    self.connect()
.../local/lib/python3.12............/site-packages/urllib3/connection.py:205: in connect
    conn = self._new_conn()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = &lt;urllib3.connection.HTTPConnection object at 0x7fc1f7218500&gt;

    def _new_conn(self):
        """Establish a socket connection and set nodelay settings on it.
    
        :return: New socket connection.
        """
        extra_kw = {}
        if self.source_address:
            extra_kw["source_address"] = self.source_address
    
        if self.socket_options:
            extra_kw["socket_options"] = self.socket_options
    
        try:
            conn = connection.create_connection(
                (self._dns_host, self.port), self.timeout, **extra_kw
            )
    
        except SocketTimeout:
            raise ConnectTimeoutError(
                self,
                "Connection to %s timed out. (connect timeout=%s)"
                % (self.host, self.timeout),
            )
    
        except SocketError as e:
&gt;           raise NewConnectionError(
                self, "Failed to establish a new connection: %s" % e
            )
E           urllib3.exceptions.NewConnectionError: &lt;urllib3.connection.HTTPConnection object at 0x7fc1f7218500&gt;: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

.../local/lib/python3.12............/site-packages/urllib3/connection.py:186: NewConnectionError

During handling of the above exception, another exception occurred:

self = &lt;graphql_api.tests.test_test_analytics.TestAnalyticsTestCase object at 0x7fcbf110e2a0&gt;
transactional_db = None
repository = &lt;Repository: Repo&lt;Owner&lt;github/codecov-user&gt;/testRepoName&gt;&gt;

    def test_get_test_results_no_storage(self, transactional_db, repository):
        with pytest.raises(FileNotFoundError):
&gt;           get_results(repository.repoid, repository.branch, 30)

graphql_api/tests/test_test_analytics.py:155: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
utils/test_results.py:50: in get_results
    serialized_table = storage_service.read_file(
.../local/lib/python3.12.../shared/storage/minio.py:200: in read_file
    res = self.minio_client.get_object(bucket_name, path)
.../local/lib/python3.12............/site-packages/minio/api.py:1159: in get_object
    return self._execute(
.../local/lib/python3.12............/site-packages/minio/api.py:400: in _execute
    region = self._get_region(bucket_name, None)
.../local/lib/python3.12............/site-packages/minio/api.py:467: in _get_region
    response = self._url_open(
.../local/lib/python3.12............/site-packages/minio/api.py:266: in _url_open
    response = self._http.urlopen(
.../local/lib/python3.12.../site-packages/urllib3/poolmanager.py:376: in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
.../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:801: in urlopen
    retries = retries.increment(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Retry(total=0, connect=None, read=None, redirect=None, status=None)
method = 'GET', url = '/codecov?location=', response = None
error = NewConnectionError('&lt;urllib3.connection.HTTPConnection object at 0x7fc1f7218500&gt;: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')
_pool = &lt;urllib3.connectionpool.HTTPConnectionPool object at 0x7fcb9c901e50&gt;
_stacktrace = &lt;traceback object at 0x7fbd1f56bd40&gt;

    def increment(
        self,
        method=None,
        url=None,
        response=None,
        error=None,
        _pool=None,
        _stacktrace=None,
    ):
        """Return a new Retry object with incremented retry counters.
    
        :param response: A response object, or None, if the server did not
            return a response.
        :type response: :class:`~urllib3.response.HTTPResponse`
        :param Exception error: An error encountered during the request, or
            None if the response was received successfully.
    
        :return: A new ``Retry`` object.
        """
        if self.total is False and error:
            # Disabled, indicate to re-raise the error.
            raise six.reraise(type(error), error, _stacktrace)
    
        total = self.total
        if total is not None:
            total -= 1
    
        connect = self.connect
        read = self.read
        redirect = self.redirect
        status_count = self.status
        other = self.other
        cause = "unknown"
        status = None
        redirect_location = None
    
        if error and self._is_connection_error(error):
            # Connect retry?
            if connect is False:
                raise six.reraise(type(error), error, _stacktrace)
            elif connect is not None:
                connect -= 1
    
        elif error and self._is_read_error(error):
            # Read retry?
            if read is False or not self._is_method_retryable(method):
                raise six.reraise(type(error), error, _stacktrace)
            elif read is not None:
                read -= 1
    
        elif error:
            # Other retry?
            if other is not None:
                other -= 1
    
        elif response and response.get_redirect_location():
            # Redirect retry?
            if redirect is not None:
                redirect -= 1
            cause = "too many redirects"
            redirect_location = response.get_redirect_location()
            status = response.status
    
        else:
            # Incrementing because of a server error like a 500 in
            # status_forcelist and the given method is in the allowed_methods
            cause = ResponseError.GENERIC_ERROR
            if response and response.status:
                if status_count is not None:
                    status_count -= 1
                cause = ResponseError.SPECIFIC_ERROR.format(status_code=response.status)
                status = response.status
    
        history = self.history + (
            RequestHistory(method, url, error, status, redirect_location),
        )
    
        new_retry = self.new(
            total=total,
            connect=connect,
            read=read,
            redirect=redirect,
            status=status_count,
            other=other,
            history=history,
        )
    
        if new_retry.is_exhausted():
&gt;           raise MaxRetryError(_pool, url, error or ResponseError(cause))
E           urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='minio', port=9000): Max retries exceeded with url: /codecov?location= (Caused by NewConnectionError('&lt;urllib3.connection.HTTPConnection object at 0x7fc1f7218500&gt;: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

.../local/lib/python3.12.../urllib3/util/retry.py:594: MaxRetryError
graphql_api/tests/test_test_analytics.py::TestAnalyticsTestCase::test_get_test_results
Stack Traces | 6.37s run time
No failure message available
graphql_api/tests/test_test_analytics.py::TestAnalyticsTestCase::test_get_test_results_no_redis
Stack Traces | 6.41s run time
No failure message available

To view individual test run time comparison to the main branch, go to the Test Analytics Dashboard

Copy link

Test Failures Detected: Due to failing tests, we cannot provide coverage reports at this time.

❌ Failed Test Results:

Completed 2606 tests with 5 failed, 2595 passed and 6 skipped.

View the full list of failed tests

pytest

  • Class name: graphql_api.tests.test_test_analytics.TestAnalyticsTestCase
    Test name: test_get_test_results

    No failure message available
  • Class name: graphql_api.tests.test_test_analytics.TestAnalyticsTestCase
    Test name: test_get_test_results_no_redis

    No failure message available
  • Class name: graphql_api.tests.test_test_analytics.TestAnalyticsTestCase
    Test name: test_get_test_results_no_storage

    self = <urllib3.connection.HTTPConnection object at 0x7fc1f7218500>

    def _new_conn(self):
    """Establish a socket connection and set nodelay settings on it.

    :return: New socket connection.
    """
    extra_kw = {}
    if self.source_address:
    extra_kw["source_address"] = self.source_address

    if self.socket_options:
    extra_kw["socket_options"] = self.socket_options

    try:
    > conn = connection.create_connection(
    (self._dns_host, self.port), self.timeout, **extra_kw
    )

    .../local/lib/python3.12............/site-packages/urllib3/connection.py:174:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    .../local/lib/python3.12.../urllib3/util/connection.py:72: in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    host = 'minio', port = 9000, family = <AddressFamily.AF_UNSPEC: 0>
    type = <SocketKind.SOCK_STREAM: 1>, proto = 0, flags = 0

    def getaddrinfo(host, port, family=0, type=0, proto=0, flags=0):
    """Resolve host and port into list of address info entries.

    Translate the host/port argument into a sequence of 5-tuples that contain
    all the necessary arguments for creating a socket connected to that service.
    host is a domain name, a string representation of an IPv4/v6 address or
    None. port is a string service name such as 'http', a numeric port number or
    None. By passing None as the value of host and port, you can pass NULL to
    the underlying C API.

    The family, type and proto arguments can be optionally specified in order to
    narrow the list of addresses returned. Passing zero as a value for each of
    these arguments selects the full range of results.
    """
    # We override this function since we want to translate the numeric family
    # and socket type values to enum constants.
    addrlist = []
    > for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    E socket.gaierror: [Errno -3] Temporary failure in name resolution

    .../local/lib/python3.12/socket.py:976: gaierror

    During handling of the above exception, another exception occurred:

    self = <urllib3.connectionpool.HTTPConnectionPool object at 0x7fcb9c901e50>
    method = 'GET', url = '/codecov&location=', body = None
    headers = HTTPHeaderDict({'Host': 'minio:9000', 'User-Agent': 'MinIO (Linux; x86_64) minio-py/7.1.13', 'x-amz-content-sha256': '...ers=host;x-amz-content-sha256;x-amz-date, Signature=05af952d2846b4ed5e339dcca185860b954d6bfa77fe8a375b3c6f7638bc6d11'})
    retries = Retry(total=0, connect=None, read=None, redirect=None, status=None)
    redirect = False, assert_same_host = False
    timeout = <object object at 0x7fcc0a922fe0>, pool_timeout = None
    release_conn = True, chunked = False, body_pos = None
    response_kw = {'preload_content': True}
    parsed_url = Url(scheme=None, auth=None, host=None, port=None, path='/codecov', query='location=', fragment=None)
    destination_scheme = None, conn = None, release_this_conn = True
    http_tunnel_required = False, err = None, clean_exit = False

    def urlopen(
    self,
    method,
    url,
    body=None,
    headers=None,
    retries=None,
    redirect=True,
    assert_same_host=True,
    timeout=_Default,
    pool_timeout=None,
    release_conn=None,
    chunked=False,
    body_pos=None,
    **response_kw
    ):
    """
    Get a connection from the pool and perform an HTTP request. This is the
    lowest level call for making a request, so you'll need to specify all
    the raw details.

    .. note::

    More commonly, it's appropriate to use a convenience method provided
    by :class:`.RequestMethods`, such as :meth:`request`.

    .. note::

    `release_conn` will only behave as expected if
    `preload_content=False` because we want to make
    `preload_content=False` the default behaviour someday soon without
    breaking backwards compatibility.

    :param method:
    HTTP request method (such as GET, POST, PUT, etc.)

    :param url:
    The URL to perform the request on.

    :param body:
    Data to send in the request body, either :class:`str`, :class:`bytes`,
    an iterable of :class:`str`/:class:`bytes`, or a file-like object.

    :param headers:
    Dictionary of custom headers to send, such as User-Agent,
    If-None-Match, etc. If None, pool headers are used. If provided,
    these headers completely replace any pool-specific headers.

    :param retries:
    Configure the number of retries to allow before raising a
    :class:`~urllib3.exceptions.MaxRetryError` exception.

    Pass ``None`` to retry until you receive a response. Pass a
    :class:`~urllib3.util.retry.Retry` object for fine-grained control
    over different types of retries.
    Pass an integer number to retry connection errors that many times,
    but no other types of errors. Pass zero to never retry.

    If ``False``, then retries are disabled and any exception is raised
    immediately. Also, instead of raising a MaxRetryError on redirects,
    the redirect response will be returned.

    :type retries: :class:`~urllib3.util.retry.Retry`, False, or an int.

    :param redirect:
    If True, automatically handle redirects (status codes 301, 302,
    303, 307, 308). Each redirect counts as a retry. Disabling retries
    will disable redirect, too.

    :param assert_same_host:
    If ``True``, will make sure that the host of the pool requests is
    consistent else will raise HostChangedError. When ``False``, you can
    use the pool on an HTTP proxy and request foreign hosts.

    :param timeout:
    If specified, overrides the default timeout for this one
    request. It may be a float (in seconds) or an instance of
    :class:`urllib3.util.Timeout`.

    :param pool_timeout:
    If set and the pool is set to block=True, then this method will
    block for ``pool_timeout`` seconds and raise EmptyPoolError if no
    connection is available within the time period.

    :param release_conn:
    If False, then the urlopen call will not release the connection
    back into the pool once a response is received (but will release if
    you read the entire contents of the response such as when
    `preload_content=True`). This is useful if you're not preloading
    the response's content immediately. You will need to call
    ``r.release_conn()`` on the response ``r`` to return the connection
    back into the pool. If None, it takes the value of
    ``response_kw.get('preload_content', True)``.

    :param chunked:
    If True, urllib3 will send the body using chunked transfer
    encoding. Otherwise, urllib3 will send the body using the standard
    content-length form. Defaults to False.

    :param int body_pos:
    Position to seek to in file-like body in the event of a retry or
    redirect. Typically this won't need to be set because urllib3 will
    auto-populate the value when needed.

    :param \\**response_kw:
    Additional parameters are passed to
    :meth:`urllib3.response.HTTPResponse.from_httplib`
    """

    parsed_url = parse_url(url)
    destination_scheme = parsed_url.scheme

    if headers is None:
    headers = self.headers

    if not isinstance(retries, Retry):
    retries = Retry.from_int(retries, redirect=redirect, default=self.retries)

    if release_conn is None:
    release_conn = response_kw.get("preload_content", True)

    # Check host
    if assert_same_host and not self.is_same_host(url):
    raise HostChangedError(self, url, retries)

    # Ensure that the URL we're connecting to is properly encoded
    if url.startswith("/"):
    url = six.ensure_str(_encode_target(url))
    else:
    url = six.ensure_str(parsed_url.url)

    conn = None

    # Track whether `conn` needs to be released before
    # returning/raising/recursing. Update this variable if necessary, and
    # leave `release_conn` constant throughout the function. That way, if
    # the function recurses, the original value of `release_conn` will be
    # passed down into the recursive call, and its value will be respected.
    #
    # See issue #651 [1] for details.
    #
    # [1] <https://github..../urllib3/issues/651>
    release_this_conn = release_conn

    http_tunnel_required = connection_requires_http_tunnel(
    self.proxy, self.proxy_config, destination_scheme
    )

    # Merge the proxy headers. Only done when not using HTTP CONNECT. We
    # have to copy the headers dict so we can safely change it without those
    # changes being reflected in anyone else's copy.
    if not http_tunnel_required:
    headers = headers.copy()
    headers.update(self.proxy_headers)

    # Must keep the exception bound to a separate variable or else Python 3
    # complains about UnboundLocalError.
    err = None

    # Keep track of whether we cleanly exited the except block. This
    # ensures we do proper cleanup in finally.
    clean_exit = False

    # Rewind body position, if needed. Record current position
    # for future rewinds in the event of a redirect/retry.
    body_pos = set_file_position(body, body_pos)

    try:
    # Request a connection from the queue.
    timeout_obj = self._get_timeout(timeout)
    conn = self._get_conn(timeout=pool_timeout)

    conn.timeout = timeout_obj.connect_timeout

    is_new_proxy_conn = self.proxy is not None and not getattr(
    conn, "sock", None
    )
    if is_new_proxy_conn and http_tunnel_required:
    self._prepare_proxy(conn)

    # Make the request on the httplib connection object.
    > httplib_response = self._make_request(
    conn,
    method,
    url,
    timeout=timeout_obj,
    body=body,
    headers=headers,
    chunked=chunked,
    )

    .../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:715:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    .../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:416: in _make_request
    conn.request(method, url, **httplib_request_kw)
    .../local/lib/python3.12............/site-packages/urllib3/connection.py:244: in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
    .../local/lib/python3.12/http/client.py:1336: in request
    self._send_request(method, url, body, headers, encode_chunked)
    .../local/lib/python3.12/http/client.py:1382: in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
    .../local/lib/python3.12/http/client.py:1331: in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
    .../local/lib/python3.12/http/client.py:1091: in _send_output
    self.send(msg)
    .../local/lib/python3.12/http/client.py:1035: in send
    self.connect()
    .../local/lib/python3.12............/site-packages/urllib3/connection.py:205: in connect
    conn = self._new_conn()
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    self = <urllib3.connection.HTTPConnection object at 0x7fc1f7218500>

    def _new_conn(self):
    """Establish a socket connection and set nodelay settings on it.

    :return: New socket connection.
    """
    extra_kw = {}
    if self.source_address:
    extra_kw["source_address"] = self.source_address

    if self.socket_options:
    extra_kw["socket_options"] = self.socket_options

    try:
    conn = connection.create_connection(
    (self._dns_host, self.port), self.timeout, **extra_kw
    )

    except SocketTimeout:
    raise ConnectTimeoutError(
    self,
    "Connection to %s timed out. (connect timeout=%s)"
    % (self.host, self.timeout),
    )

    except SocketError as e:
    > raise NewConnectionError(
    self, "Failed to establish a new connection: %s" % e
    )
    E urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fc1f7218500>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

    .../local/lib/python3.12............/site-packages/urllib3/connection.py:186: NewConnectionError

    During handling of the above exception, another exception occurred:

    self = <graphql_api.tests.test_test_analytics.TestAnalyticsTestCase object at 0x7fcbf110e2a0>
    transactional_db = None
    repository = <Repository: Repo<Owner<github/codecov-user>/testRepoName>>

    def test_get_test_results_no_storage(self, transactional_db, repository):
    with pytest.raises(FileNotFoundError):
    > get_results(repository.repoid, repository.branch, 30)

    graphql_api/tests/test_test_analytics.py:155:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    utils/test_results.py:50: in get_results
    serialized_table = storage_service.read_file(
    .../local/lib/python3.12.../shared/storage/minio.py:200: in read_file
    res = self.minio_client.get_object(bucket_name, path)
    .../local/lib/python3.12............/site-packages/minio/api.py:1159: in get_object
    return self._execute(
    .../local/lib/python3.12............/site-packages/minio/api.py:400: in _execute
    region = self._get_region(bucket_name, None)
    .../local/lib/python3.12............/site-packages/minio/api.py:467: in _get_region
    response = self._url_open(
    .../local/lib/python3.12............/site-packages/minio/api.py:266: in _url_open
    response = self._http.urlopen(
    .../local/lib/python3.12.../site-packages/urllib3/poolmanager.py:376: in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
    .../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
    .../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
    .../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
    .../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
    .../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:829: in urlopen
    return self.urlopen(
    .../local/lib/python3.12......................../site-packages/urllib3/connectionpool.py:801: in urlopen
    retries = retries.increment(
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    self = Retry(total=0, connect=None, read=None, redirect=None, status=None)
    method = 'GET', url = '/codecov&location=', response = None
    error = NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc1f7218500>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')
    _pool = <urllib3.connectionpool.HTTPConnectionPool object at 0x7fcb9c901e50>
    _stacktrace = <traceback object at 0x7fbd1f56bd40>

    def increment(
    self,
    method=None,
    url=None,
    response=None,
    error=None,
    _pool=None,
    _stacktrace=None,
    ):
    """Return a new Retry object with incremented retry counters.

    :param response: A response object, or None, if the server did not
    return a response.
    :type response: :class:`~urllib3.response.HTTPResponse`
    :param Exception error: An error encountered during the request, or
    None if the response was received successfully.

    :return: A new ``Retry`` object.
    """
    if self.total is False and error:
    # Disabled, indicate to re-raise the error.
    raise six.reraise(type(error), error, _stacktrace)

    total = self.total
    if total is not None:
    total -= 1

    connect = self.connect
    read = self.read
    redirect = self.redirect
    status_count = self.status
    other = self.other
    cause = "unknown"
    status = None
    redirect_location = None

    if error and self._is_connection_error(error):
    # Connect retry&
    if connect is False:
    raise six.reraise(type(error), error, _stacktrace)
    elif connect is not None:
    connect -= 1

    elif error and self._is_read_error(error):
    # Read retry&
    if read is False or not self._is_method_retryable(method):
    raise six.reraise(type(error), error, _stacktrace)
    elif read is not None:
    read -= 1

    elif error:
    # Other retry&
    if other is not None:
    other -= 1

    elif response and response.get_redirect_location():
    # Redirect retry&
    if redirect is not None:
    redirect -= 1
    cause = "too many redirects"
    redirect_location = response.get_redirect_location()
    status = response.status

    else:
    # Incrementing because of a server error like a 500 in
    # status_forcelist and the given method is in the allowed_methods
    cause = ResponseError.GENERIC_ERROR
    if response and response.status:
    if status_count is not None:
    status_count -= 1
    cause = ResponseError.SPECIFIC_ERROR.format(status_code=response.status)
    status = response.status

    history = self.history + (
    RequestHistory(method, url, error, status, redirect_location),
    )

    new_retry = self.new(
    total=total,
    connect=connect,
    read=read,
    redirect=redirect,
    status=status_count,
    other=other,
    history=history,
    )

    if new_retry.is_exhausted():
    > raise MaxRetryError(_pool, url, error or ResponseError(cause))
    E urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='minio', port=9000): Max retries exceeded with url: /codecov&location= (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc1f7218500>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

    .../local/lib/python3.12.../urllib3/util/retry.py:594: MaxRetryError
  • Class name: graphql_api.tests.test_test_analytics.TestAnalyticsTestCase
    Test name: test_gql_query_aggregates

    self = <graphql_api.tests.test_test_analytics.TestAnalyticsTestCase object at 0x7fcbf10b3170>
    repository = <Repository: Repo<Owner<github/codecov-user>/testRepoName>>
    store_in_redis = None

    def test_gql_query_aggregates(self, repository, store_in_redis):
    query = base_gql_query % (
    repository.author.username,
    repository.name,
    """
    testResultsAggregates {
    totalDuration
    slowestTestsDuration
    totalFails
    totalSkips
    totalSlowTests
    }
    """,
    )

    result = self.gql_request(query, owner=repository.author)

    > assert result["owner"]["repository"]["testAnalytics"][
    "testResultsAggregates"
    ] == {
    "totalDuration": 1000.0,
    "slowestTestsDuration": 800.0,
    "totalFails": 3,
    "totalSkips": 3,
    "totalSlowTests": 1,
    }
    E AssertionError: assert None == {'slowestTestsDuration': 800.0, 'totalDuration': 1000.0, 'totalFails': 3, 'totalSkips': 3, ...}

    graphql_api/tests/test_test_analytics.py:510: AssertionError
  • Class name: graphql_api.tests.test_test_analytics.TestAnalyticsTestCase
    Test name: test_gql_query_flake_aggregates

    self = <graphql_api.tests.test_test_analytics.TestAnalyticsTestCase object at 0x7fcbf10b3800>
    repository = <Repository: Repo<Owner<github/codecov-user>/testRepoName>>
    store_in_redis = None

    def test_gql_query_flake_aggregates(self, repository, store_in_redis):
    query = base_gql_query % (
    repository.author.username,
    repository.name,
    """
    flakeAggregates {
    flakeRate
    flakeCount
    }
    """,
    )

    result = self.gql_request(query, owner=repository.author)

    > assert result["owner"]["repository"]["testAnalytics"]["flakeAggregates"] == {
    "flakeRate": 1 / 3,
    "flakeCount": 1,
    }
    E AssertionError: assert None == {'flakeCount': 1, 'flakeRate': 0.3333333333333333}

    graphql_api/tests/test_test_analytics.py:534: AssertionError

this refactor simplifies the tests from test results aggregates and
flake aggregates and adds some typing to the generate_test_results
function. It also changes the arguments passed to the
generate_test_results function so handling them is simpler and changes
some things regarding error handling in that function.
we want the GQL endpoints to consume the cached query results from GCS
and then do some filtering and ordering on top of that, and also do some
extra caching of those results in redis.

this commit also contains a bunch of reorganization and refactoring of
the GQL code
when we miss the redis cache but find the test rollup results in GCS
we want to queue up a task to cache it in redis so subsequent calls
to the GQL endpoint for the same repo get to hit redis instead of GCS
@joseph-sentry joseph-sentry force-pushed the joseph/cache-test-results branch from a4a6e79 to e3dd9a7 Compare November 7, 2024 21:28
@joseph-sentry joseph-sentry marked this pull request as ready for review November 7, 2024 21:47
@joseph-sentry joseph-sentry requested a review from a team as a code owner November 7, 2024 21:47
flake_rate: float
flake_rate_percent_change: float | None
flake_count_percent_change: float | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dang you didn't like having it grouped huh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all args with defaults must come after arguments with no defaults 😭

flake_rate_percent_change: float | None = None


def calculate_flake_aggregates(table: pl.DataFrame) -> pl.DataFrame:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to know what columns exist on the table at this point? Right now it's a little "magic"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the names of the columns are in the worker repo where it's being written, come to think of it maybe all the serialization and reading stuff can be in shared but i don't know if that would make things worst, we only read in api and we only write in worker

),
(pl.col("total_skip_count").sum()).alias("skips"),
(pl.col("total_fail_count").sum()).alias("fails"),
((pl.col("avg_duration") >= pl.col("avg_duration").quantile(0.95)).sum()).alias(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need to do the limit 100 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, that's being handled above but not here


merged_results: pl.DataFrame = pl.concat([past_aggregates, curr_aggregates])

merged_results = merged_results.with_columns(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a bit confusing for me at first to see this since I didn't realize with_columns is essentially an append or upsert depending on the column existing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will leave a comment

first: int | None,
last: int | None,
) -> None:
if interval not in {1, 7, 30}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we have these magic numbers typed out as consts somewhere?

).decode("ascii")


def validate(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random thought here, does the order of these validations matter? Do we want it to matter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, and i don't think it should matter (maybe in past iterations it would've(


if flags:
table = table.filter(
pl.col("flags").list.eval(pl.element().is_in(flags)).list.any()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude was telling me we might be able to add these guards to this check, not sure if you think we need em tho

pl.col("flags").is_not_null() & # handle null rows
pl.col("flags").list.len() > 0 & # handle empty lists

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the is not null check makes sense

total_count = table.height

def ordering_expression(cursor_value: CursorValue, is_forward: bool) -> pl.Expr:
if is_forward:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does forward imply here? It's also a little confusing that we use the value of is_forward and then change the value of is_forward in the same fn

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_forward is just about determining the ordering of the table, but ordering_expression only reads is_forward, it's the outer function that sets the value and invokes ordering_expression so i ended up just moving it out


rows = [TestResultsRow(**row) for row in page_elements.rows(named=True)]

page: list[dict[str, str | TestResultsRow]] = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be pages right? More than one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, this is a single page, since it's after all the ordering and filtering

return await sync_to_async(generate_test_results_aggregates)(
repoid=repository.repoid, interval=convert_interval_to_timedelta(interval)
repoid=repository.repoid, interval=interval.value if interval else 30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these 30s can be MeasurementInterval.INTERVAL_30_DAY right?

interval_start: int,
interval_end: int | None = None,
) -> str:
key = f"test_results:{repoid}:{branch}:{interval_start}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is an int... does this have a reasonable enough level of precision so we actually have some cache hits sometimes?

Like if the timestamp is 1231232132 vs. 1231232135 it seems like that would trigger a new rollup to be cached/created

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no timestamp being use in the key, it's just the range of days that the cached values apply to, for example 1 day, 7 days, or 30 days

Copy link
Contributor

github-actions bot commented Nov 8, 2024

✅ All tests successful. No failed tests were found.

📣 Thoughts on this report? Let Codecov know! | Powered by Codecov

this commit refactors the code and tests related to ordering direction
if we map ordering direction asc = 1 and desc = 0 and first = 1 and last
= 0 then the ordering direction is an xor of those two variables

ordering direction determines whether we're filtering by greater than
or lesser than

this commit also fixes a bug in the specification of slow tests
Copy link
Contributor

@ajay-sentry ajay-sentry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, reminder to update the bool(after) thing and add a comment on the top_k() thing

@joseph-sentry joseph-sentry added this pull request to the merge queue Nov 11, 2024
Merged via the queue into main with commit 7065d56 Nov 11, 2024
14 of 19 checks passed
@joseph-sentry joseph-sentry deleted the joseph/cache-test-results branch November 11, 2024 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants