Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub build status takes very long time to come through (parallel builds) #1407

Closed
reececomo opened this issue Feb 28, 2020 · 13 comments
Closed
Labels
site-performance slow builds, etc.

Comments

@reececomo
Copy link

  • We set COVERALLS_PARALLEL=true
  • We upload three parallel reports within a couple minutes of each other.
  • We update the we build status manually with curl -k $COVERALLS_ENDPOINT/webhook?repo_token=$COVERALLS_REPO_TOKEN -d "payload[build_num]=$BUILD_NUMBER&payload[status]=done"
  • Coveralls merges these results and we can see the values on the Coveralls interface immediately 👍

However...

We're finding the GitHub build status takes > 10 minutes to arrive, and in a couple cases, looks like it never arrived at all.

Very confused here - can you offer any guidance?

@jfarrell
Copy link

we are also seeing an increase in build status reporting back to Github from Coveralls. After Coveralls has received the coverage report and the build has finished we have seen upwards of 30m in some cases of the successful notification reaching back to Github

@reececomo
Copy link
Author

Yep just noticed this is also happening on some of our non-parallel repos too.

@reececomo reececomo changed the title Parallel builds - build status takes very long time to come through GitHub build status takes very long time to come through (parallel builds) Mar 3, 2020
@glennpratt
Copy link

Seeing the same thing here across several private repos. Emailed support with no response.

@reececomo
Copy link
Author

We just switched our org over from CodeCov, I guess we'll have to switch back 😞

@danadaldos
Copy link

Yep also seeing 30 minute build times on CircleCI (when CircleCI is correctly configured for parallel builds)

Pipelines

@narwold
Copy link

narwold commented Feb 25, 2021

We're seeing this same issue. This presents a major workflow challenge. Any status update on this?

@danadaldos
Copy link

danadaldos commented Feb 25, 2021

@narwold

We're seeing this same issue. This presents a major workflow challenge. Any status update on this?

Ours actually cleared-up at some point. Here's our CircleCI configuration (edited down to relevant parts):

Unit test step:

  test:
    <<: *testing_container_config
    parallelism: 7
    steps:
          ... [workspace setup and cache restore] ...
          
      - run:
          name: Run Non-Feature Tests
          command: |
            circleci tests glob "test/**/*_test.exs" | circleci tests split --split-by=timings > /tmp/tests-to-run
            echo ""
            echo "************ Test files in this container ***************"
            cat /tmp/tests-to-run
            echo ""
            mix coveralls.circle --parallel $(cat /tmp/tests-to-run) --exclude feature

Run feature tests step:

  test_features:
    <<: *testing_container_config
    parallelism: 10
    steps:
      ... [workspace setup and cache restore] ...

      - run:
          name: Run Feature Tests
          command: |
            ...
            mix coveralls.circle --parallel --timeout=90000 $(cat /tmp/tests-to-run)

Notify Coveralls step:

  notify_coveralls:
    docker:
      - image: cimg/base:2020.01
    steps:
      - *attach_workspace
      - run: |
          curl "https://coveralls.io/webhook?repo_token=$COVERALLS_REPO_TOKEN" \
            -d "payload[build_num]=$CIRCLE_WORKFLOW_WORKSPACE_ID&payload[status]=done"
          echo $CIRCLE_WORKFLOW_WORKSPACE_ID
          exit 0

Workflow:

workflows:
  build_and_test:
    jobs:
      - build
      - check_formatted:
          requires:
            - build
      - check_for_unused_dependencies:
          requires:
            - build
      - credo:
          requires:
            - build
      - dialyzer:
          requires:
            - build
      - lint:
          requires:
            - build
      - test:
          context: slack-secrets
          requires:
            - build
      - test_features:
          context: slack-secrets
          requires:
            - build
      - notify_coveralls:
          requires: [test, test_features]
          ```

@narwold
Copy link

narwold commented Feb 25, 2021

@danadaldos Do you have any idea what caused yours to clear up specifically?

@danadaldos
Copy link

@narwold

I should point out that our feature test step actually has a retry step within it that we think was causing problems by getting bogged down deleting all of the old test coverage data:

.WARNING: Deleting data for module 'Elixir.FooApp.Services.KBB.ValuationBuilder' imported from
["/home/circleci/foo_app/Elixir.FooApp.Services.KBB.ValuationBuilder.182.coverdata",
 "/home/circleci/foo_app/Elixir.FooApp.Services.KBB.ValuationBuilder_meck_original.182.coverdata"]

IIRC adding the --stale flag fixed it:

      - run:
          name: Run Feature Tests
          command: |
            set +e
            sudo chmod 666 /etc/hosts
            echo "127.0.0.1 local.host" >> /etc/hosts
            echo "127.0.0.1 sf-shop.local.host" >> /etc/hosts
            circleci tests glob "test/features/**/*_test.exs" | circleci tests split --split-by=timings > /tmp/tests-to-run
            echo ""
            echo "************ Test files in this container ***************"
            cat /tmp/tests-to-run
            echo ""
            mix coveralls.circle --parallel --timeout=90000 $(cat /tmp/tests-to-run)
            if [ $? -eq 1 ] && [ -z ${RETRIED} ]; then
              set -e
              echo "Retrying failed tests"
              export RETRIED=true
              mix coveralls.circle --parallel --timeout=90000 --stale $(cat /tmp/tests-to-run)
            fi
            ```

@narwold
Copy link

narwold commented Feb 25, 2021

I don't think ours has any such retry/stale issues. This is the relevant portion of ours...

<% # loop that produces n number of testing steps %>
  - label: Testing <%= package %>
    key: <%= test_key %>
    env:
      COVERALLS_FLAG_NAME: <%= coverage_flag %>
      COVERALLS_GIT_BRANCH: ${BUILDKITE_BRANCH}
      COVERALLS_SERVICE_NAME: ${CI_NAME}
      COVERALLS_SERVICE_NUMBER: ${CI_BUILD_NUMBER}
      COVERALLS_SERVICE_JOB_ID: ${BUILDKITE_JOB_ID}
      COVERALLS_PARALLEL: true
    command: |
      pnpm test -r --filter <%= package %>
      pnpm coveralls < "<%= relative_coverage_files_path %>/lcov.info"
<% # end of loop %>

  - label: "Coveralls uploads complete"
    depends_on:
<% test_runs.each do |test_run| %>
      - <%= test_run[:key] %>
<% end %>
    command: |
      curl -k https://coveralls.io/webhook?repo_token=${COVERALLS_REPO_TOKEN} \
        -d "payload[build_num]=${CI_BUILD_NUMBER}&payload[status]=done"

All the individual jobs send correctly and get a 200 response, and then the "done" call also gets a 200 response with this data:

{"done":true,"url":"https://coveralls.io/builds/########","jobs":##}

All the individual coverage numbers show up in Coveralls dashboard; yet it still takes forever to report the overall status (2+ hours in some cases).

@narwold
Copy link

narwold commented Feb 25, 2021

Update: Coveralls support advised us that they have been tracking some slowness over the past few days. This might just be on their end.

@danadaldos
Copy link

@narwold Interesting, let me know if the slowness clears up because ours was definitely configuration-related.

@afinetooth
Copy link
Collaborator

@narwold @danadaldos yes, per this incident from Tue, we had performance slowdowns caused by severe traffic in backup queues dedicated to large projects. Some outlier projects had added hundreds of jobs to those queues, blocking processing for others sharing the queues. We've developed a procedure for clearing these faster (typically within ~1hr from catching it), but we were still dialing it in on Wed so some repos still had performance issues then.

If you think your repo has become entailed in one of these backups, you can email us at support@coveralls.io and we'll give you status.

Closing for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-performance slow builds, etc.
Projects
None yet
Development

No branches or pull requests

6 participants