Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.30.4 : Pants Filter operation seems stuck #12997

Closed
MrRexZ opened this issue Sep 24, 2021 · 9 comments
Closed

v1.30.4 : Pants Filter operation seems stuck #12997

MrRexZ opened this issue Sep 24, 2021 · 9 comments
Assignees
Labels

Comments

@MrRexZ
Copy link

MrRexZ commented Sep 24, 2021

Describe the bug

Hi there,

I'm using Pants (version 1.30.4) both on my CI machines and local machines to run some automated tests in Scala.
I've been noticing issues with Pants operations sometime being stuck when running via CI.
In my script, I'm making use of pants filter command to decide what part of input target to be passed into our test, and it's always during the ./pants filter step that the Pants operation stops working.
Does anyone have any pointers on how to further debug this issue so that I can figure out what's the root cause?
Thank you! :)

Please see the attached here for relevant logs.

Here are part of the snippets from the logs (in the end, i had to manually cancel this CI job, and from the timestamp, there's a huge time gap which indicates that the pants filter are stuck):

2021-09-23T17:03:20.5708882Z ++ ./pants -ldebug filter --filter-target=anduin/server/test/scala/io/datalogue/anduin/server:anduin //:build_root //:jooq_gen //:scala-library //:scalac //:scalac-plugin-dep //:scalafmt //:scalapbc 3rdparty/jvm/com/amazonaws:redshift 3rdparty/jvm/com/amazonaws:s3 3rdparty/jvm/com/amazonaws:sts 3rdparty/jvm/com/beachape:enumeratum 3rdparty/jvm/com/beachape:enumeratum-circe 3rdparty/jvm/com/bloomberg:jdbc-comdb2 3rdparty/jvm/com/cloudera:jdbc-cloudera-impala 3rdparty/jvm/com/fasterxml/jackson/dataformat:jackson-cbor 3rdparty/jvm/com/github/ghik:silencer-lib 3rdparty/jvm/com/github/julien-truffaut:monocle 3rdparty/jvm/com/google/cloud:google-cloud-storage 3rdparty/jvm/com/google/protobuf:protobuf-runtime 3rdparty/jvm/com/lightbend:akka-management 3rdparty/jvm/com/lightbend:alpakka-csv 3rdparty/jvm/com/lightbend:alpakka-elasticsearch 3rdparty/jvm/com/lightbend:alpakka-ftp 3rdparty/jvm/com/lightbend:alpakka-kinesis 3rdparty/jvm/com/lightbend:alpakka-s3 3rdparty/jvm/com/microsoft/azure:azure-storage 3rdparty/jvm/com/microsoft/sqlserver:jdbc-sqlserver 3rdparty/jvm/com/newrelic/agent/java:newrelic 3rdparty/jvm/com/newrelic/agent/java:newrelic-api 3rdparty/jvm/com/omnisci:jdbc-omnisci 3rdparty/jvm/com/oracle/database/jdbc:jdbc-oracle 3rdparty/jvm/com/sap:jdbc-sap 3rdparty/jvm/com/storm-enroute:scalameter 3rdparty/jvm/com/sun/xml/bind:jaxb-core 3rdparty/jvm/com/sun/xml/bind:jaxb-impl 3rdparty/jvm/com/thesamenet/scalapb:scalapb-runtime 3rdparty/jvm/com/thesamenet/scalapb:scalapb-runtime-grpc 3rdparty/jvm/com/typesafe/akka:akka-cluster 3rdparty/jvm/com/typesafe/akka:akka-distributed-data 3rdparty/jvm/com/typesafe/akka:akka-http 3rdparty/jvm/com/typesafe/akka:akka-http-test 3rdparty/jvm/com/typesafe/akka:akka-http-xml 3rdparty/jvm/com/typesafe/akka:akka-persistence 3rdparty/jvm/com/typesafe/akka:akka-protobuf 3rdparty/jvm/com/typesafe/akka:akka-slf4j 3rdparty/jvm/com/typesafe/akka:akka-stream 3rdparty/jvm/com/typesafe/akka:akka-stream-kafka 3rdparty/jvm/com/typesafe/akka:akka-stream-test 3rdparty/jvm/com/typesafe/akka:akka-test 3rdparty/jvm/com/typesafe/config:config 3rdparty/jvm/com/univocity:univocity 3rdparty/jvm/commons-io:commons-io 3rdparty/jvm/de/flapdoodle/embed:mongo-embedded 3rdparty/jvm/de/heikoseeberger:akka-http-circe 3rdparty/jvm/io/circe:circe 3rdparty/jvm/io/circe:circe-literal 3rdparty/jvm/io/circe:circe-yaml 3rdparty/jvm/io/confluent:confluent-avro 3rdparty/jvm/io/grpc:grpc-netty-shaded 3rdparty/jvm/io/higherkindness:droste 3rdparty/jvm/io/kamon:kamon 3rdparty/jvm/io/kamon:kamon-test 3rdparty/jvm/io/kamon:kanela-agent 3rdparty/jvm/io/nats:java-nats 3rdparty/jvm/io/nats:java-nats-server-runner 3rdparty/jvm/io/strimzi:oauth 3rdparty/jvm/javax/activation:jaxb-api 3rdparty/jvm/javax/xml/bind:jaxb-api 3rdparty/jvm/junit:junit 3rdparty/jvm/net/snowflake:jdbc-snowflake 3rdparty/jvm/net/sourceforge/jtds:jdbc-jtds 3rdparty/jvm/org/apache/avro:avro 3rdparty/jvm/org/apache/commons:commons-exec 3rdparty/jvm/org/apache/commons:commons-text 3rdparty/jvm/org/apache/directory/server:kerberos 3rdparty/jvm/org/apache/hadoop:hadoop-aws 3rdparty/jvm/org/apache/hadoop:hadoop-azure 3rdparty/jvm/org/apache/hadoop:hadoop-client 3rdparty/jvm/org/apache/hadoop:hadoop-common 3rdparty/jvm/org/apache/hive:jdbc-hive 3rdparty/jvm/org/apache/logging/log4j:log4j 3rdparty/jvm/org/apache/opennlp:opennlp 3rdparty/jvm/org/apache/parquet:parquet-avro 3rdparty/jvm/org/bouncycastle:bcprov-jdk15on 3rdparty/jvm/org/flywaydb:flyway 3rdparty/jvm/org/hsqldb:hsqldb 3rdparty/jvm/org/http4s:http4s 3rdparty/jvm/org/jooq:jooq 3rdparty/jvm/org/log4s:log4s 3rdparty/jvm/org/mariadb/jdbc:jdbc-mariadb 3rdparty/jvm/org/mockito-scala:mockito-scala 3rdparty/jvm/org/neo4j/driver:neo4j 3rdparty/jvm/org/parboiled:parboiled 3rdparty/jvm/org/pf4j:pf4j 3rdparty/jvm/org/reactivemongo:reactivemongo 3rdparty/jvm/org/scala-lang:scala-library 3rdparty/jvm/org/scalatest:scalatest 3rdparty/jvm/org/scalatra:scalate 3rdparty/jvm/org/tpolecat:doobie 3rdparty/jvm/org/tpolecat:doobie-hikari 3rdparty/jvm/org/tpolecat:doobie-scalatest 3rdparty/jvm/org/typelevel:cats 3rdparty/jvm/org/typelevel:cats-effect 3rdparty/jvm/org/typelevel:fs2grpc 3rdparty/jvm/org/typelevel:mouse 3rdparty/python:numpy 3rdparty/python:pandas 3rdparty/python:pbkdf2 3rdparty/python:psycopg2 3rdparty/python:pyarrow 3rdparty/python:pytest 3rdparty/python:python-dateutil 3rdparty/python:pyyaml 3rdparty/python:requests 3rdparty/python:requirements.txt 3rdparty/python:validators anduin/commons/main/scala/io/datalogue/anduin/commons:commons anduin/commons/test/scala/io/datalogue/anduin/commons:commons anduin/plugins/databricksSpark/3rdparty:spark-jdbc anduin/plugins/databricksSpark/main/resources/META-INF:META-INF anduin/plugins/databricksSpark/main/scala/io/datalogue/anduin/plugins/databricksSpark:databricks-spark anduin/plugins/databricksSpark/main/scala/io/datalogue/anduin/plugins/databricksSpark:databricks-spark-plugin anduin/plugins/elasticsearch/main/resources/META-INF:META-INF anduin/plugins/elasticsearch/main/scala/io/datalogue/anduin/plugins/elasticsearch:elasticsearch anduin/plugins/elasticsearch/main/scala/io/datalogue/anduin/plugins/elasticsearch:elasticsearch-plugin anduin/plugins/elasticsearch/test/resources:resources anduin/plugins/elasticsearch/test/scala/io/datalogue/anduin/plugins/elasticsearch:elasticsearch anduin/plugins/hadoop/main/java/org/apache/parquet/tools:tools anduin/plugins/hadoop/main/resources/META-INF:META-INF anduin/plugins/hadoop/main/scala/io/datalogue/anduin/plugins/hadoop:hadoop anduin/plugins/hadoop/main/scala/io/datalogue/anduin/plugins/hadoop:hadoop-plugin anduin/plugins/hadoop/test/resources:resources anduin/plugins/hadoop/test/scala/io/datalogue/anduin/plugins/hadoop:hadoop anduin/plugins/mws/3rdparty:commons-codec anduin/plugins/mws/3rdparty:commons-logging anduin/plugins/mws/3rdparty:httpclient anduin/plugins/mws/3rdparty:httpcore anduin/plugins/mws/3rdparty:mws-orders anduin/plugins/mws/3rdparty:mws-runtime anduin/plugins/mws/main/resources/META-INF:META-INF anduin/plugins/mws/main/scala/io/datalogue/anduin/plugins/mws:mws anduin/plugins/mws/main/scala/io/datalogue/anduin/plugins/mws:mws-plugin anduin/plugins/mws/test/scala/io/datalogue/anduin/plugins/mws:mws anduin/plugins/redshift/main/resources/META-INF:META-INF anduin/plugins/redshift/main/scala/io/datalogue/anduin/plugins/redshift:redshift anduin/plugins/redshift/main/scala/io/datalogue/anduin/plugins/redshift:redshift-plugin anduin/plugins/redshift/test/scala/io/datalogue/anduin/plugins/redshift:redshift anduin/plugins/s3/main/resources/META-INF:META-INF anduin/plugins/s3/main/scala/io/datalogue:s3 anduin/plugins/s3/main/scala/io/datalogue:s3-plugin anduin/plugins/s3/test/resources:resources anduin/plugins/s3/test/scala/io/datalogue/anduin/plugins/s3:s3 anduin/plugins/sap/main/resources/META-INF:META-INF anduin/plugins/sap/main/scala/io/datalogue/anduin/plugins/sap:sap anduin/plugins/sap/main/scala/io/datalogue/anduin/plugins/sap:sap-plugin anduin/plugins/teradata/main/resources/META-INF:META-INF anduin/plugins/teradata/main/scala/io/datalogue/anduin/plugins/teradata:teradata anduin/plugins/teradata/main/scala/io/datalogue/anduin/plugins/teradata:teradata-plugin anduin/plugins/teradata/test/scala/io/datalogue/anduin/plugins/teradata:teradata anduin/server/3rdparty:cbc2jdbc anduin/server/3rdparty:impala anduin/server/3rdparty:omnisci anduin/server/main/resources:resources anduin/server/main/scala/io/datalogue/anduin/server:anduin-app anduin/server/main/scala/io/datalogue/anduin/server:anduin-bin anduin/server/main/scala/io/datalogue/anduin/server:anduin-docker anduin/server/main/scala/io/datalogue/anduin/server:anduin-server-lib anduin/server/main/scala/io/datalogue/anduin/server:databricks-spark-unstable-none-cccc43c72731bad8e6a95897a215325c19703fb9 anduin/server/main/scala/io/datalogue/anduin/server:elasticsearch-unstable-none-14876994b6d86e89c72f202726c438d64cd0e247 anduin/server/main/scala/io/datalogue/anduin/server:hadoop-unstable-none-47527a13a44389a075ca2052b666b4e39d079bd3 anduin/server/main/scala/io/datalogue/anduin/server:kanela-agent-unstable-runtime-d6ea599397842915ba40c15d90591254083c9dd8 anduin/server/main/scala/io/datalogue/anduin/server:move_anduin_plugin anduin/server/main/scala/io/datalogue/anduin/server:mws-unstable-none-fff76cb8d6cbbedb77bd6a3851709c4ab2a8e60e anduin/server/main/scala/io/datalogue/anduin/server:newrelic-unstable-runtime-b51e41a12c440ed39de800942da2d473edef1daf anduin/server/main/scala/io/datalogue/anduin/server:redshift-unstable-none-7a2b1f6b1f9d079f2fa30242366cc329199a142e anduin/server/main/scala/io/datalogue/anduin/server:s3-unstable-none-177b201c1b065049289590df6dbc8b1b0d98fdc1 anduin/server/main/scala/io/datalogue/anduin/server:sap-unstable-none-0441d0258e0cbf1eb67e78f7fbf5cfe2b28776dc anduin/server/main/scala/io/datalogue/anduin/server:teradata-unstable-none-2d006a4ac578ff1df5ecb7f4af885a826153a799 anduin/server/test/resources:resources anduin/server/test/scala/io/datalogue/anduin/server:anduin apollo:apollo-docker apollo:apollo-docker-coverage apollo:build apollo:build-coverage apollo:bundle apollo:bundle-coverage cerby/main/resources:migrations cerby/main/resources:resources cerby/main/scala/io/datalogue/cerby:cerby-app cerby/main/scala/io/datalogue/cerby:cerby-bin cerby/main/scala/io/datalogue/cerby:cerby-docker cerby/main/scala/io/datalogue/cerby:cerby-server-lib cerby/main/scala/io/datalogue/cerby:jwt cerby/main/scala/io/datalogue/cerby:ldap cerby/main/scala/io/datalogue/cerby:log4j cerby/main/scala/io/datalogue/cerby:mail cerby/main/scala/io/datalogue/cerby:newrelic-unstable-runtime-b51e41a12c440ed39de800942da2d473edef1daf cerby/main/scala/io/datalogue/cerby:validator cerby/main/scala/io/datalogue/cerby:zxcvbn cerby/test/resources:resources cerby/test/scala/io/datalogue/cerby:cerby cerby/test/scala/io/datalogue/cerby:sttp cerby/test/scala/io/datalogue/cerby:wiremock cs/pipeline-tests:pipeline-tests cs/pipeline-tests:pipeline-tests-lib dtl-python-sdk:datalogue dtl-python-sdk:test_resources dtl-python-sdk:tests dtl-python-sdk:tests-utils hermes/main/scala/io/datalogue/hermes:hermes hermes/test/resources:resources hermes/test/scala/io/datalogue/hermes:hermes nexus/main/scala/io/datalogue/nexus:nexus nexus/test/scala/io/datalogue/nexus:nexus picasso/main/scala/io/datalogue/picasso:jsoup picasso/main/scala/io/datalogue/picasso:mongo picasso/main/scala/io/datalogue/picasso:picasso picasso/main/scala/io/datalogue/picasso:xslx-streamer picasso/test/resources/excel:excel picasso/test/scala/com/monitorjbl:monitorjbl picasso/test/scala/com/monitorjbl:xslx-streamer picasso/test/scala/io/datalogue/picasso:picasso protobuf:proto scout/main/java/db:migrations scout/main/resources:db scout/main/resources:migrations scout/main/resources:resources scout/main/scala:newrelic-unstable-runtime-b51e41a12c440ed39de800942da2d473edef1daf scout/main/scala:scout-app scout/main/scala:scout-bin scout/main/scala:scout-docker scout/main/scala:scout-lib scout/test/resources:resources scout/test/scala:scout utilities/main/scala:elastic4s utilities/main/scala:fs2 utilities/main/scala:inet utilities/main/scala:utilities utilities/test/scala/io/datalogue/utilities:utilities yggdrasil/main/resources:migrations yggdrasil/main/resources:resources yggdrasil/main/scala/io/datalogue/yggdrasil:yggdrasil-app yggdrasil/main/scala/io/datalogue/yggdrasil:yggdrasil-bin yggdrasil/main/scala/io/datalogue/yggdrasil:yggdrasil-docker yggdrasil/main/scala/io/datalogue/yggdrasil:yggdrasil-lib yggdrasil/test/resources:resources yggdrasil/test/scala/io/datalogue/yggdrasil:yggdrasil
...
2021-09-23T17:03:23.0567003Z 17:03:22 [DEBUG] workunit_store: Starting: Fingerprinting: scout/main/resources/migrations/V69__ADD_PERMISSION_DETAILS.sql
2021-09-23T17:03:23.0568508Z 17:03:22 [DEBUG] workunit_store: Starting: Reading dtl-python-sdk/tests/unit/transformations
2021-09-23T17:03:23.0569875Z 17:03:22 [DEBUG] workunit_store: Completed: Reading dtl-python-sdk/datalogue/models
2021-09-23T17:03:23.0571557Z 17:03:22 [DEBUG] workunit_store: Completed: Fingerprinting: cerby/test/resources/google-open-id-connect-discovery.json
2021-09-23T17:03:23.0573261Z 17:03:22 [DEBUG] workunit_store: Starting: Reading dtl-python-sdk/datalogue/models/transformations
2021-09-23T17:55:28.9747358Z ##[error]The operation was canceled.
2021-09-23T17:55:28.9841737Z ##[group]Run scacap/action-surefire-report@v1

The part it's stuck is always different on every run. In this case, it's workunit_store: Starting: Reading dtl-python-sdk/datalogue/models/transformations, but in other cases, it's reading other files.
So I don't think it's the files being the actual problem, but possibly due to the actions that happened during the reading the files.
I've tried running the command locally (where most of the other operations run are different from the ones usually run in CI, of course), and it works for the pants filter command.

Here's my pants.toml config:

[GLOBAL]
pants_version = "1.30.4"
v1 = true
v2 = true
process_execution_local_parallelism = 4
pythonpath.add = [
    "%(buildroot)s/pants-plugins/src/python"
]
backend_packages.add = [
    'verst.pants.docker',
    'scalapb.pants',
    'scalatest.pants',
    'anduinplugin.pants',
    'jooq.pants',
]

backend_packages2.add = [
    'pants.backend.python',
    'pants.backend.python.lint.black'
]

plugins = ['pantsbuild.pants.contrib.node']

pants_ignore = [
    'node_modules/**',
    '.pants.d/**',
    '.pids/**'
]

build_file_prelude_globs = [
  "pants-plugins/macros.py",
]

[source]
root_patterns = [
    "*/main/*",
    "*/test/*",
    "*/src",
    "build-support/bin",
]
marker_filenames = ["SOURCE_ROOT"]

[resolver]
resolver = "coursier"

[resolve.coursier]
# jvm option in case of large resolves
jvm_options = ['-Xmx4g', '-XX:MaxMetaspaceSize=256m']

[export]
# Same if needed for large resolves
jvm_options = ['-Xmx4g', '-XX:MaxMetaspaceSize=256m']

[coursier]
repos = [
    'https://maven-central.storage-download.googleapis.com/maven2/',
    'https://repo1.maven.org/maven2/',
    'https://packages.confluent.io/maven'
]

[scala]
version = 'custom'
suffix_version = 2.12
scalac_plugin_dep = '//:scalac-plugin-dep'
scalac_plugins = ['silencer', 'kind-projector', 'bm4']

[compile.rsc]
# Below to prevent a stack overflow error
jvm_options = [
    '-XX:MetaspaceSize=1G',
    '-XX:MaxMetaspaceSize=2G',
    '-Xmx2G',
    '-Xss100m',
    '-Xms512M'
]
args = [
    '-S-encoding', '-SUTF-8',
    '-S-unchecked',
    '-S-deprecation',
    '-S-feature',
    '-S-Xlint',
    '-S-Xfatal-warnings',
    '-S-Ypartial-unification', # To be removed with scala 2.13
    '-S-Ypatmat-exhaust-depth', '-S30',
    '-S-language:higherKinds',
    '-S-language:existentials',
    '-S-language:reflectiveCalls',
    '-S-language:implicitConversions'
]
compiler_option_sets_enabled_args = "{'nowarn': ['-S-nowarn']}"

[test.scalatest]
# Below to prevent a stack overflow error
jvm_options = [
    '-XX:MetaspaceSize=1G',
    '-XX:MaxMetaspaceSize=2G',
    '-Xmx2G',
    '-Xss100m',
    '-Xms512M'
]

[scalafmt]
config = '.scalafmt.conf'

[eslint]
setupdir = "%(buildroot)s/build-support/eslint"
config = "%(buildroot)s/build-support/eslint/.eslintrc"
# TODO: make ESlint work
skip = true

[black]
config = "pyproject.toml"

[pytest]
version = "pytest>=3.6.3"
args = ["-vv"]

[jvm-platform]
default_platform = "java8"
platforms = """
{
  'java8': {'source': '8', 'target': '8', 'args': [] },
}
"""

[cache.docker-publish]
ignore = true

[cache.docker-build.docker-jvm]
ignore = true

[cache.docker-build.docker-node]
ignore = true

[cache.resolve.node]

ignore = true

[python-setup]
interpreter_constraints = ["CPython==3.8.*"]

[cache]
read_from = ['https://<TRUNCATED>']

[node-distribution]
version = "v12.18.0"

[yarnpkg-distribution]
version = "v1.22.4"

Pants version
1.30.04

OS
Linux

@MrRexZ MrRexZ added the bug label Sep 24, 2021
@MrRexZ MrRexZ changed the title Pants Filter stuck v1.30.4 : Pants Filter operation stuck Sep 24, 2021
@MrRexZ MrRexZ changed the title v1.30.4 : Pants Filter operation stuck v1.30.4 : Pants Filter operation seems stuck Sep 24, 2021
@jsirois
Copy link
Contributor

jsirois commented Sep 24, 2021

Are the pants.toml options you have above reasonable for the CI nodes you're running on? In particular the heap sizes and parallelism? It could be that these values are good for a developer but bad for CI and you're hitting GC thrash in java tools for example. If this seems likely to you, you can use a seperate toml for CI that over-rides values that should be different in CI via, for example:

PANTS_CONFIG_FILES=pants.ci.toml ./pants ...

@stuhood
Copy link
Member

stuhood commented Sep 24, 2021

@MrRexZ : I presume that this issue also reproduces with the default logging (--level=info rather than debug)? I ask, because we've fixed a few issues in the 2.x series related to deadlocks with higher log levels, but those fixes likely haven't made it back to 1.30.x.

@benjyw benjyw self-assigned this Sep 24, 2021
@MrRexZ
Copy link
Author

MrRexZ commented Sep 24, 2021

@jsirois Thanks for the fast reply! :)

Yes, I think they are reasonable values (maybe they can be further improved to speed things up, but I don't think it'll fix the issue I'm having).
My CI machine (EC2 instance, m5.2xlarge type) and laptop are roughly similar (in that it has 8 cores and 32GB RAM).
I had tried changing with the parallelism value before in the config (initially I had left it empty), but explicitly setting it doesn't provide any difference in output than previous trials (both in CI and locally).

I've never changed the heap size (as I had never set the max size in my code), so I assume the system will use as much as it's allocated

@MrRexZ
Copy link
Author

MrRexZ commented Sep 24, 2021

@stuhood Yes, I first encountered it with default logging. I'll try using lower log levels to see if the situation improves.

I ask, because we've fixed a few issues in the 2.x series related to deadlocks with higher log levels

Thanks a lot for the info! Going off on tangent for a bit, our team has been starting to use Pants, and we have been using Pants 1.3x as Pants 2.x doesn't support JVM languages, and occasionally in our developers' machines, we're sometimes getting pants processes being stuck on some pants operations (which got resolved when we terminate Pants processes, and re-run). Good to hear that there are some improvements in 2.x, but we would be super grateful if we can have those improvements on Pants version we can run that'll allow us to run JVM-based language (Java and Scala), such as a backport or to have them working in Pants V2 :)

Thank you!

@stuhood
Copy link
Member

stuhood commented Sep 24, 2021

Ok, thanks. The next step will unfortunately likely be attaching a debugger and getting a stacktrace: you'll need gdb in your container.

Are you able to open an SSH session into it while the process is hung?

If so, things are much easier: you can determine the pid of the hung process, and then run:

echo -e 'attach '$pid'\nset pagination off\nthread apply all bt' | sudo gdb

If you are unable to attach to the container, you should add the following script to your CI environment:

#!/bin/bash
set -eux

sleep $1
pid=$(pgrep -f "$2" | head -n 1)
echo -e attach $pid'\nset pagination off\nthread apply all bt' | sudo gdb

...and then call it from your CI script immediately above calling pants, like so:

# Launch in the background, sleep 360 seconds, and then backtrace the first process named `pantsd`.
./sleep_then_backtrace 360 pantsd &

@MrRexZ
Copy link
Author

MrRexZ commented Oct 5, 2021

@stuhood

I've just had the time to get the logs and I have done some changes since there (I replaced the ./pants filter command with my own bash command as workaround), but the similar symptom (the pants process being stuck), when executing other pants goal (the test goal) right after where filter used to be, is still visible.

I've managed to obtain the stacktrace according to your suggestion above, let me know if the log attached in this post is helpful.
For context, the ./pants test (first line in the file) starts happening at 2021-10-05T18:23:30.6518614Z, but the GDB logs ends at 2021-10-05T18:29:34.4443088Z.
I then had to cancel the job (via Github Action Workflow) at 2021-10-05T18:51:34.8257802Z (as can be seen in the last line of the log)

gdb_log_1.txt

@stuhood stuhood assigned stuhood and unassigned benjyw Oct 5, 2021
@stuhood
Copy link
Member

stuhood commented Oct 5, 2021

Thanks a lot! That trace is exactly what we needed.

It looks like you found a new issue: I've posted #13127 with the fix, and will get it cherrypicked to 1.30.x ASAP.

@stuhood
Copy link
Member

stuhood commented Oct 7, 2021

#13127 has been picked, and I'm getting one other fix in proactively in #13149: once those have both landed I'll cut a new release. Thanks for your patience!

@stuhood
Copy link
Member

stuhood commented Oct 8, 2021

Pants 1.30.5rc1 is now released containing this fix: thanks for your patience! https://pypi.org/project/pantsbuild.pants/1.30.5rc1/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants