Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pex files do not allow for OpenTelemetry auto-instrumentation #2065

Closed
Peder2911 opened this issue Feb 24, 2023 · 9 comments · Fixed by #2068
Closed

Pex files do not allow for OpenTelemetry auto-instrumentation #2065

Peder2911 opened this issue Feb 24, 2023 · 9 comments · Fixed by #2068

Comments

@Peder2911
Copy link

We are using pantsbuild to set up a developer platform with an emphasis on a nice, frictionless developer experience. A part of this experience is codeless auto-instrumentation such as is provided by Opentelemetry with auto instrumentation. Specifically, we are looking to auto-instrument using the Kubernetes Operator offered by OpenTelemetry.

Auto-instrumentation with OpenTelemetry works by monkey-patching libraries, creating passthrough functions that wrap existing functions with logging statements. The way this works is kind of obscure to me, and is something I have to figure out a bit more before I can give constructive input into how to support this.

Auto-instrumentation of non-pexified applications can be as easy as running (for example from a clean Python3.9 docker container):

pip install flask opentelemetry-distro opentelemetry-instrumentation-flask

cat << EOF > app.py
import flask
app = flask.Flask(__name__)
@app.route("/")
def helloworld():
  return "<h1>Hello world!</h1>"
app.run()
EOF

opentelemetry-instrument \
  --traces_exporter console \
  --metrics_exporter console \
 python app.py

This will spin up a Flask server that also emits telemetry. This separation of code and logging is super useful for maintaining a clean developer experience.

This does not work out-of-the-box with Pex, however. Ideally, one would be able to run the following to achieve the same thing:

pip install pex opentelemetry-distro opentelemetry-instrumentation-flask

cat << EOF > app.py
import flask
app = flask.Flask(__name__)
@app.route("/")
def helloworld():
  return "Hello from Pex"
app.run()
EOF

pex flask -o app.pex --exe ./app.py
opentelemetry-instrument \
  --metrics_exporter console \
  --traces_exporter console \
  ./app.pex

This does, as expected, not allow OpenTelemetry to inject any instrumentation ,and will not result in any telemetry being emitted. It would be pretty awesome if it did though! I would appreciate any thoughts and ideas for how to implement this.

@Peder2911
Copy link
Author

@jsirois helpfully showed me how I could make OpenTelemetry work within a Pex, with the auto-instrumentation being run as the Pex entrypoint: https://gist.github.com/Peder2911/4842c21c131c795d24e4f76883e9b315

This is fine, though I would really like to make this process external to Pex for the above-mentioned reasons (keeping telemetry out of the way of developers). Also, I could not get this to work in my Pants repo, but that might just be my lack of configuration skills showing: https://github.com/Peder2911/otel_pants_reprex

@Peder2911
Copy link
Author

@jsirois
Copy link
Member

jsirois commented Feb 24, 2023

Also, I could not get this to work in my Pants repo, but that might just be my lack of configuration skills showing: https://github.com/Peder2911/otel_pants_reprex

The issue there is fixed by:

diff --git a/apps/dockerfiles/app.dockerfile b/apps/dockerfiles/app.dockerfile
index 5b47ef0..7138bdb 100644
--- a/apps/dockerfiles/app.dockerfile
+++ b/apps/dockerfiles/app.dockerfile
@@ -3,4 +3,4 @@ FROM python:3.9
 ARG PEX_FILE
 COPY $PEX_FILE /app.pex
 RUN PEX_TOOLS=1 /app.pex venv --non-hermetic-scripts --bin-path prepend /app/venv
-ENTRYPOINT ./app.pex --metrics_exporter console --traces_exporter console uvicorn asciinator.app:app --host 0.0.0.0
+ENTRYPOINT /app/venv/pex --metrics_exporter console --traces_exporter console uvicorn asciinator.app:app --host 0.0.0.0

In other words, you created a non-hermetic venv successfully, but then missed using it as the entrypoint - you mistakenly used the original PEX.

@jsirois
Copy link
Member

jsirois commented Feb 24, 2023

This is fine, though I would really like to make this process external to Pex for the above-mentioned reasons (keeping telemetry out of the way of developers).

There is no need to make the default entrypoint of the PEX be opentelemetry-instrument, you just need to ensure the PEX contains all the opentelemetry dependencies. Normally you run the PEX and are none-the-wiser that it happens to contain opentelemetry (except for increased PEX size). When you want to use opentelemetry, follow the PEX_TOOLS... procedure and then set the PEX_SCRIPT=opentelemetry-instrument environment variable to have the PEX installed in the venv use that as its alternate entrypoint.

As far as the need to include opentelemetry in the PEX, I'm not sure that can be eliminated. I'll dig a bit more and report back.

@jsirois
Copy link
Member

jsirois commented Feb 24, 2023

As far as the need to include opentelemetry in the PEX, I'm not sure that can be eliminated. I'll dig a bit more and report back.

So opentelemetry deps definitely need to be included in the PEX. But, with a quick hack session to extend the --non-hermetic-scripts PEX_TOOLS venv option to regular PEX builds, This works:

$ python -mpex \
  opentelemetry-distro \
  opentelemetry-instrumentation-flask \
  flask \
  -D ~/support/OpenTelemetryPex/src \
  -c flask --inject-args '--app app run' \
  -o app.pex \
  --non-hermetic-venv-scripts \
  --venv

That yields a PEX that, run normally, just functions like a flask app:

$ ./app.pex
sys.path:
/home/jsirois/dev/pantsbuild/jsirois-pex
/home/jsirois/.pex/venvs/1c47953879efb9ed9fdb692fb1bde1aed7129c45/5985ed09b49a653d6596b0e14d134c5456cf1a9f
/usr/lib/python310.zip
/usr/lib/python3.10
/usr/lib/python3.10/lib-dynload
/home/jsirois/.pex/venvs/1c47953879efb9ed9fdb692fb1bde1aed7129c45/5985ed09b49a653d6596b0e14d134c5456cf1a9f/lib/python3.10/site-packages
 * Serving Flask app 'app'
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000
Press CTRL+C to quit

But, when run with the alternate opentelemetry-instrument entrypoint, yields an auto-instrumented run of the same PEX:

$ PEX_SCRIPT=opentelemetry-instrument ./app.pex --traces_exporter console --metrics_exporter console ./app.pex
Error in sitecustomize; set PYTHONVERBOSE for traceback:
ModuleNotFoundError: No module named 'opentelemetry'
sys.path:
/home/jsirois/dev/pantsbuild/jsirois-pex
/home/jsirois/.pex/venvs/1c47953879efb9ed9fdb692fb1bde1aed7129c45/5985ed09b49a653d6596b0e14d134c5456cf1a9f
/home/jsirois/.pex/venvs/s/b4d82d80/venv/lib/python3.10/site-packages/opentelemetry/instrumentation/auto_instrumentation
/home/jsirois/dev/pantsbuild/jsirois-pex
/usr/lib/python310.zip
/usr/lib/python3.10
/usr/lib/python3.10/lib-dynload
/home/jsirois/.pex/venvs/1c47953879efb9ed9fdb692fb1bde1aed7129c45/5985ed09b49a653d6596b0e14d134c5456cf1a9f/lib/python3.10/site-packages
 * Serving Flask app 'app'
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000
Press CTRL+C to quit
^C{"resource_metrics": [{"resource": {"attributes": {"telemetry.sdk.language": "python", "telemetry.sdk.name": "opentelemetry", "telemetry.sdk.version": "1.16.0", "telemetry.auto.version": "0.37b0", "service.name": "unknown_service"}, "schema_url": ""}, "scope_metrics": [], "schema_url": ""}]}

jsirois added a commit to jsirois/pex that referenced this issue Feb 25, 2023
This provides the same option when building a PEX as the
`--non-hermetic-scripts` does when creating a venv from a PEX using
`pex-tools`. In addition, both means of creating a non-hermetic venv
are extended to the venv `pex` script itself.

Closes pex-tool#2065
jsirois added a commit that referenced this issue Feb 26, 2023
This provides the same option when building a PEX as the
`--non-hermetic-scripts` does when creating a venv from a PEX using
`pex-tools`. In addition, both means of creating a non-hermetic venv
are extended to the venv `pex` script itself.

Closes #2065
@jsirois
Copy link
Member

jsirois commented Feb 27, 2023

Alright @Peder2911, with --non-hermetic-venv-scripts and new support in latest Pants for this (pantsbuild/pants@2d10fc3), this presents the closest option I think, to what you initially desired:

You build an opentelemetry PEX one time and include this in a base image:

pex \
  opentelemetry-distro \
  opentelemetry-instrumentation-flask \
  --no-strip-pex-env \
  -c opentelemetry-instrument \
  -o opentelemetry.pex

N.B.: The opentelemetry PEX should include all opentelemetry-instrumentation-* any of your apps may need. The key here for this PEX is --no-strip-pex-env.

Then, each app that wants to be instrumentable looks like so:

pex \
  flask \
  -D ~/support/OpenTelemetryPex/src \
  -c flask \
  --inject-args '--app app run' \
  --no-strip-pex-env \
  --venv \
  --non-hermetic-venv-scripts
  -o app.pex

In this case the critical options are, again --no-strip-pex-env, but also --venv --non-hermetic-venv-scripts. So these 3 flags are the leak to your developers; you must tell them they have to set those "just because" if they ever want instrumentation in production.

With that setup, you instrument a random app like so:

PEX_PATH=app.pex:opentelemetry.pex \
  ./opentelemetry.pex \
    --traces_exporter console \
    --metrics_exporter console \
    ./app.pex
Error in sitecustomize; set PYTHONVERBOSE for traceback:
ModuleNotFoundError: No module named 'opentelemetry'
 * Tip: There are .env or .flaskenv files present. Do "pip install python-dotenv" to use them.
sys.path:
/home/jsirois/dev/pantsbuild/jsirois-pants
/home/jsirois/.pex/venvs/56347834ba57d87009d61ab4c6051d6b53b88844/b6c4ced144deba18bee8fb73aa09dafbbda71843
/home/jsirois/.pex/installed_wheels/b994662c2ac27118fdef4c5273299815579b77e1298dd0f5c9fbb3751a746023/opentelemetry_instrumentation-0.37b0-py3-none-any.whl/opentelemetry/instrumentation/auto_instrumentation
/home/jsirois/dev/pantsbuild/jsirois-pants
/usr/lib/python310.zip
/usr/lib/python3.10
/usr/lib/python3.10/lib-dynload
/home/jsirois/.pex/venvs/56347834ba57d87009d61ab4c6051d6b53b88844/b6c4ced144deba18bee8fb73aa09dafbbda71843/lib/python3.10/site-packages
 * Serving Flask app 'app'
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000
Press CTRL+C to quit
^C{"resource_metrics": [{"resource": {"attributes": {"telemetry.sdk.language": "python", "telemetry.sdk.name": "opentelemetry", "telemetry.sdk.version": "1.16.0", "telemetry.auto.version": "0.37b0", "service.name": "unknown_service"}, "schema_url": ""}, "scope_metrics": [], "schema_url": ""}]}

In summary:

  • Good: You don't need to know the ./app.pex entrypoint.
  • Good: The ./app.pex need not have any depndencies on opentelemetry*
  • Bad: The ./app.pex must have been built with these flags: --no-strip-pex-env --venv --non-hermetic-venv-scripts

@jsirois
Copy link
Member

jsirois commented Feb 28, 2023

Actually, by including a pex requirement in the opentelemetry.pex you can fully eliminate the bad.

Creating an opentelemetry.pex tool like so:

pex \
    opentelemetry-distro \
    opentelemetry-instrumentation-flask \
    pex \
    --no-strip-pex-env \
    --venv \
    --non-hermetic-venv-scripts \
    -c pex-tools \
    -o opentelemetry.pex

Allows you to do this to any PEX with no restrictions on how the PEX was built and not knowing the PEX entrypoint:

RUN \
    PEX_PATH=opentelemetry.pex \
    ./opentelemetry.pex \
    ./app.pex \
    venv \
    --non-hermetic-scripts \
        app/venv

And that venv can then be run auto-instrumented via:

ENTRYPOINT \
    app/venv/bin/opentelemetry-instrument \
    --traces_exporter console \
    --metrics_exporter console \
        ./app/venv/pex

@Peder2911
Copy link
Author

This is a really elegant solution @jsirois, will try this out in my repo right away!

@Peder2911
Copy link
Author

Beautiful! It works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants