Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor package distribution #445

Closed
reyang opened this issue Jan 14, 2019 · 8 comments
Closed

Refactor package distribution #445

reyang opened this issue Jan 14, 2019 · 8 comments

Comments

@reyang
Copy link
Contributor

reyang commented Jan 14, 2019

Currently OpenCensus Python is released as a monolithic package pip install opencensus which has the following disadvantages:

  1. One will have to install all the dependencies even if most of them are not being used. This also adds concerns on deployment time/size and security.
  2. Some of the exporters / integration with 3rd party libraries have to bind with the core OpenCensus Python release schedule. This is problematic for both the core OpenCensus Python and 3rd party integrations when it comes to the release / hotfix schedule.

The proposal is to refactor the current monolithic package using the namespace packaging approach https://packaging.python.org/guides/packaging-namespace-packages/.
Specifically, https://packaging.python.org/guides/packaging-namespace-packages/#pkgutil-style-namespace-packages will be used given we need to support both Python 2.x and 3.x.

@reyang
Copy link
Contributor Author

reyang commented Jan 14, 2019

@c24t

@c24t
Copy link
Member

c24t commented Jan 15, 2019

Following up from a conversation with @reyang:

We have two related problems here: how to split up the existing single distribution package (following the packaging docs, I'll call this a distribution), and how/whether to split the code in this repo into multiple repos. For now I'm assuming that we want to keep the existing package/namespace structure, and it looks like namespace packages let us do exactly that while moving the code between distributions and repos.

It's clear that we need a separate distribution for third party trace integrations (opencensus.trace.ext), but not clear whether we should move these into separate repos. For comparison, java client integrations are kept in the client repo, but in a separate contrib package. Keeping multiple distributions in the same repo is standard practice (see e.g. the GCP python client library repo, which includes a few dozen distributions), but it does make development more complicated.

We may also want to move the exporters (opencensus.trace.exporters, opencensus.stats.exporters) into a separate distribution, or move each exporter into its own distribution. Unless we decide that we want a single distribution per repo, I think there's less reason to move these than third party integrations.

Then there are other components like the google cloud format propagator that are backend-specific, but can't be easily extracted from the library. I don't think we should touch this now, but worth keeping in mind if we want to prioritize having a small core library later.

@c24t
Copy link
Member

c24t commented Jan 15, 2019

One will have to install all the dependencies even if most of them are not being used

FWIW our only current dependency is google-api-core. The exporter-specific dependencies are only installed if you install the client as e.g. pip install 'opencensus[stackdriver]'.

@reyang
Copy link
Contributor Author

reyang commented Jan 15, 2019

Good point @c24t.

I suggest that we do this step by step:

  1. Split the existing single distribution package, while keeping all the code in the current repo. In this phase, we don't need to change the existing test/release process. The goal is simple - integration with 3rd party libraries (e.g. Flask, Django) will not be installed by default, they will be provided in separate packages (e.g. opencensus-ext-flask, opencensus-ext-django) - we need to come up with a better naming guidance, let's do that in PRs.
  2. Use one integration outside the main repo (e.g. https://github.com/opencensus-integrations/ocpymemcache/), and practice what we did in phase 1.
  3. Split the exporters into separate distribution packages.

@reyang
Copy link
Contributor Author

reyang commented Jan 15, 2019

You're right, the current explicit dependency (things that will be installed while running pip install opencensus) is https://github.com/census-instrumentation/opencensus-python/blob/master/setup.py#L26.

There are other types of dependencies:

  1. Runtime dependency, such like Flask integration. Currently we don't specify package dependency, which means there is no semantic versioning guarentee, users would see exception on import if the actual dependency is missing.
  2. 3rd party introduced explicit dependency, e.g. https://github.com/opencensus-integrations/ocpymemcache/blob/master/setup.py#L20.

@c24t , what is your suggestion? Do we want to converge (e.g. having an explicit dependency from Flask integration distribution package to Flask)?
I see both pros and cons:

  1. pros: if the user decided to install opencensus-ext-flask, packaging system will get the dependencies and make sure the version compatibility.
  2. cons: integration that targets multiple major versions (e.g. https://github.com/census-instrumentation/opencensus-python/blob/master/opencensus/trace/ext/httplib/trace.py#L25) would be tricky if the underlying dependency is not part of the standard Python library.

@c24t
Copy link
Member

c24t commented Jan 15, 2019

Step 1 sounds exactly right to me.

2. Use one integration outside the main repo (e.g. https://github.com/opencensus-integrations/ocpymemcache/), and practice what we did in phase 1.

Do you mean using a namespace package to make the package name consistent with the other integrations in the library? Or moving the code into the repo alongside the other integrations? If the former that sounds great too.

Do we want to converge (e.g. having an explicit dependency from Flask integration distribution package to Flask)?

It looks like we can use environment markers to solve the problem of integrations importing different packages at runtime (see this blog for an example), but this may be a headache.

In any case I think it's more important that we enforce a minimum version for e.g. flask than that we install the right version of flask on installing the OC integration.

@c24t
Copy link
Member

c24t commented Jan 15, 2019

@reyang do you have an opinion on which integrations should go in the client library repo and which should live in their own repo in https://github.com/opencensus-integrations?

@reyang
Copy link
Contributor Author

reyang commented Jan 15, 2019

Do you mean using a namespace package to make the package name consistent with the other integrations in the library? Or moving the code into the repo alongside the other integrations? If the former that sounds great too.

The former.

It looks like we can use environment markers to solve the problem of integrations importing different packages at runtime (see this blog for an example), but this may be a headache.

In any case I think it's more important that we enforce a minimum version for e.g. flask than that we install the right version of flask on installing the OC integration.

Thank you! I'll investigate.

@reyang do you have an opinion on which integrations should go in the client library repo and which should live in their own repo in https://github.com/opencensus-integrations?

I don't have a very strong opinion. I believe the httplib/request and things that come with the default Python installation should be included into the core opencensus repo. For 3rd party things like MySQL, Flask, Django, I think it is mainly determined by the maintainer's preference.

This was referenced Feb 12, 2019
@reyang reyang closed this as completed Mar 13, 2019
@reyang reyang mentioned this issue May 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants