Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

insights: Insights-gateway #447

Closed
wants to merge 2 commits into from
Closed

Conversation

iNecas
Copy link
Contributor

@iNecas iNecas commented Aug 20, 2020

No description provided.

@iNecas
Copy link
Contributor Author

iNecas commented Aug 24, 2020

@radekvokal
Copy link

From business perspective it makes sense to me. To be absolutely clear, the components/operators using the c.r.c Insights gateway will be responsible for payload size and data privacy. Insights gateway won't be anyhow processing the data. Correct @iNecas ?

@iNecas
Copy link
Contributor Author

iNecas commented Aug 24, 2020

Correct @radekvokal

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: iNecas
To complete the pull request process, please assign mfojtik
You can assign the PR to them by writing /assign @mfojtik in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

@smarterclayton smarterclayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The drawbacks are all things I consider significant - I would probably lean towards a library approach (and expected it to be discussed here if there are gaps).


Given insights-operator is already part of the OCP cluster and has proper credentials
configured already as part of the installation, we're proposing for it to expose
a proxy for other components to be able to send the payloads on their behalf.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm - I would probably prefer these components remain decoupled architecturally. Insights is providing support data about the cluster - it really is not a generic operator (generic operators have a high bar).

There must be an extremely strong reason to expose a new generic api - it would probably be a better start to expose a library those operators can use rather than trying to provide a service. The justifications over providing a library would have to be very compelling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looping @jhjaggars and especially @chambridge to perhaps fill in some gaps around requirements/justification and more thoughts on the library alternative.

Few things regarding the library approach:

  1. some targeted operators are implemented with ansible (https://github.com/project-koku/korekuta-operator/blob/5fe7938ee87ca08dec6181cc0e37083979def62c/roles/collect/tasks/main.yml#L339): the library doesn't seem to help much with that
  2. with the library approach, the operator would need to have access to secrets in openshift-config namespace: would we want to allow this or minimize the access to that resource? Or would it mean the credentials need to be handled differently. The gateway approach limits the access to the upload only: nothing else exposed.
  3. the library still doesn't allow the single place for overview on what data are flowing to c.r.c.
  4. additional requirements around networking we've already seen and solved in insights-operator (such as service-specific proxy as implemented in openshift/insights-operator@65e2183) would need to be implemented again or left as gaps.

As of extending the operator to be a generic one, we could argue the scope is still narrowed around providing the c.r.c-related services, not to be a single place for all outgoing traffic beyond the c.r.c. ingress.

As mentioned at the bottom of this doc (https://github.com/openshift/enhancements/pull/447/files#diff-74d208115f9ccd14e14cb6d90d466a9eR196) long term, we would like to see the generic solution that even insights-operator could use provided by OCP core.

For the API, would adding v1alpha in the endpoint help with the concerns anyhow?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal here is to create a secure channel into cloud.redhat.com for data that doesn't fit telemetry yet provide value to customer thru applications build on c.r.c -- here's a discussion why Cost Management can't meet the budget and cardinality requirements of monitoring team and also why the current dataflow into c.r.c is not optimal -- https://docs.google.com/document/d/1gdxlc37-CMniwptdccob4C3fwrAZNgyOQFDNL5msCsk/edit#heading=h.v7h8j8wxgjo2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a library approach solves a couple of the issues we're wanting to address:

  1. abstract the mechanics of talking to cloudDot (http client configuration and uri stuff)
  2. centralize the proxy configuration

It doesn't address gaining access to credentials necessary for authentication with cloudDot.

Additionally, a library approach grants us the ability to just do it, which I really like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm waiting for more feedback on this proposal from the OLM operators interested into this future.

Given we would go the library approach, where should the proxy configuration live?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can someone describe how the library would work with an Ansible based operator?
Would this be something we'd have to put into an Ansible collection? Also would like to understand the impact on EXD how the library would be built and made available for a certified operator release? I think the documentation for the library would need some information on how an operator would gather the pull secret config plus any necessary RBAC role/cluster role needs.

But if all that is easy then I'm okay with a library approach as it might have a side effect that it doesn't need all the backporting into z streams.

I assume the library would also take into account the cluster global proxy configuration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the ansible-operator, I can think of two ways (based more on how I think ansible-operator works, without much exprience):

  1. explosing the library via executable that would be available in the operator image and used instead of the curl it uses today. You would need to figure out how to add the binary to your image. Disadvantage would be it would need to read configuration on every upload, or deal with caching.
  2. running as a side-car container, exposing the local endpoint to provide the gathering functionality
    Regarding the side-car: what do you think about this being the generic regardless it's ansible operator or not @smarterclayton, @jhjaggars? This way, we could also standardize on metrics and would actually feel like k8s way?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the advantages of using the library from the solution we have today - coded by ourselves? It seems that the long term solution of an upload service will better fix the air-gapped use cases, and would simplify credential management and RBAC configuration.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least two advantages that I can think of:

  1. standardization of how to make the connection
  2. clear instructions for users to follow

The sidecar container approach (basically a service wrapper around the library) should address the ansible operator case. Would it be a big lift to include another image in the existing certification process for each operator?

Copy link
Contributor Author

@iNecas iNecas Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question: could the OLM managed operators create this bindings as part of the deployment process (or ask the administrator to do so?):

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: some-olm-operator
  namespace: openshift-config
rules:
- apiGroups:
  - ""
  resources:
  - secrets
  resourceNames:
  - pull-secret
  - support
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: some-olm-operator
  namespace: openshift-config
roleRef:
  kind: Role
  name: some-olm-operator
subjects:
- kind: ServiceAccount
  name: operator
  namespace: some-olm-operator

If so, the sidecar could also pull the right token from the pull secret and watch for the changes.

The sidecar is probably my favorite one also as it could standardize on things like tracked metrics.

each needing additional data to be sent from the OCP clusters, such as:

- [cost management](https://github.com/project-koku/korekuta-operator)
- [subscription watch](https://github.com/chambridge/subscription-watch-operator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are generally moving towards all this data will be sent through telemetry in the future


- [cost management](https://github.com/project-koku/korekuta-operator)
- [subscription watch](https://github.com/chambridge/subscription-watch-operator)
- [marketplace](https://github.com/redhat-marketplace/redhat-marketplace-operator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Over time, I expect all of this data to flow through telemetry once we have completed the necessary data interlock.

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 6, 2021
@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 5, 2021
@openshift-bot
Copy link

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci-robot
Copy link

@openshift-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants