-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4.0 arch draft #12880
4.0 arch draft #12880
Conversation
90fff57
to
6f8c6cf
Compare
@vikram-redhat, will you PTAL? Do you have a suggestion for someone who'd be willing to do a SME review of this version and look at some graphics markups? |
9b25fcc
to
987344c
Compare
@kalexand-rh - thanks. @derekwaynecarr - would you be the best person to review this for docs? |
0161298
to
de6f9da
Compare
@smarterclayton and @eparis - tagging you as well for a review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The group leads need to get engaged more and help flush out the high level topics. This is a good start but I expect more input from them to describe the high level areas.
Can you add a rough “high level topics” tree to this PR so we can comment and discuss what the top sections should be and ensure they match?
modules/architecture-overview.adoc
Outdated
|
||
Like {product-title} v3, {product-title} v4 is a layered system designed to | ||
expose underlying container images and Kubernetes concepts as accurately as | ||
possible, with a focus on easy composition of applications by a developer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to touch this up, I’m not sure this is going to be the right way to phrase 4. I think this is where we reset expectations for customers, and it’s going to be a combination of the core kubernetes mission (application infrastructure), the core 4 mission (automated and self-monitoring, flexible infrastructure), and the openshift story (giving developers tools to evolve their applications under operational oversight).
modules/architecture-overview.adoc
Outdated
as both the fundamental unit of the product and an option for easily deploying | ||
and managing utilities that your apps use. | ||
|
||
{product-title} has an Operator-based architecture of smaller, decoupled units |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I’d focus on selling us as kubernetes first, with a set of standard components that you need (network, ingress, logging, monitoring), made simple to scale, upgrade, and maintain through a focus on automating operations with operators.
objects stored in etcd, a reliable clustered key-value store. Those services are | ||
broken down by function: | ||
|
||
* Operators, which run the core {product-title} services. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a bit too much like a list - I would hit the three core layers above and give examples of each. Operators are a detail, really, even if they underlie everything (more glue).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which three layers? Dev apps, K8 core app management, and the OpenShift management layer?
modules/architecture-overview.adoc
Outdated
build working images and react to new images. | ||
** Deployments, which expand support for the software development and deployment | ||
lifecycle. | ||
** Routes, which announce your service to the world. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ingress and Routes
modules/architecture-overview.adoc
Outdated
** Deployments, which expand support for the software development and deployment | ||
lifecycle. | ||
** Routes, which announce your service to the world. | ||
** Templates, which allow you to simultaneously create many objects that are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s time to deemphasoze templates - we need to describe our catalog of supporting application infrastructure as having three parts - operators (expose apis that automate full component lifecycle like databases), service bindings (consume services running elsewhere), and templates (simple instant examples).
modules/node-types.adoc
Outdated
[id='node-types-{context}'] | ||
= Node types in {product-title} | ||
|
||
{product-title} uses bootstrap, master, and worker nodes. The bootstrap node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven’t settled on naming yet for worker nodes. I would however say that we don’t “use ... nodes”, we assign nodes roles within the cluster that define their security boundaries and have specific roles.
It’s unclesr to me that this is the best way to communicate this though - bootstrap nodes are a detail of installation, while master and worker/compute nodes are what an admin deals with every day. I wouldn’t give them the same level of importance.
This page would be better as a discussion of the roles nodes play (use role instead of type, role is the official term). Connecting it to the ways people can change nodes (masters today cannot be changed, machines sets mean adding new nodes roles is easy), how security boundaries work, what tools are used to subdivide and group nodes is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make some updates based on your comments, but I don't have all of this context. Do you have a suggestion for who to talk to for more context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modules/operators-overview.adoc
Outdated
upgrades your cluster. By accepting automatic updates, you can automatically | ||
keep your cluster up to date with the most recent compatible components. | ||
|
||
To allow Cincinnati to provide only compatible updates, a release verification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t call it Cincinnati. Say it’s the OpenShift update service. Cincinnati is more of a code name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I can't remember hearing its true name. Is it the "OpenShift update service" in all versions, or is it the "{product-title} update service?"
[id='security-overview-{context}'] | ||
= Security in {product-title} | ||
|
||
The {product-title} and Kubernetes APIs authenticate users who present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a great set of content (it’s really unbalanced). The auth teams need to provide feedback to correct this but it should be talking about our threat model, the different tools we have for security, and provide high level overview of our security subsystems. This page as it is is not useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dragged it over from the 3.x stream. :)
@enj, @ericavonb, thoughts?
modules/operators-overview.adoc
Outdated
always runs as a `systemd` process. | ||
|
||
|
||
[id='cluster-version-operator-{context}'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t want to focus on these lower level details until we have an architecture topic on “upgrades” (which covers most of the details). I would do link outs to the major functional areas and list a few of the core operators and their roles, but this is too focused on “what” and not “why”. The details below would be maybe a subtopic under specific operators but as is this is too messy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are basically no link outs or xrefs in the 4.0 docs - if more information about a particular topic is required, it has to be included in the same assembly, or we can link from one top-level assembly to another top-level assembly.
Who do you suggest to talk to for the improvements to the upgrade portion?
modules/operators-overview.adoc
Outdated
The cluster version Operator orchestrates all things. | ||
|
||
{product-title} 4.0 introduces several new components that support the cluster | ||
version Operator, including Cincinnati and Telemetry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Telemetry should be split out and part of an “observability” high level arch topic that then describes the major subsystems and what they do together (monitoring with Prometheus, alertmanager and grafana, logging, and telemetry).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you either have a suggestion for a contact who can explain how these fit together or the outline for the user story? I'm not familiar enough with how monitoring, alerting, grafana, logging, and telemetry fit together to say what that story is, and all of our assemblies need to have one.
@derekwaynecarr @eparis I’d like to see us get the high level topics agreed on and help flesh out which topics are under covered. We need to invest more time with Kathryn and the docs team than we have historically on these docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton, thank you! I'm working on updates based on your comments.
I'm not sure who the best contacts are for some of your requested changes - will you tag some more people in for me?
based on customized parameters. | ||
* Controllers, which read those REST APIs, apply changes to other objects, and | ||
report status or write back to the object. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who should provide this information?
{product-title} offers two installation options: fully-managed infrastructure and bring your own | ||
infrastructure. In version 4.0, the fully-managed option allows you to install a | ||
cluster in Amazon Web Services (AWS) that runs on Red Hat CoreOS nodes. If you want to | ||
use any other cloud or install your cluster on-premise, use the bring your own |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a public-facing roadmap I can link to? If there's a good source of information that interested parties can use to track our changes, I'm happy to point them to it.
|
||
|
||
With both installation types, installation and upgrade both use a controller | ||
that constantly reconciles component versions as if it were any other Kubernetes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who's the right person to talk to about "controller" versus "operator," and do you have a suggestion for who might provide better nuance?
both its updates and updates to RHCOS, {product-title} provides an opinionated | ||
lifecycle management experience that simplifies the orchestration of node upgrades. | ||
|
||
{product-title} employs three DaemonSets and controllers to simplify node management: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who's your SME of choice for this update?
modules/operators-overview.adoc
Outdated
upgrades your cluster. By accepting automatic updates, you can automatically | ||
keep your cluster up to date with the most recent compatible components. | ||
|
||
To allow Cincinnati to provide only compatible updates, a release verification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I can't remember hearing its true name. Is it the "OpenShift update service" in all versions, or is it the "{product-title} update service?"
[id='security-overview-{context}'] | ||
= Security in {product-title} | ||
|
||
The {product-title} and Kubernetes APIs authenticate users who present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dragged it over from the 3.x stream. :)
@enj, @ericavonb, thoughts?
objects stored in etcd, a reliable clustered key-value store. Those services are | ||
broken down by function: | ||
|
||
* Operators, which run the core {product-title} services. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which three layers? Dev apps, K8 core app management, and the OpenShift management layer?
modules/node-types.adoc
Outdated
[id='node-types-{context}'] | ||
= Node types in {product-title} | ||
|
||
{product-title} uses bootstrap, master, and worker nodes. The bootstrap node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make some updates based on your comments, but I don't have all of this context. Do you have a suggestion for who to talk to for more context?
modules/operators-overview.adoc
Outdated
The cluster version Operator orchestrates all things. | ||
|
||
{product-title} 4.0 introduces several new components that support the cluster | ||
version Operator, including Cincinnati and Telemetry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you either have a suggestion for a contact who can explain how these fit together or the outline for the user story? I'm not familiar enough with how monitoring, alerting, grafana, logging, and telemetry fit together to say what that story is, and all of our assemblies need to have one.
de6f9da
to
b1e41b0
Compare
@enxebre, @crawford, will you PTAL at this draft architecture overview? @smarterclayton has some great suggestions and requests that I don't have enough information to implement. (If you have other suggestions for stakeholders to review, please tag them in!) |
class, which describes the types of compute nodes that are offered for different | ||
cloud platforms. For example, a `machine` type for a worker node on Amazon Web | ||
Services (AWS) might define a specific machine type and required metadata. | ||
`MachineClasses`:: A unit that defines a class of `machines` and facilitates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MachineClassses are not yet integrated with the platform.
MachineDeployments might be landing soon openshift/installer#990. These are mainly to machines what "deployments" to pods. I'll keep this thread updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo extending and adding more info like https://www.linux.com/blog/event/kubecon/2018/4/extending-kubernetes-cluster-api will clarify a lot of end users who are not familiar with K8 Cluster API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enxebre, I'll keep an eye out for your updates and look for additional content to include.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class, which describes the types of compute nodes that are offered for different | ||
cloud platforms. For example, a `machine` type for a worker node on Amazon Web | ||
Services (AWS) might define a specific machine type and required metadata. | ||
`MachineClasses`:: A unit that defines a class of `machines` and facilitates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enxebre, I'll keep an eye out for your updates and look for additional content to include.
modules/operators-overview.adoc
Outdated
or roles the second-level Operator runs as, the CRD and pull secret that drives | ||
the operation of the Operator, and the Operator deployment. | ||
|
||
Second-level Operators write out to a CRD resource called the cluster Operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vikram-redhat, what's your opinion on linking v incorporating upstream docs like this?
modules/operators-overview.adoc
Outdated
Some Red Hat Operators drive the cluster functions, like the scheduler and | ||
problem detectors. Others are provider for you to manage yourself and use in | ||
your applications, like etcd. {product-title} also offers certified Operators, | ||
which the community built around the OLM and maintains. These certified Operators |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adellape, do you have more information about this process?
modules/operators-overview.adoc
Outdated
your applications, like etcd. {product-title} also offers certified Operators, | ||
which the community built around the OLM and maintains. These certified Operators | ||
are traditional applications that are Kubernetes-aware because they are wrapped | ||
in an Operator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wking, I see where you're going, but I don't think that "wrap" would translate well. Does this sound ok to you?
These certified Operators provide an API layer to traditional applications so you can manage the application through Kubernetes constructs.
This reverts commit 987344ce872c971f8acb889858b4d57f9f386da1.
I've created some follow-up issues and am merging per @vikram-redhat. |
I'd initially wanted to address "It provides a graph, or diagram that contain" -> "It provides a graph, or diagram, that contains". But the "of component Operators" bit was confusing to me to. So I've reworded to lead with "graph of recommended update(s)", and reduce to a single sentence. I'd also be fine leading with a description of release images, and then introducing the recommended update edges that link them, but that seemed like a bigger refactor. Wording I'm replacing is from 29016d7 (draft of early 4.0 architecture updates, 2018-11-08, openshift#12880) and 6e1b894 (Initial add of modularized arch guide content, 2019-05-21, openshift#14991).
…hift, including RHCOS" The RHCOS mention is from way back in 1ac3751 (some updates per Clayton, 2018-11-26, openshift#12880). But: $ git --no-pager grep -h supported modules/rhcos-about.adoc {op-system} is supported only as a component of {product-title} {product-version} for all {product-title} machines.... This commit rephrases the update docs to put RHCOS under OpenShift, so folks don't get ideas and think it is a stand-alone product.
…hift, including RHCOS" The RHCOS mention is from way back in 1ac3751 (some updates per Clayton, 2018-11-26, openshift#12880). But: $ git --no-pager grep -h supported modules/rhcos-about.adoc {op-system} is supported only as a component of {product-title} {product-version} for all {product-title} machines.... This commit rephrases the update docs to put RHCOS under OpenShift, so folks don't get ideas and think it is a stand-alone product.
…hift, including RHCOS" The RHCOS mention is from way back in 1ac3751 (some updates per Clayton, 2018-11-26, openshift#12880). But: $ git --no-pager grep -h supported modules/rhcos-about.adoc {op-system} is supported only as a component of {product-title} {product-version} for all {product-title} machines.... This commit rephrases the update docs to put RHCOS under OpenShift, so folks don't get ideas and think it is a stand-alone product.
…hift, including RHCOS" The RHCOS mention is from way back in 1ac3751 (some updates per Clayton, 2018-11-26, openshift#12880). But: $ git --no-pager grep -h supported modules/rhcos-about.adoc {op-system} is supported only as a component of {product-title} {product-version} for all {product-title} machines.... This commit rephrases the update docs to put RHCOS under OpenShift, so folks don't get ideas and think it is a stand-alone product.
Draft of architecture updates from the arch overview call.
http://file.rdu.redhat.com/kalexand/111618/4.0_arch/architecture/architecture.html