Elafros resources: the missing link #412

steren · 2018-03-17T04:18:28Z

Here is an illustration showing a very common use case: Within one cluster, developers deployed, and later updated, two different "components of their architecture", for examples, a "function that implements an API endpoint" and a "web frontend".
As a result, from a API resource perspective, there are two Routes, each of them pointing to a different Revision that has been created from a Configuration:

Notice the orange boxes:
In our example, they represent the "function" and the "web Frontend". They are what Google Cloud Functions or AWS Lambda would call today a "Function". Or what App Engine would call a "Service", and Cloud Foundry an "Application".

We are currently performing some user research to understand how users think about these and how they would name them.

Problem statement

Problem 1: A missing higher level concept

Conceptually, neither the Route nor the Configuration are the top level entities that map exactly to the concept used in my introduction (the "function" and the "web frontend"). In most cases, we want developers to think first about the orange boxes, and later, if needed about the Route and Configuration(s).

Saying that an orange box is equivalent to a Route is not strictly correct because there is no parent-children or ownership relationship enforced by Elafros between a Route and Configuration(s). A Configuration can live completely detached from a Revision.

Problem 2: Name

Neither the names "Route" or "Configuration" are a good fit for this "orange box", these terms should not be the top level "thing" to create or the main collection of things our end users operate against.

I am opening this issue to evaluate solutions to these problems.

mikehelmick · 2018-03-23T05:07:11Z

Problem Summary

In summary, there is a desire to have a more structured grouping than provided in the Elafros API. This is appealing from the standpoint of product simplicity and potentially reduced cognitive load on our users, leading to a more elegant CLI/UI experience.

This is a proposed solution to Problem 1 as stated above.

Problem 2 will be addressed in a separate comment for that reason, we will refer to this resource simply at “Steve” for purposes of this proposal.

Proposed Solution

Steve will exist as a higher level resource that users interact with in order to make changes to their compute services, rather than interacting with Route and Configuration directly. The ability to restrict muate access to Route and Configuration is up to the cluster operator.

Steve is a composition of 2 pieces of information, a name, and a rollout policy. There will be several kind of rollout policies defined, but Steve can only accept one at a time. This is inspired by the k8s core Volume and associated kinds that appear in it.

Steve is a new CRD that will be hosted at github.com/elafros/steve along with a full OSS implementation of Steve. Use of Steve is optional in Elafros.
While we envision many possibilities for interaction with Steve, we will start with two basic scenarios.

Atomic Rollout Policy

There is a single name for Steve, which creates a corresponding Route, and a single Configuration object. When the Steve object is created, a Route and Configuration will be created, and the appropriate metadata.ownerReferences values will be set.

In this scenario, the user creates a Steve object with a simple rollout type (runLatest) and a single Configuration. In the example below, this results in 4 objects being created: The specified Steve object, a route named “my-function”, a configuration named "my-function", and the first revision. When the first revision is created, it will begin serving.

POST /apis/elafros.dev/v1alpha1/namespaces/default/steve

apiVersion: elafros.dev/v1alpha1
kind: Steve
metadata:
  name: my-function 
spec:
  # With runLatest set, the latest ready revision from the configuration
  # will always be set to 100% traffic allocation in the route. 
  runLatest:
    configuration:
      ...
    
status:
  ...

Manual

If a user finds themselves in a broken state, they can switch their Steve to have a rollout policy of manual. Manual requires a specific revision, which will upon reconciliation be immediately set to 100% in the route, and no further automatic changes will happen until the Steve is switched to a different kind.

Upon reconciliation, the Route is adjusted to point 100% to the specified revisionName. The revisionName provided must belong to the configuration contained by Steve.

apiVersion: elafros.dev/v1alpha1
kind: Steve
metadata:
  name: my-function 
spec:
  manualRollout:
    revisionName: abc
    configuration:
      ...
status:
  ...

Changing Between Steves

In general, changing between different specializations of Steve (different rollout policies) is a fine and normal thing to do..

For the case of a user that is in the simple case, using runLatest, and they have a bad push. In order to recover from the bad push, they switch the rollout policy to “manual” and pin their Steve to a specific revision. When the user has sufficiently patched their configuration, they simply change Steve back to runLatest and resume the release train.

Future Expansion

These are just two example rollout policies. There are other rollout policies that will be proposed as individual issues once the new repository is created.

mattmoor · 2018-03-24T16:47:53Z

I don't like the name Manual in this context. I would suggest Pinned or something along those lines.

To elaborate on that a bit, I think that Steve represents a higher-level "easy-mode" that groups a whole bunch of concepts together. I see Manual as the user shedding Steve and taking control of the lower-level resources themselves, but even here Steve presents value as a grouping construct (e.g. UX, cascaded deletion, ...).

Ordinarily, my expectation is that when Steve is managing the Route and Configuration, that if something reaches around Steve to manipulate those resources that the Steve controller would quickly correct them.

I'd propose an alternate concept of Manual (perhaps in addition to Pinned) that users can change to/from that let's them take control of the underlying resources:

apiVersion: elafros.dev/v1alpha1
kind: Steve
metadata:
  name: my-function 
spec:
  # The underlying resources become the source of truth.
  manual: {}

If a user changes to Manual, then Steve simply stops managing them and edits are allowed.

If a user changes from Manual, then Steve simply makes the state of the world that of the new specification.

This is a bit weird for Kubernetes objects, but I find this variation surprisingly appealing. In particular, the capacity to switch between this model (Some Ops) and a more managed model (No Ops).

To be clear: I would expect UX to degrade in Manual mode, since it is a lower-level of abstraction.

@vaikas-google @mikehelmick @steren WDYT?

steren · 2018-03-27T23:28:52Z

I support allowing access to the underlying resources. This is probably better for tooling compatibility.

I am convinced that the only thing we need is better grouping.

We could consider Steve to be a new resource that has label selectors on Configurations and Routes. (like Kubernetes' Services do with Pods)
This would allow us to easily list Steves, while having Routes and Configuration still as independent resources.

asciimike · 2018-03-28T01:19:02Z

Chatted w/ @mikehelmick and I think we have a nice way to represent automatic and manual Steve that should meet these requirements.

Automatic `Steve`

This is a copy from @mikehelmick's design above, with the addition of label selectors, and labels being required on resources being created. If the Steve controller creates additional resources, those must have steve labels as well.

apiVersion: elafros.dev/v1alpha1
kind: Steve
metadata:
  name: my-function 
spec:
  selector:
    steve: my-function
  runLatest:
    configuration:
        metadata:
          labels:
            steve: my-function
        spec: 
          ...

Manual `Steve`

Manual mode is just the label selector piece of automatic Steve. Note that resources will still have the steve label, such that they can be grouped in a Steve.

apiVersion: elafros.dev/v1alpha1
kind: Steve
metadata:
  name: my-function 
spec:
  selector:
    steve: my-function

----------------

apiVersion: elafros.dev/v1alpha1
kind: Configuration
metadata:
  name: my-function 
  labels:
    steve: my-function
spec:
  ...

Switching between modes

Automatic to manual: Automatic Steve will create labels (and/or owner references) on all resources, thus Steve as a grouping concept will remain even after Steve as a simplification concept is no longer needed.

Manual to automatic: Developers will need to re-provide the appropriate Route(s)/Configuration(s), and any resources changed in manual mode will be re-set to the desired configuration.

In either case, the same UI and CLI tools should be able to view both flavors of Steve.

tliberman · 2018-03-28T22:28:46Z

Last week I conducted research with 47 developers to assess names for Steve.

As context for the naming exercises, participants were given a walkthrough of a basic UI, CLI, and the diagram @steren shared above. They were then asked to provide their own name for Steve:

After providing their own suggestions, participants completed a MaxDiff survey in order to demonstrate their preferences toward some names that were being considered. MaxDiff analysis was performed using this code from E. Bahna, & C. Chapman (2018). "Constructed, Augmented MaxDiff." In B. Orme, ed. (forthcoming), Proceedings of the 2018 Sawtooth Software Conference, Orlando, FL. March 2018.

Result of the MaxDiff Steve naming exercise:

Overall, “service” and “component” were the top two candidates both when names were solicited as well as in the MaxDiff preference exercise, with “service” performing slightly better.

steren · 2018-03-28T23:41:06Z

Thanks for driving this study @tliberman.

This is a strong case for Service.
But as we know Kubernetes is already defining the concept of Service.

After consulting with @kelseyhightower, we suggest to use Service under a namespace:

Kubernetes resource namespaces have been introduce for that
we prefer to start with the name that makes more sense based on user data, and leave room for community feedback.

This means that Steve's full name is elafros.dev/Service.
And when the Elafros context is obvious (e.g. in our tooling or docs), we can refer to it as Service.

This solves Problem 2: Name described in my original issue.

mikehelmick · 2018-03-28T23:54:21Z

I spoke with several of you offline.

The proposed solution still stands, with a name change to service (described above)

What I had named as manual before will become pinned and we will also specify a new manual mode where the user is free to interact with the resources created by the service object directly, and the controller for the service will not try to correct things.

We will be setting up this resource in a new repository, likely elafros/service.

I will close this issue once that repository is created.

mattmoor · 2018-03-29T04:29:31Z

One challenge of the name-collision is the kubectl experience, where typing kubectl get services won't WAI (I think you'd need to fully-qualify elafros.dev/v1alpha1/Service). Something I'd suggest here (and possibly for the other CRDs) is that we adopt "shortNames" in our CRD descriptors, which allow you to alias the resource:

    # shortNames allow shorter string to match your resource on the CLI
    shortNames:
    - ela-svc

(not sure if hyphens are allowed)

I've been debating revision => rev and configuration => cfg for some time to save myself some typing, but here I think it's especially useful.

* add types for Service and generate related code. Issue #412 * add types for Service and generate related code. Issue #412 * Convert tabs to spaces, reformat service_types.go * add tests for service types * convert tabs to spaces * fix spelling mistakes in service definition * rerun codegen after fixing spelling in json types

bobcatfish · 2018-04-05T20:48:37Z

I've created #591 to add this functionality to the conformance tests.

vaikas · 2018-04-05T22:57:55Z

You can do: kubectl get service.elafros.dev better, but still not that super awesome.

…

On Wed, Mar 28, 2018 at 9:29 PM Matt Moore ***@***.***> wrote: One challenge of the name-collision is the kubectl experience, where typing kubectl get services won't WAI (I think you'd need to fully-qualify elafros.dev/v1alpha1/Service). Something I'd suggest here (and possibly for the other CRDs) is that we adopt "shortNames" in our CRD descriptors, which allow you to alias the resource: # shortNames allow shorter string to match your resource on the CLI shortNames: - ela-svc (not sure if hyphens are allowed) I've been debating revision => rev and configuration => cfg for some time to save myself some typing, but here I think it's especially useful. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#412 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKwedAMVHii1EBl4vSHHS4Agkrx-u1Qfks5tjGMsgaJpZM4SuqUs> .

mattmoor · 2018-04-11T14:11:28Z

The initial type definition and skeleton controller have gone in, and we're starting to break out smaller issues for missing things (e.g. docs), so I'm going to optimistically close this issue.

If there is some reason for it to still exist, please reopen with a comment explaining why.

ergannon · 2019-03-21T17:14:43Z

To expand on the research @tliberman shared and further assess our object model, we conducted a study with 19 participants: 9 with mostly Operator responsibilities, 8 with mostly Developer responsibilities, and 2 with an even split of both.

Qualitative data demonstrated that participants define a service as an entity that operates independently, is opaque to the consumer, sends and receives requests, and performs a defined business function. A diagram labeling exercise revealed that service was the most commonly chosen term to describe the Knative service|steve concept, and load balancer was the most common term for the Kubernetes service concept. These findings were consistent when sampling moderate to advanced Kubernetes users in a follow-up study. See the full methodology and results here.

evankanderson · 2019-03-21T17:53:45Z

Has any of this research been shared with the Kubernetes sig-apps group?

Chairs

The Chairs of the SIG run operations and processes governing the SIG.

Matt Farina (@mattfarina), Samsung SDS
Adnan Abdulhussein (@prydonius), Bitnami
Kenneth Owens (@kow3ns), Google

This patch contains following changes: - to remove istio related resources - to bump servleress-operator to v1.5.0

mikehelmick self-assigned this Mar 18, 2018

mattmoor added the area/API API objects and controllers label Mar 19, 2018

sixolet mentioned this issue Mar 21, 2018

Commit API spec and design doc #444

Merged

1 task

mikehelmick mentioned this issue Apr 3, 2018

CRD type definitions for the Service object (#412) #577

Merged

This was referenced Apr 4, 2018

remove pointers from types, add webhook validation for service #583

Merged

Verify gofmt on pullrequests #585

Closed

bobcatfish mentioned this issue Apr 5, 2018

Add Service CRD to Elafros docs #594

Closed

vaikas mentioned this issue Apr 5, 2018

Steve controller #597

Merged

mattmoor closed this as completed Apr 11, 2018

loganlee mentioned this issue May 24, 2018

Proposal: How to structure the project's roadmaps and their upkeep #949

Closed

pmorie mentioned this issue Jun 28, 2018

Service (services.serving.knative.dev) is very confusing. #1397

Closed

steren mentioned this issue Jul 16, 2018

Docs use the term "App", but this term is not defined knative/docs#131

Closed

steren mentioned this issue Aug 29, 2018

Moving "routing mode" to the route (from service) #1865

Closed

bgrant0607 mentioned this issue Mar 21, 2019

Proposal: Replace Service with ReverseProxy kubernetes/kubernetes#73352

Closed

steren mentioned this issue Sep 12, 2019

Revert Importer naming decision. knative/eventing#1857

Closed

markusthoemmes pushed a commit to markusthoemmes/knative-serving that referenced this issue Mar 11, 2020

Bump servleress-operator to v1.5.0 and remove istio (knative#412)

a6e4b7f

This patch contains following changes: - to remove istio related resources - to bump servleress-operator to v1.5.0

yanweiguo mentioned this issue Aug 19, 2021

Configuration & Route lifecycle operations are required knative/specs#36

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elafros resources: the missing link #412

Elafros resources: the missing link #412

steren commented Mar 17, 2018 •

edited

Loading

mikehelmick commented Mar 23, 2018 •

edited

Loading

mattmoor commented Mar 24, 2018

steren commented Mar 27, 2018 •

edited

Loading

asciimike commented Mar 28, 2018 •

edited

Loading

tliberman commented Mar 28, 2018

steren commented Mar 28, 2018 •

edited

Loading

mikehelmick commented Mar 28, 2018

mattmoor commented Mar 29, 2018

bobcatfish commented Apr 5, 2018

vaikas commented Apr 5, 2018 via email

mattmoor commented Apr 11, 2018

ergannon commented Mar 21, 2019

evankanderson commented Mar 21, 2019 •

edited

Loading

Elafros resources: the missing link #412

Elafros resources: the missing link #412

Comments

steren commented Mar 17, 2018 • edited Loading

Problem statement

Problem 1: A missing higher level concept

Problem 2: Name

mikehelmick commented Mar 23, 2018 • edited Loading

Problem Summary

Proposed Solution

Atomic Rollout Policy

Manual

Changing Between Steves

Future Expansion

mattmoor commented Mar 24, 2018

steren commented Mar 27, 2018 • edited Loading

asciimike commented Mar 28, 2018 • edited Loading

Automatic Steve

Manual Steve

Switching between modes

tliberman commented Mar 28, 2018

steren commented Mar 28, 2018 • edited Loading

mikehelmick commented Mar 28, 2018

mattmoor commented Mar 29, 2018

bobcatfish commented Apr 5, 2018

vaikas commented Apr 5, 2018 via email

mattmoor commented Apr 11, 2018

ergannon commented Mar 21, 2019

evankanderson commented Mar 21, 2019 • edited Loading

Chairs

steren commented Mar 17, 2018 •

edited

Loading

mikehelmick commented Mar 23, 2018 •

edited

Loading

steren commented Mar 27, 2018 •

edited

Loading

asciimike commented Mar 28, 2018 •

edited

Loading

Automatic `Steve`

Manual `Steve`

steren commented Mar 28, 2018 •

edited

Loading

evankanderson commented Mar 21, 2019 •

edited

Loading