From 064d0b5cb26d7897453883e5e5b84aba0788dbd3 Mon Sep 17 00:00:00 2001 From: Dan Mosora Date: Thu, 13 Jan 2022 14:16:32 -0500 Subject: [PATCH 1/8] Add basic template with header info filled out --- proposals/draft/19-layered-model-of-singer | 98 ++++++++++++++++++++++ 1 file changed, 98 insertions(+) create mode 100644 proposals/draft/19-layered-model-of-singer diff --git a/proposals/draft/19-layered-model-of-singer b/proposals/draft/19-layered-model-of-singer new file mode 100644 index 0000000..ae3c558 --- /dev/null +++ b/proposals/draft/19-layered-model-of-singer @@ -0,0 +1,98 @@ +# SIP #`` - Layered Model of Singer + +_This document follows the [Singer Improvement Proposal (SIP) process](./draft/PR21%20-%20Proposal%20Documentation%20and%20Review.md)_ + +## Proposal Status + +| header | header | +| ------ | ------ | +| State | Draft | +| Issue Link | [#19](https://github.com/MeltanoLabs/Singer-Working-Group/issues/19) | +| Created | 2022-01-13 | +| Updated | 2022-01-13 | + +----------------------- + +## I. Proposal Summary + +### TL;DR Overview + +...(Summarize in a few sentences)... + +### What specific change do you propose to make? + +...(Detailed description)... + +## Motivation + +... + +### What problem does it solve? + +... + +### Why is it needed? + +... + +----------------------- + +## II. Proposal Details + +...(Detailed information here)... + +----------------------- + +## III. Additional Information + + + +### Which layer(s) of the Singer ecosystem does this proposal directly touch? + +Select all that apply: + +- [ ] Singer Specification - required capabilities and behaviors +- [ ] Singer Specification - optional capabilities and behaviors +- [ ] Singer best practices and other guidance +- [x] **Singer Working Group - practices and procedures** +- [ ] Singer documentation (Other) + +### Are there any downsides to this change? + +... + +### Is the change backwards compatible? + +... + +### Other Considerations + +... + +### How are Singer developers affected by the change (if applicable)? + +... + +### How are Singer users affected by the change? (if applicable)? + +... + +### Prototype Implementations + +...(if applicable)... + +### Future Plans + +...(if applicable)... + +### Excluded Alternatives + +...(if applicable)... + +### Acknowledgements + +...(if applicable)... + +### What defines this SIP as "done"? + +... From 28be95d6e914e0729f42ee998a62f39fe470ee4a Mon Sep 17 00:00:00 2001 From: Dan Mosora Date: Thu, 13 Jan 2022 14:40:06 -0500 Subject: [PATCH 2/8] First pass at high-level information --- proposals/draft/19-layered-model-of-singer | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/proposals/draft/19-layered-model-of-singer b/proposals/draft/19-layered-model-of-singer index ae3c558..25bacfc 100644 --- a/proposals/draft/19-layered-model-of-singer +++ b/proposals/draft/19-layered-model-of-singer @@ -17,23 +17,29 @@ _This document follows the [Singer Improvement Proposal (SIP) process](./draft/P ### TL;DR Overview -...(Summarize in a few sentences)... +The Layered Model of Singer is a mechanism for organizing the mass collection of libraries, patterns, and practices that organizations using Singer and providing a means of discussing how these best practices fit into the overall ecosystem. + +In addition to categorization, its second goal is to enable structured conversation about how to move a practice between levels. For example, pulling a concept from a library and generalizing it as a widespread standard. ### What specific change do you propose to make? -...(Detailed description)... +This change is a document for the Singer Working Group as a meta-specification to provide guidelines for discussing proposed features to categorize a proposal and its components for discussion. ## Motivation -... +Singer is a lot of things aside from a JSON-line based data exchange protocol. It is a collection of tools, best practices, reserved metadata keywords, standard command-line arguments, web applications, orchestration tooling/practices, and more. A part of how this could evolve in this way is from the open-endedness of the original spec. Data extraction use cases vary by source, by runtime environment, by orchestration mechanism, etc. and those use cases generally cannot be anticipated up front. + +The motivation here is to help keep that simplicity by defining levels from a most generic level (Spec) to a most specific level (Framework/Application). ### What problem does it solve? -... +The problem it is solving is having a consistent language to discuss features and changes proposed in the Singer Working Group so that all participants can be sure that they are talking about the same thing. ### Why is it needed? -... +Having this sort of language to categorize pieces of a proposed change into varying degrees of specificity will allow proposals to be refined and keep use-case specific mechanisms out of lower levels. + +For example, if use-case specific details make their way into the over-the-wire JSON line Spec, this would end up pushing the protocol to become more complicated to write software that reads and writes (more conditions to handle), thus leading to a spread of logical bloat in those implementing the protocol. ----------------------- From 94087ab71909625e73b9651312e6b2b6e8f9360d Mon Sep 17 00:00:00 2001 From: Dan Mosora Date: Fri, 18 Feb 2022 19:06:48 -0500 Subject: [PATCH 3/8] WIP from last session of writing --- proposals/draft/19-layered-model-of-singer | 35 ++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/proposals/draft/19-layered-model-of-singer b/proposals/draft/19-layered-model-of-singer index 25bacfc..9fe252e 100644 --- a/proposals/draft/19-layered-model-of-singer +++ b/proposals/draft/19-layered-model-of-singer @@ -46,6 +46,41 @@ For example, if use-case specific details make their way into the over-the-wire ## II. Proposal Details ...(Detailed information here)... +## The Layered Model of Singer + +Looking at Singer, there are a lot of design choices baked in around a core value of simplicity. The reasoning for this has always been to give developers the freedom and flexibility to make it what they want, since all data sources are vastly different, and one cannot effectively design for all future cases in the ELT space. + +As we discuss evolving Singer as a whole and as a community, it will be important to take care to not lose the core value of simplicity that has allowed the space for best practices to be invented like those encoded in the current existing frameworks/libraries. + +Approaching the stack as a layered model can give us a means of aligning where an idea fits, and a tool iterate organically to "upgrade" concepts from a framework feature to a codified standard to a spec change if it makes sense. + +### Layer 1: Specification +This is the current specification as it stands, some principles of features here: +- Language agnostic and implementation independent +- Focus on the std-out portions of using Singer (serialization format, message types, required keys for messages, etc.) + +### Layer 2: Standards and Best Practices +These are being tracked in #10, but as far as the initial design decisions of Singer go, this conceptually includes things like Command-Line Arguments, Catalog, Metadata Keys/Custom Metadata, Standard State Keys, etc. + +Some principles here: +- These are also language agnostic and implementation independent +- They help standardize the nitty-gritty to make writing frameworks and libraries more easy +- They strive to make code more portable, readable, and usable for users and devs alike. + +[TODO: Split into required and optional] + +### Layer 3: Libraries and Frameworks +This is where we get into the language specific stuff. Libraries like `singer-python` and/or `singer-clojure` or frameworks like the MeltanoSDK take the standards plus best practices and encode them in a way that makes sense for the patterns of each language. This is also a good place to be a test bed for things that might become standards. + +Principles: +- Language specific +- Generic use cases +- These influence the way that code is written for their specific language + +### Layer 4: Tooling/Orchestration/UX/Infrastructure +I'm not quite sure about this one, but these are things that don't seem to fit in the other layers, and kind of make up an analog to the "Application Layer" of the OSI layered model. This layer is included to be a spot to hold things that are in use on a specific industry, use case, deployment method, etc., but not quite ready to be standardized. + +[Aside:] This layer could use the most work, but it seemed worth including here. My gut says that it's likely harder to standardize these kinds of things, since it'll be where our orgs' respective product offerings fall into a lot of the time, and with that comes IP concerns, specifics for our target users (e.g., technical vs. non-technical), a specific slice of the industry, and/or a more narrow set of use cases. That said, tools like `singer-discover` would also fall here, and fit into a standardization conversation more easily. ----------------------- From 4bb9f26a909623885f3af2e9c999f6aa3b3dafb5 Mon Sep 17 00:00:00 2001 From: longestpath <27017104+longestpath@users.noreply.github.com> Date: Fri, 18 Feb 2022 20:06:03 -0500 Subject: [PATCH 4/8] Fill out proposal --- proposals/draft/19-layered-model-of-singer | 84 +++++++++++----------- 1 file changed, 41 insertions(+), 43 deletions(-) diff --git a/proposals/draft/19-layered-model-of-singer b/proposals/draft/19-layered-model-of-singer index 9fe252e..58ef671 100644 --- a/proposals/draft/19-layered-model-of-singer +++ b/proposals/draft/19-layered-model-of-singer @@ -17,9 +17,9 @@ _This document follows the [Singer Improvement Proposal (SIP) process](./draft/P ### TL;DR Overview -The Layered Model of Singer is a mechanism for organizing the mass collection of libraries, patterns, and practices that organizations using Singer and providing a means of discussing how these best practices fit into the overall ecosystem. +The Layered Model of Singer is a mechanism for organizing the mass collection of libraries, patterns, and practices that organizations using Singer rely on and providing a means of categorizing how these best practices fit into the overall ecosystem. -In addition to categorization, its second goal is to enable structured conversation about how to move a practice between levels. For example, pulling a concept from a library and generalizing it as a widespread standard. +In addition to categorization, its second goal is to enable a structured conversation about how to move a practice between levels. For example, pulling a concept from a library and generalizing it as a widespread standard. ### What specific change do you propose to make? @@ -27,7 +27,7 @@ This change is a document for the Singer Working Group as a meta-specification t ## Motivation -Singer is a lot of things aside from a JSON-line based data exchange protocol. It is a collection of tools, best practices, reserved metadata keywords, standard command-line arguments, web applications, orchestration tooling/practices, and more. A part of how this could evolve in this way is from the open-endedness of the original spec. Data extraction use cases vary by source, by runtime environment, by orchestration mechanism, etc. and those use cases generally cannot be anticipated up front. +Singer is a lot of things aside from a JSON-line based data exchange protocol. It is a collection of tools, best practices, reserved metadata keywords, standard command-line arguments, web applications, orchestration tooling/practices, and more. A part of how this was able to evolve in this way is from the open-endedness of the underlying spec. Data extraction use cases vary by source, by runtime environment, by orchestration mechanism, etc. and those use cases generally cannot be anticipated up front. The motivation here is to help keep that simplicity by defining levels from a most generic level (Spec) to a most specific level (Framework/Application). @@ -37,39 +37,42 @@ The problem it is solving is having a consistent language to discuss features an ### Why is it needed? -Having this sort of language to categorize pieces of a proposed change into varying degrees of specificity will allow proposals to be refined and keep use-case specific mechanisms out of lower levels. - -For example, if use-case specific details make their way into the over-the-wire JSON line Spec, this would end up pushing the protocol to become more complicated to write software that reads and writes (more conditions to handle), thus leading to a spread of logical bloat in those implementing the protocol. +Having this sort of language to categorize pieces of a proposed change into varying degrees of specificity will allow proposals to be refined and keep use-case specific mechanisms out of lower levels. The hope is that this sort of framework will encourage innovation at the top end of the hierarchy and promote conversation to bring the parts that map to a true generic data protocol down towards the spec level as practicioners adopt those pieces and work them thoroughly in their own data space. ----------------------- ## II. Proposal Details -...(Detailed information here)... -## The Layered Model of Singer - -Looking at Singer, there are a lot of design choices baked in around a core value of simplicity. The reasoning for this has always been to give developers the freedom and flexibility to make it what they want, since all data sources are vastly different, and one cannot effectively design for all future cases in the ELT space. +### The Layered Model of Singer -As we discuss evolving Singer as a whole and as a community, it will be important to take care to not lose the core value of simplicity that has allowed the space for best practices to be invented like those encoded in the current existing frameworks/libraries. +Looking at Singer, there are a lot of design choices baked in around a core value of simplicity. The reasoning for this has always been to give developers the freedom and flexibility to make it what they want, since all data sources are vastly different, and one cannot effectively design for all future cases in the ELT space. As discussed in the motivation of this SIP, organizing layers of the Singer ecosystem can provide the language necessary for the disparate organizations using Singer to propose and categorize the things that work for them and enable a wider conversation towards wider spread adoption. -Approaching the stack as a layered model can give us a means of aligning where an idea fits, and a tool iterate organically to "upgrade" concepts from a framework feature to a codified standard to a spec change if it makes sense. +#### Layer 1: Specification +This is the specification itself, which is purely a JSON protocol for transmitting data over the wire. -### Layer 1: Specification -This is the current specification as it stands, some principles of features here: -- Language agnostic and implementation independent +Some principles of features here: +- Language agnostic, implementation independent, and generic regardless of use case. - Focus on the std-out portions of using Singer (serialization format, message types, required keys for messages, etc.) -### Layer 2: Standards and Best Practices -These are being tracked in #10, but as far as the initial design decisions of Singer go, this conceptually includes things like Command-Line Arguments, Catalog, Metadata Keys/Custom Metadata, Standard State Keys, etc. +#### Layer 2: Standards +The pieces that systems using Singer can rely on such as catalogs and discovery mode. These pieces required to implement to be considered as up to modern standards for Singer. Not implementing these is not an incompatibility, but all Singer actors should strive to implement the standards. The standards are more focused upon Singer actors themselves themselves (e.g., taps, targets), as opposed to the over-the-wire protocol at the heart of Singer. Some tooling that rely on these standards may not function if an actor does not implement it. + +Conceptually, this includes things like Command-Line Arguments, Catalog, Metadata Keys/Custom Metadata, Standard State Keys, etc. Some principles here: -- These are also language agnostic and implementation independent -- They help standardize the nitty-gritty to make writing frameworks and libraries more easy -- They strive to make code more portable, readable, and usable for users and devs alike. +- Language agnostic, implementation specific, and generic regardless of use case +- They help standardize the nitty-gritty to make writing frameworks and libraries easier + +#### Layer 3: Best Practices -[TODO: Split into required and optional] +This is the set of things that Singer practicioners have figured out work in their space, but aren't sure whether they are globally generic or not yet. These best practices are generally not required, but are things that have been used as ways to make it easier to build taps. -### Layer 3: Libraries and Frameworks +Some principles here: +- Language agnostic, implementation specific, and use case specific. +- These items are candidates for lower levels of the ecosystem, but maybe are not quite yet standards (i.e., are not at the level that they can be considered required) +- They strive to make code more portable, readable, and usable for users and devs alike in a structural sense + +### Layer 4: Libraries and Frameworks This is where we get into the language specific stuff. Libraries like `singer-python` and/or `singer-clojure` or frameworks like the MeltanoSDK take the standards plus best practices and encode them in a way that makes sense for the patterns of each language. This is also a good place to be a test bed for things that might become standards. Principles: @@ -77,17 +80,20 @@ Principles: - Generic use cases - These influence the way that code is written for their specific language -### Layer 4: Tooling/Orchestration/UX/Infrastructure -I'm not quite sure about this one, but these are things that don't seem to fit in the other layers, and kind of make up an analog to the "Application Layer" of the OSI layered model. This layer is included to be a spot to hold things that are in use on a specific industry, use case, deployment method, etc., but not quite ready to be standardized. +### Layer 5: Tooling/Orchestration/UX/Infrastructure +This is the bread and butter of organizations that adopt singer. As far as standards go, this is not generally something that will be voted upon by the Singer communicty as a whole. Instead, this is the state that all proprietary code falls in and the specific product offerings of practitioners fit. Nothing in this space is expected to be open source, but the option is there for things that may qualify for best practices. -[Aside:] This layer could use the most work, but it seemed worth including here. My gut says that it's likely harder to standardize these kinds of things, since it'll be where our orgs' respective product offerings fall into a lot of the time, and with that comes IP concerns, specifics for our target users (e.g., technical vs. non-technical), a specific slice of the industry, and/or a more narrow set of use cases. That said, tools like `singer-discover` would also fall here, and fit into a standardization conversation more easily. +This layer can be considered an analog to the "Application Layer" of the OSI layered model of networking. + +Principles: +- Specific to a particular vertical or market space +- Specific to a single ops approach for deployment or other closer-to-the-metal conecerns +- Not generalizable without either losing value in the source product ----------------------- ## III. Additional Information - - ### Which layer(s) of the Singer ecosystem does this proposal directly touch? Select all that apply: @@ -100,40 +106,32 @@ Select all that apply: ### Are there any downsides to this change? -... +There are certainly downsides to this change. Directing the conversation with a categorical model is very important to consider. There are potentially nuances that are not yet considered that must be accounted for in this proposal, but the author intends that those pieces can be amended to clarify as they come out of working together. ### Is the change backwards compatible? -... - -### Other Considerations - -... +There is nothing to be backwards comptible with. This is a first language proposal of this mechanism purely for the Working Group's consideration. ### How are Singer developers affected by the change (if applicable)? -... +Hopefully not at all unless there is a good reason to. A primary goal of this is to only require things that are truly worthwhile to implement for the general Singer world as a whole. ### How are Singer users affected by the change? (if applicable)? -... - -### Prototype Implementations - -...(if applicable)... +Singer users should be able to utilize this language as well to evaluate things that claim to implement the Singer best practices. ### Future Plans -...(if applicable)... +The future plans for this proposal are that the Working Group will continue to iterate upon it and come to truly beneficial categorizations that advance the world of data as a whole. ### Excluded Alternatives -...(if applicable)... +A purely free approach to extension proposals has been explicitly excluded by this. An entirely bottom-up approach isn't compatible with this sort of structure. ### Acknowledgements -...(if applicable)... +Thank you to Taylor Murhy and Aaron Steers from Meltano for engaging with the initial Issue associated with this idea and helping me develop it conceptually over the past months. ### What defines this SIP as "done"? -... +Purely adoption by the Singer Working group as an operational practice would consider this proposal as Done. From b763d26bb8a4468f33d3f1a66adfe17568e22ea8 Mon Sep 17 00:00:00 2001 From: Dan Mosora Date: Fri, 18 Feb 2022 21:41:49 -0500 Subject: [PATCH 5/8] Update updated-at value --- proposals/draft/19-layered-model-of-singer | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/draft/19-layered-model-of-singer b/proposals/draft/19-layered-model-of-singer index 58ef671..8353d13 100644 --- a/proposals/draft/19-layered-model-of-singer +++ b/proposals/draft/19-layered-model-of-singer @@ -9,7 +9,7 @@ _This document follows the [Singer Improvement Proposal (SIP) process](./draft/P | State | Draft | | Issue Link | [#19](https://github.com/MeltanoLabs/Singer-Working-Group/issues/19) | | Created | 2022-01-13 | -| Updated | 2022-01-13 | +| Updated | 2022-02-18 | ----------------------- From aecfc206f41b34e0d7e838fed55c9beb156b8dbb Mon Sep 17 00:00:00 2001 From: Dan Mosora <30501696+dmosorast@users.noreply.github.com> Date: Tue, 22 Feb 2022 10:45:13 -0500 Subject: [PATCH 6/8] Update proposals/draft/19-layered-model-of-singer Co-authored-by: Taylor A. Murphy --- proposals/draft/19-layered-model-of-singer | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/draft/19-layered-model-of-singer b/proposals/draft/19-layered-model-of-singer index 8353d13..b6dbdce 100644 --- a/proposals/draft/19-layered-model-of-singer +++ b/proposals/draft/19-layered-model-of-singer @@ -130,7 +130,7 @@ A purely free approach to extension proposals has been explicitly excluded by th ### Acknowledgements -Thank you to Taylor Murhy and Aaron Steers from Meltano for engaging with the initial Issue associated with this idea and helping me develop it conceptually over the past months. +Thank you to Taylor Murphy and Aaron Steers from Meltano for engaging with the [initial Issue](https://github.com/MeltanoLabs/Singer-Working-Group/issues/19) associated with this idea and helping me develop it conceptually over the past months. ### What defines this SIP as "done"? From b0c1e1cda323a5dd26e6f0b56c868c3e4d527d8b Mon Sep 17 00:00:00 2001 From: Dan Mosora <30501696+dmosorast@users.noreply.github.com> Date: Tue, 22 Feb 2022 10:46:02 -0500 Subject: [PATCH 7/8] Update proposals/draft/19-layered-model-of-singer Co-authored-by: Taylor A. Murphy --- proposals/draft/19-layered-model-of-singer | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/draft/19-layered-model-of-singer b/proposals/draft/19-layered-model-of-singer index b6dbdce..5d0ba3c 100644 --- a/proposals/draft/19-layered-model-of-singer +++ b/proposals/draft/19-layered-model-of-singer @@ -73,7 +73,7 @@ Some principles here: - They strive to make code more portable, readable, and usable for users and devs alike in a structural sense ### Layer 4: Libraries and Frameworks -This is where we get into the language specific stuff. Libraries like `singer-python` and/or `singer-clojure` or frameworks like the MeltanoSDK take the standards plus best practices and encode them in a way that makes sense for the patterns of each language. This is also a good place to be a test bed for things that might become standards. +This is where we get into the language specific stuff. Libraries like `singer-python` and/or `singer-clojure` or frameworks like the Meltano SDK take the standards plus best practices and encode them in a way that makes sense for the patterns of each language. This is also a good place to be a test bed for things that might become standards. Principles: - Language specific From 418f39fe553ff3faa9fed0ab817354f3f4b47a42 Mon Sep 17 00:00:00 2001 From: Dan Mosora Date: Tue, 22 Feb 2022 10:49:06 -0500 Subject: [PATCH 8/8] Some typos and wording change --- proposals/draft/19-layered-model-of-singer | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/proposals/draft/19-layered-model-of-singer b/proposals/draft/19-layered-model-of-singer index 5d0ba3c..5b4b944 100644 --- a/proposals/draft/19-layered-model-of-singer +++ b/proposals/draft/19-layered-model-of-singer @@ -55,7 +55,7 @@ Some principles of features here: - Focus on the std-out portions of using Singer (serialization format, message types, required keys for messages, etc.) #### Layer 2: Standards -The pieces that systems using Singer can rely on such as catalogs and discovery mode. These pieces required to implement to be considered as up to modern standards for Singer. Not implementing these is not an incompatibility, but all Singer actors should strive to implement the standards. The standards are more focused upon Singer actors themselves themselves (e.g., taps, targets), as opposed to the over-the-wire protocol at the heart of Singer. Some tooling that rely on these standards may not function if an actor does not implement it. +The pieces that systems using Singer can rely on such as catalogs and discovery mode. These pieces required to implement to be considered as up to modern standards for Singer. Not implementing these is not an incompatibility, but all Singer actors should strive to implement the standards. The standards are more focused upon Singer actors themselves (e.g., taps, targets), as opposed to the over-the-wire protocol at the heart of Singer. Some tooling that rely on these standards may not function if an actor does not implement it. Conceptually, this includes things like Command-Line Arguments, Catalog, Metadata Keys/Custom Metadata, Standard State Keys, etc. @@ -81,14 +81,15 @@ Principles: - These influence the way that code is written for their specific language ### Layer 5: Tooling/Orchestration/UX/Infrastructure -This is the bread and butter of organizations that adopt singer. As far as standards go, this is not generally something that will be voted upon by the Singer communicty as a whole. Instead, this is the state that all proprietary code falls in and the specific product offerings of practitioners fit. Nothing in this space is expected to be open source, but the option is there for things that may qualify for best practices. +This is the bread and butter of organizations that adopt Singer. As far as standards go, this is not generally something that will be voted upon by the Singer community as a whole. Instead, this is the state that all proprietary code falls in and the specific product offerings of practitioners fit. Nothing in this space is expected to be open source, but the option is there for things that may qualify for best practices. This layer can be considered an analog to the "Application Layer" of the OSI layered model of networking. Principles: - Specific to a particular vertical or market space - Specific to a single ops approach for deployment or other closer-to-the-metal conecerns -- Not generalizable without either losing value in the source product +- Not generalizable +- May not be a candidate for open sourcing -----------------------