Skip to content
This repository has been archived by the owner on Aug 25, 2024. It is now read-only.

docs: arch: 2nd and 3rd party plugins #1061

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
235 changes: 235 additions & 0 deletions docs/arch/0001-2nd-and-3rd-party-plugins.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
1. 2nd and 3rd party plugins
============================

Date: 2021-04-01

Status
------

Draft

Context
-------

DFFML currency has the main package and all plugins maintained within a single
repo. We always intended to support 3rd party plugins, meaning displaying them
on the main docs site as options for users. We are just now getting around to
it.

We decided that we will have the main package, 2nd party plugins, and 3rd party
plugins.

- The main package is ``dffml``.

- 2nd party plugins are plugins that are maintained by the core maintainers and
who's repos reside in the ``dffml`` organization.

- 3rd party plugins are hosted in user or other org repos and core maintainers
are not owners of those repos.

We need to take the plugins that are currency maintained within the main repo
and put them in their own repos. We need the docs site to reflect the working /
not-working status of plugins on tutorial pages. We need to have a set of
plugins that we don't release unless those all work together. This is because a
core part of our functionality is letting users swap out underlying libraries,
which they can't do if they can't be installed together.

What do we know
~~~~~~~~~~~~~~~

- Main package has no dependencies

- The plugins page lists all plugins that are in ``dffml/plugins.py``

- There are tutorials that are associated with specific plugins

- If a plugin's latest release doesn't pass CI against DFFML's latest
release, any tutorials should show that it's not working.

- A main point of DFFML is to have a set of ML libraries that work together in
a single install environment so that a user can try multiple libraries and
choose the best one.

- This means we have to know which plugins can be installed together.

Decision
--------

- We want to do the compatibility matrix check in the main plugin and in each
plugin.

- This lets the main plugin know at time of docs build, what status is for
each plugin.

- This lets plugin authors know in PR CI, etc. if they are about to cause a
compatibility issue.

- We need the ability to move things from level 1 to level 2 if we want to
deicide that that it's not longer a showstopper for release.

- Working / not-working status of tutorials we want to show two things
This only applies to support levels 2 and 3. Because support levels 0 and 1
must always work for release.

- Does this tutorial work when other packages are installed latest / master?

- Does this tutorial does work against all dependent packages for latest /
master?

- Some plugins rely only on the main package

- Main package never has any dependencies. So from a dependency checking
perspective, there should never be any issue.

- Some plugin rely on other plugins as well

- Main package never has any dependencies. So from a dependency checking
perspective, there should never be any issue.

- To know which plugins can be installed together

- Which plugins failure to validate against master branch warrant blocking
release

- We need some sort of support level tracking.

- Support level tracking means test against latest release and master
branch.

- Possible support levels

- 0 main package

- 1 2nd party required pass for release
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


- 2 2nd party not required pass for release

- Note on tutorials that involve level 2 plugins to say they aren't
working at the moment

- 3 3rd party not required pass for release

- Note on tutorials that involve level 3 plugins to say they aren't
working at the moment

Consequences
------------

- Based on support levels

- The ``dffml/plugins.json`` should list the plugin, and it's support level

- Documentation related to specific plugins

- Tutorials that pip install packages of support level 2 or 3 must have some
element that the top of the page that can show the working / not-working
status.

- When tutorials are tested, they only install the set of plugins that they
need. So a tutorial CI test will fail if those plugins do not work together.
Therefore, we display the warning because we know the tutorial works. If
there is a failure to install all support level 1 and 2 plugins together, we
know that we should display the warning. The tutorial works, but we're not
sure what other other plugins installed in a same environment might cause
dependency conflicts.

- Matrix check, two perspectives (this translates into CI tests)

- Main package

- Support level 1

- For master, does installing all the plugins from their master zip
archive URL work when all given to ``pip install`` at the same time.

- For latest release, does installing all the plugins by PyPi name work
when all given to ``pip install`` at the same time.

- "work" here meaning does pip raise any issues about conflicting
dependency versions.

- Support level 2

- For master

- Does installing all the plugins in support levels 1 and 2 work from
their master zip archive URL work when all given to ``pip install``
at the same time.

- PASS: No warning on tutorials.

- FAIL: Warning on tutorials, this may not work when other plugins
are install. This tutorial should still work when no other plugins
are installed.

- For latest release, does installing all the plugins by PyPi name work
when all given to ``pip install`` at the same time.

- Does installing all the plugins in support levels 1 and 2 work from
their PyPi name when all given to ``pip install`` at the same time.

- PASS: No warning on tutorials.

- FAIL: Warning on tutorials, this may not work when other plugins
are install. This tutorial should still work when no other plugins
are installed.

- "work" here meaning does pip raise any issues about conflicting
dependency versions.

- If they don't. Do we care about finding more info about which one's
are braking it. No, we do not care, because figuring out matrix is
exponential.

- Support level 3

- Always have a warning on tutorials, this may not work when other plugins
are installed, because this a tutorial based on a third party plugin.
This tutorial should still work when no other plugins are installed.
In the event that it doesn't please report issues to third party here:
<Link to third party project URL for plugin>

- Plugin package

- Support level 1

- Fail CI if install of support level 1 plugins fails.

- Support level 2

- Fail CI if install of support level 1 plugins fails.

- If there is some way to warn via CI. Then warn if install of support
level 1 and 2 plugins fails.

- Support level 3

- Fail CI if install of support level 1 plugins fails.

- If there is some way to warn via CI. Then warn if install of support
level 1 and 2 plugins fails.

Notes
-----

tutorial check command

this command will help us check if a tutorial by providing its URL is compatible with the locally installed version of the FML and all available plugins that are installed

check if URL to tutorial we could push a Jason file or some sort of metadata into the I built doc so that we can check maybe a unique ID and the unique ID then you know build some Jason files well what we do is we output UM you know probably some kind of structure within the document that lets us determine what the versions of the plugins that the document were looking at was tested against

dependency declaration

we need some sort of file maybe espam format which declares all of our dependencies we can then do a you know take this format converted into something that can be pip installed and then we do a PIP download in one of the stages of the CI job we upload the downloaded artifact into the next stage of the CI job and then we you know install it or yeah no we we oh we set that as the index basically so we take these downloaded files and pip only well we expose them on a local file server or something or or maybe you can use the file URL for the index uhm URL of pip in this way the test cases are only allowed to install things which have been declared in whatever you know dependency format for example espam ah so

PR validation

essentially trigger a domino effect where we analyze the requirements files of all of the plugins that are either first or second party possibly support third party later somehow uhm and we build a dependency tree to understand which packages or which plugins are dependent on the plug in which is being changed in the original pull request we run the validation for the original pull request and then we run validation against you we trigger all of the CI runs of all of the downstream projects with the PR applied to with the original PR applied at if any of the downstream repos have would need to be changed for their CI to pass we can create PR's against those repos in the original PR we can provide overrides for each dependency so that when we trigger the validation or not dependency but downstream package so that when we trigger the validation for each downstream package we can say use this PR so if you've made an API breaking change and you need to go through all of the downstream dependencies are and make changes and submit PR that would make it OK then you go and then you specify you know all of those PRs which will be used when running the CI of the downstream dependencies respectively
johnandersen777 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to nessicarily update status checks via API, can just have a pipeline within PR workflows which says this other PR must be merged in an upstrema or downstrema before this one can auto merge

Copy link
Author

@johnandersen777 johnandersen777 Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schema/github/actions/result/container/example-pull-request-validation.yaml

$schema: "https://github.com/intel/dffml/raw/dffml/schema/github/actions/result/container/0.0.0.schema.json"
commit_url: "https://github.com/intel/dffml/commit/1f347bc7f63f65041a571d9e3c174d8b9ead24aa"
job_url: "https://github.com/intel/dffml/actions/runs/4185582030/jobs/7252852590"
result: "docker.io/intelotc/dffml@sha256:ae636f72f96f499ff5206150ebcaafbd64ce30affa7560ce0a41f54e871da2"

Copy link
Author

@johnandersen777 johnandersen777 Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2023-02-23 @pdxjohnny Engineering Logs

  • https://github.com/cloudfoundry-community/node-cfenv
  • Eventting helps us have Alice sit alongside and look at new issues, workflow runs, etc. This will help her help developers stay away from known bad/unhelpful trains of thought.
    • She can look at issue bodies for similar stack traces
      • Eventually we'll have the updating like we do where we update issue or discussion thread with what console commands and outputs we run while debugging, or we'll just do peer to peer depending on context!
      • docs: arch: Inventory #1207
        • live at HEAD is great, but poly repo PR validation will bring us into the future, since we'll be running inference over all the active pull requests
          • We'll take this further to branches, then to the in progress trains of thought (active debug, states of the art which gatekeeper/umbrella/prioriziter says are active based on overlays for context of scientific exploration)
            • As our inference gets better, we'll look across the trains of thought and Prohpet.predict() state of the art trains of thought, then validate those via dispatch/distributed compute, then we'll start to just infer the outputs of the distributed compute, and validate based on risk and criticality, we'll then have our best guess muscle memory machine.
  • Mermaid has mind map functionality now
  • https://www.youtube.com/watch?v=tXJ03mPChYo&t=375s
    • Alice helps us understand the security posture of this whole stack over it's lifecycle. She's trying to help us understand the metrics and models produced from analysis of our software and improve it in arbitrary areas (via overlays). She has overlays for dependency analysis and deciding if there is anything she can do to help improve those dependencies. alice threats will be where she decides if those changes or the stats mined from shouldi are aligned to her strategic principles, we'll also look to generate threat models based on analysis of dependencies found going down the rabbit hole again with alice shouldi (shouldi: deptree: Create dependency tree of project #596). These threat models can then be improved via running https://github.com/johnlwhiteman/living-threat-models auditor.py alice threats audit, threats are inherently strategic, based on deployment context, they require knowledge of the code (static), past behavior (pulled from event stream of distributed compute runs), and understanding of what deployments are relavent for vuln analysis per the threat model.
      • Entity, infrastructure (methodology for traversal and chaining), (open) architecture
      • What are you running (+deps), where are you running it (overlayed deployment, this is evaluated in federated downstream SCITT for applicablity and reissusance of VEX/VDR by downstream), and what's the upstream threat model telling you if you should care if what your running and how your running it yields unmittigated threats. If so, and Alice knows how to contribute, Alice please contribute. If not and Alice doesn't know how to contribute. Alice please log todos, across org relevant poly repos.
      • When we do our depth of field mapping (ref early engineering log streams) we'll merge all the event stream analysis via the tuned brute force prioritizer (grep alice discussion arch)
  • Loosly coupled DID VC CI/CD enables AI in the loop development in a decentralized poly repo environment (Open Source Software cross orgs)

WIP: IETF SCITT: Use Case: OpenSSF Metrics: activitypub extensions for security.txt

Copy link
Author

@johnandersen777 johnandersen777 Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

graph TD
    subgraph transparency_service[Transparency Service]
        transparency_service_pypi_known_good_package[Trust Attestation in-toto style<br>test result for known-good-package]
    end
    subgraph shouldi[shouldi - OSS Risk Analysis]
        subgraph shouldi_pypi[PyPi]
            shouldi_pypi_insecure_package[insecure-package]
            shouldi_pypi_known_good_package[known-good-package]
        end
    end
    subgraph shouldi[shouldi - OSS Risk Analysis]
        subgraph shouldi_pypi[PyPi]
            shouldi_pypi_insecure_package[insecure-package]
            shouldi_pypi_known_good_package[known-good-package]
        end
    end
    subgraph cache_index[Container with pip download for use with file:// pip index]
        subgraph cache_index_pypi[PyPi]
            cache_index_pyOpenSSL[pyOpenSSL]
        end
    end
    subgraph fork[Forked Open Source Packages]
        subgraph fork_c[C]
            fork_OpenSSL[fork - OpenSSL]
        end
        subgraph fork_python[Python]
            fork_pyOpenSSL[fork - pyOpenSSL]
        end

        fork_OpenSSL -->|Compile, link, embed| fork_pyOpenSSL
    end
    subgraph cicd[CI/CD]
        runner_tool_cache[$RUNNER_TOOL_CACHE]
        runner_image[Runner container image - OSDecentrAlice]
        subgraph loopback_index_service[Loopback/sidecar package index]
            serve_package[Serve Package]
        end

        subgraph workflow[Python project workflow]
            install_dependencies[Install Dependencies]
            install_dependencies -->|Deps from N-1 2nd<br>party SBOMs get cached| runner_tool_cache
            install_dependencies -->|PIP_INDEX_URL| loopback_index_service
        end

        runner_tool_cache --> runner_image
    end

    shouldi_pypi_known_good_package --> transparency_service_pypi_known_good_package

    serve_package -->|Check for presence of trust attestation<br>inserted against relavent statement<br>URN of policy engine workflow used| transparency_service_pypi_known_good_package

    cache_index_pypi -->|Populate $RUNNER_TOOL_CACHE<br>from cached index| runner_image

    fork_pyOpenSSL -->|Publish| cache_index_pyOpenSSL
Loading


Repo locks

for this what amounts to essentially a Poly repo structure to work we with the way that we're validating all of our poor requests against each other before merge we need to ensure that when the original PR is merged all the rest of the PR's associated with it that might you know fix API breaking changes in downstream dependent packages are also merged therefore we will need some sort of a system account or bot to which has which must approve every pull request and that bot we can make the logic so that if there is if an approved reviewer has approved the pull request then the bot will approve the pull request analyst initiate the locking procedure and rebate support request into the into the repo so when we have a change which effects more than one repo we will we will trigger rebase is into the respective repos main branches while all of those repos are locked in fact all of the reports will be locked within that within the main repo and the 2nd party org this is because we need to ensure that all of the changes get merged and there are no conflicts so that we end up in an unknown state which which would result in us ending up in an unknown state our state is known so long as we have tested all of the PR's involved against the main branch I or the you know the latest commit before rebase. When all PR's in a set across repos are approved the bot will merge starting with the farthest downstream PR at it will specify somehow version information to the CIA so that the C I can block waiting for the commit which was in the original PR to be merged before continuing this will ensure that the CI jobs do not run against a slightly outdated version of the original the repo which the original PR was made against
Copy link
Author

@johnandersen777 johnandersen777 Mar 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Maintainer info

for support level two or three plugins which might break with the application of a PR we should have some bot or some workflow comment to highlight which plugins would break if this PR was applied this information is purely for maintainers well as well as ah it's mainly for maintainers so that they understand whether they should request additional work be done or slight modifications or ah whether we need to plan and create issues to to potentially for example breaking a support level two plug-in we may be OK with that we just need to make sure that we're tracking it
1 change: 1 addition & 0 deletions docs/arch/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,6 @@ https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions
.. toctree::
:titlesonly:

0001-2nd-and-3rd-party-plugins
0002-Object-Loading-and-Instantiation-in-Examples
0003-Config-Property-Mutable-vs-Immutable