Remove usage of global attributes #75

nineinchnick · 2020-02-26T13:32:14Z

Using global attributes when building models makes it really hard to share assets between different models. Multiple models have to be stored in separate sources, cannot be processed together and unused assets are still drawn in diagrams.

This MR removes all usages of global/static attributes and requires to explicitly pass elements to a model. Note that only Dataflows have to be passed. See the new tests added for difference in usage.

Also note that previously, the sequence diagram relied on the order in which assets were defined (not dataflows). Now dataflows are used to order assets. It's still not consistent with being able to manually order dataflows but this can be improved in future MRs.

ghost · 2020-02-26T13:32:35Z

Congratulations 🎉. DeepCode analyzed your code in 2.376 seconds and we found no issues. Enjoy a moment of no bugs ☀️.

👉 View analysis in DeepCode’s Dashboard | Configure the bot

izar

Not sure I like this one. The idea behind pytm is to have the code describe the system, transparently, without making it a whole "big thing": write the code that describes the system you're working right here, and keep it with the code that is described.
From a software engineering point of view, the separation and re-use you propose is great, but I think here it will necessarily complicate 80% of cases pytm is used, to simplify 20%.

nineinchnick · 2020-03-17T17:11:16Z

I knew this one is controversial and is going to spawn some discussion :-)

If you take a look at the modified README and tm.py, the only things that changes is you have to specify which dataflows form the model when creating one, and because of that you have to define the model at the end. This makes the process of building a model a bit more explicit. I don't think this makes it any less transparent.

We could retain the same order of first defining and describing the model if we could assign the model as a property to all assets and dataflows but this would add a lot more overhead. So I went with the current solution.

The 80/20 argument depends on much complicated it would be to build small models. I'm not sure that being explicit means complicating things. You already have to be explicit when defining Dataflows by pointing to a source and sink. This falls into this workflow nicely:

define assets
define dataflows by connecting assets
collect dataflows into a model

So maybe the fact that it reorders the current workflow seems like a complication?

nineinchnick · 2020-03-17T17:35:04Z

Note that before we can complete the discussion here #84 tries to address the issue of assets shared between multiple models.

izar · 2020-03-17T22:54:58Z

The thing is that we are still trying to get adoption. To start now with semantics of this before that will just make the adoption curve more steep. If we keep the model of "create as you write" it stays natural, like writing a document: first the header (tm.) , then the context (elements) then the dialog (dataflows) and in the end, close with .process(). I'd like to keep that clarity for as long as possible.

…

--izar https://keybase.io/izar/chat

On Tue, Mar 17, 2020 at 1:35 PM Jan Waś ***@***.***> wrote: Note that before we can complete the discussion here #84 <#84> tries to address the issue of assets shared between multiple models. — You are receiving this because your review was requested. Reply to this email directly, view it on GitHub <#75 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAC2BAPMGKOV2ZJZRVQA7LDRH6YFNANCNFSM4K4FNDQA> .

nineinchnick · 2020-03-18T06:43:51Z

I understand, but it sounds like a trap. This kind of changes needed to make the library more extensible are breaking changes. Doing this early will help keep the userbase.

What if I didn't move creating the model to the end in the examples?

tm = TM("Some model")

user = Actor("User")
server = Server("Web server")

tm.elements = [
    Dataflow(user, server, "Request"),
]

tm.process()

izar · 2020-03-18T18:33:43Z

Let me think about it for a bit. Back to you ASAP. --izar https://keybase.io/izar/chat

…

On Wed, Mar 18, 2020 at 2:44 AM Jan Waś ***@***.***> wrote: I understand, but it sounds like a trap. This kind of changes needed to make the library more extensible are breaking changes. Doing this early will help *keep* the userbase. What if I didn't move creating the model to the end in the examples? tm = TM("Some model") user = Actor("User") server = Server("Web server") tm.elements = [ Dataflow(user, server, "Request"), ] tm.process() — You are receiving this because your review was requested. Reply to this email directly, view it on GitHub <#75 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAC2BAJTC4MVEKUG4SGYMMDRIBUTJANCNFSM4K4FNDQA> .

izar · 2020-06-23T12:48:36Z

@shambho @colesmj what do you guys think? Is it worth to move this way and break the "simplicity" of the approach? @nineinchnick correctly points that the way it has been so far creates a problem in having multiple models handled together, but I offer that it is a marginal case compared to how people have been working so far. I mean, I see the value of the change from a development point of view but I am thinking of the users here. It breaks the "anyone can write a model in 5 minutes" i think by adding a layer of syntax that most people will probably not be using.

nineinchnick · 2020-06-24T08:02:42Z

As a last resort I can create a default context and keep the reset() method. Then this will allow to retain the old behavior and won't break BC. The only question is which approach should we recommend in the readme and tm.py as an example.

colesmj · 2020-06-24T12:30:23Z

@izar I can see your argument on this, and from @nineinchnick . This PR contains more than 1 change, however. To be honest, the suggested change to add the context on its face when used with simple TMs is not much of a change (2 lines?). The paradigm change that models are defined by their dataflows is probably the bigger change - taking the act of creating a threat model from asset centric to flow centric, and potentially an area where this change would be both a benefit and a pitfall. I think our core audience for pytm are experienced developers, not experience Python developers, but the added requirement on the user to initiate a model does not seem onerous. What may be strange is how things might be shared across models - elements are defined within the context (if I read the test and sample correctly) and then data flows are specified to build the model, again within the context; at this point nothing is shared, but I can see the benefit of having assets and elements not related to the data flows being not included in the analysis. This may catch some people off guard but it sort of makes sense - if you consider the model may be incomplete at the time of rendering (i.e. in the course of conversation the author identifies elements but does not identify how they are used yet, so there is nothing to connect them).
I would have liked to see a sample of 2 models sharing assets or flows to highlight the benefit of using a context manager before merging this PR.

izar · 2021-08-26T13:11:09Z

Where do we go with this one? @colesmj @nineinchnick

nineinchnick · 2021-08-30T14:38:22Z

I was supposed to provide more examples, especially with multiple models sharing assets. Otherwise, there are no changes requested in the PR.
I should rebase the branch and also can make sure this is backward compatible.

raphaelahrens · 2021-08-31T09:38:05Z

Hi,

I would like to see this feature being merged.

Why the feature would help me?

I work on a research project with ~10 teams, where each teams builds there own services.
But all teams share a common infrastructure, for example they all use common a data format, a registration service, a message broker and others.

To make the modelling easier and to have common names and attributes in the threat models I would like to create libraries like this company.py and data.py

# Company
from pytm.pytm import ExternalEntity
from pytm.pytm import Boundary
from pytm.pytm import Actor

boundary = Boundary('Unternehmen')
mes = ExternalEntity('MES', inBoundary=boundary)
ceo = Actor('CEO')

# Data
from pytm.pytm import Data
from pytm.pytm import Classification

efdm = Data("EFDM")
efdm.classification = Classification.RESTRICTED
efdm.format = "JSON"

efdm_signal = Data("Umsetzungssignal")
efdm_signal.classification = Classification.RESTRICTED

time = Data("Zeit")
time.classification = Classification.PUBLIC
machine_info = Data("Anlangendaten")
machine_info.classification = Classification.SECRET

These can then be used by the other teams to create there threat models like this.

from pytm.pytm import Server
from pytm.pytm import Dataflow as D

import company
import data

service = Server('Example', inBoundary=company.boundary)

D(
   service,
   company.mes,
   'Push machine info,
   data = data.machine_info
)

But with the current global state this will also add the CEO into the drawings and the reports, even if the CEO is not relevant for this threat model.
With the data module this only affects the reports, so machine_info is listed as well as time and the others.

I think it is also help in creating small per feature threat models, since the users of the software can create there own reusable libs of elements which they can reuse in there next feature threat models.

I hope this example was helpful.

nineinchnick · 2021-08-31T11:49:32Z

@raphaelahrens until this gets merged, as a workaround, you should:

set the ignoreUnused property in the model (TM object)
call TM.reset() before building each model

izar · 2021-08-31T16:05:34Z

Thanks @raphaelahrens that was a very useful example.

I definitely see the use case. I also continue to think it falls into the small side of a 80/20 split.

A naive question, but one that might (if I am not dreamwalking) be a useful compromise - what if we added functionality to add one TM object to another? This way company.py and data.py would be included, their TMs joined (smartly) with the new TM and all would be accessible, as in the example.

The simple syntax is maintained, and we have minimal change. Would that work?

raphaelahrens · 2021-09-09T09:54:14Z

We seem to agree that composable threat models are useful and I also agree that the first model must be easy to create with minimal effort.
But regarding the 80/20 concern, I also believe that there must be a way to easily grow the threat modelling efforts.
As mentioned early it is helpful to model different features in separate models and then it helps if one can reuse parts of an earlier model.
Currently I create one model for one feature and then copy the model for the next and updates to one model stay in that model.
The benefits I see in having the ability to import models are

Less typing for later models
Older models might get more attention, because you have to read them if you want to reuse them
Updates are not only applied to one model they propagate to all models.

As soon as your repository has more then one model, this feature will be helpful.

The question then is what is the desired/recommended workflow for pytm?

Currently it is simpler to have one big threat model, because reuse is much simpler.
But large models are hard to reason with.
Alternatively you can copy the threat models which makes updates harder and older models will age quicker.

For me it made sense to put elements which I would copy from other models into a separate python module for reuse.
As I have done in my example.
This felt natural to me since these are python objects, so I can write functions and classes which give me what I need and I can have constants done in my data module. ( by the way this is what made me use pytm).
And by using these functions, classes and constants like normal members of a module this felt straight forward, since I can write something like ... inBoundary=company.boundary).

I agree that not all users will be python programmers and something like with tm.build(): and the need to indent all lines will drive people mad.

I have started a branch raphaelahrens:object_context, where I experimented with the idea of putting the context inside the TM object instead of keeping it in the class object.
All other element classes then use TM.context() to get the reference to the current threat model object.
TM.context() return the object if the call was not done via an import.
With this function

def _check_if_imported():
    # Get the stack of the caller
    stack = inspect.stack()
    # Get the module of the second last stack which called
    module = inspect.getmodule(stack[-2][0])
    # return Ture only if it was from a real module
    # Not some import
    return module is not None

Which gets check if TM() and TM.context() are called

This has the benefit that when TM() is called inside an import TM._context() is not changed.
So you can combine these two threat models like this

#!/usr/bin/env python3

from pytm import (
    TM,
    Dataflow,
    ExternalEntity,
)


t = TM(
    "my test tm",
    ignoreUnused=True,
)

test = ExternalEntity('Test')

print("import")
import tm as example

Dataflow(
    test,
    example.user,
    'flow',
)


if __name__ == "__main__":
    t.process()

Currently the merging is done by setting ignoreUnused=True since it looks at all dataflows and rebuilds the element list.
The next step would be to write the merging process and to see if it even makes sense to even reuse the model via t.merge(example.tm) or so.
Also the function _check_if_imported() needs to be verified for safety, I guess there are still some corner cases where it return a false value.

izar · 2021-09-09T14:45:45Z

This looks pretty elegant! (In the example, the defined dataflow is part of t, correct?) As for the workflow, I've heard from different folks about different ways but the way I try to keep it is to have small TMs for well defined pieces of functionality, with each one living close to the code that implements the functionality. Of course there's some repetition, specially concerning external (to the functionality at hand) entities, but those are mainly circumvented by the external entity being threat modeled by itself. But I do recognize the need for the import capability you suggest.

…

On Thu, Sep 9, 2021 at 5:54 AM Raphael Ahrens ***@***.***> wrote: We seem to agree that composable threat models are useful and I also agree that the first model must be easy to create with minimal effort. But regarding the 80/20 concern, I also believe that there must be a way to easily grow the threat modelling efforts. As mentioned early it is helpful to model different features in separate models and then it helps if one can reuse parts of an earlier model. Currently I create one model for one feature and then copy the model for the next and updates to one model stay in that model. The benefits I see in having the ability to import models are 1. Less typing for later models 2. Older models might get more attention, because you have to read them if you want to reuse them 3. Updates are not only applied to one model they propagate to all models. As soon as your repository has more then one model, this feature will be helpful. The question then is what is the desired/recommended workflow for pytm? Currently it is simpler to have one big threat model, because reuse is much simpler. But large models are hard to reason with. Alternatively you can copy the threat models which makes updates harder and older models will age quicker. For me it made sense to put elements which I would copy from other models into a separate python module for reuse. As I have done in my example. This felt natural to me since these are python objects, so I can write functions and classes which give me what I need and I can have constants done in my data module. ( by the way this is what made me use pytm). And by using these functions, classes and constants like normal members of a module this felt straight forward, since I can write something like ... inBoundary=company.boundary). I agree that not all users will be python programmers and something like with tm.build(): and the need to indent all lines will drive people mad. I have started a branch raphaelahrens:object_context <https://github.com/raphaelahrens/pytm/tree/object_context>, where I experimented with the idea of putting the context inside the TM object instead of keeping it in the class object. All other element classes then use TM.context() to get the reference to the current threat model object. TM.context() return the object if the call was not done via an import. With this function def _check_if_imported(): # Get the stack of the caller stack = inspect.stack() # Get the module of the second last stack which called module = inspect.getmodule(stack[-2][0]) # return Ture only if it was from a real module # Not some import return module is not None Which gets check if TM() and TM.context() are called This has the benefit that when TM() is called inside an import TM._context() is not changed. So you can combine these two threat models like this #!/usr/bin/env python3 from pytm import ( TM, Dataflow, ExternalEntity, ) t = TM( "my test tm", ignoreUnused=True, ) test = ExternalEntity('Test') print("import") import tm as example Dataflow( test, example.user, 'flow', ) if __name__ == "__main__": t.process() Currently the merging is done by setting ignoreUnused=True since it looks at all dataflows and rebuilds the element list. The next step would be to write the merging process and to see if it even makes sense to even reuse the model via t.merge(example.tm) or so. Also the function _check_if_imported() needs to be verified for safety, I guess there are still some corner cases where it return a false value. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#75 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAC2BAMRZUZWMYIZRR52D3LUBB75DANCNFSM4K4FNDQA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

raphaelahrens · 2021-09-09T16:28:01Z

@izar yes the dataflow will be part of t.
I added the wired looking import in the middle to make sure the solution would still work with none pep8 code.
As mentioned there will be corner cases of pythons import behavior I have missed.

raphaelahrens · 2021-09-23T09:04:40Z

I created pull request #175 with the the implantation and some small test cases. But it would be nice if someone could verify that I have not interfered with the normal behavior.

nineinchnick requested a review from izar as a code owner February 26, 2020 13:32

nineinchnick force-pushed the avoid-globals branch 2 times, most recently from df4e9a4 to fff68fd Compare March 4, 2020 11:03

nineinchnick force-pushed the avoid-globals branch 2 times, most recently from 1780df7 to b598528 Compare March 4, 2020 23:28

nineinchnick mentioned this pull request Mar 15, 2020

ignore unused elements #84

Merged

nineinchnick force-pushed the avoid-globals branch 3 times, most recently from eb437fe to a4f6be3 Compare March 16, 2020 12:12

izar reviewed Mar 17, 2020

View reviewed changes

nineinchnick force-pushed the avoid-globals branch 6 times, most recently from 00b20a8 to 4ba2fae Compare March 31, 2020 18:44

nineinchnick force-pushed the avoid-globals branch from 4ba2fae to 05f9678 Compare May 19, 2020 16:53

nineinchnick force-pushed the avoid-globals branch 2 times, most recently from ca014ee to a459d66 Compare June 16, 2020 20:03

nineinchnick mentioned this pull request Oct 4, 2020

Are container objects missing? #113

Open

nineinchnick and others added 2 commits September 28, 2021 20:47

remove usage of global attributes

464e879

add build context

d63c5a1

nineinchnick force-pushed the avoid-globals branch from 296b85d to d63c5a1 Compare September 28, 2021 20:50

nineinchnick mentioned this pull request Sep 28, 2021

Moved the model context from the TM class into the TM object #175

Closed

nineinchnick closed this Mar 27, 2022

Remove usage of global attributes #75

Remove usage of global attributes #75

Uh oh!

Conversation

nineinchnick commented Feb 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented Feb 26, 2020 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👉 View analysis in DeepCode’s Dashboard | Configure the bot

Uh oh!

izar left a comment

Choose a reason for hiding this comment

Uh oh!

nineinchnick commented Mar 17, 2020

Uh oh!

nineinchnick commented Mar 17, 2020

Uh oh!

izar commented Mar 17, 2020 via email

Uh oh!

nineinchnick commented Mar 18, 2020

Uh oh!

izar commented Mar 18, 2020 via email

Uh oh!

izar commented Jun 23, 2020

Uh oh!

nineinchnick commented Jun 24, 2020

Uh oh!

colesmj commented Jun 24, 2020

Uh oh!

izar commented Aug 26, 2021

Uh oh!

nineinchnick commented Aug 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

raphaelahrens commented Aug 31, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why the feature would help me?

Uh oh!

nineinchnick commented Aug 31, 2021

Uh oh!

izar commented Aug 31, 2021

Uh oh!

raphaelahrens commented Sep 9, 2021

Uh oh!

izar commented Sep 9, 2021 via email

Uh oh!

raphaelahrens commented Sep 9, 2021

Uh oh!

raphaelahrens commented Sep 23, 2021

Uh oh!

Uh oh!

nineinchnick commented Feb 26, 2020 •

edited

Loading

ghost commented Feb 26, 2020 •

edited by ghost

Loading

nineinchnick commented Aug 30, 2021 •

edited

Loading

raphaelahrens commented Aug 31, 2021 •

edited

Loading