Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BYOC] Exclude external params from Graph Runtime #7564

Closed
wants to merge 1 commit into from

Conversation

d-smirnov
Copy link
Contributor

This patch exculdes external params from created Graph Runtime module
to prevent duplication as the external params will be also serialized
from Metadata module.

This patch exculdes external params from created Graph Runtime module
to prevent duplication as the external params will be also serialized
from Metadata module.

Change-Id: I47eda7e53b85f0b840445b60d33c2939b1cf99a0
@comaniac
Copy link
Contributor

comaniac commented Mar 2, 2021

It seems to me that the parameter name prefix doesn't indicate if it is in the external module or metadata module? Could you show some use cases?

cc @zhiics

@d-smirnov
Copy link
Contributor Author

I understood that the agreement is p[0-9]+ graph_runtime_codegen.cc:319 for internal params and <codegen_name><something_with_sequence_value> (e.g. utils.h, default implemtation uses _const_ suffix though) for param names generated from annotated functions.

@comaniac
Copy link
Contributor

comaniac commented Mar 2, 2021

ConstantUpdater can be customized as introduced in this #6697, so the parameter naming convention is not guaranteed (cc @zhiics, does codegen_name and symbol already the same?) Even it's the agreement, I still have the following two concerns:

  1. The naming convention can be changed anytime, and seems like we have no way to know that we need to change this part accordingly. Specifically, we need a test case if we really need to use the parameter name to decide whether to put the constants to metadata module.

  2. If a customized ConstantUpdater doesn't assign "null", then the intention is still maintaining the constants in the metadata module.

In summary, a safer solution is to have a way to check whether the constant is alaready stored in the extenral module when building a module, but it's not easy and that's part of the reasons that we haven't fixed this issue until now.

@d-smirnov
Copy link
Contributor Author

Agree, However it still looks to me as the agreement to put codegen_name as prefix for const naming. Bth, would it help if for filtering out external consts regexp ^p[0-9]+$ were used? I was cautious of the usage of encoded literal here

@d-smirnov
Copy link
Contributor Author

@comaniac Any suggestions then? My first attempt was to explicitly split params from internal (p prefixed) and external modules in different sets. However the subsequent modifications were quite invasive. Basically my understanding of the issue is graph runtime module should not need external constatns and metadata module should not contain internal constants.

@comaniac
Copy link
Contributor

comaniac commented Mar 2, 2021

Agree, However it still looks to me as the agreement to put codegen_name as prefix for const naming. Bth, would it help if for filtering out external consts regexp ^p[0-9]+$ were used? I was cautious of the usage of encoded literal here

I personally don't think it's a good idea to make decisions depending on variable names. The reason was mentioned in my previous comment. The variable names are supposed to be referred by users for debugging or development only.

@comaniac Any suggestions then? My first attempt was to explicitly split params from internal (p prefixed) and external modules in different sets. However the subsequent modifications were quite invasive. Basically my understanding of the issue is graph runtime module should not need external constatns and metadata module should not contain internal constants.

I don't have a good idea at this moment. It might be a way to extend the solution from @mbaret that assigns "null" pointers to the external module handled parameters to be a more general solution, but we need to be careful. Maybe we should start with an RFC for discussion.

@d-smirnov
Copy link
Contributor Author

d-smirnov commented Mar 2, 2021

I noticed that there is a check implemented for demanding codegen_name to be placed first (utils.h) in case of external codegen uses its own implementation of ConstUpdater. Looks to me now that I am on a safe side with current patch

@comaniac
Copy link
Contributor

comaniac commented Mar 2, 2021

I noticed that there is a check implemented for demanding codegen_name to be placed first (utils.h) in case of external codegen uses its own implementation of ConstUpdater. Looks to me now that I am on a safe side with current patch

Yeha that's why I asked the question about if codegen_name and symbol always the same, but it doesn't solve the problem that even a parameter is named with the external codegen prefix, it's still possible not maintained by the external module.

btw, if the purpose of this PR is going to solve this issue (https://discuss.tvm.apache.org/t/model-graph-size-doubling-after-partitioning-for-arm-compute-library/9131), I think it would be much easier to add an ACL constant updater like what Ethos-N does.

@d-smirnov
Copy link
Contributor Author

d-smirnov commented Mar 2, 2021

I doubt that what was implemented in Ethos-N is applicable to ACL (ACL relies on JSON codegen and Ethos-N uses C codegen). Could you elaborate your proposal?

@comaniac
Copy link
Contributor

comaniac commented Mar 2, 2021

I doubt that what was implemented in Ethos-N is applicable to ACL (ACL relies on JSON codegen and Ethos-N uses C). Could you elaborate you proposal?

I think this is not related to what codegen you are using? Based on the functionality of this PR, what you need is getting rid of some constatnts in the metadata module. Then it should be fine to put this logic to the ACL specific constant updater. @mbaret could you comment whether this proposal would work?

@d-smirnov
Copy link
Contributor Author

d-smirnov commented Mar 2, 2021

The constants removed by this PR are actually from graph runtime module. The metadata constants are kept intact.

The usage of custom ConstUpdater (in its current implementation) seems unrealistic to me due to the updater is called before the metadata and graph runtime modules are created. Metadata module gets all the local consts and the consts fetched via ConstUpdater as one map. Then this map of all constants is copied to graph runtime module

@trevor-m
Copy link
Contributor

trevor-m commented Mar 2, 2021

Thanks @d-smirnov for this fix! I was also trying to find ways to fix the duplicate weights bug but couldn't find any as clean as yours.

To address @comaniac's concern about using the names, could we use some sort of pass like free_vars to prune params and remove any params that aren't used in the module that is being compiled?

Also, this problem also affects relay vm. Can we do the same thing for VM?

@zhiics
Copy link
Member

zhiics commented Mar 3, 2021

I think this is just a work around. One way I can think of is that we can make graph runtime similar to some other runtimes. Therefore, it is able to query the needed parameters. I had a local draft commit to do this some time ago but I could not find it now.

@mbaret
Copy link
Contributor

mbaret commented Mar 9, 2021

I doubt that what was implemented in Ethos-N is applicable to ACL (ACL relies on JSON codegen and Ethos-N uses C). Could you elaborate you proposal?

I think this is not related to what codegen you are using? Based on the functionality of this PR, what you need is getting rid of some constatnts in the metadata module. Then it should be fine to put this logic to the ACL specific constant updater. @mbaret could you comment whether this proposal would work?

Apologies @comaniac that I missed this earlier. We did explore using the same fix as for Ethos-N (giving a 'null' ConstantUpdater), but this doesn't work for ACL as ACL does load the constants from MetadataModule. Ethos-N just directly serializes the constants into the binary module so it doesn't care if they're missing in the MetadataModule.

This problem seems quite fundamental to me and shouldn't just affect ACL, but any BYOC target that tries to make use of MetadataModule.

I think this is just a work around. One way I can think of is that we can make graph runtime similar to some other runtimes. Therefore, it is able to query the needed parameters. I had a local draft commit to do this some time ago but I could not find it now.

Given this is a user-facing issue (https://discuss.tvm.apache.org/t/model-graph-size-doubling-after-partitioning-for-arm-compute-library/9131) and we're very interested in getting a fix in soon, could you elaborate on your alternative proposal @zhiics?

@mbaret
Copy link
Contributor

mbaret commented Mar 22, 2021

Ping @comaniac @zhiics, what can we do to make progress here?

@zhiics
Copy link
Member

zhiics commented Mar 23, 2021

The approach I was thinking is that we can probably use the MetadataModule as the only place to save the weights. Other modules including GraphRuntimeModule can directly query the needed parameters from it (external modules have done that).

The way we filter the redundant parameters in the this PR can be used by users after the build (i.e. we can document it somewhere) but should not be part of the build codebase in my opinion.

@comaniac
Copy link
Contributor

We did explore using the same fix as for Ethos-N (giving a 'null' ConstantUpdater), but this doesn't work for ACL as ACL does load the constants from MetadataModule. Ethos-N just directly serializes the constants into the binary module so it doesn't care if they're missing in the MetadataModule.

Could we improve ACL's customized runtime so that it could know if the constant should load from MetadataModule or itself? To me, Ethos-N is just an extreme case that completely gets rid of constants in MetadataModule, but constant updater should be capable of supporting all three cases: all in MetadataModule, some in MetadataModule, and none in MetadataModule.

@d-smirnov
Copy link
Contributor Author

Closed as superseded with #7977

@d-smirnov d-smirnov closed this May 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants