Skip to content

Please reduce the size of this module bundle by eliminating documentation bloat and sharing code inside of nested modules #428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
KirkMunro opened this issue Oct 15, 2020 · 4 comments · Fixed by #566
Assignees

Comments

@KirkMunro
Copy link

KirkMunro commented Oct 15, 2020

Today the Graph module bundle is far too big, and the size of the modules results in slower installation times, and a large performance impact when modules are deployed into containers or serverless execution environments where size really matters.

To give some perspective:

The Az module bundle is comprised of 62 modules which define 3871 commands, and requires 180MB of disk space.
The AWS.Tools module bundle is comprised of 213 modules which define 7904 commands, and requires 261MB of disk space.
The Microsoft.Graph module bundle is comprised of 33 modules which define 4254 commands, and requires 947MB of disk space.

Clearly you can see that the Microsoft.Graph module contains a ton of bloat that is unnecessary and can be reduced through smarter usage of DLLs and nested modules, as well as smarter documentation efforts. There is no justification for this module bundle to consume close to 1GB of disk space.

Looking more closely at the Graph module bundle, here is the size breakdown by file extension:

Extension Size
--------- ----
.dll      329.63 MB
.md       301.53 MB
.ps1      298.04 MB
.ps1xml   11.32 MB
.json     2.52 MB
.psm1     1.28 MB
.psd1     0.38 MB
.p7s      0.29 MB

Looking at the .dll sizes, I cannot determine what could be done to reduce the size of those because each module produces a single dll. There is zero doubt, however, that those dlls contain a lot of shared library code, because the unique portion of what they do is very, very simple. With that in mind, and with some programmatic rearchitecture, you should be able to pull out the shared logic into a common DLL or set of DLLs that are stored in a single Microsoft.GraphShared module that is then added as a nested module to each of the 32 Graph modules that use it. That should also result in a tremendous reduction in required disk space, likely cutting the required space for DLLs by 50-66%, or more.

If you look at the documentation (the documentation portion of the .ps1 files and all of the .md files), documentation consumes the bulk of the 600MB of disk space that is shared between those two file extensions. The .ps1 files contain comment-based help that includes way too much information, to the point where you cannot properly consume it in PowerShell itself. New-MgTeam, for example, contains 8732 lines of documentation for the v1 version, and 16906 lines for beta version of the same. That documentation is then duplicated in corresponding markdown documents, where it takes up even more space.

In reviewing this documentation it is clear that the bulk of overconsumption of disk space comes from two design issues:

  1. Duplication of information

    You should ship context-sensitive help or markdown files, but not both -- just make sure it is consumable and reasonably useful when displayed using PowerShell's Get-Help cmdlet

  2. Inclusion of technical details that should be provided online

    In the .NOTES section, you provide details with a label of COMPLEX PARAMETER PROPERTIES -- those details are necessary, but they only need to be online, with a reference from PowerShell command documentation in the NOTES section.

I haven't done the exact math, but I bet effort to correct those two design problems alone would cut your module bundle size in at least half, reducing the documentation size by 500MB or more.

The combined effort of DLL size reduction through isolation of shared components combined with smarter generation of documentation and migration of deep technical details that consume too much space to online documentation sites should reduce the disk space required by the Microsoft.Graph module bundle down to (possibly significantly) below 300MB.

If AWS can ship a module bundle that defines almost 8K cmdlets in just 261MB of disk space, surely the Microsoft Graph PowerShell team can ship a module bundle that defines just over 4K cmdlets in less than 300MB (and I'm being conservative with that estimation).

Please, please take on these disk space cutting efforts. In my organization we are using the Graph module bundle inside of Windows containers, and the size of this module bundle has a strong negative impact on container deployment time.

cc: @darrelmiller
AB#7402

@ghost ghost added the ToTriage label Oct 15, 2020
@KirkMunro
Copy link
Author

Two more comments on this:

  1. This is really an important issue to solve, because you support multiple versions of the Graph API, and produce tremendous bloat through duplication of documentation across those versions. If a new version gets added at some point (2.0, for example), you'll suddenly have a 50% increase in disk space requirements.

  2. PowerShell supports updateable help. There is no need to ship help documentation in modules that are used in unattended workloads, and Update-Help can be used to download documentation for modules onto systems where that documentation is needed. Given that is the case, you have an opportunity to remove the documentation from the module bundle by default entirely, such that there is no in-box help until you run Update-Help. If you took that approach, users who want in-box help could still get it, and if you move shared logic into a nested module to reduce the DLL bloat, your module bundle disk space requirements could drop below 200MB, which would be a huge win.

@MIchaelMainer
Copy link
Contributor

This work is underway.

@finsharp

@ghost ghost removed the ToTriage label Oct 15, 2020
@JeremyTBradshaw
Copy link

@KirkMunro just saying hello as I noticed you're nearly local to me... and I second this so am subscribing to the issue.

@KirkMunro
Copy link
Author

@finsharp: Would you mind sharing the size of the module bundle with this change in place?

I'm asking because there were a number of points called out here related to size, and I'd like to know how much the new module bundle size has decreased with this change. I suspect I'll think it's a good start, but that there is still more work to be done, in which case closing this issue may be premature (although the additional changes required could be copied out of here into a new issue if that is preferred).

Anyway, I won't know what I think until I learn how much the module bundle has shrunk, so please share.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants