Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modernize Javascript bundling in LORIS #9424

Open
maximemulder opened this issue Oct 25, 2024 · 2 comments
Open

Modernize Javascript bundling in LORIS #9424

maximemulder opened this issue Oct 25, 2024 · 2 comments

Comments

@maximemulder
Copy link
Contributor

maximemulder commented Oct 25, 2024

This issue presents my opinion on the way we bundle Javascript files in LORIS, why I think our approach is flawed, and how I think it should be improved. This issue proposes a lot of different changes, which can be discussed and implemented on an individual basis.

This is a controversial topic that should be discussed at a LORIS meeting.

Current structure

Currently, in LORIS, there are source Javascript files in the following directories:

  • htdocs/js/: Some minified libraries and old scripts.
  • jslib/: A single Javascript file that is imported by some module Javascript files
  • jsx/: Some React components that are used in the main template, as well as some React components that are imported by other components.
  • modules/[module]/js/: Javascript scripts used by a given module.
  • modules/[module]/jsx/: React components used by a given module.

When building LORIS with Webpack, the following files are produced (minified and accompanied by a source map)

  • Module React components in modules/[module]/jsx/ are bundled to modules/[module]/js/
  • The Javascript file under jslib/ is directly bundled in the files that import it.
  • The React components in jsx/ are bundled to htdocs/js/components/.

Javascript files are served to the client using the following pattern:

  • Files in htdocs/js/ (and htdocs/js/components) are served directly by Apache, without going through the LORIS PHP code and routers.
  • Files in modules/[module]/js/ are served through the LORIS PHP code and routers, which notably allows to not serve the files of disabled modules.

Current directory structure template (source and compiled):

htdocs/
    js/
        components/
            CommonComponent.js
            CommonComponent.js.map
        script.js
jslib/
    commonLib.ts
jsx/
    CommonComponent.tsx
modules/
    module_a/
        js/
            moduleScript.js
            ModuleComponent.js
            ModuleComponent.js.map
        jsx/
            ModuleComponent.tsx
        .gitignore

Problems of the current structure

There are several problems with the current approach. A few minor problems are:

  • Modules must individually add their compiled Javascript files in their .gitignore, which is notably annoying when changing branches with leftover compiled files.
  • There is a lot of redundancy:
    • There are three directories to place source common Javascript files: htdocs/js/, jslib/ and jsx/.
    • There are two directories to place module source Javascript files: modules/[module]/js/ and modules/[module]/jsx/.
  • Libraries used in bundled files and not marked as external are included directly in each file that includes them, which can cause duplication and large file sizes.

A more fundamental problem is that our approach to bundle and serve Javascript files is archaic IMO: going through PHP to serve Javascript files is not how bundlers are meant to be used nowadays. This results in code and processes that are unnecessarily complex, and that are not in agreement with the field's best practices (which is notably confusing for newcomers).

Typical modern structure

Modern web projects tend to follow two design principles with regards to Javascript files and bundling:

  • Separate the front-end and the back-end (as much as reasonable).
  • Place all the compile code in a single directory (usually named dist).

This design philosophy has several advantages:

  • Simpler structure: All the Javascript files served to the client are located in a single directory, which contains no source file and is added to the .gitignore.
  • Simpler code: All the Javascript are served directly by the web server, without going through the back-end code, which allows to simplify this back-end.
  • Simpler responsibilities: All the Javascript files are public. Permissions should only be enforced for back-end endpoints 1.

New structure proposal

I propose several changes to gradually improve the LORIS front-end structures. These changes can be discussed and applied individually:

  • Enforce that JSX appears only in .jsx / .tsx files, not in .js or .ts files.
  • Do not segregate JS and JSX directories, or rather include JSX directories inside JS directories (example: .../js/components/).
  • Bundle all the Javascript files in a single public directory (probably htdocs/js/dist). No compiled Javascript file should appear outside of this directory. These files will be served by Apache and not go through the LORIS router.

Ideally, the following resulting structure would look like this one:

htdocs/
    js/
        dist/
            modules/
                module_a/
                    moduleScript.js
                    moduleScript.js.map
                    components/
                        ModuleComponent.js
                        ModuleComponent.js.map
            components/
                CommonComponent.js
                CommonComponent.js.map
            commonScript.js
            commonScript.js.map
    .gitignore
js/
    components/
        CommonComponent.tsx
    commonScript.ts
modules/
    module_a/
        js/
            components/
                ModuleComponent.tsx
            moduleScript.ts

Notes

What about modularity ?

Some may think that the proposed change goes against the modularity of LORIS, notably because serving Javascript files directly with Apache overrides the enabled module check present in PHP. However, I argue the following:

  1. Is there any benefit to this behaviour ? The user does not see links to modules they do not have access to anyway, and Javascript files alone (that is, without back-end access to the disabled modules) cannot do anything by themselves.
  2. This view of the server providing or restricting access to front-end files is outdated IMO. Modern web applications are single-page applications (SPA), and although LORIS is not a SPA, the decoupling of front-end and back-end is still a modern web principle that LORIS should follow IMO.

IMO, true modularity would be to not even include the files of disabled modules in the bundle. But I do not think that is worth at this point.

What about security ?

Front-end files such as Javascript files should not contain any sensitive data or mean to perform an unauthorized operation. By enforcing permissions only for back-end endpoints, we reduce the area we have to cover, which may even increase security. Note that the back-end might still send the results of the permission checks to the front-end so that it knows which interface to show to the user (example: which modules does the user have access to).

What about project overrides ?

Project overrides can either be compiled to /htdocs/js/dist/modules/[file].js (replacing the original compiled file) or /htdocs/js/dist/project/modules/[file].js (using a project directory).

Footnotes

  1. Independently of the permissions, this does not change the fact that the front-end code should not show links to pages a user does not have access to. Those page are theoretically accessible to any user with some reverse engineering, but they would be non-functional as the back-end would not serve data a user does not have access to.

@ridz1208
Copy link
Collaborator

Let me start with this... I'm really not an expert in front end code organization and best practices so I may be way off here.

I see some of the advantages of what you are proposing and I don't have a problem with most of it but I'm a bit on the fence about the whole server all the JS regardless of the modules you have access to or not. First because some of our javascript still offers insight on how we do things and could maybe be leveraged by hackers if its just available on the login page? while now you would need to be at least logged in and have access to the module to see the JS code itself (Think GUID generation and storage algorithms potentially exposing source and destination of the data). YES the endpoints have permissions, YES we can modify the code to only provide the data once the module is loaded and not just off the bat... but still thats more info out there than is necessary.... my 2 cents

As for the rest, as I understand it the JSX/TSX code remains under the module directory so I'm okai with that and centralization of the compiled code I'm okai with... but other than the annoyance of adding .gitignore entries I don't really see the gains that justify the work. Unless you tell me that compiling the code will be 10x faster or have some great speed improvements or anything.

@maximemulder
Copy link
Contributor Author

maximemulder commented Oct 31, 2024

Regarding the most controversial "compile everything in one place change", it is my "ideal" vision for LORIS as it allows it to be one day converted to a single-page application (which I repeat is the standard for modern web projects), but I guess that may be overly ambitious for now. If I am to take things step by step, it is probably better to have local dist folders in each module, and have all the source code live in the module js directory. The gitignore could also just contain modules/*/dist/ then.

On another note, I spent about two hours recently experimenting with using another bundler with LORIS. I managed to build LORIS in 50s~1min with Vite, which is about 1.5x~2x time faster than currently but is less of an improvement that I had hoped for. I guess Rust-based bundlers may be able to significantly improve building times (which Vite intends to move to some day), but those are less mature options for now so I don't think I'll be trying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants