Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce number of calls to wasmTable.grow #23633

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

martenrichter
Copy link

Before, getEmptyTableSlot increased the wasmTable by 1, increasing the overhead if many calls of addFunction exceeded the table length. Now, an exponential scaling strategy is implemented.

#17891

@martenrichter
Copy link
Author

Note: I have tested on a compiled Emscripten module, so I am not sure if the formatting is correct. I hope the test will show it.

@@ -151,15 +151,17 @@ addToLibrary({
}
// Grow the table
try {
var growBy = {{{ from64Expr('wasmTable.length') }}};
freeTableIndexes.push(...Array.from({ length: growBy }, (_, i) => {{{ from64Expr('wasmTable.length') }}} + i));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would use less memory to record the last used entry separately. Also I'm not sure we want to grow the table exponentially, @sbc100 was suggesting linear or quadratic.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I have changed the behavior. Unfortunately, the last commit tests failed, and I did not get a clue what was wrong. (It must be something with the templating.) Maybe the new one is better.
Regarding linear or quadratic scaling, is the initial size of the table somewhere available? We need it to achieve linear or quadratic scaling. I also think it is important that the growth is in a similar order of magnitude as the current size.

@hoodmane
Copy link
Collaborator

hoodmane commented Feb 9, 2025

I was imagining something roughly like:

  $freeTableIndexes: [],
  $usedTableLength: 0, // TODO: initialize this correctly

  // Weak map of functions in the table to their indexes, created on first use.
  $functionsInTableMap: undefined,

  $getEmptyTableSlot__deps: ['$freeTableIndexes', '$wasmTable', '$usedTableLength'],
  $getEmptyTableSlot: () => {
    // Reuse a free index if there is one, otherwise grow.
    if (freeTableIndexes.length) {
      return freeTableIndexes.pop();
    }
    if (usedTableLength < wasmTable.length) {
      return usedTableLength++;
    } 
    // Grow the table
    try {
      /** @suppress {checkTypes} */
      wasmTable.grow({{{ toIndexType('wasmTable.length') }}});
    } catch (err) {
      if (!(err instanceof RangeError)) {
        throw err;
      }
      throw 'Unable to grow wasm table. Set ALLOW_TABLE_GROWTH.';
    }
    return usedTableLength++;
  },

@martenrichter
Copy link
Author

Ok, I try if this works. But I am not completely sure, what needs to be changed in libdylink.js .

@hoodmane
Copy link
Collaborator

hoodmane commented Feb 9, 2025

I think it should be something like:

        var tableBase = metadata.tableSize ? usedTableLength : 0;
        if (handle) {
          {{{ makeSetValue('handle', C_STRUCTS.dso.mem_allocated, '1', 'i8') }}};
          {{{ makeSetValue('handle', C_STRUCTS.dso.mem_addr, 'memoryBase', '*') }}};
          {{{ makeSetValue('handle', C_STRUCTS.dso.mem_size, 'metadata.memorySize', 'i32') }}};
          {{{ makeSetValue('handle', C_STRUCTS.dso.table_addr, 'tableBase', '*') }}};
          {{{ makeSetValue('handle', C_STRUCTS.dso.table_size, 'metadata.tableSize', 'i32') }}};
        }
      } else {
        memoryBase = {{{ makeGetValue('handle', C_STRUCTS.dso.mem_addr, '*') }}};
        tableBase = {{{ makeGetValue('handle', C_STRUCTS.dso.table_addr, '*') }}};
      }

      var tableGrowthNeeded = tableBase + metadata.tableSize - {{{ from64Expr('wasmTable.length') }}};
      if (tableGrowthNeeded > 0) {
          wasmTable.grow({{{ toIndexType('tableGrowthNeeded') }}});
      }
      usedTableLength = tableBase + metadata.tableSize

@hoodmane
Copy link
Collaborator

hoodmane commented Feb 9, 2025

And probably $dumpTable in libdylink.js should use usedTableLength instead of wasmTable.length.

@martenrichter
Copy link
Author

I agree, I came up with similar changes. But I am not familiar with the code....

@martenrichter
Copy link
Author

But if you use dylink, would it not generally interfere, if you have added functions in the meantime...? (Beside this change)

@hoodmane
Copy link
Collaborator

hoodmane commented Feb 9, 2025

Dynamic linking currently works fine with addFunction. The library exports get placed in a block at the end of the wasmTable, ignoring any freelist entries.

@hoodmane
Copy link
Collaborator

hoodmane commented Feb 9, 2025

I think usedTableLength should also be initialized in two places: line 911 of preamble.js if !RELOCATABLE:

#if '$wasmTable' in addedLibraryItems && !RELOCATABLE
    wasmTable = wasmExports['__indirect_function_table'];
#if ALLOW_TABLE_GROWTH
    usedTableLength = wasmTable.length;
#endif
    {{{ receivedSymbol('wasmTable') }}}

and line 2276 of libcore.js:

#if RELOCATABLE
  // In RELOCATABLE mode we create the table in JS.
  $wasmTable: `=new WebAssembly.Table({
  'initial': {{{ toIndexType(INITIAL_TABLE) }}},
#if !ALLOW_TABLE_GROWTH
  'maximum': {{{ toIndexType(INITIAL_TABLE) }}},
#endif
#if MEMORY64 == 1
  'address': 'i64',
   // TODO(sbc): remove this alias for 'address' once both firefox and
   // chrome roll out the spec change.
   // See https://github.com/WebAssembly/memory64/pull/92
  'index': 'i64',
#endif
  'element': 'anyfunc'
});
`,
#if ALLOW_TABLE_GROWTH
   $wasmTable_postset: `usedTableLength = wasmTable.length`,
#endif
#else

But I'm not really sure where to stick the $usedTableLength definition to make that work...

@martenrichter
Copy link
Author

Ok, I was just wondering, as it contains the if clause with one branch being first load. I was wondering, if a wasm was loaded a second time with an increased number of functions. As it recycles the position in the table, it will overwrite the function pointers. If it would be running on the os and not in the sandbox, we would have a vulnerability.

@hoodmane
Copy link
Collaborator

hoodmane commented Feb 9, 2025

Are you saying that you're concerned that addFunction isn't thread safe? It's probably not.

@martenrichter
Copy link
Author

No, I am concerned with the following scenario
1.) You load a lib first with say x functions.
2.) Then you use addfunction with some functions
3.) You load the lib a second time but now it has x +1 functions, and the x+1 will overwrite the function added in 2.
But I am not sure if this would work, I was just wondering because of firstLoad .

@hoodmane
Copy link
Collaborator

hoodmane commented Feb 9, 2025

dynamic linking + pthreads is experimental:
https://emscripten.org/docs/compiling/Dynamic-Linking.html#pthreads-support

I think the safe way to use it is to make sure all library loading and addFunction calls happen before you start any threads.

@hoodmane
Copy link
Collaborator

hoodmane commented Feb 9, 2025

firstLoad is about threads. It's true when one thread has already loaded the library and it is being loaded into a second thread. If you load two different versions of the same library that will probably cause crashes.

@martenrichter
Copy link
Author

Ok, I understand now. But should it check, that it is not having a different number of table entries and throw?

@kg
Copy link

kg commented Feb 9, 2025

With fast table growing it's quite possible to hit the limit for how big a function table can be if your function table already started big. I've heard of applications with tables containing 100k entries or more.

Given that, you probably want to handle the failure case and attempt one retry where you grow by a smaller amount. Otherwise if you have an application with a table that has 400k entries and the size limit is 500k, it'd never be able to grow, even though in practice there's plenty of room to grow.

@martenrichter
Copy link
Author

Ok, good point, a factor of 10 down? Or better 100 ?

@kg
Copy link

kg commented Feb 9, 2025

Your measurements showed that 'grow by 100' and 'grow by 1000' were both big improvements over 'grow by 1'. I'd suggest just falling back to a flat 100 or 1000. 100 is more likely to work, but if an application is so close to the limit that 1000 won't work, it's probably close to broken anyway, so I could understand going with 1000 for better performance. I'd personally pick 100.

// Grow the table
try {
var growBy = lastUnused;
freeTableIndexes.push(...Array.from({ length: growBy - 1 }, (_, i) => lastUnused + i + 1));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably going to be really expensive for large tables. You might want to track free ranges instead of free indices.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This already gone, I have changed it to the suggestions by hoodmane. I will push soon, but I have to fix a failing test.

@sbc100
Copy link
Collaborator

sbc100 commented Feb 9, 2025

No, I am concerned with the following scenario 1.) You load a lib first with say x functions. 2.) Then you use addfunction with some functions 3.) You load the lib a second time but now it has x +1 functions, and the x+1 will overwrite the function added in 2. But I am not sure if this would work, I was just wondering because of firstLoad .

You can't load the same dynamic library twice. If you were to give them different names and load two version of the same library each version will get its own separate table and memory allocation.

@martenrichter
Copy link
Author

Ok, I have added the changes and the test failing before now works. But I did not run the full suite, I am waiting for CI.

@martenrichter
Copy link
Author

Your measurements showed that 'grow by 100' and 'grow by 1000' were both big improvements over 'grow by 1'. I'd suggest just falling back to a flat 100 or 1000. 100 is more likely to work, but if an application is so close to the limit that 1000 won't work, it's probably close to broken anyway, so I could understand going with 1000 for better performance. I'd personally pick 100.

I have now implemented growing by 1000 and then falling back to 10. But other variants are also possible.

@sbc100
Copy link
Collaborator

sbc100 commented Feb 9, 2025

Your measurements showed that 'grow by 100' and 'grow by 1000' were both big improvements over 'grow by 1'. I'd suggest just falling back to a flat 100 or 1000. 100 is more likely to work, but if an application is so close to the limit that 1000 won't work, it's probably close to broken anyway, so I could understand going with 1000 for better performance. I'd personally pick 100.

I have now implemented growing by 1000 and then falling back to 10. But other variants are also possible.

It seems to me that unless you fall all the way back to "grow by 1" you might not ever be able to use the final available slots.

I was going to suggest that instead of implementing falloff we could just query the max length .. but I don't think WebAssembly.Memory exposes that.

@martenrichter
Copy link
Author

Your measurements showed that 'grow by 100' and 'grow by 1000' were both big improvements over 'grow by 1'. I'd suggest just falling back to a flat 100 or 1000. 100 is more likely to work, but if an application is so close to the limit that 1000 won't work, it's probably close to broken anyway, so I could understand going with 1000 for better performance. I'd personally pick 100.

I have now implemented growing by 1000 and then falling back to 10. But other variants are also possible.

It seems to me that unless you fall all the way back to "grow by 1" you might not ever be able to use the final available slots.

I was going to suggest that instead of implementing falloff we could just query the max length .. but I don't think WebAssembly.Memory exposes that.

Ok, then we go with 1 . And yes the spec does not have this function. I could look in the source code of Chromium if this helps, but if it is not in the specs it can change anytime.

sbc100 added a commit to sbc100/emscripten that referenced this pull request Feb 9, 2025
Also add an extra assert that should catch table layout inconsistencies
between threads.

I noticed this could be improved while reviewing emscripten-core#23633.
sbc100 added a commit that referenced this pull request Feb 10, 2025
Also add an extra assert that should catch table layout inconsistencies
between threads.

I noticed this could be improved while reviewing #23633.
@@ -668,6 +669,7 @@ var LibraryDylink = {
#endif
wasmTable.grow({{{ toIndexType('metadata.tableSize') }}});
}
usedTableLength = tableBase + metadata.tableSize;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line should be moved up inside the above if (metadata.tableSize) check, no?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

@martenrichter
Copy link
Author

The new attempt uses a different structure for freeTableIndexes, which uses less memory and may also be used to easily reuse freed indices.

Before, getEmptyTableSlot increased the wasmTable by 1, increasing the overhead if many calls of addFunction exceeded the table length. Now, an exponential scaling strategy is implemented.

Co-authored-by: Alon Zakai <alonzakai@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants