Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FML's ChangeTracker and Forge's ModConfigSpec implementations can cause infinite log spam/disk writes/shutdown hangs #226

Open
LeadAssimilator opened this issue Dec 14, 2024 · 5 comments

Comments

@LeadAssimilator
Copy link

LeadAssimilator commented Dec 14, 2024

Minecraft Version: ALL

NeoForge Version: ALL

FML Version: ALL

Logs:

[12Dec2024 06:27:33.288] [pool-13-thread-1/WARN] [net.neoforged.neoforge.common.ModConfigSpec/CORE]: Configuration LoadedConfig[...] is not correct. Correcting
[12Dec2024 06:27:35.281] [pool-13-thread-1/WARN] [net.neoforged.neoforge.common.ModConfigSpec/CORE]: Configuration LoadedConfig[...] is not correct. Correcting

Steps to Reproduce:

  1. Make a mod with an incorrect ModConfigSpec whose default value doesn't pass its own validation function.
  2. Ensure fml.toml disableConfigWatcher = false (should be the default)
  3. Observe infinite config correcting cycle

Description of issue:

If a ModConfigSpec is incorrectly defined by a mod then an infinite config correction cycle may occur about every 2 seconds. This cycle has major consequences:

  1. constant log spam
  2. constant disk writes
  3. multiple .toml.bak files created (thankfully capped at 5)
  4. difficulty manually correcting the config (if even possible) due to racing against the config system writes and editors detecting the file changing underneath
  5. prevents graceful server shutdown (jvm must be killed most of the time) Related but different root cause - Fix server stop/shutdown hang by stopping the FileWatcher when unloading configs. #227
  6. goes unnoticed by both mod and pack creators

Triage:

Mods typically mess up their ModConfigSpec by using the wrong define function or overload of the builder, or specifying a default value or default supplier that provides a default value which does not pass the supplied validation function or implicit validation function. Most recently, I've seen use of defineList with a default value of an empty list instead of using defineListAllowEmpty from various mods. And another recent occurrence with Mekanism is a bit more complicated, where there is a dependency between different config values such that changing one value reduces the allowed range of another making its default invalid. While the defaults were self-consistent in this case, a pack author tried to change one value which invalidated another and that invalidated value's default was now also invalid due to the dependency. For complex configurations with dependencies, it can actually be quite difficult to get it right in all cases, especially since forge tries to individually correct specific values without any knowledge of the dependency tree.

While the underlying trigger for this problem is caused by mods, forge's config correction is in error by not preventing the cycle. When forge encounters an incorrect config toml, it attempts to correct it by rebuilding the config and resetting individual broken values to their defaults. If the defaults are specified incorrectly by the mod or the defaults for a specific value have a complex interdependency tree between other values that cannot be reasonably or correctly evaluated in all cases, then the newly "corrected" config won't actually be correct and will fail validate. After the newly "corrected" config is built it is saved. If fml disableConfigWatcher = false which is the default, the newly "corrected" config file will be picked up by the watcher, and loaded. On load, it will fail validate and cause another config correction cycle.

Proposed fix:

  1. Run the config validation step on an in-memory copy of the corrected config toml BEFORE writing it to disk.
  2. If the validation fails after correction on the in-memory copy, log an error and do NOT write out the new file to prevent the watcher from reloading it starting the cycle again.
  3. Optionally consider making this error FATAL and stop/shutdown - it would help point 6 above and ensure mod and pack authors check their work since it is painfully obviously something is wrong when the server disappears rather than an ignorable log warning.
  4. Optionally write the in-memory copy that failed validation out with a different extension like ".bkn" to aid debug while still preventing the watcher from causing the cycle.

This is dual reported to NeoForge since both projects may need work to address this issue. See neoforged/NeoForge#1768.

@LeadAssimilator
Copy link
Author

In looking into implementing the proposed fix, I realized that FML cannot get feedback as to which config values are incorrect as IConfigSpec.correct lacks a form which takes CorrectionListener that is implemented in NeoForge. While not absolutely required, without such feedback, making an invalid change to a config leaves the user guessing as to what is wrong. Currently the feedback is given indirectly through NeoForge's IConfigSpec.acceptConfig implementation as called by ModConfig.setConfig from ConfigTracker.loadConfig, but that is after the broken/uncorrectable config is written, which I think should be blocked.

So, an ideal fix definitely will require some coordination with NeoForge. Perhaps NeoForge's IConfigSpec.isCorrect impl should be emitting messages as to what is incorrect or IConfigSpec needs changed to add a new method for this purpose.

Another more convoluted solution is for ConfigTracker.loadConfig to call ModConfig.setConfig instead of IConfigSpec.correct in the try block. Since NeoForge's IConfigSpec.acceptConfig impl calls isCorrect and correct followed by save, it will provide the required feedback. Then if FML's LoadedConfig.save also asserts IConfigSpec.isCorrect before writing the config, most of the rest of the code can work as is. The downsides are isCorrect will be called 3x when editing a config file, it can be fragile, and it is otherwise a hot mess.

@Technici4n
Copy link
Member

Can we maybe check in FML that the corrected config is now correct? If not, we log an error and don't correct it.

@LeadAssimilator
Copy link
Author

Yes, that is what I was proposing. I have been testing an implementation of that where isCorrect is again called after correct and if it is still false, then throw an exception. But in doing, so I discovered as stated above, that you won't get feedback as to what is exactly wrong with the config - i.e. which config value was wrong and what it was attempting to be corrected to outside of the call to IConfigSpec.acceptConfig such as this 09:39:21] [pool-11-thread-1/WARN] [ne.ne.ne.co.ModConfigSpec/CORE]: Incorrect key materials.refined_obsidian.refined_obsidianHoeDamage was corrected from -8.0 to its default, -8.0. This seems to be an error.

I haven't been able to come up with any brilliant ideas that don't involve either changing the IConfigSpec interface or the convoluted method of calling setConfig instead of correct and writeConfig, which ends up calling acceptConfig and relies on NF's impl to provide the needed feeback, plus calls save which goes back into FML where the actual write can be blocked. It does work in my testing, but it just doesn't feel right.

@Technici4n
Copy link
Member

I'd say it's better to come up with a simple solution. I think that log you just posted is fine. Hard error seems good to me, it will force modders to correct their code.

@TelepathicGrunt
Copy link
Contributor

As long as the hard error is dev environment only, that would be ideal imo. Providing a default value that fails the configs own validation is well, an invalid state for the modder to do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants