-
Notifications
You must be signed in to change notification settings - Fork 851
Description
While testing 9.2.x, I accidentally created an invalid remap.config file. In previous releases, ATS would fail to start if it loaded an invalid remap.config file and it would emit a FATAL log message. In 9.2.0 (not released yet), this is no longer the case: ATS starts, albeit with ERROR messages.
This change looks intentional per the following PR: #7782
However we might want to reconsider this because this is a pretty significant change in behavior that can lead to confusing 404 responses for the user. Consider the following possible scenario:
- An admin creates a problematic remap.config entry near the end of the file, maybe 95% down.
- They restart ATS. In 9.1, ATS would refuse to come up with a FATAL message. In 9.2, it now emits an ERROR but does start with the previous remap entries active in its running state.
- Since the majority of the remap.config entries were parsed, their smoke testing shows that the remap.config file seems to work. Remapping works fine in general, and unless they test entries at or after their corrupted remap.config entry, remapping will seem to work.
- This runs in prod for a while and the diags.log file is rotated.
- The admin gets reports about unexpected 404 errors due to the few entries near the bottom of the remap.config file.
In this situation, he does not see diags.log errors anymore because they only happened after ATS restart. The corruption in the remap.config may not be obvious. In fact, particular entries may be fine with their only problem being that they come after the corrupted entry, which may be many lines before that for which the current 404 responses correspond. Thus for large remap.config files (ours are thousands of lines long), a corrupted entry may be almost impossible to detect visually.
Perhaps it's preferable for reloads to be more lenient, while keeping initial remap.config loading strict?