Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a patch for RubyInstaller to avoid crash on start up #620

Merged
merged 1 commit into from
Feb 22, 2024

Conversation

ashie
Copy link
Member

@ashie ashie commented Feb 16, 2024

When a non-ASCII key exists under the registry key SOFTWARE/Microsoft/Windows/CurrentVersion/Uninstall/, Fluentd fails to start workers due to Encoding::UndefinedConversionError. This patch avoid this issue.

Fix #616

@ashie ashie force-pushed the avoid-crash-on-non-ascii-registry branch from 0bb59ad to 3081471 Compare February 16, 2024 05:41
@daipom daipom self-requested a review February 16, 2024 05:41
@ashie ashie force-pushed the avoid-crash-on-non-ascii-registry branch 3 times, most recently from b7d7c3d to 592fc53 Compare February 16, 2024 07:31
@ashie
Copy link
Member Author

ashie commented Feb 16, 2024

Hmm, this patch breaks searching MSYS2 for build 🤔

@ashie ashie force-pushed the avoid-crash-on-non-ascii-registry branch 2 times, most recently from ff55475 to 7c937a3 Compare February 16, 2024 09:43
fluent-package/msi/build.bat Outdated Show resolved Hide resolved
@ashie ashie force-pushed the avoid-crash-on-non-ascii-registry branch from 7c937a3 to c4b4056 Compare February 20, 2024 03:37
@ashie
Copy link
Member Author

ashie commented Feb 20, 2024

I've confirmed that this patch fixes the issue.

@ashie ashie marked this pull request as ready for review February 20, 2024 09:39
@ashie ashie added this to the 5.0.3 milestone Feb 20, 2024
@ashie ashie requested a review from kenhys February 21, 2024 06:09
When a non-ASCII key exists under the registry key
`SOFTWARE/Microsoft/Windows/CurrentVersion/Uninstall/`, Fluentd fails to
start workers due to `Encoding::UndefinedConversionError`.
This patch avoid this issue.

Fix #616

Signed-off-by: Takuro Ashie <ashie@clear-code.com>
@ashie ashie force-pushed the avoid-crash-on-non-ascii-registry branch from c4b4056 to f651dac Compare February 22, 2024 01:10
@ashie
Copy link
Member Author

ashie commented Feb 22, 2024

We observed that supervisor process of fluentd is finished unexpectedly after about 1 hour passed while repeating recovery. While this situation, opened handles are continually increased, over than 8400 at last.

We got following backtrace on finishing supervisor process.

2024-02-21 15:34:23 +0900 [debug]: fluent/log.rb:341:debug: Got Win32 event "fluentd_7928_STOP_EVENT_THREAD"
Unexpected error undefined method `pid' for nil:NilClass
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/supervisor.rb:417:in `after_start'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_spawn_server.rb:77:in `ensure in start_worker'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_spawn_server.rb:77:in `start_worker'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:175:in `delayed_start_worker'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:159:in `restart_worker'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:125:in `block in keepalive_workers'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:102:in `each'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:102:in `each_with_index'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:102:in `keepalive_workers'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:58:in `run'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_spawn_server.rb:50:in `run'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/server.rb:128:in `main'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/daemon.rb:119:in `main'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/daemon.rb:68:in `run'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/supervisor.rb:796:in `supervise'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/supervisor.rb:582:in `run_supervisor'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/command/fluentd.rb:352:in `<top (required)>'
  <internal:C:/opt/fluent/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
  <internal:C:/opt/fluent/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
  C:/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/bin/fluentd:15:in `<top (required)>'
  C:/opt/fluent/bin/fluentd:32:in `load'
  C:/opt/fluent/bin/fluentd:32:in `<main>'

In the above log, the root cause is squashed by ensure. So I fetched additional backtrace of the exception:

#<Errno::EMFILE: Too many open files - dup>
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/process_manager.rb:190:in `spawn'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/supervisor.rb:413:in `spawn'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_spawn_server.rb:75:in `start_worker'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:175:in `delayed_start_worker'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:159:in `restart_worker'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:125:in `block in keepalive_workers'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:102:in `each'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:102:in `each_with_index'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:102:in `keepalive_workers'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_worker_server.rb:58:in `run'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/multi_spawn_server.rb:50:in `run'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/server.rb:128:in `main'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/daemon.rb:119:in `main'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/serverengine-2.3.2/lib/serverengine/daemon.rb:68:in `run'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/supervisor.rb:796:in `supervise'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/supervisor.rb:582:in `run_supervisor'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/command/fluentd.rb:352:in `<top (required)>'
<internal:C:/opt/fluent/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
<internal:C:/opt/fluent/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
C:/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/bin/fluentd:15:in `<top (required)>'
C:/opt/fluent/bin/fluentd:32:in `load'
C:/opt/fluent/bin/fluentd:32:in `<main>'

@ashie
Copy link
Member Author

ashie commented Feb 22, 2024

We observed that supervisor process of fluentd is finished unexpectedly after about 1 hour passed while repeating recovery.

I filed a new issue for tracking this problem at ServerEngine's repository:
treasure-data/serverengine#145

@kenhys
Copy link
Contributor

kenhys commented Feb 22, 2024

Before:

crashed because of problematic registry.
image

After:

  • uninstalled msys64 beforehand.
  • No crash on startup.

image

@ashie
Copy link
Member Author

ashie commented Feb 22, 2024

バッファロー らくらくアップデート!お前か!

Copy link
Contributor

@kenhys kenhys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kenhys kenhys merged commit 8a17a8c into master Feb 22, 2024
22 checks passed
@kenhys kenhys deleted the avoid-crash-on-non-ascii-registry branch February 22, 2024 07:04
ashie added a commit that referenced this pull request Feb 29, 2024
When a non-ASCII key exists under the registry key
`SOFTWARE/Microsoft/Windows/CurrentVersion/Uninstall/`, Fluentd fails to
start workers due to `Encoding::UndefinedConversionError`.
This patch avoid this issue.

Backported from v5.0.3: #620

Signed-off-by: Takuro Ashie <ashie@clear-code.com>
@ashie ashie mentioned this pull request Feb 29, 2024
ashie added a commit that referenced this pull request Feb 29, 2024
When a non-ASCII key exists under the registry key
`SOFTWARE/Microsoft/Windows/CurrentVersion/Uninstall/`, Fluentd fails to
start workers due to `Encoding::UndefinedConversionError`.
This patch avoid this issue.

Backported from v5.0.3: #620

Signed-off-by: Takuro Ashie <ashie@clear-code.com>
daipom pushed a commit that referenced this pull request Feb 29, 2024
When a non-ASCII key exists under the registry key
`SOFTWARE/Microsoft/Windows/CurrentVersion/Uninstall/`, Fluentd fails to
start workers due to `Encoding::UndefinedConversionError`. This patch
avoid this issue.

Backported from v5.0.3: #620

---------

Signed-off-by: Takuro Ashie <ashie@clear-code.com>
@daipom
Copy link
Contributor

daipom commented Apr 24, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants