Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify logging to use ASP.NET default console logger #215

Merged
merged 9 commits into from
May 24, 2024

Conversation

bdach
Copy link
Collaborator

@bdach bdach commented Jan 31, 2024

This has been bothering me for a little while. Spectator server logging is a mix of built-in ASP.NET logging to console, direct Console.WriteLine() usage, and (over)use of framework logger. The last of these is especially problematic when deploying to k8s as file-based logs are not how you're supposed to be doing logging on k8s.

So here's a diff that moves everything to the ASP.NET logger and just straight up bans the other methods to keep things in shape.

@smoogipoo
Copy link
Contributor

smoogipoo commented Feb 1, 2024

Code-wise this looks alright. Haven't tested yet.

One question I have is, it seemed that @ThePooN implied we would have a hard time retrieving these logs when they were on the K8s cluster. Now that this is no longer running on K8s, I gather it's not going to be such an issue. But if we do move it back to K8s, are we going to have issues pulling these logs from e.g. a server crash?

@bdach
Copy link
Collaborator Author

bdach commented Feb 1, 2024

As far as I gather (and @ThePooN plz correct me here if wrong) the lowdown is as such:

  • When we were on k8s, the file-based logs would be lost when an instance was killed, but stdout from a previous run was being preserved by k8s internals
  • Now we're not on k8s anymore, and deploying this would require some extra infra so that the logs could go somewhere persistent
  • We may still return to k8s, but that depends on whether the issue causing large overheads in packet handling that were observed on k8s can be fixed (there's a prospective setting that can be tuned but it has not been confirmed yet whether it fixes)

@ThePooN
Copy link
Member

ThePooN commented Feb 1, 2024

For now we're not on k8s. Log files are not lost on redeploy, and stdout is persisted by systemd with no concern in longevity.

If we were to come back to k8s, log files would be lost on every redeploy and crashes. stdout is the way to go for logging on containerized environments however we have no logging infrastructure yet, and outofthebox kubernetes persists very little logs. with osu-server-spectator throughput, only the last 5mn are persisted. So we'd want to re-enable datadog logging or a self-hosted solution like loki.

tl;dr we should most likely merge this if we come back to k8s but will also require some infra work to accommodate it

@bdach
Copy link
Collaborator Author

bdach commented Feb 1, 2024

with osu-server-spectator throughput, only the last 5mn are persisted

This would also probably get worse after this change btw, what with the number of file-based log calls I changed to print to stdout instead.

bdach added a commit to bdach/osu-server-spectator that referenced this pull request Feb 1, 2024
Before you open the diff, a few paragraphs of explanation.

For ASP.NET Core apps, there appear to be three disparate pathways to
getting events onto sentry.

- The first, and most direct one, is directly calling
  `SentrySdk.CaptureException()`. Generally does what you'd expect it
  to.

- The second is via `IWebHostBuilder.UseSentry()`. This is an
  ASP.NET-ism that works by injecting a sentry-provided middleware into
  the request handling stack. Thus, all requests that fail due to a
  thrown exception will be reported to sentry, as the middleware will
  catch the error, log it, and throw it back up to the rest of the
  ASP.NET stack to handle.

  However, note that *this does not apply to background workers*, as
  they are generally outside of this entire flow. Even if we weren't
  hand-rolling them via singletons instantiated in `Startup`,
  and instead using `IHostedService` / `BackgroundService`
  which is the most correct ASP.NET-ism for that, for a proper
  persistent background service *you can't throw exceptions because
  you'd kill both the background service, and the entire server
  with it*.

- Therefore, the third method, and the one officially recommended by the
  sentry team for background worker use cases
  (getsentry/sentry-dotnet#190 (comment))
  is to use sentry's extension of `Sentry.Extensions.Logging`. Doing
  this will middleman all log calls to `Microsoft.Extensions.Logging`
  and push all errors (and warnings too, I believe) to sentry.

In the context of all of the above,
ppy#215 becomes doubly
relevant; however, because that change requires more infra preparations
and we probably want sentry logging on the replay upload process *now*,
this is the minimal required change to make that happen.

A side effect of this change is that the replay upload errors - which
would be printed to stdout via `Console.WriteLine()` - will still be
printed there, but using ASP.NET's default logging format, which is a
little more... talkative, as the example below shows:

	fail: ScoreUploader[0]
	      Error during score upload
	      System.InvalidOperationException: nuh uh
		 at osu.Server.Spectator.Storage.ThrowingScoreStorage.WriteAsync(Score score) in /home/dachb/Documents/opensource/osu-server-spectator/osu.Server.Spectator/Storage/ThrowingScoreStorage.cs:line 12
		 at osu.Server.Spectator.Hubs.ScoreUploader.Flush() in /home/dachb/Documents/opensource/osu-server-spectator/osu.Server.Spectator/Hubs/ScoreUploader.cs:line 117

Additionally, note that we have two *other* background workers like
`ScoreUploader`, and they are: `MetadataBroadcaster` and
`BuildUserCountUpdater`. I don't consider them anywhere as key as replay
upload, therefore they are left as they are until we can get the full
logging unification changes in.
@smoogipoo smoogipoo added the blocked Don't merge this. label Feb 7, 2024
@smoogipoo
Copy link
Contributor

Blocked while we figure out a server-side log ingest solution, since it looks like we're moving back to k8s.

@smoogipoo
Copy link
Contributor

@ThePooN Has grafana been deployed now / is this PR now ready to be merged? Looking to reduce my inbox.

@ThePooN
Copy link
Member

ThePooN commented Mar 20, 2024

it is not deployed yet, i've been catching up since i came back from holidays a week ago and have been focused on higher priority stuff

@ThePooN
Copy link
Member

ThePooN commented Apr 11, 2024

@ThePooN Has grafana been deployed now / is this PR now ready to be merged? Looking to reduce my inbox.

Has been deployed now and in a state I'm pretty happy with.

We also have the ability to apply regular expressions to infer additional labels. That is most commonly used to add prefixes such as [<component name>], and make independent logging streams for every component.

@peppy peppy removed the blocked Don't merge this. label May 22, 2024
@peppy peppy merged commit 22e4888 into ppy:master May 24, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants