121 logging improvements #122

skx · 2023-12-16T07:26:57Z

This pull-request, once complete, will close #121 by improving our logging. Improving in this case means that we're very consistent with how we log messages, and their contents.

There will be three levels
- DEBUG to be used for anything a developer wants to see for themselves easily, either during or after development of a new feature.
- WARN for something a user might wish to see, but which is not a fatal error.
- ERROR for something that causes a (near immediate) program exit.

All logging messages will be written to STDERR, and by default that will be done in "plain text", however it is possible to use JSON for ingestion to other systems, and this will be documented in the README.md file.

This sets up the basic stubs which can be used for later.

skx · 2023-12-16T07:28:47Z

So far the main CLI sub-commands report their errors appropriately but the actual application/processor does not yet take advantage of this failure.

Sample usage shows how I'm using custom fields to log things:

$ LOG_ALL=1 ./rss2email import  x
time=2023-12-16T09:27:32.030+02:00 level=DEBUG msg="Adding entry from file" file=x url=https://blog.steve.fi/index.rss
time=2023-12-16T09:27:32.030+02:00 level=DEBUG msg="Adding entry from file" file=x url=https://www.filfre.net/feed/rss/
time=2023-12-16T09:27:32.030+02:00 level=DEBUG msg="Adding entry from file" file=x url=https://briancallahan.net/blog/feed.xml
time=2023-12-16T09:27:32.030+02:00 level=DEBUG msg="Adding entry from file" file=x url=http://www.breakintoprogram.co.uk/feed

skx · 2023-12-16T11:21:54Z

The various sub-commands now log via the logger, and I've set things up so the processor can have a logging-handle, via SetLogger.

The final step will be to update the processor to log appropriately, ideally with per-feed context.

skx · 2023-12-17T09:21:00Z

Current state:

The main CLI commands log properly.
The processor which handles the feeds logs properly.

There is missing logging from the "withstate" package, and the "email sending" stuff is still basically undocumented. The HTTP-fetching code probably needs to have logging as that's a common source of issues. (Specifically I get "processing feed xxxx" then if the remote host is down I have to wait for the fetch to timeout with zero progress, and sleep-delays. Just adding attempt N of Max logs would suffice.)

I also need to go through and work out what to do with the boltdb updates - are they fatal errors or warnings?

I'm extremely pleased with the structured logging approach, especially now I get source/function notes in the output too.

This is useful for showing the retries and failures

skx · 2023-12-17T10:58:57Z

Changes now:

The -verbose flag just changes the logging level.
- This has the benefit of recycling old behaviour and doing the right thing.
The HTTP-fetcher now logs a lot.
Test cases all continue to pass.

Outstanding tasks:

The boltdb stuff is still mostly undocumented, in terms of logging.
As is the email-sending stuff.

However in both cases errors are returned and are sane, so perhaps that is enough? Will ponder for the next few days.

… will bump log-levels

skx · 2023-12-17T18:30:08Z

We're now in good shape:

Logging is used appropriately in all places i think it needs to be - though perhaps more could be added over time, or some removed.
Logging shows only "important" stuff by default.
But -verbose will show "everything".

Finally we've got only a few fmt.Printf calls remaining in our code, and they look legitimate:

$ rgrep fmt.Printf . |grep -v .git
grep: ./rss2email: binary file matches
./daemon_cmd.go:		fmt.Printf("Usage: rss2email daemon email1@example.com .. emailN@example.com\n")
./daemon_cmd.go:			fmt.Printf("Usage: rss2email daemon [flags] email1 .. emailN\n")
./cron_cmd.go:		fmt.Printf("Usage: rss2email cron email1@example.com .. emailN@example.com\n")
./cron_cmd.go:			fmt.Printf("Usage: rss2email cron [flags] email1 .. emailN\n")
./seen_cmd.go:		fmt.Printf("%s\n", buck)
./seen_cmd.go:				fmt.Printf("\t%s\n", key)
./unsee_cmd.go:		fmt.Printf("Please specify the URLs to unsee (i.e. remove from the state database).\n")

I can't think of anything to tighten here, except perhaps the handling of the errors here is a bit redundent:

                // Process all the feeds
                errors := p.ProcessFeeds(recipients)

                // If we found errors then show them.
                if len(errors) != 0 {
                        for _, err := range errors {
                                fmt.Fprintln(os.Stderr, err.Error())
                        }
                }

If there are errors now we see that twice:

{"time":"2023-12-17T18:25:23.731978815Z","level":"DEBUG","source":{"function":"rss2email/processor.(*Processor).ProcessFeeds","file":"rss2email/processor/processor.go","line":101},"msg":"about to process feeds","feed_count":34}
{"time":"2023-12-17T18:25:23.732158375Z","level":"DEBUG","source":{"function":"rss2email/processor.(*Processor).ProcessFeeds","file":"rss2email/processor/processor.go","line":107},"msg":"starting to process feed","feed":"https://rachelbythebay.com/w/atom.xml"}
{"time":"2023-12-17T18:25:23.762472328Z","level":"DEBUG","source":{"function":"rss2email/httpfetch.(*HTTPFetch).Fetch","file":"rss2email/httpfetch/httpfetch.go","line":124},"msg":"fetching URL","feed":{"link":"https://rachelbythebay.com/w/atom.xml"},"httpfetch":{"link":"https://rachelbythebay.com/w/atom.xml","user-agent":"rss2email (https://github.com/skx/rss2email)","insecure":false,"retry-max":3,"retry-delay":1},"attempt":1}
{"time":"2023-12-17T18:25:53.783143722Z","level":"DEBUG","source":{"function":"rss2email/httpfetch.(*HTTPFetch).Fetch","file":"rss2email/httpfetch/httpfetch.go","line":134},"msg":"fetching URL failed","feed":{"link":"https://rachelbythebay.com/w/atom.xml"},"httpfetch":{"link":"https://rachelbythebay.com/w/atom.xml","user-agent":"rss2email (https://github.com/skx/rss2email)","insecure":false,"retry-max":3,"retry-delay":1},"error":"Get \"https://rachelbythebay.com/w/atom.xml\": dial tcp 216.218.228.215:443: i/o timeout"}
{"time":"2023-12-17T18:25:54.7833987Z","level":"DEBUG","source":{"function":"rss2email/httpfetch.(*HTTPFetch).Fetch","file":"rss2email/httpfetch/httpfetch.go","line":124},"msg":"fetching URL","feed":{"link":"https://rachelbythebay.com/w/atom.xml"},"httpfetch":{"link":"https://rachelbythebay.com/w/atom.xml","user-agent":"rss2email (https://github.com/skx/rss2email)","insecure":false,"retry-max":3,"retry-delay":1},"attempt":2}
{"time":"2023-12-17T18:26:24.812271984Z","level":"DEBUG","source":{"function":"rss2email/httpfetch.(*HTTPFetch).Fetch","file":"rss2email/httpfetch/httpfetch.go","line":134},"msg":"fetching URL failed","feed":{"link":"https://rachelbythebay.com/w/atom.xml"},"httpfetch":{"link":"https://rachelbythebay.com/w/atom.xml","user-agent":"rss2email (https://github.com/skx/rss2email)","insecure":false,"retry-max":3,"retry-delay":1},"error":"Get \"https://rachelbythebay.com/w/atom.xml\": dial tcp 216.218.228.215:443: i/o timeout"}
{"time":"2023-12-17T18:26:25.812461705Z","level":"DEBUG","source":{"function":"rss2email/httpfetch.(*HTTPFetch).Fetch","file":"rss2email/httpfetch/httpfetch.go","line":124},"msg":"fetching URL","feed":{"link":"https://rachelbythebay.com/w/atom.xml"},"httpfetch":{"link":"https://rachelbythebay.com/w/atom.xml","user-agent":"rss2email (https://github.com/skx/rss2email)","insecure":false,"retry-max":3,"retry-delay":1},"attempt":3}
{"time":"2023-12-17T18:26:55.836623461Z","level":"DEBUG","source":{"function":"rss2email/httpfetch.(*HTTPFetch).Fetch","file":"rss2email/httpfetch/httpfetch.go","line":134},"msg":"fetching URL failed","feed":{"link":"https://rachelbythebay.com/w/atom.xml"},"httpfetch":{"link":"https://rachelbythebay.com/w/atom.xml","user-agent":"rss2email (https://github.com/skx/rss2email)","insecure":false,"retry-max":3,"retry-delay":1},"error":"Get \"https://rachelbythebay.com/w/atom.xml\": dial tcp 216.218.228.215:443: i/o timeout"}
{"time":"2023-12-17T18:26:56.837833238Z","level":"WARN","source":{"function":"rss2email/processor.(*Processor).processFeed","file":"rss2email/processor/processor.go","line":279},"msg":"failed to fetch feed","feed":{"link":"https://rachelbythebay.com/w/atom.xml"},"error":"Get \"https://rachelbythebay.com/w/atom.xml\": dial tcp 216.218.228.215:443: i/o timeout"}

Here you see a lot of noise, but summarizing:

We're going to process 34 feeds.
The first feed is rachelbythebay.com
Fetching that feed fails, and retries happen.
The last line shows "WARN"
- So this would be visible by default. (WARN/ERROR are shown by default, DEBUG requires extra effort or a -verbose flag to view).

At the end of the run we see the list of errors, which in this case just shows the same thing:

..
..
{"time":"2023-12-17T18:27:26.504908829Z","level":"DEBUG","source":{"function":"rss2email/httpfetch.(*HTTPFetch).Fetch","file":"rss2email/httpfetch/httpfetch.go","line":124},"msg":"fetching URL","feed":{"link":"https://www.leadedsolder.com/feed.xml"},"httpfetch":{"link":"https://www.leadedsolder.com/feed.xml","user-agent":"rss2email (https://github.com/skx/rss2email)","insecure":false,"retry-max":3,"retry-delay":1},"attempt":1}
{"time":"2023-12-17T18:27:27.727671007Z","level":"DEBUG","source":{"function":"rss2email/processor.(*Processor).processFeed","file":"rss2email/processor/processor.go","line":285},"msg":"feed retrieved","feed":{"link":"https://www.leadedsolder.com/feed.xml"},"entries":10}
{"time":"2023-12-17T18:27:27.843660767Z","level":"DEBUG","source":{"function":"rss2email/processor.(*Processor).processFeed","file":"rss2email/processor/processor.go","line":439},"msg":"feed processed","feed":{"link":"https://www.leadedsolder.com/feed.xml"},"seen_count":10,"unseen_count":0}
{"time":"2023-12-17T18:27:27.843843979Z","level":"DEBUG","source":{"function":"rss2email/processor.(*Processor).ProcessFeeds","file":"rss2email/processor/processor.go","line":244},"msg":"all feeds processed","feed_count":34}
error processing https://rachelbythebay.com/w/atom.xml - Get "https://rachelbythebay.com/w/atom.xml": dial tcp 216.218.228.215:443: i/o timeout

In #122 we updated our codebase to ensure that we used a common logging handle to send our output messages to the user: * This included warnings and errors by default. * However "developer", or other internal messages, were also available. This pull-request builds upon that work to write the log messages to a file, as well as showing them to the user. By default a file `rss2email.log` is generated when the application starts and log entries are appended to it. However: * LOG_FILE_PATH can be set to a different path. * LOG_FILE_DISABLE can be used to disable this duplication. Why do this? Partly for reference, but also partly to allow a local setup to view errors. We could now resolve #119 by adding a cronjob: #!/bin/sh # show logs, and delete them, run once per day cat .../rss2email.log rm .../rss2email.log (Of course this might not work 100% as the current approach assumes the file is open forever in the case of `daemon` sub-command.) This isn't a complete solution, but without getting into the whole template-customization and more complex considerations it's not a terrible thing to do.

skx added 3 commits December 16, 2023 09:24

Add slog setup.

b31a6d8

This sets up the basic stubs which can be used for later.

Document logging options

3d1aaf2

Migrate some of our fmt.Printfs to proper logging.

cb5a49e

skx added 5 commits December 16, 2023 12:59

Ensure we have a global logger-handle when running tests.

c8157d4

Ported to log via logger.

97c5a4f

Minor rewording on error-message

a7a6c12

seen command now uses logger

410bc7c

Pass logging-object to processor/ module, for future use

bbfb5d1

skx added 8 commits December 16, 2023 13:23

Minor changes to logging-level explanations.

39fd4be

.

ab99545

Moved some of the logging to slog

781a9a5

BUGFIX: Failure to parse a date should NOT skip an old entry.

b61f346

Updated the processor to use the logger more often

2b37dc0

Show the config-file that had the errors, when appropriate.

3fe603e

Added more slog

1554590

Added source/function to logger - but trim prefix.

0fa70f5

skx added 2 commits December 17, 2023 12:24

HTTP-fetcher uses a logger

4a317cc

This is useful for showing the retries and failures

Updated -verbose flag to change log-level.

953fd95

skx added 7 commits December 17, 2023 13:17

Document version requirements for building, and mention that -verbose…

190f475

… will bump log-levels

Updated dependencies

fd22374

Use logger for bucket-related errors

d051fcc

Changed fmt.Printf to logger.Error.

6f562ab

Removed more fmt.Printf

069efa8

Updated sendmail to use logger

43c719d

Log email completion

4a39816

skx mentioned this pull request Dec 18, 2023

[Feature Request]: Feed notification on fetching error #119

Closed

skx merged commit 3166c16 into master Dec 19, 2023
3 checks passed

skx deleted the 121-logging branch December 19, 2023 18:46

skx mentioned this pull request Dec 30, 2023

Allow logging to a file #123

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

121 logging improvements #122

121 logging improvements #122

skx commented Dec 16, 2023

skx commented Dec 16, 2023

skx commented Dec 16, 2023

skx commented Dec 17, 2023

skx commented Dec 17, 2023

skx commented Dec 17, 2023

121 logging improvements #122

121 logging improvements #122

Conversation

skx commented Dec 16, 2023

skx commented Dec 16, 2023

skx commented Dec 16, 2023

skx commented Dec 17, 2023

skx commented Dec 17, 2023

skx commented Dec 17, 2023