Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve captive-core error reporting #2669

Closed
2opremio opened this issue Jun 5, 2020 · 5 comments · Fixed by #3189
Closed

Improve captive-core error reporting #2669

2opremio opened this issue Jun 5, 2020 · 5 comments · Fixed by #3189
Assignees

Comments

@2opremio
Copy link
Contributor

2opremio commented Jun 5, 2020

If the captive core fails to start for whatever reason, the relevant error isn't printed and a parsing error is printed instead:

2020/06/04 19:55:15 error preparing range: opening getting ledger `from-1`: unmarshalling framed LedgerCloseMeta: unmarshalling XDR frame header: xdr:DecodeUint: EOF while decoding 4 bytes - read: '[]'

In this case, the relevant error was:

/usr/local/bin/stellar-core: error while loading shared libraries: libpq.so.5: cannot open shared object file: No such file or directory

which wasn't printed out until enabling logging (uncommenting the following lines):

_, w := io.Pipe()
// br := bufio.NewReader(r)
// // Strip timestamps from log lines from captive stellar-core. We emit our own.
// dateRx := regexp.MustCompile(`^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3} `)
// go func() {
// for {
// line, e := br.ReadString('\n')
// if e != nil {
// break
// }
// line = dateRx.ReplaceAllString(line, "")
// fmt.Print(line)
// }
// }()
return w

We should also print out errors with proper formatting (i.e. the same logging format as the rest of horizon messages)

@bartekn
Copy link
Contributor

bartekn commented Jun 5, 2020

I wonder if we should create a new union used for Core-Horizon communication instead of sending LedgerCloseMeta only. I think about something like:

union CaptiveMessage switch (CaptiveMessageType type)
{
case ERROR_MSG:
    Error error;
case SYNC_PROGRESS:
    int percent;
case META:
    LedgerCloseMeta meta;
}

This way we can get errors via pipe instead of parsing stellar-core logs. Will forward this to stellar-core team.

@bartekn bartekn added this to the Horizon 1.6.0 milestone Jul 1, 2020
@bartekn
Copy link
Contributor

bartekn commented Jul 14, 2020

This is solved: #2803. The error should be part of Horizon log when #2673 is done.

@bartekn bartekn closed this as completed Jul 14, 2020
@bartekn
Copy link
Contributor

bartekn commented Oct 16, 2020

I'm reopening this. I think it's worth adding Stellar-Core logs to Horizon log but wrap it in our log format. This way we can add labels (I propose service=ingest subservice=stellar-core) so it's easy to filter them out but. We can also identified critical (form Horizon point of view) errors in Stellar-Core log and promote such logs to error level.

@bartekn bartekn reopened this Oct 16, 2020
@bartekn bartekn removed this from the Horizon 1.6.0 milestone Oct 16, 2020
@ire-and-curses
Copy link
Member

I'd be cautious about identifying specific critical errors. It couples Horizon to stellar-core's log format, which is not a formal API and could change at any time (and silently stop reporting in our logs).

@bartekn
Copy link
Contributor

bartekn commented Oct 16, 2020

I'd be cautious about identifying specific critical errors.

We won't do that. We are going to keep Core log together with Horizon log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants