Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More options for encoding.timestamp_format #17323

Closed
vbmithr opened this issue May 5, 2023 · 7 comments · Fixed by #18817
Closed

More options for encoding.timestamp_format #17323

vbmithr opened this issue May 5, 2023 · 7 comments · Fixed by #18817
Labels
domain: codecs Anything related to Vector's codecs (encoding/decoding) type: feature A value-adding code addition that introduce new functionality.

Comments

@vbmithr
Copy link

vbmithr commented May 5, 2023

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Use Cases

Currenty only rfc3339 or unix is supported, would be good to have unix_ms unix_us unix_ns

Attempted Solutions

No response

Proposal

No response

References

No response

Version

vector 0.28.2 (x86_64-unknown-linux-gnu)

@vbmithr vbmithr added the type: feature A value-adding code addition that introduce new functionality. label May 5, 2023
@spencergilbert
Copy link
Contributor

👋 thanks for the request @vbmithr,

I'm wondering if we should rethink that feature altogether. VRL has support for formatting timestamps, and it feels a bit off to have two separate ways to format timestamps.

Maybe we could consider dropping the encoding.timestamp and have it be a separate step in a previous remap or work out a way to expose VRL's encoding abilities through the encoding config.

Either way, it does seem like a good idea to expand the options there.

@jszwedko
Copy link
Member

Hey @vbmithr ! Could you provide a bit more detail about your use-case including which sinks you are using? We are trying to understand if it makes sense to expand encoding.timestamp_format or to provide different ways to satisfy the same use-case.

@vbmithr
Copy link
Author

vbmithr commented May 12, 2023

I use vector to simplify the logging/recording of public websocket data from cryptocurrency exchanges, I feed data to vector with the (unix) socket source and want the raw JSON data to be sent to AWS for archival. From there I will use a proprietary system to download this data and convert it to other formats or extracts parts I want from it, so I need to parse that data using fast JSON parsing code (like simdjson). The correct granularity for timestamps is the microsecond, or even millisecond would be enough. RFC3339 takes more time to parse than just an integer.

So I'd like to have my timestamp printed in UNIX microseconds.

Sinks I use: file, aws-s3

@jszwedko
Copy link
Member

Gotcha, thanks for adding your use-case @vbmithr . I have a vague sense that timestamp handling in Vector could benefit from an overhaul to improve the UX, but I think your suggestion here is reasonable and a small change so I'm happy to see it go in. If you or anyone else feels so motivated it would be something like adding the new formats here:

#[configurable_component]
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
#[serde(rename_all = "lowercase")]
/// The format in which a timestamp should be represented.
pub enum TimestampFormat {
/// Represent the timestamp as a Unix timestamp.
Unix,
/// Represent the timestamp as a RFC 3339 timestamp.
Rfc3339,
}

And handling them here:

fn apply_timestamp_format(&self, log: &mut LogEvent) {
if let Some(timestamp_format) = self.timestamp_format.as_ref() {
match timestamp_format {
TimestampFormat::Unix => {
if log.value().is_object() {
let mut unix_timestamps = Vec::new();
for (k, v) in log.all_fields().expect("must be an object") {
if let Value::Timestamp(ts) = v {
unix_timestamps.push((k.clone(), Value::Integer(ts.timestamp())));
}
}
for (k, v) in unix_timestamps {
log.insert(k.as_str(), v);
}
} else {
// root is not an object
let timestamp = if let Value::Timestamp(ts) = log.value() {
Some(ts.timestamp())
} else {
None
};
if let Some(ts) = timestamp {
log.insert(event_path!(), Value::Integer(ts));
}
}
}
// RFC3339 is the default serialization of a timestamp.
TimestampFormat::Rfc3339 => (),
}

@jszwedko jszwedko added the domain: codecs Anything related to Vector's codecs (encoding/decoding) label May 16, 2023
@mbneimann-at-work
Copy link

mbneimann-at-work commented Aug 14, 2023

I would like to add rfc3339_ms, rfc3339_us and rfc3339_ns to the suggested extra formats.

srstrickland pushed a commit to srstrickland/vector that referenced this issue Oct 9, 2023
* `unix_ms`: milliseconds
* `unix_us`: microseconds
* `unix_ns`: nanoseconds
* `unix_float`: seconds float

vectordotdev#17323
srstrickland pushed a commit to srstrickland/vector that referenced this issue Oct 9, 2023
* `unix_ms`: milliseconds
* `unix_us`: microseconds
* `unix_ns`: nanoseconds
* `unix_float`: seconds float

vectordotdev#17323
@srstrickland
Copy link
Contributor

I have a similar use case. We want to write logs to CloudWatch, and incredibly, CloudWatch lacks the ability to parse RFC3339 (or any other standard format) strings into usable datetime objects! So, I was forced to use unixtime, but would really prefer sub-second resolution.

I opened a PR to add some new formats.

@srstrickland
Copy link
Contributor

I would like to add rfc3339_ms, rfc3339_us and rfc3339_ns to the suggested extra formats.

@mbneimann-at-work I didn't add these, but you could build on the pattern from the linked PR to add more formats. Here's a dummy PoC with some rfc3339 customizations. But at this point, I think (as suggested at the top), it might be better just to allow users to specify either the format string, or provide a vrl function to manipulate the timestamp however they want. Feel free to poach / modify my PoC, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: codecs Anything related to Vector's codecs (encoding/decoding) type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants