Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benefits and downsides of Atoms #1891

Closed
ghost opened this issue Feb 21, 2020 · 4 comments
Closed

Benefits and downsides of Atoms #1891

ghost opened this issue Feb 21, 2020 · 4 comments
Labels
domain: data model Anything related to Vector's internal data model domain: performance Anything related to Vector's performance meta: idea Anything in the idea phase. Needs further discussion and consensus before work can begin. type: tech debt A code change that does not add user value.

Comments

@ghost
Copy link

ghost commented Feb 21, 2020

In this issue I want to discuss are there still any real benefits in using Atoms as key values in LogEvent instead of Strings.

It seems to me that switching to BTreeMap (which uses actual strings comparisons instead of comparing hashes) and introduction of default log schemas (#1769, which reduced use of static atoms) greately reduced benefits of using Atoms instead of plain Strings. On the other hand, maintaining code working with Strings directly is easier and some invocations of clone over the codebase could potentially be avoided with them.

@ghost ghost added the meta: idea Anything in the idea phase. Needs further discussion and consensus before work can begin. label Feb 21, 2020
@ghost ghost added this to the Improve data processing milestone Feb 21, 2020
@ghost ghost added the needs: approval Needs review & approval before work can begin. label Feb 21, 2020
@binarylogic binarylogic added the type: tech debt A code change that does not add user value. label Feb 22, 2020
@ghost ghost removed this from the Improve data processing milestone Feb 23, 2020
@lukesteensen
Copy link
Member

This is a great question! They definitely add a bit of friction to the development process. Their clone is very cheap, so I wouldn't worry about that too much.

I'd be curious to see the effects on throughput due to memory use (duplicate string keys on the heap) and CPU (interning and reusing atoms vs allocating and deallocating strings). With some numbers, we'd be able to weigh ergonomics in the code vs any performance differences.

@Hoverbear
Copy link
Contributor

@a-rodin Would using Strings ease integration with Lua and Javascript transforms? Seems like it would for WASM/FFI in general

@lukesteensen
Copy link
Member

Some simple benchmarks show that we get about 6-7% more throughput by removing Atom in favor of std::string::String.

Better performance and better ergonomics make this a clear win, so we should do it.

@lukesteensen lukesteensen removed the needs: approval Needs review & approval before work can begin. label Apr 10, 2020
@binarylogic binarylogic added domain: data model Anything related to Vector's internal data model domain: performance Anything related to Vector's performance labels Aug 7, 2020
@jszwedko
Copy link
Member

jszwedko commented Aug 1, 2022

Closing since we dropped the use of Atoms due to poor performance.

@jszwedko jszwedko closed this as completed Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: data model Anything related to Vector's internal data model domain: performance Anything related to Vector's performance meta: idea Anything in the idea phase. Needs further discussion and consensus before work can begin. type: tech debt A code change that does not add user value.
Projects
None yet
Development

No branches or pull requests

4 participants