-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Signal disk buffering #913
Conversation
This reverts commit d84d329.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disclaimer: I have not looked at every line of code, but I have looked at the overall design and a bunch of the code and think this is a great start/addition. I do wish it would have been more incremental, but alas, he were are.
I have a number of things I am thinking about iterating on with this, but since this is new and experimental contrib code, at this point I would prefer to merge the big module and do piecemeal issues/changes after.
Thanks @LikeTheSalad !
Co-authored-by: jason plumb <75337021+breedx-splk@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments. Definitely easier to grok with the simplified serialization logic.
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package io.opentelemetry.contrib.disk.buffering.exporters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's put all the public classes in io.opentelemetry.contrib.disk.buffering
, and put the remaining stuff in io.opentelemetry.contrib.disk.buffering.internal.*
.
In other words, I don't think there's a need for the breaking out i.o.c.d.b.exporters
and i.o.c.d.b.storage
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me, I've applied the changes.
...java/io/opentelemetry/contrib/disk/buffering/storage/files/DefaultTemporaryFileProvider.java
Outdated
Show resolved
Hide resolved
import java.io.IOException; | ||
|
||
public class DefaultTemporaryFileProvider implements TemporaryFileProvider { | ||
public static final TemporaryFileProvider INSTANCE = new DefaultTemporaryFileProvider(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We prefer exposing singletons via public static methods rather than public static fields:
private static final TemporaryFileProvider INSTANCE = new DefaultTemporaryFileProvider();
public static TemporaryFileProvider getInstance() {
return INSTANCE;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've applied this change to all the cases where an INSTANCE field was directly accessed.
* provided. FALSE if either of those conditions didn't meet. | ||
* @throws IOException If an unexpected error happens. | ||
*/ | ||
boolean exportStoredBatch(long timeout, TimeUnit unit) throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this method has roughly the same semantics as CompletableResultCode {Signal}Exporter#flush()
.
Can we get rid of this method and instead have users use flush()
to force an read from disk and export?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it can work yeah. My only concern with doing so would be that the expectations of someone using this method won't probably match the reality of what actually happens, in the sense that, I believe the flush()
method suggests that all the unexported signals will be exported right away. Though for the case of the disk exporter, it would mean that the next available batch of signals in the queue will be sent, but not all the batches at once. So I'm wondering if that could cause misunderstandings and probably invalid issues to be created in the repo because of it.
|
||
package io.opentelemetry.contrib.disk.buffering.internal.storage.utils; | ||
|
||
public class TimeProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using io.opentelemetry.sdk.common.Clock instead of introducing a new abstraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof, I forgot to make this same observation! Good call jack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just made the change, although I'm not sure if the use case here is properly addressed by a Clock implementation because the time format that the storage classes use is milliseconds, which is also the format used in the configuration parameters, making it handy to do the math when checking for changes in the state of the cache files based on the user-provided parameters. So if we provide nanoseconds as local time, we'd have to make conversions all the time, which would turn in overall to more than one conversion needed per action, considering that the way a simple clock like the one used here would provide the current time in nanoseconds would be by converting the system current milliseconds to nanos.
I think it's also worth noting that this is meant to be an internal utility class which makes it possible to change the time for testing purposes, whereas I believe the usage of Clock is also handy for when we need to provide the consumers the ability to use their own implementation, though at the moment that's not needed in here.
So based on the above, the changes I made are essentially an incorrect implementation of Clock, which returns the current time in milliseconds to avoid unnecessary conversion issues. It's a bit ugly to me, though it's a simple way to reuse the Clock interface, though it's ugly nevertheless, so I'd like to know your opinion on it.
…fering/storage/files/DefaultTemporaryFileProvider.java Co-authored-by: jack-berg <34418638+jack-berg@users.noreply.github.com>
# Conflicts: # disk-buffering/src/main/java/io/opentelemetry/contrib/disk/buffering/storage/files/DefaultTemporaryFileProvider.java
plugins { | ||
id("otel.java-conventions") | ||
id("otel.publish-conventions") | ||
id("me.champeau.jmh") version "0.7.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both opentelemetry-java
and opentelemetry-java-instrumentation
use animalsniffer to avoid using apis that are not available on android, see if it would be useful in this project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I've just added it to verify API 24 supported code
private LogRecordDataSerializer() {} | ||
|
||
static LogRecordDataSerializer get() { | ||
if (instance == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not thread safe, consider whether you need to initialize this lazily
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've made the changes
proto.writeDelimitedTo(out); | ||
return out.toByteArray(); | ||
} catch (IOException e) { | ||
throw new IllegalArgumentException(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to me IllegalStateException
seems better suited than IllegalArgumentException
for wrapping IOException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I don't have a strong opinion on this, I've just changed it
} | ||
} | ||
|
||
private static void copyFile(File from, File to) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Files.copy
isn't available on android?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, it's just that it was added in API level 26 which is quite high for apps that need to support older devices, we've got a couple of customers which are on level 24. Though now that I think about it better, if the core OTel SDK requires library desugaring, then probably there's no need to make this lib safe for older versions. I'll take a deeper look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that makes sense. I had a quick peek at desugared api list and couldn't find Files.copy
. I'd just add a comment there explaining why you are not using Files.copy
. An alternative would be to implement copy using FileChannel.transferTo
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Thanks for taking the time 👍 Btw I'm planning to make all the requested changes first thing tomorrow, since I'm already EOD for today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment explaining why it was done this way, and also, the animalsniffer check will complain if Files.copy is used
|
||
private static void copyFile(File from, File to) throws IOException { | ||
try (InputStream in = new BufferedInputStream(new FileInputStream(from)); | ||
OutputStream out = new FileOutputStream(to, false)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
false
should be the default (overwrite instead of append)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've removed it
} | ||
|
||
private static void copyFile(File from, File to) throws IOException { | ||
try (InputStream in = new BufferedInputStream(new FileInputStream(from)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using BufferedInputStream
here feels a bit weird as you are already using your own buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I've removed it.
|
||
public final class Constants { | ||
|
||
public static final byte[] NEW_LINE_BYTES = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only used in a test. Is this a remnant from the json based serialzier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, it is from when we were using the custom json serialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's removed now.
} | ||
|
||
/** | ||
* Attempts to write a line into a writable file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line
feels a bit misleading as far as I can tell it's just binary data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also a leftover from when json was used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's removed now.
expireTimeMillis = createdTimeMillis + configuration.getMaxFileAgeForReadMillis(); | ||
originalFileSize = (int) file.length(); | ||
temporaryFile = configuration.getTemporaryFileProvider().createTemporaryFile(file.getName()); | ||
copyFile(file, temporaryFile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the copy source file to temp file, read temp file and overwrite the source after each batch logic hard to follow. Comments might help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it was also hard for me. The general design is documented in a markdown file now. Maybe just link to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of copying the data around it might be more efficient to alter the file format so that it would be possible to tell which chunks have already been handled. Having some kind of header in the data file could also be useful. If it is possible that files written by older version of library are read by newer version then having a header could allow skipping data in case file format has changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zeitlinger and I also discussed some more optimized approaches as well, such as having a separate file to keep track of the bytes already read and updating that number instead. The header approach I think it's a good idea as well, especially because it'd keep everything in the same file, however, I'd have to double-check how could we update the header values at the same time the file is being read to make sure we don't lose the current position if the app gets terminated without a chance to properly close the reader. Though ultimately, since there's a lot to this PR already, and, based on a benchmark added to this specific functionality, I found out that the simple approach is quite decent in terms of performance, I was thinking it would be ideal to add these optimizations on a future PR if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exploring these ideas in the future is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've added an explanation in the class's doc
return ReadableResult.FAILED; | ||
} | ||
if (hasExpired()) { | ||
close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something else is supposed to delete the expired file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A clean up is done when creating a new file here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx @LikeTheSalad!
and thx @zeitlinger @breedx-splk @jack-berg @laurit for reviewing!!
Description:
Feature addition - Allows to cache signals in the disk and later send them on-demand.
This feature is pretty similar to the one already existing in the OpenTelemetry Swift SDK.
Requirements
Existing Issue(s):
Testing:
Documentation:
Javadoc and README.
How it works
You can take a look at the CONTRIBUTING file to get a more detailed overview of how it all works.
Outstanding items: