-
Notifications
You must be signed in to change notification settings - Fork 980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRILL-5270: Improve loading of profiles listing in the WebUI #1250
Conversation
[Current Apache Master] User latency when 8 web-clients (wget) request for |
[DRILL-5270] User latency when 8 web-clients (wget) request for |
[DRILL-5270] User latency when 8 web-clients (wget) request for Backend logging reveals the archiving process:
|
@arina-ielchiieva / @parthchandra could you review this? |
@kkhatua Why not use the Guava Cache? http://www.baeldung.com/guava-cache . I think it would simplify the implementation. |
I did consider using the Gauva Cache initially, but I could not figure out how to specify the eviction policy based on the profile name. Guava provides a mechanism to limit the cache size and evict the oldest entry, but I wanted to override the mechanism that defines 'oldest'. Lastly, the TreeSet allows us to access the elements in a sorted order, which seemed missing in Guava. Do you think it makes the code cleaner if I were to extract the mechanism into a separate implementation of this 'cache' ? |
@kkhatua I'm still not sure why you want to override the definition of oldest? Why is the default LRU eviction policy not sufficient? If you need an ordered list of keys for the cache you can accomplish this with the Guava cache by adding a key to a TreeSet when the Loader is called, and removing a key from a TreeSet when the Removal Listener is called. My main concern is that implementing our own cache creates complexity and opens up the possibility for bugs. Whereas a pre-existing cache is already debugged and tested for us. |
The way the cache is constructed is by first listing all the profile files and sorting them (the profile ID is generated in a monotonically decreasing value to ensure sortedness in stores like HBase), This customized TreeSet is used to inject profiles (since the FileSystem is not guaranteed to return the list in order), so the TreeSet provides the ordering. We retain only the first N (which are, implicitly, the latest profiles). If we were to add more profiles than the max capacity, the TreeSet is pruned at the rightmost end. |
I think I understand the difference in our two perspectives. You wanted a cache that will always only contain the N most recently created profiles. If you happen to access the N + 1th youngest profile, the cache will not contain it and will never contain it, the cache will only hold the N most recently created profiles. I still prefer the approach with the Guava cache because you can still effectively achieve the same result. As new profiles are created they can be added to the cache. If you access a very old profile, one more recently created profile will be evicted from the cache and the old profile will be added to the cache since a user just requested it. I would argue this behavior is not only easier to implement since we are leveraging a library, but actually more desirable since it caches a profile based on when it is used, not when it was created. If you still disagree with using the Guava cache. I agree with your proposal of moving your cache into a separate class. I think you should also add some unit tests for the cache to verify that it works as expected. The unit tests will also make maintaining and enhancing the class easier for future developers. |
I actually like the Guava cache approach for its elegance and capabilities, but it expands the scope significantly without a huge benefit from what we currently have. The concept of the cache that you are envisioning is with the complete profile. This is only for listing of the profiles. When an individual profile is accessed, Drill ends up fetching a new copy from the PStore to serialize the contents to visualize it. |
@kkhatua Sounds good. Thanks for the explanations and thanks for improving the performance so much :) ! |
8082760
to
94ff86f
Compare
Done all the changes. Found an unused import in an unrelated file, so I fixed that to make sure the code builds after rebasing to latest master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kkhatua, before we proceed with code review, please clean up the code a little bit since LocalPersistentStore.
looks too overloaded. I suggest we move archiving logic to some helper class as mentioned in my comments... Actually, this would allow us to unit test it much better...
assertEquals(null, poppedProfile); | ||
} | ||
|
||
assert(testSet.size() == initCapacity); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use junit assertions in tests.
*/ | ||
public String add(String profile, boolean retainOldest) { | ||
store.add(profile); | ||
if ( size.incrementAndGet() > maxCapacity ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove spaces.
logger.error("Unable to archive {} profiles to {}", pendingArchivalSet.size(), archivePath.toString()); | ||
} | ||
} catch (IOException e) { | ||
e.printStackTrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please replace with proper exception...
|
||
private ProfileSet profilesSet; | ||
private PathFilter sysFileSuffixFilter; | ||
// private String mostRecentProfile; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arina-ielchiieva I want this as a place holder till the Hadoop Filter API improves in performance. There are TODOs marked for that purpose.
private Iterable<Entry<String, V>> iterableProfileSet; | ||
private Stopwatch archiveWatch; | ||
|
||
public LocalPersistentStore(DrillFileSystem fs, Path base, PersistentStoreConfig<V> config, DrillConfig drillConfig) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that LocalPersistentStore
should be responsible for archiving. Instead we can consider implementing some "Archivers" (dummy and not dummy, for not) and init them based on PROFILES_STORE_ARCHIVE_ENABLED
property.
throw new RuntimeException(e); | ||
} | ||
|
||
//No need to acquire lock since incoming requests are synchronized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please point out to the code which proves this statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method services HTTP requests, which is reflected in the tests (and graphs above). Hence, the need for a lock seemed unnecessary.
if (fs.isDirectory(archivePath)) { | ||
int archivedCount = 0; | ||
archiveWatch.reset().start(); //Clocking | ||
//while (archivedCount < archivalRate) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove...
private Stopwatch archiveWatch; | ||
|
||
public LocalPersistentStore(DrillFileSystem fs, Path base, PersistentStoreConfig<V> config, DrillConfig drillConfig) { | ||
super(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove...
@arina-ielchiieva I've made the following changes:
|
@arina-ielchiieva @kkhatua would this make it into the 1.14 release? |
@arina-ielchiieva I've done some additional changes, please review. |
10a12c7
to
c4c6bec
Compare
@priteshm this required some more rework, which I'm hoping I've addressed. We can review and try to get this in as part of 1.15.0 |
c4c6bec
to
78fc4d2
Compare
78fc4d2
to
7e34415
Compare
7e34415
to
8feabce
Compare
When Drill is displaying profiles stored on the file system (Local or Distributed), it does so by loading the entire list of `.sys.drill` files in the profile directory, sorting and deserializing. This can get expensive, since only a single CPU thread does this. As an example, a directory of 120K profiles, the time to just fetch the list of files alone is over 6 seconds. After that, based on the number of profiles being rendered, the time varies. An average of 30ms is needed to deserialize a standard profile, which translates to an additional 3sec for therendering of default 100 profiles. A user reported issue confirms just that: DRILL-5028 Opening profiles page from web ui gets very slow when a lot of history files have been stored in HDFS or Local FS Additional JIRAs filed ask for managing these profiles DRILL-2362 Drill should manage Query Profiling archiving DRILL-2861 enhance drill profile file management This PR brings the following enhancements to achieve that: 1. Mimick the In-memory persistence of profiles (DRILL-5481), by keeping only a predefined `max-capacity` number of profiles in the directory and moving the oldest to an 'archived' sub-directory. 2. Improve loading times by pinning the deserialized list in memory (TreeSet; for maintaining a memory-efficient sortedness of the profiles). That way, if we do not detect any new profiles in the profileStore (i.e. profile directory) since the last time a web-request for rendering the profiles was made, we can re-serve the same listing and skip making a trip to the filesystem to re-fetch all the profiles. Reload & reconstruction of the profiles in the Tree is done in the event of any of the following states changing: i. Modification Time of profile dir ii. Number of profiles in the profile dir iii. Number of profiles requested exceeds existing the currently available list 3. When 2 or more web-requests for rendering arrive, the WebServer code already processes the requests sequentially. As a result, the earliest request will trigger the reconstruction of the in-memory profile-set, and the last-modified timestamp of the profileStore is tracked. This way, the remaining blocked requests can re-use the freshly-reconstructed profile-set for rendering if the underlying profileStore has not been modified. There is an assumption made here that the rate of profiles being added to the profileStore is not too high to trigger a reconstruction for every queued up request. 4. To prevent frequent archiving, there is a threshold (max-capacity) defined for triggering the archive. However, the number of profiles archived is selected to ensure that the profiles not archived is 90% of the threshold. 5. To prevent the archiving process from taking too long, an archival rate (`drill.exec.profiles.store.archive.rate`) is defined so that upto that many number of profiles are archived in one go, before resumption of re-rendering takes place. 6. On a Distributed FileSystem (e.g. HDFS), multiple Drillbits might attempt to archive. To mitigate that, if a Drillbit detects that it is unable to archive a profile, it will assume that another Drillbit is also archiving, and stop archiving any more.
1. Separated 'cache' impl into separate class 2. Added test cases for #1 3. Rebased on top of master (DRILL-6386)
1. Refactored to introduce an Archiver 2. Allow for cache to only apply to WebServer 3. For non-webserver request, like SysTables, support for recursive listing. This is because, while archiving speeds up performance for WebServers, SysTables would need access to archived profiles for analytics. 4. Added tests for the ProfileSet cache
1. Manage thrown exceptions 2. Sorting profiles via Java8 streams API x
Code was rebased to account for impact of fixes for DRILL-6053 (locking of PStore), DRILL-6422 (shaded Guava imports) and DRILL-6492 (schema/workspace insensitivity).
8feabce
to
4c8d2e0
Compare
Closing this in favor of PR #1654 |
Note: Closed the old PR #755 and opening this.
When Drill is displaying profiles stored on the file system (Local or Distributed), it does so by loading the entire list of
.sys.drill
files in the profile directory, sorting and deserializing. This can get expensive, since only a single CPU thread does this.As an example, a directory of 120K profiles, the time to just fetch the list of files alone is over 6 seconds. After that, based on the number of profiles being rendered, the time varies. An average of 30ms is needed to deserialize a standard profile, which translates to an additional 3sec for therendering of default 100 profiles.
A user reported issue confirms just that:
DRILL-5028 Opening profiles page from web ui gets very slow when a lot of history files have been stored in HDFS or Local FS
Additional JIRAs filed ask for managing these profiles
DRILL-2362 Drill should manage Query Profiling archiving
DRILL-2861 enhance drill profile file management
This PR brings the following enhancements to achieve that:
max-capacity
number of profiles in the directory and moving the oldest to an 'archived' sub-directory.Reload & reconstruction of the profiles in the Tree is done in the event of any of the following states changing:
i. Modification Time of profile dir
ii. Number of profiles in the profile dir
iii. Number of profiles requested exceeds existing the currently available list
drill.exec.profiles.store.archive.rate
) is defined so that upto that many number of profiles are archived in one go, before resumption of re-rendering takes place.