-
-
Notifications
You must be signed in to change notification settings - Fork 19
Vessel
Vessel is the FarNet module for Far Manager providing heuristically improved file and folder history lists (smart histories, self-organizing lists) and related UI tools for opening files, navigating to folders, and etc.
See the details about the module in its README.
This topic provides some technical details.
The name originally comes from "View/Edit/Save/SELect history".
Vessel records history of file and folder operations invoked from their smart lists and uses this information for optimizing these lists. Items having more chances to be used soon are closer to the top. These chances are calculated during analysis of item use patterns in the past.
Vessel implements an advanced self-organizing list.
Simple self-organizing list techniques:
- Move to front method (Far Manager history lists use it)
- Count method (partially used in Vessel on ranking)
- Transpose method
Our current model is non-parametric based on combination of simple methods and usage patterns extracted from the history. We assume that "interesting" items tend to be used periodically. We infer and use these periods dynamically.
Items are ranked and sorted as described below.
Rank 0
The most recently used items (2 last hours) are sorted by times. This is the "move to front method".
Rank 1
Older items are sorted by evidences of using them the past for approximately same idle times as items have now. The whole time line is split in several spans on logarithmic scale (4, 8, 16, 32... hours).
For the given item idle time we get its span log2(idle hours)
and count
recorded item uses for this span. This count is the rank used for sorting.
Rank 3
If the items are still equal by ranks then the "count method" is used. But literal use counts are not effective. Imagine one day an item was used many times and then became not used (very typical scenario). We do not want such items to be prioritized.
So we count and compare the number of days when an item was used at least once. For our application this looks more practically useful, intuitively and per experiments.
Rank 4
The remaining items are sorted by last used times. It is the "move to front method" again, only moving to the top of items with the same ranks.
Further tweaks are not useful. Experiments show that in our domain (file and folder history) many items in this group are never used again. In Vessel they eventually become aged first and removed from tracking later.
It is good. It is not necessarily optimal. But it is definitely much better that sorting by last used times and this is supported by numbers. Use Vessel "Train history" and "Train folders" commands in order to estimate the gain.
How training comparison works. For each history record X we build the smart list sorted by ranks and the normal list sorted by times. The lists are built from the previous items in history. And we know that the next item to be used is X. So we compare this item positions in lists. In most cases the ranked list wins, i.e. the item is much closer to the top of the ranked list. Normally there are just a few losers with not that significant position shift.
It is worth to mention that in the beginning of history records the gain is low and the lists are basically sorted by times. But the more you use these lists the smarter they become.
The following picture represents a snapshot of history records. The axis X is
items (files or folders). The axis Y is the logarithmic time (log2(hours)
)
and idle spans. Dots are recorded uses of items with their idle times, i.e.
times since the previous uses.
We can see that different items exhibit different use patterns. These patterns are taken into account by the proposed ranking. Note that item priorities may increase over time even if items remain not used. For example, if an item is used somewhat daily or weekly then it will have higher priority somewhat daily or weekly and lower priority in the spans between.
The following picture is just the research history. It represents training results of the originally used less effective parametric model with two factors.