-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Content popularity #3649
Comments
One way to handle the problem of conflicting response is to do away with absolute measures of popularity, and instead use relative measures of popularity. To put it another way, when we do "content popularity", essentially we want to sort torrents according to some function. There are sorting algorithms that perform very well if the array to sort is composed of big chunks of already sorted data. We treat data from other nodes as this "half-sorted" data, and finish the sort ourselves, trying to do as little checks ourselves as possible. Essentially, we treat discrepancy in data provided to us by other peers as an error in sorting order that should be fixed by applying some sorting algorithm, with real-world data getting the last word. |
Thank you for the suggestion. It would nicely apply for sorting multiple responses for multiple torrents. However, the current issue is different in the sense that we are talking about multiple health responses (number of seeders and leechers) for a single torrent obtained from multiple peers. The real question is whose response do we trust more to decide whether or not to update our local database. One approach in current consideration is to use 1) freshness of response and 2) trust score of sending peer from trustchain to make the decision. |
@xoriole, my suggestion solves this problem. Whenever you see a conflict in orderings, you check it yourself. The only alternative is to use trust score. But even in the case of using the trust score, you'll have to do the check sometime, to catch cheaters and add your part of knowledge to the "swarm mind". The most conflicted opinions would get the most attention and re-checking. |
It's just like selective checks in a supermarket or in transport. |
It usually better to ask for factual and verifiable data. Relative ranking are difficult to verify and catch a spammer in a lie. Exact swarm data is better in this case I believe. |
I spent some time on this (or a similar) problem at BitTorrent many years ago. We eventually gave up once we realized how hard the problem was. (specifically, we tried to pass around, via gossip, which swarms are the most popular. Since the full set of torrents is too large to pass around, we ended up with feedback loops because the ones that were considered popular early on got disproportional reach). Anyway, one interesting aspect that we were aiming for was to create a "weighted" popularity, based on what your peers in the swarms you participated in thought was popular. in a sense, "what is popular in your cohort". |
Content Popularity
We have channels and torrents which comprise/represent content within Tribler. However, we do not yet have a mechanism to check if the content is popular, alive or even dead. This ticket is created as home for tracking development to address content popularity.
Parent issue #2783
The basic idea is to get a simple implementation operational first and incrementally build upon it, starting with torrent popularity.
Torrent popularity
Check a set of torrents (max: 25) using torrent checker and gossip the popular results to a set of connected peers (max:25) regularly at a fixed interval of time.
Channel Popularity
What are the indicators of a popular channel?
These parameters can be used to derive a popularity index/score (very simple, initially) which can be used to rank the channels and disseminate the most popular channels to the peers regularly at a certain interval of time.
Issue to address: How to handle the conflicting response about the same torrent or channel from multiple peers?
The text was updated successfully, but these errors were encountered: