-
Notifications
You must be signed in to change notification settings - Fork 16
Skewed revenue share #2
Comments
@dimitri-xyz - hi again! i'm not sure i agree. if 10x as many people go to site X than go to site Y, would we not expect site X to get credited 10x compared to site Y. sites that get more page views/durations should get credited more than sites that get less. that doesn't mean that small sites don't get credited, but it should be "relatively" proportional. does this make sense? what is the specification missing that would lead someone to think otherwise? thanks! /mtr |
Are you using "duration" to weigh ad impressions? Consider the two sites:
Imagine personalblog.com is from a small blogger that writes one new in-depth article a week. The articles are long, so they take a while to write. At the same time, the quality keeps readers coming back for more. Assume personalblog.com has a significant readership of 50K readers. Because readers only go there once a week, it is unlikely that personalblog.com is on any of its readers top 20 list. At the same time, the site generates at least (50K readers x 4 weeks =) 200K impressions a month. This can lead to meaningful income. Is Brave, if successful, going to drive this blog out of business? On the other end of the spectrum, we have buzzfeed.com which is awash with gossip and light articles with little content, but which readers go to on a daily basis for a quick break. Because of this daily use, buzzfeed.com may be on many reader's top 20 list. My point is that the current proposal skews payments away from sites that do not make it to a reader's top 20. Here is another way to look at it: By using the "top 20 lists" are penalizing "eventual use" publishers (like personalblog.com who tend to be small) in favor of "continuous use" publishers (like buzzfeed). We are steering the content of the web to be less diverse by using this revenue distribution model. A linear (per impression) revenue distribution model would not have this effect. I hope this makes it clearer! :-) |
let make a request for some help!
what do you think a formula that returns a weight for each publisher might look like? many thanks! /mtr |
Here’s one suggestion. (TL;DR: Go to Default Policy section) First, the reward should be per page and independent of the grouping of pages per publisher. Visiting 100 pages from 100 different publishers or 100 pages by the same publisher should give out the same reward in either case. It should not mater how the publishers are organized, just how much time and attention each set of pages gets from users. A user may assign more value to an article because it is from a reputable source such as the New York Times, but the payment system itself should be agnostic to such metadata. To make sure we are on the same page (pun intended ;-), let me first get two implicit assumptions out of the way. User consumptionDifferent users spend different amounts of time on the web and each user may “consume” more or less of it in different months. Assume that user A uses the web for 100 hours in a month and that users B and C use it for 50 hours each. Shouldn’t user A have about as many “reward points” to give out as users B and C added together? Shouldn’t user A have about twice as many reward points to give out as user B? I will call the conjecture that by spending twice as much time on the web user A has about twice as much reward points to give out as user B proposition “double time = double rewards” (DTDR for short). DTDR assumes that A will have gone to about twice as many pages as B or spent about twice as much time on each page (but not both). The number of available “reward units” given out by a single user each month may change according to how much he used the web that month (his “web consumption”). In saying that “the client has 100,000 votes to spend each month”, I assume you are normalizing the calculation to be independent of how much the user surfed the web on a particular month. User valueFrom the point of view of a publisher, users are not all the same. Different users may have different dollar amounts associated with their personas. This maybe because a particular user:
In fact, users may want to signal to publishers what experience they want by either changing how many dollars each of their “reward units” are worth or by giving out more reward units. In saying “the client has 100,000 votes to spend each month” we are also normalizing with respect to user value. We may want to make it so that the average “user value” is 100,000 units per month, but I don’t think we should force all users to have the same value, even if they use the web for about the same amount of time. It seems more flexible to be able to give different users different amounts of reward points. This allows publishers to think of reward points as dollars and to tune their content to maximize revenue (rather than a “per unique user” weighed measure). User configured reward policyI propose we model the “reward votes” as putting a value on a user’s attention span. In other words, if an article grabbed a user’s attention for longer, it should be better rewarded. If a site got more distinct hits, it should be rewarded more. However, it is not immediately clear what the shape of the allocation function should be. Different users may have different use cases. For some users, any page which they visit for less than 5 seconds (and do not click on) should be considered an annoyance and not rewarded. For these users, such pages are just annoying pop-up ads they couldn’t block fast enough. These users may also want to disproportionately reward sites on which they spent more than 2 minutes. On the other hand, some users may just want to quickly skim the web for news headlines and then get on with their lives. For these users, having a page open for 3 minutes means they forgot to close it. Because of the very different expectations, I think different users should be able to change how they allocate rewards based on the trade-off between time spent on the page and number of pages visited. Default policyWhat should the default setting be then? I suggest the following policy:
more precisely:
Pages are rewarded based on how much time a user spent on them, but through diminishing returns. There is also a minimum page load time, before which no award is given at all, and a click bonus which rewards pages that are so good the user is able to quickly find what they are looking for. Rewards are given per page and then just aggregated per publisher. The time spent on the page should be “with active focus” to prevent tabs which are left open for 5 days from getting huge rewards. The “load time” should prevent pages which turned out to be “mistakenly loaded” from getting rewards. Following the argument above, I believe it is very important to allow each user to change this formula according to her preference. I also think users should be able to change how much reward votes they will give out each month based on their total web usage and how much money they want to spend to improve their web experience by just "buying" more or less "reward votes". |
@dimitri-xyz - hi again! if you have a few minutes, could you take a look at the code starting at https://github.com/brave/ledger-publisher/blob/master/index.js#L126 and let me know what additional parameters / formulas should be used. for now, we can assume that the duration parameter is the number of milli-seconds that the tab had focus. many thanks! |
My suggestion is that we have diminishing rewards for the amount of time spent on a page. This means the reward function must be a concave function of You currently have a linear model. The score is a linear function of the duration:
in other words
There are many utility functions in the literature. I suggest changing the linear model into a quadratic model. In other words, into a “parabola on its side”. First, zero Your linear model gives a reward of current linear model: minDuration = 2000 what happens: user spends 0 seconds on the page => reward = 0 After each extra 30 seconds the user spends in the page, the publisher gets an extra reward point (no diminishing returns). I will call this the “extra reward” time We can use the first 3 points to calibrate a quadratic model. quadratic model: minDuration = 2000 user spends 0 seconds on the page => reward = 0 For the model To give the publisher an extra reward point the user has to spend longer and longer on the page. what happens:
The user has to spend an extra 26 seconds at each new interval for the publisher to get the same amount of extra reward points (diminishing returns). To use this quadratic model we solve for a and b based on the parameters you have (with duration in seconds and assuming visitWeight = 1):
and use the quadratic formula to find the rewards. This yields:
Note 1: The rewards should be diminishing and the concave function should not be bounded, but I don’t think this is the ideal way to calibrate the model. A better way to calibrate it would be to assume users’ behavior is optimal and maximizes their utility and use data such as this: Note 2: I still think you should not use a function such as |
@dimitri-xyz - thanks for the note. you may find the (new) documentation a bit helpful -- https://github.com/brave/ledger-publisher#page-visits -- but it's pretty clear that you already understand the existing code! the question i have for the formula above is how to whether visitWeight should simply go away as a parameter. should the score be different if the total duration is the same but the number of visits goes up or down. i suspect that if i spent the same amount of time at two publishers, but did only 1 visit for publisher #1 and 3 visits for publisher #2, then the score for publisher #1 should be higher somehow. what do you think? note 1: that is a fascinating study! note 2: i agree. |
@mrose17
Yes, as it is superfluous. Look at your current formula
I would like to make two points here. First, a publisher is just a set of pages and I believe how we partition all the pages into different publishers should not influence on how much money each publisher gets paid out. So, that publishers have no interest in claiming to be a single entity (or multiple ones) just to increase their revenue. However, there is a trade-off in time spent. So, I would recast your question as: Does 60 seconds spent in 2 pages (30 seconds in each) provide the same The answer to this question is that:
Notice that the linear model you proposed does not pay the same in either case. It needs to be tweaked. Also, the payments given to (2 pages for 30 seconds) vs (1 page for 60 seconds) are independent of how these pages are grouped by publisher. Hope this helps! |
@dimitri-xyz - thanks for the reply. sorry to be so late in replying. regrettably, github doesn't notify me via email for some reason... sigh i agree regarding regarding the formula. i have put a new version on github -- https://github.com/brave/ledger-publisher/blob/master/index.js#L223 this version of the code treats each visit as independent when applying the scoring formula. my thinking is that this fits with the concave model. if you disagree, please let me know... thanks! |
I think we're making a very big assumption on page view time. If I spend 30 minutes on a page with one paragraph (maybe referencing it while doing something else), does that mean it should have more weight than a long article that I read without interruption in less time? If I'm a fast reader, should video content have more weight just because it takes me longer to view it? What is the optimal distribution anyway? Ads are much easier because the value is determined by a market, but how do we value content without a market? Is it the value that the user places on the content? The value that the publisher places on the content? Do we determine value by fiat? I don't think we have good answers to any of these questions right now. While all of the discussion here is very good, it's based on a (so far) unproven premise that time on page is a good way to measure content value. I think this is valuable data to collect and analyze, but I think it's too early to put it into practice. |
If we want the reward given to be independent of page view time, let's just make However, you are right that there might be other factors that are more important to take into account. From the examples you mentioned, it seems that the metadata associated with a page (e.g. Is it a video? If it is text, is it long? Does it have pictures? etc) can be very useful in providing more information about the page's value. I think users will have wildly different expectations about how Brave should reward a page. As I said in a previous comment on this thread, I think it is very important that we make this customizable. |
one aspect of page view time that I think we should consider is the minimum duration. Anything less than a few seconds should be filtered out as a redirect or otherwise unwanted navigation. Several things have changed since this thread was started, but the most important one is that we have abandoned the top N list for several reasons. First, it completely leaves out pubs who get a lot of visits in aggregate, but very few from any single user. You alluded to this problem above as well, but the second reason we moved away from the top N list is privacy. It exposes too much information about a single user and increasing the number of sites would make the problem much worse. The current privacy model gives users 1 "reward" per day. A single site is randomly selected from a list of site visits over the last 30 days. Some sites will be represented more often in the list based on either raw visit counts or some other potential weighting algorithm. If the monthly payment is $5, sites that should make $100/month or more will get at least 90% of that amount from the random sampling. The daily submissions are independent and unlinkable to other daily submissions from the same user. This could still map to what you've outlined, but we would randomly select one of the 100K reward points and send only the site that reward point was linked to. |
@dimitri-xyz - a follow-up: the next release of the ledger-publisher package will support multiple scorekeepers. initially, there are two could i trouble you for a one or two sentence description for the package's README file? many thanks! |
sounds like |
assuming we are using the same definition of |
The current version of the documentation is (in my view) not yet sufficiently clear. I had a very hard time making sense of it. So, this is a preliminary assessment. A more formal outline would be useful.
The proposed format of the browsing summary report is: "a list of the top N most visited sites". This summary format skews the revenue share to the most successful sites. In other words, small publishers that generate value for the web do not monetize any of the value they generate. All value generated by all web publishers will disproportionately monetized by the large sites (i.e. large publishers).
I understand that reporting every single site visited compromises privacy. I believe better crypto tools may need to be used here.
The text was updated successfully, but these errors were encountered: