Oz Blog News Commentary

Paying for Data

September 23, 2019 - 23:53 -- Admin

In the New York Times, there is a video opinion piece from Jaron Lanier which makes the case for finding a way for consumers to be paid for their data. I really enjoyed the accessibility of this piece as I think it helped make a clearer case. But I found myself with some big questions and so wanted to put those out there.

The opinion is based on Lanier’s work with Glen Weyl on data dignity. The basic idea is familiar: when you do stuff on the Internet, you are often sharing that ‘data’ with platforms who then use it (a) as an input to target ads towards you and (b) as a means of building better algorithms to target ads to others. There are three broad classes of beneficiaries from this data being made available. First, there are merchants who buy ads. Second, there are consumers who get better ad matches (although I will warrant there is a debate about this one). Finally, the platforms who do the matching. Lanier argues that he does not believe that the shares of “who gets what” here is the best one (I would presume for social welfare) and there is a second-order effect that if the shares aren’t right then the parties do not have the right incentives. In particular, you don’t want to much going to entities that aren’t doing much — the undertone is that these entities are Facebook, Amazon and Google.

What is missing from the current world, Lanier argues, is a proper data market. This isn’t just something that platforms trade with each other but instead that explicitly involves consumers being paid for that data. The video is a bit short on how that might happen but, from their other work, what they argue for is a clearer articulation of consumer rights to withhold data from platforms (and beyond) and also a better technological means for them to do so. In other words, it is to give consumers property rights and a more cost-effective means of defending them.

[By the way, it is not like there aren’t private entities trying to do this right now. I installed iOS13 this week on my iPhone and it has been telling me quite explicitly which apps are doing what with my data. In particular, it has prompted me — explicitly with a notification — to think about whether I want ‘Dark Sky’ or ‘Google’ to know my location when I am not using the app. (It was yes for the former and no for the latter but I wouldn’t have set that without a prompt from Apple). My guess is that each would like me not to make a quick decision at that point but to be allowed to make a case for particular decisions. This isn’t necessarily a market but at least a clear articulation of benefits. Right now, I suspect Apple doesn’t know how to do that in a fair manner and so we have a blunt instrument. If Google wants my data, it will need to make a special request at some point of me for it.]

If you can do this, then Lanier believes that a market can emerge organically. Like all markets with many transactions, you won’t have to decide much as a consumer as to how to negotiate and collect payments, but intermediaries will crop up to do that. They term them MIDS or “Mediators of Individual Data.” I couldn’t tell if these just emerged (in which case, wouldn’t they just be other tech companies) or if they were somehow the creation and output of a regulatory infrastructure (like banking?) but, regardless, they would make the market.

This is all very interesting but, even aside from the challenges in doing all of this, there is one big issue that I can’t look away from: pricing. How is price going to be determined for each bit of data? The reason this is complicated is that there are two classes of data — articulated above — that Lanier puts together in terms of how consumers can be paid.

For your input data, that is how your own data tells someone what sort of ads to provide, there is a more direct relationship between yourself, the platform and the merchant. As I wrote about some time ago, that makes it really complicated how to think about payments for data at all. If you are being paid for your data, then that is coming from someone else. If it is coming from the merchant, then it is part of their cost and if they are selling you stuff, that becomes part of your price. In other words, it is complicated and it is far from clear that you giving your data freely, in this case, is a problem.

How are data is used to build algorithms for platforms to sell to others is another matter? Today, however, we have some help on that matter from a new paper from Daron Acemoglu, Ali Makhdoumi, Azarakhsh Malekian, and Asuman Ozdaglar. I’m not going to do justice to their work but my paraphrase of their argument is that when it comes to building algorithms, your data to one platform is substitutable for data you have given to another platform and for data from other people. You may be unique as a datapoint but in a statistical pool, the whole point is to make you non-unique if you are building a robust predictive algorithm. That means you aren’t of much value which means the price of your data will be low. Acemoglu show that it might be too low which is one of the reasons why Lanier thinks that some collective negotiation of data payments is required.

But herein I have another worry. We have seen this before. In the setting of prices for music royalties, we have a clear set of property rights, a set of collection agencies, a regulatory structure and a periodic ton of arguments amongst economists as to what the ‘right’ price is. Having participated in this fight recently, let me tell you that it is hard even to get price structure right let alone pricing levels. If Lanier gets his way, I’d be willing to participate in the fight over data pricing but my guess is that making a claim for prices that will pay households $20,000 a year (and that is what Lanier and Weyl suggest is possible) is another thing. To use an Australian phrase, “tell ’em they’re dreaming.” You can write music and not get that.