Few will have failed to notice that when people talk about data being collected it is often described in a physical form. A recent article in the Guardian titled ‘I asked Tinder for my data. It sent me 800 pages of my deepest, darkest secrets’ shows this phenomenon, describing not the amount of data collected in gigabytes, but instead putting it in terms of physical printed pages. You can visualize this amount of data fairly easily, just imagine something the thickness of a copy of Anna Karenina and suddenly the amount of data held becomes a real thing, something tangible where before it was almost an abstract concept.
The Guardian article was written by Judith Duportail, who opened the app 920 times and matched with 870 people, there are people who will have opened the app 10x more than this and matched with 10x more people, does that mean that there will be 8,000 sheets of paper for these people? If we imagine that Tinder, an app limited to the purpose of dating is collecting roughly 0.86 pages of data every time you use the app, how much does Google collect and how many physical pages would that fill? On an average day I would guess I use the site 50 times to fact check, find stories, or just check the weather, if they collected data at the same velocity as Tinder (which is a very conservative estimate), this would represent 301 pages of data every week, or 15,652 pages per year. To put that number in perspective, it would be around 15 copies of War and Peace stacked on top of one another, if it were printed on regular printing paper it would stand over 1.5 metres high. That’s taller than Dolly Parton.
This move away from the physical to purely digital has allowed data to proliferate in society because there isn’t a physical reinforcement of what is being shared. It is true for not only information, but equally for money. A study by MIT, for instance, found that people were willing to spend 65% more for a basketball ticket if they were paying for it with a card rather than cash. The fact that you cannot see the money you’re spending on something means you are willing to spend more, its the same with the data being collected online. If we needed to fill out a form by hand with all of the data that’s being collected, nobody in their right mind would do it, but because its quick and invisible people are willing to have their data collected and stored by millions of companies around the world without them realizing.
Data may be being collected in ways that many are not comfortable with, but the benefits of this data collection is clear, which muddies the waters when it comes to right and wrong in the area. We all know that data has allowed things like social media to grow, for huge repositories like Netflix and Amazon to show customers what they would like from the millions of options, and allows Google to offer world beating products for free. It has also made the world safer, allowing authorities to access information about potentially dangerous people and track criminals in ways that was not previously available.
However, the unfathomable non-physical data that’s being collected has undoubtedly had an unintended negative impact, as we have seen with the hacks that have taken place over the last few years. One of the key reasons why the Equifax attack was perceived to be so much worse than other companies is that people had never directly given their data to them. But is there much difference between that and a company losing data that you were unaware they had? For instance, going back to Tinder, they collect information from every conversation you have on there, but then they also analyze these chats and makes psychographic findings based on these over time. Yet, deep within their user agreement it states ‘you should not expect that your personal information, chats, or other communications will always remain secure’. This kind of information has the potential to be hugely damaging. Even just the content of the messages is personal enough, let alone the psychographic information within this dataset.
The potential damage that data can do is massive. Nobody would keep their deepest secrets written on paper in a flimsy shed that people were constantly trying to break into. When that shed is owned by the company keeping that data, it becomes even more unpractical, but because we can’t see where the data is stored, are unaware of what data is being stored, and cannot see the people trying to constantly get it, there seems to be no issue in sharing your information.
This may be a case of hyperbole as many companies genuinely keep the data they hold as safely as possible, but regardless, there are still hacks taking place every day and we have no way of knowing which company is holding your data in Fort Knox and which is using a drafty barn. We reveal a huge amount about ourselves because we aren’t aware of what exactly is being gathered and when, with little public scrutiny on the companies holding it, so is it any wonder that so much of it is lost?