Re-engineering News with Technology

Years ago, in college, I went to a presentation by a big internet company, as part of a recruitment event. At the time, I was working at the college newspaper, and the talk was about their “front page”. They said it was the biggest news site at the time, so I was excited.

The bulk of the talk was technical. But the presenter mentioned that one of the biggest challenges was keeping abreast of what they called the “National Enquirer effect”. The problem, as she described, was this. The main goal of the front page is to drive traffic to other properties; and the system was always optimizing both the selection of content on the front page and its ordering based on raw clicks. He said, while no one admits to it, content with the best-clickthrough rate was always “bikini women”, so left alone, algorithms would turn the front page into National Enquirer. Ironically, this means that no one would visit them, over a long enough period. They said they were trying to fix this by some longer term optimizations, but for now, there was essentially a team for each locale that monitored the site, and kept it “clean”.

Couple days ago, I saw a tweet about a NYTimes wedding post. It said “Trevor George asked Morgan Sarner out to dinner 10 nights in a row, and won her heart”. The person, whose retweet I saw, said she’d probably get a restraining order. It was funny, I “liked it” but it seemed odd that NYTimes would promote such creepy behavior. So I clicked on the link. It turns out the groom did not just ask her out 10 days in a row, but took her out as such. It’s a minor difference, between the text in the tweet but the actual content, but it was enough to get that person to retweet a mock of it, and more ironically to get me to click on it.

Ev Williams, the co-founder of Twitter and Medium, likened the algorithms that govern the internet as to a Deus ex machina that provides you with the most extreme of what you want; you think car crashes are interesting? Here’s a pile up! It feels true, and definitely explains the long-winded global nausea.

Looking at it another, though, this is just specific application of the paperclip maximizer. Instead of natural resources of the earth, we are just mining minds. And instead of making more paperclips, we are just making some people in Bay Area richer. I live in the Bay Area, for now, so of course I shouldn’t complain.

But what’s really missing from the debate is how technology has really failed to find a way to attract attention of its readers without sacrificing the content. And while some of it is done automatically, some of it is self inflicted.

There are structural and economic explanations for the problem. Internet first destroyed the newspapers’ monopoly on advertisement. Then the glut of content came, with democratization of publishing tools, further pushing down the value of any individual work. Unbundling of pieces from the newspapers and magazines that carried them reduced the value of a brand, and in turn pieces that make up a bundle too. As social media platforms further flattened all content into same structure, be it from The New York Times or some kids from Macedonia, any semblance of product differentiation.

The number of knobs publishers have is dwindling and their editorial decisions is one of their last levers. When you are competing with so much content, and you don’t control how your content is distributed, your only option is to change your content to fit your distribution channels. Legitimate news organizations have long erected a well between “the church and the state”, or rather “editorial and the advertising” but at least in terms of packaging, the wall no longer exists. The only difference is that it’s not he advertisers that determine your content now, but your distributors.

I saw this first hand too. At Digg, we would casually tell big publishers and famous individual alike that if they worded headlines a specific way, they would get more clicks. Sometimes it worked, sometimes it didn’t. Google has entire guides, mostly technical but with editorial hints, on how to help you get more traffic.  Facebook does it too, but slightly for different reasons. They want people to click on the content, but not too much, so publishers better avoid clickbait titles. And of course, most publishers, especially smaller ones that do not have big subscription revenues or rich patrons to back them, get in line.

This is not a jab at newspapers, although it is that a bit. My real qualm is that we still don’t have a proper way to consume the news where the hook doesn’t dictate the content. We built search engines that can scour the entire web in less than a second, but I still can’t figure out whether a piece of content is worth my time, or is just fluff. I can take a virtual tour across the globe, but I cannot tell what a federal policy change means for me as a resident in California. The primary problem is funding and revenue, but is there a lack of imagination as well?

I also don’t know if the solutions to these problems will exist on the supply side, or the demand side. Probably, it will need to be both. Publishers need ways to authenticate and brand their content, and consumers need reading experiences that respect those. Moreover, consumers need a better way to find and consume content that respects the integrity of it, and not let it be violated for distribution.

There are a lot of attempts to build a new stack for consuming news. Services like Blendle attempt to fix monetization by removing the hurdle of micropayment, and also consolidate subscriptions. Facebook and Google try various things too; AMP is a way to clear up the reading experience (and cynically move more of the content to Google servers), Facebook’s Instant Articles is a more locked-in and heavy-handed way of doing the same. Both Facebook and Google also want to help publishers gain more subscribers, and the subscribed users to have more fluid, integrated experiences on the web with their own platforms.

And of course, publishers, try their hands too. One of my personal favorites is what Axios does, with their telegraphic, lightly structured way of presenting their content. It feels respectful of my time as a reader, cuts through the fluff without sounding too clinical. I wish more publishers experimented with radically different, but still thoughtful ways of producing and presenting content like them.

At that talk I went, they said one of the ideas was to have a fluff lever; slide it to one side and you get practically smut. Slide it all the way to the other, it’s all dreary politics, which wasn’t smut at the time. As far as I know, they never launched it.

Internet has undermined, intentionally or not, the workings of all news organizations. It took over their advertising, their users’ attention, and now a few companies inadvertently are guiding more of the content too. The different responses to this change lie across the political spectrum. What is common, though, is that the problems will not go away, and the economics that govern newspapers will not go back to where they were. But maybe, there are ways to attack this problem with technology, as well as with policy.

I am not sure if that is the answer, but maybe it could be worth trying.

Digg was all about news and nothing else. It didn’t work out.

Couple days ago, I was having lunch with a friend who used to work at Twitter. Eventually, the issue of Fake News came up. I told him, as more of a joke, that Facebook could just solve the Fake News problem by taking the News out of News Feed, and turning it to essentially just a bunch of social update. He retorted, saying that product already existed and it was called Instagram. We both sighed and shrugged and downed a few more drinks.

Now, apparently Facebook is trying that exactly, and of course publishers are freaking out. You can’t really blame them. For many publishers, Facebook is their biggest source of traffic, which they monetize via ads. But you can also not just feel bad for them, because, that is the risk of building your business on someone else’s platform. Just ask Zynga.

My understanding is that Facebook started promoting news sources and publishers more or less as a defense mechanism against Twitter. It might be ancient history now, but there was a time where the fates of these companies weren’t as far apart as they are now. Facebook noticed that Twitter was getting an undue amount of attention from the media folks, with newscasters and individual journalists signing up on in droves and moving the conversation there. Facebook wasn’t a fan, decided to flex its muscles a bit.

I don’t know if that’s true, but it rings true. And I know this, because I used to work at a company that was in the same boat, at the same time. When we launched Digg V4, one if its goals was to cut down the noise of Twitter and just focus on links instead of the mundane status updates. It didn’t work out, and Digg imploded rather spectacularly but the idea was solid. Digg was always at the forefront of many ideas that are common now, such as “liking” things both in and out of Digg’s website and apps. But with Digg V4, it all came crushing down.

To understand all this, you need to go back to 2010, if not earlier. Twitter and Digg were both merely curiosities, largely unknown outside of Silicon Valley. Twitter However, Digg controlled a significant amount of traffic, and getting on its front page could be a huge boost to not just publishers but really any company. Even Dropbox, now a pretty much a household name, attributed a significant amount of their early users to getting on Digg’s front page.

However, Twitter was already gaining momentum. Although the site could barely stand without failwhaling, it was already signing up big time users like Ashton Kutcher, Justin Bieber. But they didn’t really drive traffic to anyone; and most people were using it as a more public stream of consciousness than anything.

So that was one of Digg’s plays with V4; that we’d be the driver of traffic to publishers because we didn’t have any of those pesky “I am eating a cheese sandwich” updates that littered your Twitter timeline that you didn’t know what to do with.

Kevin Rose tweet about Digg V4
Kevin Rose tweet about Digg V4

The why and the how of Digg’s failure is complicated. But largely, it was a perfect storm of technical issues (mostly of our own doing), management mishaps, and of course the Cold War Digg used to wage on its users finally erupting into thermonuclear skirmishes. Digg always had a delicate relationship with its most influential user base; either side never really blinked but with Digg V4,  it all changed.

One of the most controversial changes was making My News, the logged in personalized page the default option, as opposed to the “Top News”, which was The Digg Homepage. With this change, the importance of Top News was significantly reduced since we essentially distributed the logged in page views across thousands of personalized homepages. This was both a way to keep more people logged in, by providing them a better and more engaging homepage, but also to make sure that we had more unique pages where most publishers could get clicks from.

The real controversial change was actually allowing publishers to automatically submit items to Digg by sucking in their RSS feeds. What this meant is that now you could participate in Digg without really participating; we could just suggest your account to new users who would see your content, which Digg would automatically ingest, without you doing ever anything. And the fact that we accidentally, I swear, promoted those items over manually submitted items did not help.

While Digg made a conscious decision to prioritize big publishers, we managed to scare away most of the user base. Without users, of course, the traffic publishers received started dwindling down. But more critically, without eyeballs on Digg itself, the advertisers slowly fled. The rest is history.

Facebook, for what’s its worth, never had a problem with users leaving its service and I doubt they ever will. I don’t know if they would, but they could remove all the links to publishers from News Feed and most users wouldn’t give a damn. The genius of News Feed was never the links, but it was the ability to give living and breathing person on earth their own personalized rumor mill. The outrage articles, especially in the age of Trump is addictive, for sure, but it doesn’t hold a candle to the addictiveness of being able to see a new update from one of your friends.

I am a strong believer in the importance of journalism for a liberal democracy. I would dare not wish for publishers, and journalists to lose their sources of revenue. But at the same time, I can’t imagine that at least the big publishers did not see this coming. No one in their right mind would put all their eggs in someone’s basket. And hey, maybe this is a good thing. Maybe this is the wake up call we all needed.

Fighting Spam at Facebook

Couple days ago, I wrote about how “Fake News” on Facebook is a spam problem caused, or at least exasperated, by economics of attention. Since there’s limited amount of attention people can give in a day, and Facebook controls so much of it, if you can reverse engineer out the mechanics of the News Feed, you can fan out your message, or boost in Facebook parlance, to millions of people with at a minuscule cost.

On this blog, I use a combination of my experience as a software engineer, what is reported in press, and some light rumor treading to explore ideas. But it is hard to not come off as navel-gazing. No one writes about spam at Facebook, when it’s not a problem. And these systems are complex, involving hundreds of people working on them over many years. They have their own compromises. The inner workings aren’t always hidden (but they are, more than they should be), but it’s not always easily accessible to an outsider.

Luckily, I had some help. Melanie Ensign is a former co-worker of mine Uber. And more importantly, she used to work at Facebook with teams working on fighting spam.  Melanie has been at Uber for almost a year now, but I think her experience from Facebook (where a lot of Uber’s security team hails from, including the CSO) is well worth exploring.

Through a few tweets (embedded at the bottom of this post), she told me how Facebook used to combat spam, why certain approaches worked better than others. It’s worth noting that her comments were about Facebook posts, not about ads (a bit more on those in a bit).

Ensign says that Facebook fights spam primarily by targeting not the content of the posts, but the accounts that post them. She says “systems were trained detect spam based on behavior of accounts spreading malware. It’s never really been about *content* until now”.  She adds that monitoring content is tricky, for several reasons.

The first is obvious; with more than 2 billion monthly active users and many more billions of content posted on the site every day, it’s a big, wieldy undertaking to even start monitoring that much content. Facebook is an engineering force to behold but scaling an operation like that; building systems to analyze that much material for unstructured data, and doing it effectively real-time is not a simple task.

Second reason, is the obvious risks around censorship. Facebook admittedly wanted to keep a neutral position on the content posted on its site (save for legal requirements). False negatives are bad; you let in “spam”, but a false positive is akin to censorship. This might be less controversial now, where Facebook works with fact-checkers to annotate content. But the Facebook promise was always one of extreme, sometimes admittedly labored editorial impartiality.

Third is harder to appreciate but one I can understand. When you build systems that recognize spammy content, you inherently give away your secret; it becomes much easier to work around. Ensign points to the Ray Ban spam that was going in Facebook couple years ago. She says that since content the proper bona fides, team fighting spam instead relied on account characteristics of the accounts posted. Facebook engineers who presented at the Spam Fighting @Scale conference share similar insights; “Fake accounts are a common vector of abuse across multiple platforms.” and “It is possible to fight spam effectively without having access to content, making it possible to support end-to-end encrypted platforms and still combat abuse” are two that are worth mentioning.

Running a user generated content site is hard. When I first started at Digg, the thing that shocked me most was how much of the Digg engineering was really to keep the site remotely clean. We had tools that worked to recognize botnets, stolen accounts, and everything in between. Every submission was evaluated automatically for many characteristics, such as “spamminess”, adult content, a few more. There were tools to block certain content, only in certain countries. Brigades, are they were called, would form on Yahoo! Groups to kick stuff of the site, or promote them. As we plugged one hole, some social media consultant find a new way to use Digg’s various tools to send traffic to his or her ad infested site.

And there was also the scaling. Digg was a lean and mean engineering organization, compared to Facebook. But still, we always struggled with scaling challenges, and so did everyone. Failwhale might be all gone now, but running a site that’s under attack 24/7, with features being added left and right causing unforeseen performance issues all of a sudden, scaling an organization to support just more than a few million users, is an exercise that few can appreciate.  Keeping a site that complicated, up and running at that scale, is a challenge. Doing that while keeping site fast, is a whole another beast. Former users of Friendster or Orkut might feel the same way; performance issues were what caused their users to leave the sites.

I stand by my initial assessment of the problem. Facebook built a massive attention pool, sliced and diced it, packaged it nicely, and now is making bank selling it to the highest bidder. The problems it faces, from spammy content, to fake news, is inherent to the medium of exchange, attention. Sketchy characters flock to frothy marketplaces, like bees to honey. What makes or breaks a marketplace valuable is being the trustworthy intermediary, between the buyer and the seller. By being so large, and so influential, Facebook owns this problem.

And to be clear, this is not a dig (or Digg?) at the company; I rely on Facebook to keep in touch with friends scattered around the world. My WhatsApp groups, like for many others, is my support system. I think as the world’s address book, it is where any business, or activist, or a community organizer find customers, supporters, or members. And of course, while I have no significant others working at the company, or have any financial exposure to it, I do have close friends who are former, or current employees.

My main qualm with social networks has always been the commercializing of the individual and the collective attention spans. As we spend more of our waking hours plugged in, move more and more of our political discourse, both in United States and around the world, to these walled gardens with rules that are written by a few people living in California, we risk losing more than just the integrity of a single election.

Fake News is an attention economy problem

A common theme of this blog is that history repeats itself. There are some fundamental dynamics of information that are innate to the internet, and most companies coast those trends. There are occasional shifts; like the smartphone with its always-on-connectivity and sensors but things more or less follow certain trends.

The recent rise of “fake news”, or cheap information that plagues everywhere that Facebook, and to a smaller degree Google, is dealing with has precedents and can be explained (and predicted, as many did) basic look at the economies of attention, which is the another theme of this blog. Being somewhat reductionist, the problem can be view as a spam issue, on steroids. I admit the integrity of presidential elections is a more serious problem than loss off productivity but a more sterile approach might help come to some immediate solutions.

Facebook might be the punching bag these days for everyone, especially journalists, but Google had its fair share of spam issues. Not too long ago, at around 2009, the Mountain View company was fighting a fierce war against what was then called “content farms”. These companies would basically figure out the trending Google searches, create extremely cheap content, real fast, and do some SEO magic, and get traffic from Google, against which you can sell ads. As long as your cost of production was lower than your revenue from ads, you were golden.

This was a big, lucrative business. The biggest player in this game, aptly named Demand Media was a billion dollar public company. This Wired feature on the the company is full of amazing anecdotes. The company ran many, many websites targeted at virtually any vertical, including one called Livestrong, a franchise of the none other than Lance Armstrong.

Google, soon woke up to the danger, and issued an update to its “algorithm”, called the Panda update and effectively kneecapped the entire industry. Today we are looking to hear from Facebook CSO Alex Stamos, but Matt Cutts of Google was all the rage back then.

Facebook even had its fair share of “spam” problems, and while company might seem like paralyzed in an effort to satisfy both sides, it wasn’t always that way either. Zynga figured out the dynamics of News Feed, as well as the psychological rewarding mechanisms of unsuspecting “gamers” and built a billion dollar business around it. In the meantime, though Zynga and its flagship FarmVille game became synonymous with spam. When Facebook woke up to the problem, and took action, the resulting tweaks nearly killed Zynga too. The gaming company is still around, as a public company, but it’s struggling to even pay for its HQ. Same pattern also happened with companies like Upworthy, and many other “viral” news sources.

As an outsider, it’s not clear how much of an existential crisis this is for Facebook. Google’s struggles with content farms was an existential risk; users losing trust in their search engine can jump ship to Bing or any other. Facebook users are locked in to the platform, and by the virtue of social networks, as more users join, its gets harder for next user to leave. The social network is more or less the world’s biggest address book for many, and the filter bubbles really make the problem of fake news only one someone else can diagnose for you, not unlike a mental disorder. Some like Sam Biddle even argue inherently benefits from our endless craving of drama. Russian interference in US elections propelled the problem to mainstream media, but that was unintentional.

Moreover, the numbers itself make it a challenge. unlike a few content farms (or virtual farms, in case of Zynga) that can be easily identified, for Facebook, there are 5 million advertisers who can push any sort of content to users’ news feeds. Still, it doesn’t seem like an unmanageable number. There are many business that have similar number of customers, who seem to keep a handle on them.

It wouldn’t be great for Facebook’s bottom line to have to increase the cost per customer, but it is probably the right approach for the long term. The media and tech analyst Ben Thompson argues the same in his column. (Subscription might be required) Facebook flew past its competitors partly by being the saner, more refined, Ivy-grad built and approved alternative. Google probably doesn’t miss revenue it used to earned from the content farms, and Facebook certainly doesn’t miss Upworthy. Longer term vision would help. A company that’s building solar powered planes that communicate each other via gyroscopically stabilized lasers should be able to solve some spam issues.

As a sidetone, it’s worth mentioning the opposite examples. These cheap SEO or virality games do not always end badly for companies. For each Demand Media, there’s a “success” story like Business Insider, and the like. The journalistic pasts of these organizations are questionable. Both, among others, have built their businesses on borrowing content from other organizations, having fewer and more junior staff, but really playing the SEO game better than anyone. Similarly, Buzzfeed is a now serious journalistic powerhouse now but the company was decidedly built on subsidizing actual journalism off of more viral, bite-sized content.

The fact the solutions will emerge only points to the chronic nature of the problem, however.  Facebook, Google, or any platform can solve the spam problem, given enough resources and focus. An economy that’s based on commodified attention poses not just passing economic challenges to tech behemoths, but existential risks for a regime that’s somewhat predicated on an educated public. The history of attention economy is the subject of Tim Wu’s excellent book Attention Merchants, which I can’t recommend highly enough.

When people’s attention can be sold to the highest bidder, the producers with the lowest fixed costs will rule the world. A few years ago, it was Demand Media, then it was Zynga, then Upworthy and Huffington Post, and today it’s everyone. As costs of production goes down (which is a good thing), the challenge will get harder. Moreover, as targeting of not just ads, but any content, becomes more precise, yet more opaque, the shared context that holds a society together will inevitably decay.

It might be a libertarian pipe dream to live free of interference from anyone, in one’s own digital and physical cocoon, but that seems untenable in the long run for a liberal democracy. At some point, we will have to elevate our rights to our information laid down in a more robust fashion, instead of relying on the good will of a few people living in California. Spam, as a risk to productivity, was solved by better technology, as well as regulation that required transparency to widely distributed emails. But most importantly, it got solved after we acknowledged the problem, saw the long term risks, and attacked it at its mechanics.

With Big Data Comes Big Responsibility

It’s getting harder to suppress the sense of an impending doom. With the latest Equifax hack, the question of data stewardship has been propelled to the mainstream, again. There are valid calls to reprimand those responsible, and even shut down the company altogether. After all, if a company whose business is safekeeping information can’t keep the information safe,

what other option is there?

The increased attention to the topic is welcome but the outrage misses a key point. Equifax hack is unfortunate, but it is not a black swan. It is merely the latest incarnation of a problem, that will only get worse, unless we all do something.

The main issue is this: any mass collection of personally identifiable data is a liability. Individuals whose data is vacuumed en masse, the companies who do the vacuuming, and the legislators should become aware of the risks. It is fashionable to say “data is the new oil” but the analogy only goes so far, especially when you consider the current situation of the oil-rich countries. Silicon Valley itself here is especially vulnerable.

Big parts of the tech industry in Bay Area  is built on mass collection of such private data, and deriving some value from it. A significant part of the value comes from, somewhat depressingly, from the ever increasingly precise ad targeting. The problem with this model was long known, if not tacitly admitted by its creators, but it wasn’t until the Snowden revelations real national debate has picked up. With the recent brouhaha following the 2016 Elections, and a real risk of an authoritative government in the US, the questions are louder this time.

Public outcry does help, but the change is very slow. Part of it is the business models are wildly successful. Combined Alphabet (née Google) and Facebook are a trillion dollar duopoly. The cottage industry around these two companies, along with practically all stakeholders in the area being somewhat either beholden or financially tied to the industry, motivation to change is small.  Some companies, like Apple, try to raise the issue to a higher plane of morality, part for ethical reasons, part competitive. But the data keeps getting collected, at an ever increasing pace and it’s getting more and more likely a catastrophic event will occur.

Let’s first talk about how data gets exposed. Hacking, or unauthorized access is the most talked about but it’s far from the only way. A lot of the times, , it’s just a matter of a small mistake. Take Dropbox. A cloud storage company once allowed anyone to log into anyone else’s account by entirely ignoring a password check. The case was caught quickly, but it’s a dire reminder of small mistakes can happen. And that is a point worth pondering, separate the recent hack Dropbox suffered from.

As easily data is collected and stored, it’s even easier for it to change hands. Companies and their assets change hands, and so do the jurisdictions they live in. Russian tech sector is a prime example. Pavel Durov, the founder of the oddly popular instant messaging platform Telegram, first built VKontatke, a Russian social network site much popular than Facebook in the country,. But then came Russian government with demands of censorship. Durov ran away but the Russian social network is owned by a figure much closer to the government. And there’s always LiveJournal, which again got sold to a Russian company, now all its data under Russian jurisdiction.

And sometimes, the companies themselves open up themselves to being hacked. Once an internet darling, Yahoo! was put on spotlight when its own security team found a poorly designed hacking tool, installed by no other than company itself. Initially designed to track certain child pornography related emails for the government, the tool was built without the knowledge of the company’s Chief Security Officer, Alex Stamos, a well regarded security professional. He departed the company soon after, only to join Facebook. And again, this is just an addition to the Yahoo! hack that affected 1 billion users, and almost derailed multi-billion dollar acquisition.

Government surveillance is a touchy subject, and moral decisions are always fuzzy, with someone being unhappy. Governments should use tools at their disposal to keep their citizens safe, and this might sometimes require uncomfortable measures. This doesn’t mean they should be given a direct access to millions of people’s private, however. Intelligence efforts should be directed, not drag net. Living in a liberal democracy requires a certain amount of discomfort, not pure order.

But it is hard to deny the evidence at hand, from once liberal darlings like Turkey to known autocratic regimes like China, any government will find it impossible to resist the temptation to take a peek at the data, one way or another.

Governments are made up of people, just like corporations are. The solutions to these problems won’t be easy; with so much already built, tearing it all down is not an option, or even preferable. The industries built add value, employ thousands, if not millions. But we have to start somewhere, both as individuals, technology companies, and legislators.

First, individuals need to be more cognizant of their decisions about their data. Some of it will require education, from a much younger age. But even today, for many, there are a lot of easy steps one can take.

For many uses, a more private, less surveillance oriented tools already exist. Instant messaging tools like WhatsApp (once bought by Facebook for a whopping $19 Billion) is easy to use while using an cutting edge end-to-end encryption technology borrowed from Signal. One can wonder, if essentially playing spies is worth the hassle, but the risks are real, and getting more so every day even for congress people in the US.

For regular browsing, things are in worse shape. Practically every site on the internet tracks you across every other site, shopping and news sites are particularly bad. The users are fighting back, with sometimes clunky, equally overzealous tools. Thanks to an overzealous adoption of ads, both intrusive and sometimes malicious, ad-blocking is on the rise around the world. It is hard to fault consumers, most would benefit from using an independently owned Ad-Blocker like uBlock Origin, or using a browser like Brave that has such technology built in. Apple recently updated its browser Safari on both macOS and iOS to “intelligently” curb cross-site tracking.

For things like email, and cloud storage, things are trickier. For many users, their data is safer with a big company with a competent security team, as opposed to a smaller service provider. There’s a balance here; while big providers are much juicier targets (including governments who can request data legally), they also have the benefit of being hardened by such attacks. Companies like Google use their own services, further incentivizing them to safeguard data, at least from hackers.

However, even then, most people would benefit from increasing the security from the default values. For users of Gmail, Dropbox, and virtually any other cloud storage technology, using 2-Factor authentication, coupled with a password manager is a must.

And largely, going back the cognizance, individuals must be aware of the data they provide and be at least minimally informed. When you sign up for a new service, before sharing with them all your data, see if they at least have a way to delete it, or export it. Even if you never use either of those options, they can be good signs that company treats your data properly, instead of letting it seep into their machinery.

For creators of such technology, things are harder but there’s hope. First step is obvious; companies should treat personally identifiable data as liabilities and collect as little as possible, and only for a specific purpose. This is also the general philosophy behind EU’s new General Data Protection Regulation (GDPR) directive. Instead of collecting as much data as possible, hoping to find good use for it later, companies should only collect data, when they need to. And most importantly, they should delete the data, when they are done with it, instead of hoarding it.

Moreover, companies should invest in technologies that do not need collecting data at all, such using client side computation instead of server side. Apple is the prime example here; company uses machine learning models that are generated on the server, on aggregate data, for things like image recognition or speech synthesis on the devices themselves. Perhaps a sign of poetic justice, the intelligent cross-site tracking Apple built-in to its browser is based on data collected in aggregate form, instead of personally identifiable fashion.

It is not clear, if such technologies can keep up with a server-based solution where iteration is much faster, but the investments might pay dividends. Today’s smartphones easily compete with servers of just a few years ago in performance. Things will only get better.

And for times when mass collection of data is required, companies should invest in techniques that allow aggregate collection instead of personally identifying data. There are huge benefits to collecting data from big populations, and the patterns that emerge from such data can benefit everyone. Again, Apple is a good example here, though Uber is also worth mentioning. Both companies aggressively use a technique called differential privacy where private data is essentially scrambled enough to be not identifiable but still the patterns remain. This way, Uber analysts can view traffic patterns in a city, or even do precise analysis for a given time, without knowing any individual’s trips.

And more generally, companies should invest and actively work on technologies that reduce the reliance on individuals’ private data. As mentioned, a big ad industry will not go away overnight, but it can be transformed to something more responsible. Technologists are known for their innovative spirit, not defeatism.

End-to-end encryption is another promising technology. While popular for instant messaging, technology still in infancy for things like cloud storage and email. There are challenges; the technology is notoriously hard to use, and the recovery is problematic when someone forgets their encryption key, such as their password. Maybe most importantly, encryption makes the data entirely opaque to storage companies, severely limiting the value they can provide on top of it.

However, there are solutions, some already invented, some being worked on. WhatsApp showed that end-to-encryption can be deployed at massive scale and made easy to use. Other companies like Keybase work on more user-friendly ways to do group chat, and possibly storage, while also working on a new paradigm for identity. And there’s also more futuristic technologies like homomorphic encryption. Still in research phase, if it works as expected, technology might allow being able to build cloud storage services where the core data is private while still being able to be searched on, or indexed. Technology companies should direct more of their research and development resources efforts to such areas, not just better ways to collect and analyze data.

And lastly, legislators need to wake up to the issue before it is too late. The US government should enshrined privacy of individuals as a right, instead of treating as a commercial matter. Moreover, mass collection of personally identifiable data needs to be brought under supervision.

Current model, where an executive responsible for leaking 140M US consumers’ can get away with a slap on the wrist and $90M payday, does not work. Stronger punishment would help, but preventing such leaks at the source by limiting the size, fidelity, or the longevity of the data would be better.

Moreover, legislators should work with the industry to better educate the consumers about the risks. Companies will be unwilling to share details about what is possible with the data they have on their users (and unsuspecting visitors) but it is better for consumers to make informed decisions in the long run. Target made the headlines when it reportedly figured out a woman was pregnant before she could tell her parents. Customers should be able aware of such borderline creepy technology before they become subjects to it. Especially more so considering Target itself was also a victim of multiple major hacks. Facebook recently was the subject of a similar report where the company discovered a family member of a tech reporter (the same reporter who broke the Target story), unclear to everyone how. Individuals should not feel this powerless against corporations.

The current wave of negative press against Silicon Valley, caused mostly by the haphazard way social networks were used to amplify messages from subversive actors, is emotionally charged but is not wholly undeserved. Legislators can and should help technology companies earn back people’s trust, by allowing informed debate about their capabilities. A bigger public backlash, when it happens, would make today’s pessimism seem like a nice day in the park.

There are huge benefits to mass amounts of data. There is virtually no industry that wouldn’t benefit from having more data. Cities can make better traffic plans, medical researchers study diseases and health trends, governments can make better policy decisions. And it can be commercially beneficial too, with more data we can make better machine learning tools, from cars that can drive themselves to medical devices that can identify a disease early on. Even data that is collected for boring purposes can become useful; Google’s main revenue source selling ads on top of its search results, which no user would want to get rid of.

Data might be new oil, but only with mindful, responsible management of it will the future look like Norway, rather than Venezuela or Iraq. In its essence, personally identifiable data in huge troves is a big liability. And the benefits we derive from such data currently, is largely mostly used for things like better ad targeting. No one wants to go back to a time without Google, or Facebook. But it possible to be more responsible with the data. The onus is on everyone.

Facebook and Uncanny Valley

I grew up with Facebook, in all senses of the word. The first time I was in the US for summer school in 2004, Facebook practically didn’t exist. Just a year after, in 2005, when I was again in the US for a different summer school and actually got a .edu email address from a major college, I remember my friends being really excited that they could join this service called The Facebook. I remember vaguely looking at it, not really getting what the big deal was and casually ignoring it.

Fast forward yet another year, in 2006, I am on CollegeConfidential forums, a forum frequented by high school seniors applying to colleges in the US and I see that virtually everyone in the CMU forums are freaking out about getting their email addresses simply because it’d allow them to get on Facebook.

And during the course of my studies, in a relatively short span of 4 years, I have seen Facebook evolve from this website where you would go to see if that girl in your Econ class was single or not to an alternative, second internet for a significant part of the world’s population. And maybe more interestingly, while “the-company-to-work-for” for computer science majors at CMU was definitely Google in 2006, Facebook was definitely became a much more appealing option in 2010, especially for those who wanted to work more on the consumer side of things, like yours truly.

It is not just because a significant part of my young-adult life evolved alongside of Facebook that I get more value from Facebook than the average user; I am a Turkish native who went to an American prep school in Turkey. Now a significant chunk of my friends are scattered across the world. While we maintained a Yahoo! Group for some time,for some intra-class communication, that group died a pretty quick death as people’s lives got busier, other things took priority and most importantly Facebook simple became easier to use for same purposes.

That is all to say that Facebook is very important to me, probably more so than it is to a nerd who grew up with BBSes and dial-up or casual user on it.

There is however something way more essential for me, something so valuable that I can’t put a price on it and would do anything to keep it mine; my personal life.

Those two realities, that I value my interaction on Facebook as well as my personal life would of course be irrelevant to each other had it not been for Facebook simply taking a much bigger part in both my personal and my social circle’s life. And even that would be fine; culture and our way of living will undoubtedly change with each advancing technology; but seeing the effect on Facebook my own personal life and mental well-being, I have started to actually think about how to handle this new technology better.

Moreover, I have been always interested in how technology actually changes people’s lives. While computers and all things high-technology has always been fascinating in their own right, the biggest reason I started doing what I am doing is and living where I am living is that I wanted to be around when technology when it not only it changes our lives as individuals but also as a society.

While I am not as multi-cultural as I wish I have been, I am lucky enough to have a good grasp of not just Turkish and American cultures but also the “internet” culture that I grew up with as a kid who had spent more time on his computer than being outside for a significant part of my life.

So over time, I have formed some well informed, some not so well informed, opinions about Facebook. As culture and technology are two of my favorite topics, it’s something I have talked a lot to many people about and those people have told me many times I should share those thoughts with others.

This is all those thoughts, unabridged.

Facebook is public.

This is the guiding principle of my activity on Facebook.

It should be very clear to anyone with some knowledge of advertising and marketing works is that the more data Facebook has on you, the more money they can make. So it is definitely in Facebook’s interest to get you to both put in as much as data as you want as well as making it more available to others. In fact, this horse is so beaten to dead by everyone, that I almost find typing all this pointless.

But the thing that is really worth mentioning is that I actually believe that Zuck and Co believe that they are doing something good and worthwhile, inducing us to obsessively catalogue and index every inane activity happening in our lives. Sure, it’s easy to point out how this will all make Facebook the next AOL, it takes a different, slightly twisted but in that amusingly twisted, mentality to build features so that you can mark the first time you got a tattoo on your timeline or when you recovered from chemo. While the nerds among us would pour hours and hours to organizing our Winamp playlists and no one else seemed to care, it somehow became acceptable, if not outright “cool”, to be the person who checks in not oneself but everyone around him to the hot spot that none of your friends are at, without a single care about how it might be used or abused.

However, you don’t need to look any further than the privacy kerfuffle Ms.Randi Zuckerberg raised to understand the implications of actually putting any content online. Ms. Zuckerberg posted a picture of her family, including her brother Mark, chatting around a kitchen counter. While the photograph itself wasn’t “public” in the Zuckian sense, one of Ms.Zuckerberg’s reporter friends, probably thinking it was a benign enough photograph posts it on Twitter, resulting on Randi Zuckerberg first saying how “uncool” it is, posting a couple more angst-filled tweets and then deleting them right after.

The irony of the whole situation notwithstanding, the point I am trying to get across while there is already something a bit disturbing about how a single entity having so much personally identifiable information about you, the more nuanced issue is that as long as you put any sort of information on Facebook, be it an image or a relationship status or a simple comment, you are in fact sharing that all that information with not just the Facebook’s evil privacy-hating overlords but also practically every single person who might see it on their Newsfeed. And sure, even if you somehow made your way out of the Escheresque privacy controls and you have limited the your online exposure today to your socially capable friends, you are still making a ton of assumptions, from what private means to each of your friends to their actual well-meaning, if we are going to get a bit dark.

And then what happens when Facebook actually changes their privacy policy so that you new actions don’t adhere to your old controls and your friends can now share the content if they sacrifice their youngest new-born to the gods? And of course, there’s the problem with Facebook inventing even more new ways to expose more of your activity not only on Facebook but also on any other application with their frictionless sharing. Are you going to now trust not only the judgement of the seedy app developers who’d do anything to get their user numbers app, as well as their technical competence in addition to everything else you already had to keep in mind?

Sure, it’s somewhat of a stretch for most people to be embarrassed by a photo of their dinner to be posted on national news, the chances of you having posted something on Facebook having made its way to someone other than its intended audience is higher than you’d think. I’m sure your off-the-cuff racist joke is hilarious but do you think all your friends and your-friends’s friends share your appreciation for Louis CK? And yes, you do look great in that bikini but have you ever made your way to the darker corners of the internets where creepy men share them with the rest of the other creepy men, pretty much legally?

So do the easy thing and ask yourself: would you be OK with whatever you are posting on Facebook (or any other social network, for that matter) being public one day? It’s only 2 months ago that Facebook removed the feature where you could truly hide yourself from all searches.

Facebook has more privacy controls than ever before but arguing that as Facebook hasn’t become more “public” over time or simply won’t be fully public at some point in the future is a futile discussion.

Privacy isn’t dead.

I find the new-fangled “privacy is dead, get over it” rhetoric utterly misinformed, if not outright stupid.

It’s easy, and fun if I say so myself, to be overly excited about a new way to check-in places, share your feelings and thoughts, or maybe snap a picture of a particularly attractive sunset (or a bike, if you are like me). You can argue until you can’t on Hacker News whether or not such things constitute as innovation but what you cannot argue is that mere mortals, people like you and me, enjoy them a lot.

It is however naive to think that the more of our lives we put online for others to see, the less we care about privacy. In fact, if anything, I’d say that most people I know are more aware of that nasty feeling you get whenever your privacy is infringed because it happens more and more every day.

And make no mistake; the moment when someone infringes on your privacy, when it comes to stuff that really matters, you will find yourself feeling the same way too.

Not to take any more cheap shots at Ms.Zuckerberg’s misfortune –though she probably deserves a lot more for producing the world’s most horrible show on TV–, if one can get angry over a picture of her family, just sitting around a kitchen counter being posted online, saying that we should just simply give up on privacy because it’s too damn inconvenient is not just wrong but actually dishonest.

In fact, one need to look not any further than the tech scene to find stories of people doing the craziest things, things that you’d not expect them to do, just because arguably someone invaded on their privacy. While some of them, like Google not talking to CNet for a year because they unearthed some publicly available information about its CEO is more entertaining than not, some others like the famous Ruby developer _why collecting his things and quitting the internet is more damning and dark. Barbara Streisand might not have been as foolish as we thought she was, after all.

And sure, you can again make the argument all this doesn’t apply to you because you are essentially a nobody on the internet. But simply imagine how you’d feel if one day you find a notebook on your friend’s desk where he lists the times that you leave and enter your building. And moreover, he also lists every single thing you told him, the boring stuff like your favorite book as well as the awkward stuff like what kind of skin cream you have in your closet.

Are you going to argue that he can’t sit at the cafe across the street from you and watch you literally get on the public street or simply work as a cashier at the grocery store?

While you should go ahead and reconsider your friendship with that sociopath, you might as well come out of that traumatizing relationship with an epiphany about how much you cared about your own privacy and how much it matters to you. And while it is easy to flex it here and there every once in a while, it hurts a lot, more than you think, once you lose it. And by definition, it is one of those things that you’ll not easily get back after it’s gone. So might as well keep it as close to your chest.

It is not real life.

As we spend more and more time online, digital, cyber or whatever we call it these days, our well curated online presence slowly slips into the uncanny valley. You look at someone’s online profile and get this feeling that person is living the life you wish you had been living; some shots by the beach that you will never go to or the concerts that no one invites you to.

Surely, it might just be my Fear of Missing Out, or FOMO as it is lovingly called, talking but there is something utterly non-human and almost disturbing to see someone’s life in such great detail without the perfections.

It reminds me of this time they were shooting a movie on campus, back when I was in college. As some scenes in the movie took place in a dorm room, the film crew actually build a “dorm room” in the common area of our dorm. What was amazing about the dorm room was that it was so much of a stereotypical dorm room, with the casually discarded clothes on the bed to the random containers of Cup-Noodles to Harold and Kumar posters and the Macs and every other detail being picture-perfect that you could definitely tell that it wasn’t an actual dorm room but something that was actually manufactured.

That is not to say everyone is constantly putting out an act on Facebook. The real issue is that it is very hard, if not impossible, to actually create any resemblance of documentation of one’s social world using bits-and-bytes online. Maciej Cegłowski describes the technical issues around the issue (as well as the utter pointlessness of it) much better than I ever could in his post. I am simply approaching the issue from the other, psychological end.

Ask any social psychologist and you’ll hear about how self-reporting studies are extremely hard to validate. It turns out it is surprisingly get people to give you the answers you want (or not) but extremely hard to actually get them to describe to to you how you feel. In fact, if you look at enough social psychology studies, you might very well think that the entire field is about finding a more ingenue and clever ways to trick people into giving you the true answers, instead of doing any “science” work.

And there is of course the social pressure which muddies the waters even further. Are you actually going to post about your horrible break-up when you see your friends are having the time of their lives in Malibu? Maybe fish for some easy likes and compliments by posting a joke or an Instagram. But then, why would you let anyone know that you are spending your valuable time, that time you’ll never get back, being on Facebook? And now we are back to where we started. Shouldn’t you actually be out and about in a tropical island or just be simply out to meet some new people? Maybe becoming a true Lawnmower Man and playing Farmville all day, every day is the answer, after all.

When I was looking for a new job, couple years ago, someone who I consider a good mentor told me that a lot of the really cool jobs aren’t actually public. They are not posted on companies’ websites or job boards. The only way you’ll hear about them is if someone actually reaches out to you because they think that is the right job for you and you are the right person that for that job.

I find the phenomenon extends well into social life as well. As I mentioned before, my social circle on Facebook is pretty fun and I definitely learn about new stuff happening around me. But more often than not, I get notified about the really coolstuff that is happening through boring mediums like hearing it from a friend over some beers or someone actually reaching out to me personally thinking that I’d really enjoy that really cool thing.

And I haven’t even touched upon the actual living aspect of it all in this meta-noise. Now that you have excommunicated your sociopath friend and are now shooting the shit with your best friends in Malibu; everyone is having a great time. You think this is what happiness must be like, just enjoying your friends company without a single care in the world other than your drink being a little too cold.

Would you rather be the person who’s actually having that great time or the person who is obsessively documenting everything that is amazing happening around you? Just like most things in life, there’s a line in the sand (no pun intended) that one draws here; we all want to remember the good times and have memoirs but there’s a point at which the whole enjoyment becomes a simple vehicle for documentation and the reality becomes irrelevant. Of course, this is nothing new to Facebook, but there’s no denying that Facebook’s permeance in our lives took it to unprecedented levels.

Facebook should do better.

As I mentioned before, I have no beef with Facebook, as a company or a product. In fact, I have even applied there 3 years ago for a job opportunity and have a good deal of friends, including one of my best friends from college, work there as engineers, designers, and product managers.

As an engineer myself, the fact that Facebook even works amazes me day and night. I have always considered their design team on top of their stuff, working with challenges that would make a regular designer’s head explode in a second. And moreover, I have high admiration for the speed and fervor with which they are able to ship features and change things.

In fact, I believe Facebook in and of itself is one of the places that has been operating relatively consistent and coherent manner as well as consumer focused companies go; there are deviations and distractions (looking at you, Poke) here and there but you can’t blame Zuck & Co. for doing what they said they’d be doing.

If anything, given its sheer size and how much it permeated into our lives, I am surprised that Facebook hasn’t made as much of a cultural dent as many other online properties. Granted, it has created a never-ending stream of amusing stories (mostly caused by privacy blunders) for bored journalists to sift through and the vast amount of data Facebook generates should feed generations of sometimes slightly misguided but mostly well meaning social scientists and marketers, Facebook The Company simply feels like it has been busy building features that you think it should have (as in, for example, seeing a list of all things you have “liked”), instead of doing something world-changing or jaw-dropping.

As I touched on before, Facebook did all this while creating a culture that not only nurtures but also attracts very high-caliber talent. I might be alone in feeling this way but I hope that the company actually continues on that culture after its eventful IPO, finds its true calling (and by that I mean revenue source) and invests all that back into its technology and talent and become a true tech company instead of a media conglomerate that everyone loves to hate.


One friend I asked to proof-read this essay told me that there’s no point to all this. And I agree, there isn’t. These are simply my thoughts on Facebook, just like I said, some are well-informed and some not-so-much.

But there is something I want to convey and that is that your online activity, be it on Facebook or Twitter or whatever, matters as much as you want it to.

Just think about what you are doing, every once in a while.