the information apocalypse

Feb 19, 2018

My friend Aviv was recently the subject of a popular Buzzfeed article about the possibility of an impending “information apocalypse”. tl;dr: Aviv extrapolates from the fake news crisis that started in 2016 to a world in which anyone can create AI-assisted misinformation campaigns indistinguishable from reality to the average observer. From there, a series of possible dystopian scenarios arise, including:

“Diplomacy manipulation,” in which someone uses deepfakes-style video manipulation tools to produce a realistic video of a political leader declaring war, in order to provoke their enemy into retaliation.
“Polity simulation,” where Congresspeople’s inboxes are spammed with messages from bots pretending to be their constituents.
“Laser phishing,” which improves the sophistication of phishing attacks by training phishing email generators on messages from your real friends, so you’re more likely to read them and fall for them.

With regards to how this would affect our everyday lives, Aviv postulates that exposure to a constant barrage of misinformation may lead to “reality apathy”; that is, people will start to assume that any information presented to them is untrustworthy and give up on finding the truth. In such a world, Aviv says, “People stop paying attention to news and that fundamental level of informedness required for functional democracy becomes unstable.”

The rest of this post consists of my half-formed thoughts on the possibility of reality apathy as it relates to existing technology, largely informed by my perspective working in the computer security industry.

Let’s start with the Internet of the late 90s and early 2000s. Back then, long before the 2016 fake news crisis, it was already possible to bombard everyone on the Internet with well-crafted, believable misinformation. All you had to do was create a legitimate-looking website with a legitimate-looking domain name and then spread the link via chain emails. Case in point: the first website that ever traumatized me as a kid was something called Bonsai Kitten. Complete with realistic-looking photos of cute kittens contorted into various jars and vases, this website completely fooled the 5th-grade me into believing that some unimaginably cruel person was raising and selling mutilated kittens. Apparently I wasn’t the only one - the site drew hundreds of complaints a day from concerned animal-lovers and was the subject of an FBI investigation according to Wikipedia. Eventually Bonsai Kitten was debunked, and I’m now friends with the creator of the site that once haunted my nightmares.

Another example of misinformation that people online have been exposed to since the dawn of the Internet is email-based phishing and spam. However, nowadays it is relatively difficult to create a successful bulk phishing/spam campaign, largely thanks to Google’s improvements in spam filtering and Gmail’s ever-increasing centralization of email. As Mike Hearn describes in great detail on the messaging@moderncrypto.org mailing list, Google eventually “won” the spam war by building a reputation system in which sender reputations could be calculated faster than the attacker could game the system. Notably, this solution relied on both Gmail’s ability to scan an incredibly large volume of plaintext email and their ability to broadly distinguish between legitimate Gmail users and bots trying to do Sybil attacks on the reputation scoring system.

How does this change in a world where phishing emails are generated by really smart AIs? We might imagine that the content of these emails would do a much better job at fooling both humans clicking the spam/not-spam labels and Gmail’s filter algorithms into thinking that they aren’t spam. However, the spammer would still have to figure out a way of obtaining non-blacklisted sender IP addresses, which can be done by signing up for a bunch of accounts with a webmail service like Gmail (which Gmail tries to prevent, as Mike Hearn notes) or by taking over someone else’s account and spamming their contacts.

The latter case may seem intractable. If a bot hacks my Gmail account, uses my sent mail to train itself to generate emails that sound exactly like me, and emails my friends asking them to send money to the hacker’s Bitcoin address, Gmail’s spam filters are going to have a hard time figuring out that those were not legitimate emails from me. On the other hand, I could have prevented this from happening had I done a better job of securing my email account. For these reasons, I consider email-based misinformation to be mostly a solved problem as long as Gmail’s anti-spam team keeps doing their job, Google still controls the vast majority of email (ugh), and users can keep their accounts secured.

What about phishing websites, then? There are some incomplete defenses against those:

Google SafeBrowsing is a service integrated into Chrome and other browsers that contains a dynamic blacklist of “bad” domains, such as phishing sites and sites distributing known malware. Browsers that enable SafeBrowsing will show a warning before allowing a user to view a site on the SafeBrowsing blacklist.
High-profile legitimate websites such as banks will often buy Extended Validation TLS certificates, which require their organization to be validated by the issuing certificate authority. In exchange, browsers show a green bar in the URL bar next to the lock icon, usually containing the name of the organization and the country code. However, it’s relatively easy to game the validation process, and it’s unclear how well this actually protects average users from real-life phishing attacks.

On the other hand, the problem of determining whether a website is presenting a distorted version of reality has some important differences from the problem of determining whether a website is phishing/malware. For one, many people who have no problems accepting Google’s decision on whether a website is distributing ransomware binaries would not be happy if Google were the sole arbiter of whether news stories were true or not. The occasionally-made argument that SafeBrowsing is a form of censorship would apply much more cleanly if SafeBrowsing (or a similar megacorp-controlled service) also blacklisted sites that were ruled to be “fake news”.

So how do we protect people from believing fake news in a world where anyone can generate realistic-looking videos, images, and stories?

One idea I had brought up with Aviv was that web browsers or content platforms like Facebook and YouTube could show a special UI indicator for media and stories from reputable news sites, similar to how browsers show a green bar for EV certificates and how Twitter/Facebook show a check mark next to verified account names. For instance, videos that are approved as legitimate by the Associated Press could be signed by a cryptographic key controlled by AP that is preloaded into browsers. Then when the video is playing in someone’s Facebook news feed, the browser’s trusted UI (for instance somewhere in the URL bar) would display a popup informing the user that the video has been approved by AP. This would however not prevent attacks where a Facebook user makes up a fake caption (“Trump declares war on Hawaii”) for a real image (an old photo of Trump speaking).

(Browsers and news reader apps could even add an optional “trusted news mode” where they only display media files that have been cryptographically signed by a reputable journalist organization. Unfortunately this would also block content from “citizen journalists” such as people livetweeting photos from a protest.)

To end on a more pessimistic note, I fear that we ultimately can’t stop misinformation campaigns from becoming rampant and normalized, not because of purely technological reasons but because of psychological ones. During and in the wake of the 2016 Presidential Election, a lot of the shares, likes, and retweets that I saw boiled down to people on both sides trying as hard as they could to reinforce their existing beliefs. Social media, it turns out, is an excellent tool for propagating “evidence” of what you believe, regardless of whether the evidence is real or not.

Fake news that is unpalatable (ex: Bonsai Kitten) stops spreading once it’s been debunked, but fake news that is crafted to be consistent with your desired reality can keep getting views, shares, and clicks. Instead of reality apathy, we end up with pick-your-own-reality filter bubbles, in which people gather and amplify fake evidence for the reality that best suppports their underlying narrative. Instead of “giving up” on consuming information, people cherry-pick their information consumption based on feelings instead of fact, turning more and more online spaces into breeding grounds for extremism.

And so, maybe the majority of people wouldn’t even want SafeBrowsing-style blacklists of fake news sites or verification badges on legitimate journalist-vetted news articles, because they’re not reading the news to learn the truth - they’re reading the news to validate and spread their existing worldviews.

Effectively this means that any technological solution to the information apocalypse depends on a social/behavioral solution: people need to welcome cognitive dissonance into their online spaces instead of shunning it. But it sounds almost ridiculous to suggest that shares/likes/retweets should be based on factual accuracy, not emotions. That’s not how social media works.

discrete blogarithm

the information apocalypse