Our online personality is now as measurable as our carbon footprint. In addition to some rather obvious statistics, such as how often we tweet, how many others we follow, and how many others follow us, we principally reveal ourselves in our choice of words. How often I refer to “I,” “me,” “myself,” “mine,” and “my” can tell you a good deal about my propensity for self-absorption, while a frequent use of “we” and “our” indicates a willingness to share either credit or blame. The frequency with which I use “you” or “your” is just as indicative of a desire to channel my feelings outward, so if I also show a partiality for negative words, this pairing of observables is strongly indicative of hostility. Frequent use of “LOL,” “OMG,” and the exclamation point reveals an excitable personality, while emoticons and hashtags such as #irony and #sarcasm make explicit not just my feelings but a playful stance toward the content of my own tweets. The use of complex sentence structures hinging on the logical connectives “if,” “but,” “yet,” and “therefore” suggests a capacity for analytical thinking, while frequent questions—especially those involving words of negative sentiment—offer a clue to a nervous disposition.
Using simple linguistic criteria such as these in combination with an array of sentiment lexicons, James Pennebaker and his team at the University of Texas have developed a sentiment tool, linguistic inquiry and word count (LIWC) that can quantify an author’s personality along a range of dimensions, including positivity, anxiety, depression, anger, affability, social engagement, arrogance, enthusiasm, logicality, topicality, and self-absorption. An online version (at AnalyzeWords.com) allows users to affectively profile a Twitter personality of their choosing by entering the corresponding Twitter handle. Here is a screen-grab of an LIWC profile of @realDonaldTrump in June 2016:
Trump’s profile, captured in the days following his acceptance speech to the Republican National Convention in Ohio in 2016, contains notes of arrogance, anger, and intense positivity and even suggests a certain logical structure to his argumentativeness. If none of this seems particularly surprising, this is, after all, the goal of a credible analysis. Trump’s anger was voluble and contagious, and the real estate tycoon is famously unrestrained by modesty when touting his acumen for making deals, picking wives, or fighting terrorists. Frequent uses of the mantra “Make America Great Again” also leavened his attacks on “crooked” Hillary to establish an online personality that was angry about the present but very positive about the future. But author profiling is not palm reading, even if each involves the measurement of different kinds of “life” line. Online personality is dynamic and diachronic, not static and synchronic, though one can discern a general disposition that persists over time, individual readings can reveal the effects of context. Consider Trump’s profile just a day after Hillary Clinton’s acceptance speech at the 2016 Democratic National Convention in Philadelphia, a speech during which Trump tweeted prolifically:
Trump remains upbeat in this profile, if less so than in the afterglow of his own speech, yet his anger levels spike as he devotes more energy to denouncing his rival than to expounding his own vision. So we see his attacks on Hillary become less analytical as he swaps logical structure for the pugilistic simplicity of insults. Online personalities as famously pungent as Trump’s are a magnet for satire, and one might well ask how good a job these phony Trump accounts do at capturing his volatile Twitter temperament. One real-life satirist is @DonaldDrumpf, a human account whose name pokes fun at the Trump family’s immigrant history.
@DonaldDrumpf aims to satisfy two completing goals: to mimic Trump’s barbed Twitter style while undercutting the content of his rhetoric. It achieves the latter by addressing topical issues, such as claims that Trump is Putin’s Manchurian candidate, while mocking Trump’s self-regarding origin story in tweets such as: “My father only left me a few measly million dollars. Now I have billions. Testimony to my ability to borrow. Drumpf 2016 #selfmademan.” At the same time, pitch-perfect word choice allows @DonaldDrumpf to echo the pugnacious Trump tone, and as shown in this AnalyzeWords profile, a preponderance of negative words such as “measly” pushes Drumpf’s perceived anger levels high into the red zone. While this is an assured act of Twitter ventriloquism, aspects of the satirist’s own comedic style will inevitably show through when delivering a comic message in the target’s voice. We see in the profile above that the satirist goes large on every measurable aspect of Trump’s personality—even upping the sensory dimension to lend him a touchy-feely sense of his own feelings—except for the two aspects that most clearly shine through in Trump’s own tweets: arrogance and positivity. And while a cutting irony is sometimes conveyed with a simple hashtag such as #selfmademan, the conflation into a single tweet of what we expect Trump to say and what we expect his critics to say often calls for an elaborate counterfactual logic, serving to boost Drumpf’s perceived analyticity to an artificially high level.
Now look at the profile of a satirical Trump Twitterbot named @DeepDrumpf:
Designed by MIT postdoctoral researcher Bradley Hayes, the @DeepDrumpf bot demonstrates the use of recurrent neural networks, specifically long short-term memory (LSTM) networks, to train a generative system on the speech patterns of a human exemplar, as found in, say, transcripts of one’s speeches and tweets. The experimental beat writers of the 1960s used the cutup method of Brion Gysin and William S. Burroughs to slice’n’dice the texts of other writers into novel arrangements that, one hopes, will preserve the themes of the original text while disrupting its inherent clichés. @DeepDrumpf uses the deep-learning technology of LSTM as its statistical answer to scissors and paste to achieve much the same ends: to cut transcripts of attested speech into training data so as to learn how to stick it all together again in resonant but strangely familiar ways. The dark irony of @DeepDrumpf is not planned with anything like the comedic rigor of @DonaldDrumpf, but in a testament to the cutup method, its outputs can seem just as disruptive of its target’s clichés as anything written by a human satirist. The following tweet, one of @DeepDrumpf’s most retweeted and favorited, fuses several Trumpian themes—from the anti-immigrant wall to “You’re fired!” to “job creation” in a way that comically undercuts them all: “I can destroy a man’s life by firing him over the wall. That’s always been what I’m running, to kill people and create jobs. @HillaryClinton.” Unlike the very human @DonaldDrumpf, which strives for the comedic consistency of a human satirist so that every tweet is worthy of retweeting, @DeepDrumpf is a more hit-and-miss affair. Yet because we do not hold bots to the same standards as human creators on Twitter, we will gladly engage with the bot in a co-creative process of collaborative filtering by endowing some outputs with collective acclaim with our retweets and favorites. Though human creators benefit from the same word-of-mouth marketing by avid followers, we don’t want our bots to simply ride on someone else’s coattails but to become an active part of the co-creation process. Like Duchamp recognizing the aesthetic merits of a lowly object that many others have scorned, we become connoisseurs of the generative objet trouvé when we acclaim these accidents of bot meaning.
The resulting vector acts as a representative needle for the personality of the account; as the personality changes—becoming more or less angry, say—the needle will twitch.
As bots go, @DeepDrumpf tends to—not inappropriately—run its mouth, in an effort to squeeze as much content into its bite-sized tweets as possible. Because its content is drawn from the disassembled and reconstituted tweets of its target, its outputs convey a magnified sense of Trump’s personality. As shown in the profile above, the bot lights up the board on all dimensions measured by AnalyzeWords, except for the sensory dimension. To judge by Trump’s own words, he is not one to articulate his own feelings at length or in public, preferring to project feelings of love and admiration (for him) onto others while he gives a voice to their rage. In deciding which satirical account does a better job of capturing the personality of Donald Trump, it may seem odd to even imagine that a Twitterbot might have a personality at all. @DonaldDrumpf’s is essentially a blend of Trump’s and that of its human creator, while @DeepDrumpf’s personality is something else again, the exaggerated (yet undercut) personality of a digital über-Trump. In fact, every Twitterbot has a personality. It may be the personality of a raucous pet or a pet rock, but it is a personality nonetheless. How could any Twitterbot fail to have a personality, given that we unleash our bots onto a vast social network in which people judge the character of others by what and how they tweet? A Twitterbot may be an artificial entity, but each Twitterbot is an artificial social entity to boot.
By assigning a Twitter account to a specific point on each of 11 scales, AnalyzeWords effectively maps the account to a point in an 11-dimension space. If we draw a line from the origin of this space (all zeroes) through this point, the resulting vector acts as a single representative needle for the personality of the account; as the personality changes—as its tweets become more or less angry, say, or more or less attuned to others—the needle will twitch about in the space. Now imagine the needles of all Twitter users, pointing in various directions and moving ever so slightly with each new tweet. When two needles seem to point in the same direction, leaving only a small angle between them in vector space, then we can say that the corresponding Twitter accounts are exhibiting highly similar personalities. We need only measure the cosine of the angle between two vectors to estimate how similar they are, since the cosine of a zero degree angle is 1 and the cosine of a 180 degree angle is -1. Thus, to estimate the similarity of @realDonaldTrump to @DonaldDrumpf or @DeepDrumpf, or to any bot of your choosing, we can simply measure the angle between their AnalyzeWords vectors.
As an example, let’s compare the AnalyzeWords profile of @realDonaldTrump to that of @Lord_Voldemort7. Our reasons for this choice will become clear soon enough but let’s proceed on the presumption that the tweets of a president (or a presidential candidate) will be quite unlike those of someone pretending to be the self-proclaimed Dark Lord. Sampled in mid-July 2016, @realDonaldTrump’s profile produced this vector:
Analytic: 54, Angry: 65, Arrogant: 71, Depressed: 55, In-the-moment: 47, Personable: 51, Plugged In: 47, Sensory: 47, Spacy: 50, Upbeat: 55, Worried: 67>
To distinguish high from low scores for each dimension, as each has an opposing semantic interpretation—a low score for Angry actually means Calm, after all—we subtract 50 from each value, so that dimensions run from −50 to +50 instead:
Analytic: 4, Angry: 15, Arrogant: 21, Depressed: 5, In-the-moment: −3, Personable: 1, Plugged In: −3, Sensory: −3, Spacy: 0, Upbeat: 5, Worried: 17>
The needles of opposing personalities will point in very different directions. Because we want to normalize each vector so that its length in our vector space is 1, we first calculate the length of the vector using the standard Euclidean metric, the square root of the sum of the squares of each dimension, which gives 32.388. We can now normalize the vector by dividing each dimension by this length, to give:
When we now calculate the length of this normalized vector using the Euclidean metric, we see that it has a unit length of 1.0. By way of comparison, the profile for @Lord_Voldemort7 yields:
To calculate the cosine of the angle between any two vectors of unit length, we just have to calculate the dot product of the two by summing the product of the corresponding dimensions of each vector. So the dot product of the vectors for @realDonaldTrump and @Lord_ Voldemort7 is 0.7795. Recall that the more similar the AnalyzeWords profiles of two accounts, then the closer their vectors will be in 11-dimensional space and the nearer the value will be to 1 (conversely, the more dissimilar the profiles, the nearer it will be to −1). By this reckoning, 0.423 suggests just a modest resemblance, while 0.7795 is indicative of deep similarities between @realDonaldTrump and @Lord_Voldemort7. In fact, if we use AnalyzeWords.com to profile 695 of the most followed accounts on Twitter (as ranked by TwitterCounter.com), we find that @Lord_Voldemort7 is the closest of them all to @realDonaldTrump in our vector space, with the profile of Family Guy creator and sharp-tongued satirist Seth MacFarlane racking up a similarity of 0.6237. Conversely, pop singer Carly Rae Jepsen (@carlyraejepsen) is the most dissimilar of all the 695 profiles, exhibiting a (dis)similarity of −0.7197 to @realDonaldTrump. In vector space terms, the personality needles of these celebrity tweeters resolutely point in opposite directions.
Tony Veale is an associate professor of computer science at University College Dublin.
Mike Cook is a senior research fellow at the University of Falmouth.
Twitterbots: Making Machines that Make Meaning by Tony Veale and Mike Cook will be published by the MIT Press on Sept. 4, 2018. Copyright © 2018 by the Massachusetts Institute of Technology.