This is post in work in progress and not yet complete or proofed. Read at your own risk.
The Santiago Swallow project started with a simple question: how easy is it to quickly and cheaply create credibility on line? The answer was clear. Anybody with Internet access and Paypal can make a Twitter account and populate it with paid followers. Nothing more than that is needed. I extended the character into Wikipedia and this, his own web site, because I thought it might be necessary to stop him from being found out. It wasn’t — he would have done just as well without them.
I had been gradually improving Santiago’s automatic tweet generation so it was not too repetitive. I called these tweets fortune cookies. My intention was to have Santiago spout some faux TED-style wisdom every so often: a short line that may at first glance seem profound, but was actually nonsense. One formula was:
“To [verb a] is to [verb b] [adverb].”
The automation software then randomly picked verbs and an adverb from a list and generated hundreds of sentences. Some were good:
Some were not:
There were more misses than hits. Letting the automation run unchecked—which I did at first to drive Santiago’s tweet count up—caused him to spew repetitive, cryptic nonsense:
This made it obvious he was a bot, so I had to curate the auto tweets before the were posted, deleting many, sometimes hand correcting grammar on others. I also wrote some tweets by hand and queued them to be released at random times and in random order. Many of Santiago’s tweets are still written by me and added to the queue, many of his automated Tweets are deleted by me before they are sent. Curation teaches me what works and what does not. The eventual goal is be to have everything he says be auto-generated. Santiago took a small step towards this goal and became a more interesting Tweeter once I simplified his fortune cookie formulae and made many more of them. This increased the variety, reduced the chance of the same formulae appearing in succession, and resulted in a better ratio of sense to nonsense. One example is:
“I [verb] therefore I am.”
This yielded some interesting results:
In addition to generating fortune cookies, Santiago post information from the web too: for, his queue includes scores from social analytics web site Kred, which featured in the original Quartz story. He scrapes his own score, and also randomly selects from lists of public figures and Quartz editorial staff, by going to:
kred.com/[user name]
then using the site’s built in “Tweet” function. There is a catch: often the resulting Tweet exceeds Twitter’s 140 character limit, so some tricky deletion, normally of some or all of this part of Kred’s boilerplate text [and an Outreach level of x in the Global Community] is required. Another problem is that not all verified accounts use the public figure’s name in an expected way, so sometimes he is Tweeting the Kred score of an account that does not belong to the right person. I need to build his target lists by checking for verified accounts.
I am also experimenting with having Santiago tweet lyrics from songs by Daft Punk, Gary Numan and others, but this is hit and miss at present unless I enter them manually. Sometimes lines of lyrics pasted from the web run together or include formatting instructions.
The biggest challenge is having Santiago respond realistically to Tweets directed at him by followers. To do this well, he almost has to pass the Turing Test. “Almost” is the key word. Twitter’s 14o character limit ( 124 characters in practice, as “@SantiagoSwallow” accounts for 16), plus Twitter users’ conventions, such as “RT” and hashtags, constrain the problem within helpful bounds. Twitter profiles also have structure for user name, location and web site which is helpful.
This challenge nearly tripped Santiago up before he got started. As part of his early automation, he followed random Twitter accounts. This result in a Twitter suspension before the article was posted for overly aggressive following, after which I turned follow automation off completely, with one exception, discussed later.
But before the suspension, Santiago had randomly followed an account called @RachelRoseReid. Rachel sent him a tweet asking why he was following her. At that time his response was to favorite hr message and retweet it. This amplified Rachel’s suspicion. I sent her a handwritten reply from my account, and then another from Santiago’s to allay her concerns. I then sent her an e-mail explaining the project and asking her not to give Santiago away. Thankfully, Rachel agreed.
I have taken a number of steps to improve Santiago’s reactive automation. Building on lesson I learned about the need for simplicity and variety in the fortune cookies, Santiago parses all Tweets that include the string [@SantiagoSwallow] to identify ones that are candidates for reply. Tweets containing links are immediately eliminated. There is no way of knowing where they lead and no way for him to understand linked content. Favorites, retweets, modified treats and “heard throughs” are also eliminated .The remaining interactions come in two classes: new followers and messages. New follower protocol is straightforward. If a profile includes a blog in WordPress, blogger or another common tool, Santiago will check the date of the last post. Most people do not update their blogs regularly. If the most recent post is more than three months old, Santiago will welcome the follower with a canned phrase teasing them about their blog, for example:
This is a nice automation trick. It appears personal, but the syntax is simple. Once Santiago identifies that the web site in the user’s profile is a dormant blog, he only has to complete this string:
“Welcome, [Twitter handle]. It has been so long since you posted something new on your home page I was starting to worry.”