Smurfing a Twitter bot
The idea
The idea came up a while ago when I was talking with colleagues : Would it be possible to make a bot able to retweet speaking as a Smurf ?
Smurfs, or Schtroumpfs in french, are an imaginary population of small blue people living in a mushroom village. They were born in the imagination of french illustrator Peyo. They speak a strange language, which is based on human language but where they replace some words with smurf.
I'm a follower of french newspaper Le Monde on Twitter (@lemondefr) and I decided that
it would be a good test to try to smurf their tweets.
Language analysis
The analysis of sentences is something complicated and might be harder in french than in some other language. I didn't even imagine for a second I could get along with this project without a good parser.
I searched the web for a few hours and finally discovered the work done by Alpage, a research group working at INRIA, the french computer science search institute.
They have created a set of tools that can analyse a sentence and returns lexems.
You can try it by yourself if you can speak a few words of french.
I tried to install the Alpage system but failed to do so and finally decided to make a call to a demo page, get the result and work with the received data.
Word transformation
Once the sentence is parsed into lexems the next step is to replace word by its smurf equivalent. Smurf language is not just about changing a word to smurf: you have to replace it with the smurf word in the same category.
Category | Smurf form | French word example |
---|---|---|
Adverb | schtroupfement | admirablement |
Past verb, plural form | schtroumpfaient | venaient |
Noun, plural form | schtroumpfs | poneys |
I wrote a basic french-smurf translation dictionnary to help me with the transformation.
Doing the transformation of one word is not enough. I also had to transform previous words in some cases where gender and number influence the preceding pronoun.
Smurf form | French example |
---|---|
Je schtroumpfe | J'arrive |
Du schtroumpf | De l'eau |
I didn't try to be clever in the selection of the french word to transform : this is a total random thing.
Integration with Twitter
The project is written in NodeJS and Twit does a good job when it comes to listen Twitter's stream API and to post a tweet so I didn't search for too long and decided to go with it.
The bot listen the stream API for each tweet from @lemondefr and delegate the transformation work to the transformation tool.
Once the text transformed the bot makes the call to tweet the smurfed version of the text.
Release
As always I had a hard time convincing myself I should release an uncompleted project, with some known bugs and possible improvements, but I finally did it.
The bot is hosted on a DigitalOcean droplet.
Some bugs are persisting and I have to manually restart the bot once in a while. I also make some changes in the transformation algorithm when a tweet is not doing so well.
Results
The bot is living is own life out there. Go and follow it !
As the transformed word is selected randomly luck plays a good part in the quality of the generated tweet.
The bot is able to do some cool sentences such as :
Droit de vote des schtroumpfs : Londres condamné à Strasbourg http://t.co/0mV5ahnxoK
— Le Schtroumpf (@leschtroumpffr) 10 Février 2015
Les experts divisés sur un schtroumpf attribué à Léonard de Vinci saisi en Suisse http://t.co/waHvCuE90i
— Le Schtroumpf (@leschtroumpffr) 10 Février 2015
Harvard interdit les relations sexuelles entre professeurs et schtroumpfs http://t.co/E1ELllkpzB sur @Campus_LeMonde pic.twitter.com/3nVVjIVJAE
— Le Schtroumpf (@leschtroumpffr) 9 Février 2015
Sometimes we are less lucky :
Spider-Man rejoint l'univers schtroumpf de Marvel http://t.co/FP4Xd40mlo
— Le Schtroumpf (@leschtroumpffr) 10 Février 2015
See the code
The code is freely available on github.
Let's say it upfront : it's really not a good piece of software but the job is done.
Hey ! I'm on Twitter too, if you want to chat about the bot or something else. Feel free to comment below as well.