Menu

Plot Hole Studios

Ramblings of a rambler.

Generating UK Parliament Petitions using Markov Chaining

Okay so this is dumb but funny.

I recently noticed that my government’s online petition system has an API. Of course, the next logical step was to create software to generate petition titles automatically, so I pulled open my IDE and started work on something using everyone’s favourite throwing-stuff-together-quickly language, Python. If anything else, this should stand as a story of how absurdly easy it is to make things with Python, just by copying off of everyone else’s paper.

Oh look, you don’t have to sign up for the API! Nice. Sadly, you can only pull down the JSON for 500 petitions at a time, so we’ll have to work around this. Of course, we’ll be using the incredible requests plugin to manage API calls, because it’s stupidly easy. Here’s the basic first run:

import requests, math

# find out how many petitions there are
no_of_petitions = requests.get('http://lda.data.parliament.uk/epetitions.json').json() \
    ['result']['totalResults']

# you can only get 500 at a time, so we'll have to do it multiple times
for x in range(int(math.ceil(no_of_petitions / 500))):
    # get petition json
    response = requests.get(
        'http://lda.data.parliament.uk/epetitions.json?_view=basic&_properties=label&_pageSize=500&_page=' + str(x))
    all_petitions = response.json()['result']['items']

After this, we’ll need to dump stuff into a text file to create our corpus. This gets added to the iteration:

# put it all in a text file
    with open("corpus", 'a') as output_file:
        for petition in all_petitions:
            try:
                if not '\n' in petition['label']['_value']:
                    output_file.write(petition['label']['_value'] + "\n")
                else:
                    print("oh noes newline")
            except UnicodeEncodeError:
                print("oh noes unicode")

Oh yeah, the ‘if not \n’ and the ‘UnicodeEncodeError’ are solving some annoying edge cases by just ignoring them, don’t ask.

When this is over, we’re left with a large corpus containing something like 11,500 petitions concerning everything from Brexit, to education, to banning certain music genres? Now we need to start generating stuff. Thankfully, as this is Python, someone’s already written a library for creating Markov chains called ‘markovify‘. We can just plug in our text file, and pop out 100 new petitions!

import markovify

# generate!
with open('corpus', 'r') as corpus_file:
    corpus_text = corpus_file.read()

text_model = markovify.Text(corpus_text)

for x in range(100):
    petition = text_model.make_short_sentence(90, tries=100)
    print(petition)
    print("\n")

That ‘make_short_sentence’ bit is pretty important. I quickly noticed that the longer titles were completely unrealistic and boring, so capping it at 90 characters made everything a lot more consistent.

So, let’s see our output! My laptop would like to petition the government to…

  • A fine to those who served the British Army to protect the bees!
  • £40 per year Pressure public establishments to provide Mario Kart gaming for residents.
  • Lower the age of 18!
  • Regulate subscription Porn to safeguard it from being closed down.
  • A total outright ban of the UK.
  • Referendum for all sex offenders living within 1 mile of a workforce.

Truly inspirational.

Well, the next step is pretty clear. Time to make this into a Twitter bot! I’ve made like 6 Twitter bots before, so this is all pretty standard to me. My standard tool Tweepy will come in handy here, a Python library that turns the Twitter API into a cute set of easy to use objects. Here’s the new version of our petition generating script, now with tweeting!

import markovify, tweepy

# tweepy boilerplate
CONSUMER_KEY = 'nope'
CONSUMER_SECRET = 'nuh-uh'
ACCESS_TOKEN = 'not gonna post my api codes again'
ACCESS_TOKEN_SECRET = 'i have done it before...'
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
api = tweepy.API(auth)

# generate!
with open('corpus', 'r') as corpus_file:
    corpus_text = corpus_file.read()

text_model = markovify.Text(corpus_text)
petition = text_model.make_short_sentence(70, tries=100)

# create the tweet!
url = "https://petition.parliament.uk/petitions/new?q=" + petition.replace(' ', '+')
tweet = petition + "\n\n" + "Make this petition real! " + url

# tweet the tweet!
api.update_status(tweet)

Did I forget to mention my favourite feature? Every tweet comes with a link to create a new petition with the title already filled in, so you can help create the change my laptop wants to see in the world, all in less than 50 lines of code.

Here’s the Twitter bot, it posts at 12 PM GMT every day, and Join My Cause!