Rise up citizen software engineers!

How I created the UK Election Petition Tracker in a few hours.

Saturday 23 Nov, 2024. Somewhere in Leeds. While Storm Bert raged outside, inside I was raging about how the new “Labour Government” in the UK was doing more lying, gaslighting, and creating collective harm in the UK whilst throwing £billions to other countries. Millions of people in the UK are upset, for many and multiple reasons, and up popped an outlet: a Parliament petition for a new General Election.

When I clicked on the link, it had 42k signatures. Then later, it had 52k. “Hey up,” I said in my Yorkshire accent, “There might be summat to this!”.

I quickly got bored of clicking refresh on the petition page, so I wrote a Python script on my laptop to do it for me. But, rather than scratch my fast balding head and try to remember how (I have done web scraping before), I decided to ask my friend, colleague, co-worker, and co-pilot: ClaudeAI (by Anthropic).

Table Of Contents

show

What elements to scrape on parliament.gov.uk?
Ask ClaudeAI how to write a Python scraper
Asking ClaudeAI to help write a Streamlit app
Create a Github repo
Create a Github access token for the app
Write the Python code for the Streamlit app
Run the app locally on laptop
Deploying the UK Petition Tracker to Streamlit Community Cloud
Sharing the UK Petition Tracker online

What elements to scrape on parliament.gov.uk?

First of all — because I know what it’s going to need — I went to the Parliament UK website, opened the Chrome Developer Tools Inspector, and found the element I wanted to track: “span.count”…

Ask ClaudeAI how to write a Python scraper

With this element identified (and with web scraping, this is as easy as it gets because the parliament.gov.uk website is excellent and doesn’t his information behind JavaScript), I could then ask ClaudeAI:

The resulting code let me grab that span.count from the Parliament website, and just report timestamp + count. I had to create a Python virtual env and install some pips, and then I could run this:

import requests
from bs4 import BeautifulSoup
import time
from datetime import datetime
import os
from pathlib import Path

def get_petition_count(url):
    """Fetch the petition count from the given URL."""
    try:
        response = requests.get(url)
        response.raise_for_status()
        soup = BeautifulSoup(response.text, 'html.parser')
        count_element = soup.select_one('span.count')
        
        if count_element:
            # Remove commas from the number and convert to integer
            count = int(count_element.text.replace(',', ''))
            return count
        else:
            raise ValueError("Count element not found on the page")
            
    except requests.RequestException as e:
        print(f"Error fetching the petition: {e}")
        return None
    except ValueError as e:
        print(f"Error parsing the count: {e}")
        return None

def log_count(count, filepath):
    """Log the count with timestamp to the specified file."""
    timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    log_entry = f"{timestamp},{count}\n"
    
    # Create directory if it doesn't exist
    Path(filepath).parent.mkdir(parents=True, exist_ok=True)
    
    # Append to file
    with open(filepath, 'a') as f:
        f.write(log_entry)
    
    print(f"Logged count: {count} at {timestamp}")

def main():
    url = "https://petition.parliament.uk/petitions/700143"
    log_file = "petition_counts.csv"
    interval = 30 * 60  # 30 minutes in seconds
    
    # Create/check header in log file
    if not os.path.exists(log_file):
        with open(log_file, 'w') as f:
            f.write("timestamp,count\n")
    
    print(f"Starting petition count tracker. Logging to {log_file}")
    print(f"Checking every {interval/60} minutes...")
    
    while True:
        try:
            count = get_petition_count(url)
            if count is not None:
                log_count(count, log_file)
            
            time.sleep(interval)
            
        except KeyboardInterrupt:
            print("\nTracking stopped by user")
            break
        except Exception as e:
            print(f"Unexpected error: {e}")
            # Wait a minute before retrying in case of unexpected errors
            time.sleep(60)

if __name__ == "__main__":
    main()

And running that, the output is:

Not very exciting. Not easy to share. All my fellow Brits signing up… if I only knew a way to get a better, more visual version of this onto the internet…

Enter Streamlit and Github.:

Streamlit for the public interface — charts, figures, hosted on the internet.
Github for the public code — Python code, the collected petition data.

And Streamlit works “on top of” Github (well, more to the side, but you know what I mean if you’re a software architect type).

Asking ClaudeAI to help write a Streamlit app

So, back to ClaudeAI.

First I confused Replit with Streamlit — hey, as Human League once sang, I’m Only Human — but eventually this is what I got from Streamlit.

This wouldn’t be as simple as running local code on my laptop because, in addition to Python, it needs Github and Streamlit…. I will try to keep this concise! It took a few hours to get everything working.

I had to go through many iterations of the app and I’m going to skip them (tho, arguably, that’s where the value is), and I’m going to skip to the end.

If you’re interested in the “How you got there” story, I can do that. Just ask.

Create a Github repo

For all of this, we need to venture into source code control with Git and the hosted Github.

I created a “remote repo” on Github called Viewyonder / UK General Election Tracker.

I would sync my local code with this remote repository. (So I code locally using VScode on my laptop, and sync changes to Github).
When I use Streamlit, it will run the code “from” that Github repository *AND* write the petition data to it (using a Python module).

Create a Github access token for the app

For the Python app — whether it ran on my laptop (I can run Streamlit on my laptop) or on the internet (on the Streamlit Community Cloud) — it needs to access the code and write the petition data.

For this I needed a Github token for the repo that contained the code.

That meant adding a new “Github App” and then creating a “classic token”. I used that token as a “secret” later in the app.

Write the Python code for the Streamlit app

By “talking with Claude” and using my existing Python and Streamlit knowledge, we quickly got an app up (in less than an hour). The rest was revisions.

Here’s the Python code on Github for the UK Election Petition Tracker.

import streamlit as st
import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup
from datetime import datetime, timedelta
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.express as px
import os
import time
import csv
from io import StringIO
from pathlib import Path
import base64
from github import Github
from PIL import Image

# Page config
st.set_page_config(
    page_title="Petition Tracker",
    page_icon="📊",
    layout="wide"
)

# Add some content to dashboard
logo = Image.open('viewyonder eye logo lobster lobster 600x200.png')
st.image(logo)
st.write("A fun little Streamlit app to get the count of signatures every minute from the Parliament petition for a UK General Election.")
st.markdown("🌐 Go sign the petition](https://petition.parliament.uk/petitions/700143)")
st.markdown("Write to your MP with [Write To Them](https://www.writetothem.com/)")

# Initialize GitHub connection
store = Github(st.secrets["GITHUB_TOKEN"]).get_repo(st.secrets["GITHUB_REPO"])

# Initialize session state
if 'last_fetch_time' not in st.session_state:
    st.session_state.last_fetch_time = datetime.min
if 'last_count' not in st.session_state:
    st.session_state.last_count = 0

def get_petition_count(url):
    """Fetch the petition count from the given URL."""
    time_since_last_fetch = datetime.now() - st.session_state.last_fetch_time
    if time_since_last_fetch.total_seconds() < 55:
        return None

    try:
        response = requests.get(url)
        response.raise_for_status()
        soup = BeautifulSoup(response.text, 'html.parser')
        count_element = soup.select_one('span.count')
        
        st.session_state.last_fetch_time = datetime.now()
        
        if count_element:
            return int(count_element.text.replace(',', ''))
        return None
    except Exception as e:
        st.error(f"Error fetching count: {e}")
        return None

#def load_data():
#    """Load data from CSV file."""
#    if os.path.exists('petition_counts.csv'):
#        df = pd.read_csv('petition_counts.csv')
#        df['timestamp'] = pd.to_datetime(df['timestamp'])
#        return df
#    return pd.DataFrame(columns=['timestamp', 'count'])

def load_data():
    """Load data from GitHub CSV file."""
    try:
        # Get file content from GitHub
        #file = store.repo.get_contents(store.file_path)
        file = store.get_contents('petition_counts.csv')
        content = base64.b64decode(file.content).decode()
        
        # Convert to DataFrame
        df = pd.read_csv(pd.io.common.StringIO(content))
        df['timestamp'] = pd.to_datetime(df['timestamp'])
        return df
    except Exception as e:
        # If file doesn't exist or there's an error, return empty DataFrame
        return pd.DataFrame(columns=['timestamp', 'count'])
    
#def log_count(timestamp, count):
#    """Log count to CSV file."""
#    file_exists = os.path.exists('petition_counts.csv')
#    
#    with open('petition_counts.csv', 'a', newline='') as f:
#        writer = csv.writer(f)
#        if not file_exists:
#            writer.writerow(['timestamp', 'count'])
#        writer.writerow([timestamp, count])

def log_count(timestamp, count):
    """Log count to GitHub CSV file."""
    try:
        # Try to get existing file
        try:
            file = store.get_contents('petition_counts.csv')
            # Decode existing content
            current_data = base64.b64decode(file.content).decode()
            # Add new row
            new_data = current_data.rstrip() + f"\n{timestamp},{count}"
            # Update file
            store.update_file(
                'petition_counts.csv',
                f"Update petition count: {count}",
                new_data,
                file.sha
            )
        except:
            # File doesn't exist, create new with headers
            content = f"timestamp,count\n{timestamp},{count}"
            store.create_file(
                'petition_counts.csv',
                f"Initial petition count: {count}",
                content
            )
    except Exception as e:
        st.error(f"Error saving to GitHub: {e}")

def calculate_metrics(df):
    """Calculate metrics including rolling averages."""
    if len(df) == 0:
        return 0, 0, 0, pd.Series(), pd.Series()
    
    latest_count = df['count'].iloc[-1]
    
    # Calculate rolling averages
    df = df.set_index('timestamp')
    rolling_5min = df['count'].rolling('5min').mean()
    rolling_15min = df['count'].rolling('15min').mean()
    
    # Calculate rates using masks instead of last()
    current_time = df.index.max()
    hour_ago = current_time - timedelta(hours=1)
    five_min_ago = current_time - timedelta(minutes=5)
    
    # Create masks for time periods
    hour_mask = df.index >= hour_ago
    five_min_mask = df.index >= five_min_ago
    
    # Calculate hourly rate
    if hour_mask.sum() > 1:
        hour_start_count = df.loc[hour_mask, 'count'].iloc[0]
        hourly_rate = latest_count - hour_start_count
    else:
        hourly_rate = 0
    
    # Calculate minute rate
    if five_min_mask.sum() > 1:
        five_min_start_count = df.loc[five_min_mask, 'count'].iloc[0]
        minute_rate = (latest_count - five_min_start_count) / 5
    else:
        minute_rate = 0
    
    return latest_count, hourly_rate, minute_rate, rolling_5min, rolling_15min

def create_live_chart(df, rolling_5min, rolling_15min):
    """Create a live-updating chart with rolling averages."""
    fig = make_subplots(specs=[[{"secondary_y": True}]])
    
    # Raw data
    fig.add_trace(
        go.Scatter(x=df['timestamp'], y=df['count'],
                  name="Raw Count",
                  mode='lines',
                  line=dict(color='blue', width=1)),
        secondary_y=False
    )
    
    # 5-minute rolling average
    #fig.add_trace(
    #    go.Scatter(x=rolling_5min.index, y=rolling_5min,
    #              name="5-min Average",
    #              mode='lines',
    #              line=dict(color='red', width=2)),
    #    secondary_y=False
    #)
    
    # 15-minute rolling average
    #fig.add_trace(
    #    go.Scatter(x=rolling_15min.index, y=rolling_15min,
    #              name="15-min Average",
    #              mode='lines',
    #              line=dict(color='green', width=2)),
    #    secondary_y=False
    #)
    
    # Calculate and plot rate of change using masks
    df_rate = df.copy()
    df_rate['rate'] = df_rate['count'].diff() / ((df_rate['timestamp'] - df_rate['timestamp'].shift(1)).dt.total_seconds() / 60)
    
    fig.add_trace(
        go.Scatter(x=df_rate['timestamp'], y=df_rate['rate'],
                  name="Rate (signatures/min)",
                  mode='lines',
                  line=dict(color='orange', width=1)),
        secondary_y=True
    )
    
    fig.update_layout(
        title='Live Petition Count with Rolling Averages',
        xaxis_title="Time",
        yaxis_title="Signatures",
        yaxis2_title="Signatures per Minute",
        hovermode='x unified',
        height=600
    )
    
    return fig

def create_activity_heatmap(df):
    """Create a heatmap showing signature activity by hour and day."""
    if len(df) < 2:
        return None
    
    # Add day and hour columns
    df_heatmap = df.copy()
    df_heatmap['hour'] = df_heatmap['timestamp'].dt.hour
    df_heatmap['day'] = df_heatmap['timestamp'].dt.date
    
    # Calculate signatures per hour using diff
    df_heatmap['signatures'] = df_heatmap['count'].diff()
    
    # Create pivot table for heatmap
    pivot_table = df_heatmap.pivot_table(
        values='signatures',
        index='day',
        columns='hour',
        aggfunc='sum',
        fill_value=0
    )
    
    # Create heatmap
    fig = go.Figure(data=go.Heatmap(
        z=pivot_table.values,
        x=pivot_table.columns,
        y=pivot_table.index,
        colorscale='Viridis',
        hoverongaps=False,
        hovertemplate='Day: %{y}<br>Hour: %{x}<br>Signatures: %{z}<extra></extra>'
    ))
    
    fig.update_layout(
        title='Signature Activity Heatmap',
        xaxis_title='Hour of Day',
        yaxis_title='Date',
        height=400
    )
    
    return fig

def main():
    st.title("📊 Petition Tracker Dashboard")
    
    url = "https://petition.parliament.uk/petitions/700143"
    
    # Create columns for metrics
    col1, col2, col3, col4 = st.columns(4)
    
    # Load and update data
    df = load_data()
    current_count = get_petition_count(url)
    
    if current_count is not None and (len(df) == 0 or current_count != st.session_state.last_count):
        timestamp = datetime.now()
        log_count(timestamp, current_count)
        st.session_state.last_count = current_count
        df = load_data()
    
    # Calculate metrics
    latest_count, hourly_rate, minute_rate, rolling_5min, rolling_15min = calculate_metrics(df)
    
    # Display metrics
    with col1:
        st.metric("Current Count", f"{latest_count:,}")
    
    with col2:
        st.metric("Signatures per Hour", f"{hourly_rate:,}")
    
    with col3:
        st.metric("Signatures per Minute", f"{minute_rate:.1f}")
    
    with col4:
        if len(df) > 0:
            time_since_start = (datetime.now() - df['timestamp'].iloc[0])
            st.metric("Time Tracking", 
                     f"{time_since_start.days}d {time_since_start.seconds//3600}h {(time_since_start.seconds//60)%60}m")
    
    # Create tabs for different visualizations
    tab1, tab2 = st.tabs(["Live Tracking", "Activity Heatmap"])
    
    with tab1:
        # Update live chart
        if not df.empty:
            fig = create_live_chart(df, rolling_5min, rolling_15min)
            st.plotly_chart(fig, use_container_width=True)
            
            # Show recent data
            st.subheader("Recent Readings")
            n_recent = 10
            
            # Create recent dataframe with rate calculation
            recent_df = df.copy().sort_values('timestamp', ascending=False).head(n_recent)
            # Calculate rate (signatures per minute)
            recent_df['rate'] = recent_df['count'].diff(-1).abs() / \
                            (recent_df['timestamp'].diff(-1).dt.total_seconds() / 60)
            
            # Select and rename columns
            display_df = recent_df[['timestamp', 'count', 'rate']].copy()
            
            # Format the dataframe for display
            st.dataframe(
                display_df.style.format({
                    'timestamp': lambda x: x.strftime('%Y-%m-%d %H:%M:%S'),
                    'count': '{:,}',
                    'rate': '{:.1f}'
                }),
                use_container_width=True
            )
    
    with tab2:
        # Display activity heatmap
        if not df.empty:
            heatmap_fig = create_activity_heatmap(df)
            if heatmap_fig:
                st.plotly_chart(heatmap_fig, use_container_width=True)
                
                # Add summary statistics
                st.subheader("Activity Summary")
                df_summary = df.copy()
                df_summary['hour'] = df_summary['timestamp'].dt.hour
                df_summary['signatures'] = df_summary['count'].diff()
                
                hour_sums = df_summary.groupby('hour')['signatures'].sum()
                peak_hour = hour_sums.idxmax()
                peak_signatures = hour_sums.max()
                
                col1, col2 = st.columns(2)
                with col1:
                    st.metric("Peak Activity Hour", f"{peak_hour:02d}:00")
                with col2:
                    st.metric("Peak Hour Signatures", f"{int(peak_signatures):,}")
    
    # Auto-refresh
    time.sleep(1)
    st.rerun()

if __name__ == "__main__":
    main()

Run the app locally on laptop

One of the many lovely things about Streamlit is that you can run it locally on your laptop… then Deploy it to the Streamlit Community Cloud.

Running locally, it starts a webserver and sends you on a local port… then you can click Deploy, do all the Streamlit setup, and your app is live…

And in a browser:

Deploying the UK Petition Tracker to Streamlit Community Cloud

I wanted to share the petition tracker to the world and I can’t do that with an app running on my laptop (well I can, but that’s way to technical). So I opted to deploy the locally running app to Streamlit Community Cloud — it’s free!

Because my code is working, in Python, and on Github, and built for Streamlit locally… this is *dead easy*.

In the local app, in the browser, click Deploy… make sure to put your secrets in (Github token, repo, and CSV filename in github). Et voila:

I’m mostly an X user, so I’ll give some examples. I posted my progress on X, and I tagged people, and replied to others. Most people ignored me, some loved it, and some negged me about the petition being a waste of time.

I appreciate those that supported my private, free endevour!

C’est ça.

Rise up citizen software engineers!

What elements to scrape on parliament.gov.uk?

Ask ClaudeAI how to write a Python scraper

Asking ClaudeAI to help write a Streamlit app

Create a Github repo

Create a Github access token for the app

Write the Python code for the Streamlit app

Run the app locally on laptop

Deploying the UK Petition Tracker to Streamlit Community Cloud

How a 3-step successful CV strategy can improve your persuasive sales technique

Using an LLM to create colour gradients

Anthropic say AI is impacting tech jobs the most, but how?

An unusual but great interview technique for hiring managers

The Evolution of GenAI LLM Prompt Engineering: From Zero-Shot to Chain of Thought

Using Save to Notion to remember web pages

What elements to scrape on parliament.gov.uk?

Ask ClaudeAI how to write a Python scraper

Asking ClaudeAI to help write a Streamlit app

Create a Github repo

Create a Github access token for the app

Write the Python code for the Streamlit app

Run the app locally on laptop

Deploying the UK Petition Tracker to Streamlit Community Cloud

Sharing the UK Petition Tracker online

Similar Posts

Get in touch