Best Actress Winners

For days 4 and 5 of dashboard week, we were challenged to webscrape data from Letterboxd and turn it into an infographic dashboard. Since this was a two day project, I spent most of the first day gathering my data in Alteryx and planning my story, and then left the second day for building the dashboard. 

Collecting the data:

First, I looked at the Letterboxed website to decide on a topic to explore in my dashboard. I chose to look at Oscar Winners for Best Actress, wanting to gather data on each film and the actresses who won.

To do this, I created an Alteryx flow that downloaded the data from each page in this list. Within here I used RegEx to parse the bits of code that I needed, in particular, the weblink for each individual film page.

Using this, the latter Download tool in this flow was used to scrape the data for each film page, which included a section of JSON code containing key data fields for each film. Once I had isolated the JSON code, I outputted this as a file, so that I could avoid having to rerun the download tools too many times.

Now that I had access to the data I needed, I planned a schema for the tables I wanted to create, so that unnesting the JSON code wouldn’t explode the rows too much.

With my schema plan and many JSON Parse tools in Alteryx, I was able to construct these tables as I needed using the following flow, outputting each section to a sheet in an excel document, which I could open in Tableau:

JSON Parse Hint: If the code has ‘{ }’ outside, Unnest. If the code has ‘[ ]’ outside, use Flatten. 

At this point I realised that I had a full list of all actors in each film, with no identifier of whether they were actually the Oscar winner. For my analysis, I somehow needed to isolate these, and as much as I tried to webscrape the data, it seemed impossible. Given the time constraints for this project, I eventually turned to Gemini, asking it to create a dataset of all Best Actress winners, including the following fields: date of win, actress name, film name, date of birth.

Returning to Alteryx, I combined this with the actors and main movie sheets I had already created to find the actor ID and movie ID. These were needed so that I could create a relationship between tables in Tableau. Finally I cleaned up the fields and outputted this to a new document.

Building the dashboard

Having collected my data and created the relationships between tables in Tableau, I now needed to plan a story and build my dashboard.

Since this was an Infographic style dashboard, I first wanted to find a more specific focus. I began by writing down some questions and found that I was particularly interested in the actresses who had won this award multiple times. I decided that this would be my focus, and began sketching a plan for my dashboard, thinking of chart designs, and putting together a colour palette. 

I decided I wanted this to be a more longform dashboard, and I wanted to use deep blues, pinks and purples to capture a film colour grading feel. I also decided I wanted a vertical timeline to run down the side as a decorative feature that tied the different sections together.

Once I had established my plan, I built the dashboard in Tableau, making any necessary tweaks along the way, resulting in the following:

Click here to see the final dashboard.

Author:
Bethany Haysom
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2026 The Information Lab