Monthly Archives: May 2011

Music charts and playlist data for bbc radio

This project took place late last year, but is worth writing up because it shows some of the problems that the BBC has when it tries to build user experiences based on data, at the scale that it needs.

This project provides the BBC with a data service for music charts and playlists. It turns out that, even in an age where music is competing with many other media “what is top of the charts?” is still a question that can get 2 million visitors a week excited enough to visit the radio 1 chart site.

This project provides the BBC with a technical solution that can cope with 2million visitors, and is easier and quicker to update live as the charts get updated.

The database schema we agreed was as follows:

chart-db-schema
An overview of the chart and playlist schema
Notice the ‘item’ entity. This is the thing (eg a particular top 40 single) being talked about.

One of the hard things about working at the BBC is the fact that it publishes a vast amount of content. For music, it turns out there isn’t a great source of super accurate music data. The official chart company publishes the charts, and the BBC licenses that information for use on its radio programmes, but the chart company data doesn’t use identifiers, it is just a text file.

What this means is that there is no super accurate way of curating charts over time. If Rihanna changes her name to Squiggle (hey, Prince did it) we know its her, because we’re told by a huge marketing machine, that its the same person. But a computer can’t make that same leap. So, any data associated with Rihanna would not be associated with Squiggle. Similarly if a particular mix of a single becomes popular, can we associate that mix with the original song that it is a remix of?

When you’re trying to maintain data integrity for the BBC, so that you can tell stories about it, and show the audience interesting journeys, the fact that we have many playlists and many charts each week becomes a real maintenance problem. Who is going to polish that data, curate it and maintain it? Is it something that represents good use of the license fee?

Luckily, for music artist we have a great source of identifiers. The BBC uses the musicbrainz data set. By matching artist names to an identifier in musicbrainz, we can associate new data with the same music artist, even if they change their name. Therefore it is much easier to maintain the data associated with that artist, even if they change their name, because their identifier won’t change.

Unfortunately for the item data, there is no great source of track names.

For now, the BBC is trying to maintain that data itself, and accepting that some of the data may be slightly broken over time, and may need tidying in the future.

In the next few weeks, musicbrainz will be releasing their ‘next generation schema’ which the BBC is supporting. Like the artist names identifiers, musicbrainz will then be able to give us a great set of identifiers for each item.

Exciting news for Information Architects like me, who want to be able to tell stories using data over a long period of time.

The Archers 60th Anniversary – Social media simulcasting experiment

Its taken a while to write this up, but on January 2nd of this year, we conducted a little social media experiment with The Archers.

The audience were invited to tweet their reactions during the 60th Anniversary broadcast of the show. We built a web experience for users to enjoy. This experience summarised live reactions to the unfolding storyline in the Archers.

We also built a ‘summary experience’ for people to enjoy after the episode.

BBC - Radio 4 - The Archers - Tweetalong 2nd January 2011_1304257653273
Screen grab of the archers tweetalong
The hard work was largely done by the Radio 4 interactive team, and the BBCs social media team, and a company called metabroadcast. [Who are rather excellent at their engineering and getting stuff done]. My role, as usual, was to lead the information architecture for the BBC.

This project comprised an audience experience, with a content management system including integration with the ‘twitter firehose’. The entire development took just under 5 weeks over Christmas. A great example of us doing agile development.

There are several interesting parts to the information architecture:

Despite a careful social media campaign, the hashtag that got used most by the audience was not one prescribed by the BBC, but one prescribed by the hardcore fans, on the archers message boards. They picked up on the marketting that had been done, suggesting that events in the 60th episode would be ‘Shaking Ambridge To The Core’. Their chosen hashtag? #sattc.

Naturally there were several other hashtags in use, each of which had to be monitored by the content management system. For example #archers #thearchers #archers60.

The content mangement system we built can be reused for many other dramatic events. For example, Metabroadcast used it for covering dramatic news in Egypt. It would also work on Xfactor, The Apprentice or any other broadcast where there is a controlled vocabulary of terms (eg peoples names).

Different users refer to the people involved in the drama in different ways. For this reason there are a set of synonyms for a particular term. For example, in The Archers, fans refer to Helen as ‘The Ice Queen’. It’s important to capture this reference to Helen and attribute it correctly.

The word cloud needs to reflect audience sentiment, but also block rude terms. So, our moderators had full control (including revoking approved terms, in case they made a mistake) and could look at the unmoderated list of terms, choosing which are appropriate to be published.

Similarly our moderators have full control over ‘Highlighted tweets’ and can also select a host account to be highlighted in the experience (in this case @BBCthearchers with a title of ‘Plot’ – see the above screengrab).

One of the interesting results of this work was the idea that when you’re moderating in real time, you can actually moderate too quickly for the audience! The highlighted tweets were, in my opinion, updating to quickly to be read by users – something quite distinct to real-time experiences.

The experience seems to work best if it starts slightly before broadcast, and ends after it. That way you capture the lifecycle of the episode – from anticipation to the broadcast, to the aftermath.

I really enjoy the sense of connection that these experiences can bring. It feels a little similar to watching a film or theatre in an audience – the sense of joint participation and connection with the unfolding drama, it’s one of the compelling things about social media in general.

There are many things I would improve about the experience, but I think given the time and effort that we had available, I’m very proud of this work.

All credit to the audience and social media team for the fact that the Archers trended worldwide for an hour on January 2nd 2011. Not bad for a drama that’s been running 60 years :-).