Data journalism, robots … and predictions on where we go from here

 

No.09

At Google’s latest Digital News Initiative conference, held in Amsterdam last week, there were plenty of ideas being discussed around what the future of news looked like.

The DNI involves Google investing millions of pounds in projects put forward by media organisations large and small from across Europe, which could help shape the future of the media and support the development of journalism in the years to come.

Britain’s Press Association was one of the biggest winners this time, securing over 700,000 Euros to fund a new news service which will generate 30,000 local news stories a month sourced from data … and written by ‘robots’.

A team of five journalists will spot stories in data sets and then use artificial intelligence to create potentially hundreds of versions for different locations – hence the notion of robots.

AI-powered (Artifical Intelligence) journalism has been bubbling for a number of years. In America, Chicago Tribune publisher Tronc plans to use AI to auto-generate up to 2,000 videos a day to support stories, news agency AP has increased its volume of earnings reports from business announcements by 10-fold using AI (with the firm saying there are fewer errors than when humans did them)and the same company is now using AI to write minor baseball league reports.

At the Washington Post, ‘robots’ are deployed to write results of some elections, and also for sport. At the LA Times, a bot automatically sends out alerts whenever an earthquake is recorded – an inadvertently alarmed people about an earthquake predication for 2025 after a bug entered the system which powered the data the LA Times relies on.

So are we at a tipping point where technology now replaces reporters in newsrooms up and down the UK? I don’t think so – especially if we embrace their potential. Here’s why:

vivekrobots-1400x500

Robots have already had a huge impact on news

It would be wrong to see projects such as the PA one (known as RADAR – Reporters and Data and Robots) as the introduction of robots into local journalism. It might be the scaling up of the creation of articles using artificial intelligence, but local journalism is already being hugely influenced by robots through search engines and social media.

Google determines what people see when they make searches using algorithms which a whole industry has sought to game for years, resulting in a constant war of attrition between Google and said industry to ensure that people get quality results for their searches.

Likewise, Facebook’s algorithm is studied forensically by local newsrooms across the UK as journalists seek to get their work to as many people as possible. Neither Facebook or Google have a real-time army of humans dealing with every newsfeed or search in realtime. They’ve effectively robotised the old role of newsagent, and that has brought pros and cons for local newsrooms.

Which robots will have the most influence?

As Google and Facebook have sought to make their services – which can count for north of 60% of referrals to news websites – ever more relevant to individuals, they may have created a world in which robot journalism has to battle with robot newsagent.

It’s widely accepted that Google, for example, penalises sites which appear to have duplicate content from other news sites. For regional news publications which have a heritage of taking national news as part of their content mix from agencies like PA in print, this poses a challenge: What’s the point in publishing content which won’t be seen?

On Facebook, content which is most widely shared and reacted (likes, shares, comments) tend to get the greatest traction. Increasingly, this is content which comes with an ’emotional sell’ to draw a reaction. Both Google and Facebook are challenging publishers through their robots to create compelling content. AI journalism can play a part here – but it’s a bedrock journalists who know their readers have to build on if we are to be successful.

The oversupply of data

This brings into sharp focus the question of what the future is for news agencies in a world where near ‘page-ready’ content has limited value in an online world. In a social media world, newsrooms everywhere have eyes and ears everywhere thanks to tools like Dataminr and Crowdtangle. Against this backdrop, PA, under its smart editor in chief Pete Clifton is making the most of the role of news agency as information source through its RADAR project.

In recent years, we have been deluged by so-called open data. There are probably millions of datasets on public sector websites waiting to be explored, containing fascinating stories which deserve to be told. The only way journalism is realistically going to be able to cover many of them is by making the most of the technological tools at our disposal.

Could RADAR scour the now free Companies House data and report on every company with income of over £1m? Probably. Property prices at a local level every month via Land Registry? I imagine so. Automated stories on crime data at council level from police.uk? You bet.

The problem with making the most of open data is that it has been pushed out in an inconsistent way, and often isn’t publicised. This is particularly so at council level, where for every Bristol Council truly embracing the spirit of open data, you have four or five who still only share data (sometimes only via PDF) under sufferance.

RADAR has the potential to redefine the role of the news agency approach in local journalism, making it an invaluable ‘have you seen?’ source for newsdesks.

Content v Context

It’s not the first time the DNI has funded a data journalism project. In the previous round, The Bureau for Investigative Journalism secured funding for Bureau Local, which seeks to investigate large datasets and produce locally-relevant content by building up a network of journalists who can work with them and learn data journalism skills at the same time.

The PA project delivers scale through volume of content, while the Bureau Local delivers context – and that’s a crucial difference. It’s possible to push out hundreds of stories using story templates and AI-powered technology as PA will prove, but how do you add context?

It’s a challenge embraced by the company I work for via our Data Unit, which shares the ‘skeleton’ of dozens of news stories a week for local newsrooms (along with data visualisations, widgets and exclusive investigations using FOI data). Data gives us the content, and the sense of a story, but only local knowledge can provide context.

Some of the newsrooms the Data Unit works with tend to be at the bottom of almost every government data-driven league table. Readers don’t want to know that every day. On other occasions, what might seem like a remarkable statistical piece of news is put in a different when light when local context is applied: Of course that rail route has had the most delays … they’ve been clearing a landslide for weeks. 

Data journalism works best, in my view, when it’s treated as just another form of journalism – and plays by the same rules. Local knowledge is all important, and the art of a good story remains being able to make it relevant to local readers.

But to make something relevant to local readers, we first need to know about it – which is where RADAR will come in surely.

News v information

RSG1

The digital revolution in news has resulted in news websites offering an increasing amount of information to readers, as well as news. Why? Because sometimes that is just what people want.

During the general election campaign last month, some of the most engaging articles the titles I worked with were informational, rather than ‘stories’ – ranging from ‘who should I vote for’ through to ‘who are the candidates in my area and what do they stand for?’ And yes, “Can I use a pencil in the polling station?” was up there too.

Real-time open data is becoming an informational source newsrooms should make the most of, such as the Manchester Evening News’ ‘Where can I park’ widget built by the Trinity Mirror Data Unit. Traffic and travel is another obvious area.

I began in journalism at a time when people harked back to the early 90s phrase ‘news you can use.’ If ever there was an example of that in a digital world, it’s TM’s ‘Real Schools Guide’ which applies journalism and academic logic to provide an insight into all high schools based on no fewer than 25 sources of data.

Such projects show that readers often just want information. Open data gives them that. Getting it to readers is crucial if we want to remain relevant. Sometimes that’s via stories offering context, but increasingly often it’s just as valuable to share the data and make it searchable and understandable. 

 

 

 

The true value of local journalism

Not surprisingly, the demise of local reporters has been predicted by the pending arrival of robot journalists. If you see local journalism as merely an industrial process which fills boxes on pages and delivers a volume of stories online, then I can see why you’d draw such a conclusion.

But local journalism’s value in a digital age isn’t about volume, it’s the way it engages with people. Which is why it’s never been more important for every journalist to spend time talking to readers, listening to communities and allowing audience data around their work to influence their decisions. 

The point at which a robot is able to replicate that intimate understanding of what matters locally, and the time when readers prefer to engage with a well-programmed robot rather than a human being will be the point at which we’ve all failed … and a moment in time which we have no reason to ever expect will materialise, if we embrace our relationship with readers with respect.

Used well, projects such as RADAR provide newsrooms with the chance to better understand what is going on locally … and become more relevant to local communities as a result.

Leave a comment