10 minute read time.
By Janet Greco | 9 March 2020
Accurate and granular TV metadata has long been a necessary ingredient for better content management, navigation and search in the TV industry.
The use of automated techniques for extracting new forms of enhanced metadata is increasingly being explored to help TV busineses maintain their competitive edge. Streamlining operations through better and more coherent data sets can lead to improved work flows and efficiency. TV archives can be better monetized through better algorithms that can drive better personalisation and help companies exploit the “long tail” of their content.
It’s become common to hear the term “Artificial Intelligence” (AI) in connection with media industry topics. It’s a buzzword that’s often heard on TV trade show floors and in many company announcements. It seems as if almost every TV technology vendor has incorporated “AI” into their software somehow.

What do we mean by “AI”?
But what are we really talking about when we say “AI”? In addition to AI, there are several other closely related topics that are good to know at least by name. These include machine learning, data science and deep learning.
Data science needs computer science and AI (but is also involves many other completely different application domains). Computer science is a relatively broad field that includes AI but also other subfields such as distributed computing, human-computer interaction and software engineering. Machine learning (ML) is usually considered to a be a part of AI, while deep learning is a part of machine learning, where systems can improve their performance in a given task with more and more experience or data.
All of these AI-driven technologies are at play when we speak about automatic metadata extraction (AME). What is the current state of the art of AME? Where are we headed with the application of these technologies? Will TV industry players and consumers be ultimately better off?

These were the questions posed to a diverse group of industry players, AI and metadata experts who came together to discuss AI and Metadata, an event sponsored by the IET Media Technical Network hosted by Janet Greco of Broadcast Projects on 21 Feb 2020 in London. This intimate event of 15 participants conducted under Chatham House Rule featured two distinguished international speakers from the Fraunhofer Institute for Digital Media Technologies (IDMT) and the European Broadcasting Union (EBU), both of whom exempted themselves from the Chatham House Rule and agreed to be quoted in this article.  

AI is not Magic
Hanna Lukashevich, Head of Semantic Music Technologies at Fraunhofer IDMT, addressing the current state of the art, first of all aimed to demystify the topic. “AI is not magic”, she said, and “if AI is not working, there are reasons for that.” One typically forgets that AI needs humans, humans who have to feed the systems data, ideally clean data, and teach it to do the right task for a specific use case. It is a long and meticulous process.

The need for clean data, and clear business use cases became quickly evident, after viewing a video clip of a group of wild turkeys crossing a road. The content analysis system did a great job of recognizing everything in the scene, except for one element, the turkeys. For humans, it’s obvious. The animal pops out as being the main element of the scene. But the “dumb” machine had only been trained to recognise the elements that had the most business value (cars, houses, roads, etc). There was no business case – as yet – that called for the automatic recognition of turkeys in this example.

When AI doesn’t work as expected, it is an ongoing process of further training the model. “AI has its best chance to succeed when it is developed with your specific use case in mind,” according to Hanna Lukashevich.

d110f9bbbc8e904001a28182959080ca-huge-intelligence.png
Our second international guest, Jean-Pierre Evain, Principal Project Manager of the EBU, has been involved with AI and automated metadata extraction for the past 15 years. An eminent figure who has been at the forefront of an uncountable number of important industry innovations related to digital media technology and metadata, including the development of the TV Anytime metadata specification, EBUCore and much more, he is currently running the EBU Metadata Developer Network (EBU MDN) which hosts its annual workshop every June.

Still a long road to “meta-truth”
To those in the industry who think that AI is all hype and that no one is truly using it right now, you can meet the experts pushing the boundaries of what is possible at the annual Metadata Developer Network meeting. The year 2020 marks the 10th anniversary of the EBU MDN which has been growing steadily year on year, with three full days of demonstrations and workshops in Geneva, and a growing number of participants. “It is the immense processing power available now that makes AI possible today,” said Jean-Pierre.
The EBU MDN showcases automatic metadata extraction technologies such as computer vision, speech to text, and automatic translation and are all on course to bring transformative benefits to the TV business. Content analysis offers the automatic generation of keywords relating to scene elements (object recognition), sentiment analysis and natural language processing (NLP) techniques, which is working well. But where will all these new forms of extracted metadata be stored? Into which systems should this additional metadata be deposited, in order to realise the intended business benefits?
Structured metadata is the cornerstone for all of this to work well. After many years, it is a recognised base line and is beginning to be implemented now. But it has been a long time coming. The devil has always been in the detail, and the devil is still with us, after all these years. It's long past overdue to escalate this topic, was the general consensus of the group.

“Broadcasters who never took care of their metadata, meaning basically none of them, are doomed,” said one participant, with broad agreement across the room. Despite the meticulous creation of metadata standards over many years, there has been very little progress in terms of harmonising metadata across the enterprises. Siloed work practices still prevail, particularly at legacy operators. This means that huge challenges remain to merge, validate and arrive at a single “meta-truth” of data.

Recommendation Echo Chambers
Quality content recommendation algorithms can only achieve their true potential for honest personalisation with good baseline data that can be seamlessly merged with other data sets. When a subscriber signs up, and viewers are rarely aware what is at play behind the scenes in terms of the data used to drive their recommendations even though they’ve opted in to the heavily articulated terms of service for what they are buying into including the underlying General Data Protection Regulations (GDPR).

Indeed, pay TV operators have increased five-fold the number of data points they have on their audiences over the past three years. The targeting works well for ads, but not for content recommendation, said one participant. Content recommendation is optimized for creating “echo chambers” and feedback loops that in the end don’t help the consumer at all. A black swan approach is what is necessary instead, remarked another participant, to avoid the brute force data analysis that sits behind both content recommendations and targeted advertising.

“What is more interesting is the combination of two areas and the use of AI in both,” offered another participant. “First, AI to create and extract the metadata that can be used to enhance the user experience. Second, for AI to understand the patterns of actual user experience (discovery and consumption – usage) and develop new ways to enhance and extract new metadata. Connect these two circles and let the machines discover if turkeys crossing the street are more important to tag than the houses in the background, given the consumer behaviour and let the metadata extractor dynamically adapt to the needs at any given time.

“In other words, if AI can tell me why Game of Thrones was so popular, what subjects, objects, feelings, storytelling elements, and the combination of which made it a success, then AI can give “hints” to producers on the next success and truly recommend to consumers against not just flat profile tag lists but against complex multiple layers profiles audiences can actually relate to.”

Most broadcasters are not ready
So, will the application of AI bring positive changes to both consumers and TV companies? And is this the one buzzword that actually has staying power? AI as a technology concept is already having an impact. It seems people want to believe in it. It is likely continued progress will be inevitable, for better or worse.
Certainly, it is not always for the good of consumers, including personalisation and recommendation as we know these technologies are rather used to the advantage of some vendors and TV companies.
“We talked about the difficulties to enforce GDPR. There is probably no possible regulation that will really work. It’s all about ethics. Maybe AI might help more in terms of giving access to archives, which has less commercial sensitivity and might be more in the scope of public service broadcasters? Maybe AI can also help against disinformation or in the creation of new content and applications to the benefit of both consumers and TV companies. There is probably more to come but at what cost in terms of resources and competences?
“AI is only a means to an end. It is more about managing the data issued from AI and clearly most broadcasters are not ready. Many believe they don’t need to manage data as they think AI will save their day. A fatal error reflecting a complete lack of maturity”, said Jean-Pierre Evain. 
Structured and clean data always makes for a better result when applying these technologies. And indeed, AI will not be able to perform some sort of magic to sort out the mess of diverse, complex and overlapping data sets.
We have not heard the last of the “AI” buzzword. The topic of AI and metadata, sorting out internal metadata systems, data use, content recommendation and personalisation are all likely to be with us for a very long time.