GEMA News / 17 April 2024

Text and data mining: Reservation of use for works of the GEMA repertoire, e.g. by artificial intelligence (AI)

Works by GEMA members must not be used for commercial text and data mining (TDM). This also includes the training of generative AI models. GEMA publicly announced the respective reservation of use.

It was with the advent of ChatGPT at the latest that generative artificial intelligence (AI) has been in the public eye. First, the focus was on AI that generates texts in no time at all, and soon after, the truthfulness of AI-generated images was under discussion. This shows: Generative AI entails opportunities as well as risks. And it’s no different for music. But what does it mean for creators and music users?

On this page, you can find out:

  • Why works by GEMA members cannot simply be used to train AI applications,
  • Why licensees such as radio and television stations or online service providers also need to declare a reservation of rights of use if they make works of the GEMA repertoire accessible to the public.

Go to the reservation of the rights of use (PDF) 

Our task: Protecting the interests of our members

Generative AI can store, reproduce and redesign copyright works of all kinds. This does not mean that we can soon expect an Ai-based Mozart or John Lennon. Nevertheless, it is understandable that creators are following the developments with apprehension. After all, adequate and fair payment and the very basis of existence are affected. These are the results of a study commissioned by GEMA and our French sister organisation SACEM.

As a collective management organisation, we are accompanying change. We are opening up to the potential for innovation. And, above all, we are representing the interests and rights of the music creators we represent.

What is generative, artificial intelligence (AI)?

Surely, it is helpful to remind ourselves how gen AI actually works. It is defined as a type of artificial intelligence which can generate various content. As mentioned before, these are texts, images, videos or sounds. Let’s use the example of text to illustrate the operating principle.

AI holds a lot of information...

AI systems such as ChatGPT are based on generative models, often even Large Language Models (LLMs). Based on the data with which they were trained, they can generate new content. After training, an LLM holds statistical information about the underlying data - in this case natural language.

… and still doesn’t know anything

Based on this information, the model can complete texts, so-called text-to-text models. An LLM “knows” that the probability for the sentence “Water is ...” ending in “wet” is higher than “dry”. Thanks to the vast amount of training data, this rather reduced core ability led to language models being able to solve more complex tasks. The knowledge applied in this instance is, however, not knowledge in the actual sense but probability statements. AI systems using music, work in a similar way and require huge volumes of training data.

Copyright: Rightsholders can declare a reservation of the right of use

AI models regularly use copyright-protected data for their training where existing works are copied or reproduced. This right does, however, lie exclusively with the creators. The only exception: commercial text and data mining (TDM). In this case, large amounts of digital texts and data are analysed and extracted.

Whether generative AI can be classified as text and data mining and whether it is even governed by the respective rules and regulations, has not been conclusively clarified. What is clear, however, is that the use can only be permissible provided that the rightsholders have not declared a reservation of use (so-called opt out).

GEMA has declared the reservation on their behalf

We have already seen to it in the past that the right to declare the reservation of the right of use on behalf of our members has been assigned to us:

In this context, we would like to point out that GEMA has publicly declared the reservation pursuant to Section 44b (3) UrhG (German Copyright Act) for uses of the works represented by GEMA on its website as soon as the regulation came into force. The reservation comprises also uses outside of Germany in accordance with the legal systems in force abroad. This is done pursuant to Section 44b (3) UrhG.

Important for licensees: You also need to declare your reservation of use

If you are a licensee, for example a broadcaster or an online service provider , and you make works of the GEMA repertoire available to the public, you are obligated to declare the reservation of use in the course of making the works available in a machine-readable format and in such a way that third parties cannot use the licensed works without the payment of a licence fee pursuant to Section 44b UrhG.

As a consequence, third parties are not entitled to use or make available to third parties our repertoire for TDM and AI-based uses without consent. If you intend to use protected works for the purposes of text and data mining, please get in touch with us. In such cases, please write to kontakt@gema.de.