Case Overview
On 11 November 2025, the Higher Regional Court of Munich-I issued a landmark judgment in GEMA v. OpenAI (Ref. 42 O 14139/24), addressing the legal implications of using copyrighted works in AI training and outputs. The case examines whether the memorization and reproduction of copyrighted works of the German music rights society (GEMA) by OpenAI language models, such as ChatGPT, constitute copyright infringement or fall under exceptions like text and data mining of German Copyright law.
Plaintiff's Claim
The Plaintiffs asserted a claim for injunctive relief and damages against the defendants, the operators of the language model and chatbot ChatGPT, for memorizing the lyrics of nine well-known German authors and reproducing the copyrighted lyrics as “unchanged” outputs in response to user prompts.
Defense presented by Defendants
- Defendants denied that its language models copy or store copyrighted material, stating that the outputs are merely a reflection of what the models learned from the training dataset.
- In their defense, defendants also relied on limitation provisions of the German Copyright Act (Act) relating to text and data mining (TDM), including (i) Section 44b, which allows reproductions of lawfully accessible works for TDM, and (ii) Section 57, which permits reproduction of works that are incidental to the main work or subject.They contented that these provisions exempted its use of copyrighted materials in training and generating outputs through its language models.
- The defendants further argued that users, rather than the language models, should be held accountable for the infringement, since the output was generated in response of prompts made by the users.
Court's Decision & Rationale
- The Court ruled in favour of the plaintiff, finding that both memorization of copyrighted materials in the language models and the reproduction of songs’ lyrics as chatbot output constitute a clear infringement of copyright exploitation rights by the defendants.
- In its ruling, the Court noted that the language models reproduced the song lyrics because of “memorization,” meaning the models actually stored parts of the training data and could output them. The match between the training data and the outputs was too close to be just a coincidence.
- Regarding the defendant’s reliance on limitation provisions for TDM under Section 44b and Section 57 of the Act, the Court observed that reproduction of copyrighted output is not covered under these restrictions. The premise of the limitation provisions allows only necessary reproduction of work for TDM such as converting work into a digital format or temporarily storing it in the memory for analysis, and does not cover permanent storage and output.These acts do not interfere with exploitation rights of the authors or harm their ability to earn from their work. Extending the scope of TDM to cover permanent reproductions of copyrighted work by the language models would leave the authors unprotected as the law does not require model users to pay for reproduced output.
- In terms of Section 57 of the Act, the Court opined that there was no main work to which the reproduced output could be incidental and dispensable. To accept the defendants’ argument that the song lyrics were incidental, the entire training dataset would have to qualify as a single copyright-protected work, which is not the case.
- The Court also rejected the notion advanced by the defendants that users should be held responsible and not the defendants.The Court emphasized that, since the architecture of the model was created and operated by the defendants, they were responsible for the training, memorization, and reproduction of the unauthorized output.
Conclusion & Analysis
This judgment could open doors for protection of authors’ rights against infringement by AI-generated outputs. It establishes that memorization and reproduction of copyrighted output by language models constitute infringement, and may influence how other jurisdictions approach the training and use of copyrighted material in AI models. However, it is possible that this precedent may only apply to outputs that contain a substantial or verbatim portion of copyrighted material. In the case of lyrics or music, even short sequences of words could be highly recognizable, whereas for other types of content, reproduction may need to be more substantial to qualify as infringement. The judgment remains subject to appeal, and OpenAI is expected to challenge it before an appellate court. It will be interesting to see how the question of whether memorization and reproduction by language models fall within the scope of TDM limitations unfolds.
Authored by: Sonali Mishra
Email: sonalimishra.mac@gmail.com
Add comment
Comments