Graham Greenleaf (Macquarie University – Macquarie Law School) and David F. Lindsay (UTS: Law) have posted Copyright’s Public Domains: The Limits on AI Appropriation on SSRN. Here is the abstract:
The spectacular rise to commercial and intellectual prominence of artificial intelligence (AI) since 2022, and in particular the predominant role of generative AI and its use of large language models (LLMs), has given rise to many legal and policy problems. Four main types of problems for copyright law are sketched, in each of which one significant legal question raised is whether the use of the content is in the public domain, or is it an infringement of copyright? Can the developer of the training set, or the deployer of the AI system, or the end-user of the AI system, claim that the use they have made of the AI system’s content does not involve an infringement of copyright but is a use of that content which is in the public domain?
Those building AI systems are faced with three alternatives. They can negotiate to obtain the consent (through private licences) of the owners of copyright in the works intended to be used. Or they can ignore whether or not consent may be necessary, on the basis that they ‘can get away with it’ anyway. Or they can attempt to justify the use they wish to make of the content on the basis that it does not require owner consent because it is in the public domain. This last approach is the subject of this article, which aims to be comprehensive and precise about which aspect(s) of the public domain can be used to support claims that consent is not legally necessary.
Our previous work has identified fifteen aspects of the copyright public domain, the aggregate of which is the copyright public domain in a particular jurisdiction. We consider each of those fifteen categories, with a focus on Australian law, explaining first how the category is consistent with our overall definition of the public domain (that is, ‘the public’s ability to use content on equal terms without seeking permission’). Then the relationship of each category to international copyright law is stated, with an emphasis on how much (if any) expansion of that category in national laws is consistent with current international copyright law. We give brief examples of how each category is reflected in national laws, particularly in Australia. Finally, we consider possible ‘opportunities’, meaning how this aspect of the public domain has been or could be used or expanded (by legislation or case law) to assist the development of AI systems.
This article concludes that developing solutions for the substantial challenges posed by generative AI can be assisted by analysing the actual and potential extent to which all of the existing public domain categories apply, or could reasonably be developed to apply, to access and use for developing AI systems. Instead of a ‘magic bullet’ to be found in only one category, the best solution might come from a combination of various public domain elements.
From our survey we find that nine of the fifteen public domain categories offer some possibilities for supporting AI development, but in most categories these possibilities are slight. Two most clearly permit use of a substantial amount of content without the need for permission: the substantial number of works in which the copyright term has expired (category 5); and material available for use under neutral voluntary licences (category 14), such as CC licences. In two further categories access to significant amounts of public domain content is complicated due to legal uncertainties and practical obstacles: fair use exceptions, at least in the US and the EU (but not at present in Australia); and the tenet of copyright law that mere facts or ideas (category 10) can be freely used, which might seem to hold considerable potential for lawful AI training. Other public domain categories, as they currently exist, are of only theoretical value.
The question then turns to the extent to which there is potential for developing existing public domain categories, while appropriately balancing the interests of authors and owners, on the one hand, and AI developers, on the other. There is more scope for national jurisdictions to reform the public domain categories than is commonly thought. For example, works expressly excluded from copyright protection (category 3), could take advantage of the two permissible optional exclusions, namely laws and other official texts, and political and legal speeches, and ‘news of the day’. This leaves the two public domain categories that have been a focus of debates about policy responses to the rise of generative AI: ‘free use’ exceptions (category 12) and neutral compulsory licensing (category 13). We can see some scope for a very nuanced TDM exception for some AI training which is in the public interest, and for some conventional compulsory licences or extended collective licences (ECLs). Reliance on the status quo, and market-based approaches are unlikely to be sufficient.
Recommended!
To receive new posts from Legal Theory Blog by email, get a free subscription to Legal Theory Stack.
Lawrence Solum
