Content Protection Project – Post-Open: What Comes After Open Source

The Post Open Administration operates the Content Protection Project to help content creators to protect themselves from having their works appropriated unfairly by the operators of AI systems without payment, attribution, or compliance with your license. This project is open to all content creators, including ones who wish to grant individual permission and receive payment for their content, Open Source and Creative Commons projects which wish no payment but want attribution and compliance with their licenses, and those who only wish fair attribution. Participation is free, but we would appreciate a donation, as serving process on all of the world’s AI companies costs us lots of money.

The Problem

AI isn’t just a computer program. It’s a means by which large companies appropriate your work, and then offer it to others as theirs. They do it by reading your content into a large language model, a kind of computer database used by AI systems. The content that can be appropriated includes everything on your web site, your videos, music, and images however you publish them, and any other computer-readable information that they can get their hands on. Even your appearance and the sound of your voice, which are used to make deepfakes.

Many of us have already had the experience of seeing our own work parroted by AI without permission, attribution, or payment. The large language model doesn’t contain a copy of your work in any recognizable form, in the same way that after reading a book, medical imaging would not show a copy of the text in your brain. But unlike your brain, a large language model doesn’t forget, and isn’t creative, so everything that comes out of it is just a mix of what went in. Your work, and that of other people.

Because no recognizable copy of your work exists in the large language model that we can see today, courts have failed to enforce copyright when AI parrots people’s work. It’s difficult to prove that there was actual copying in the way that people have copied work since the printing press was invented, which is what copyright law restricts, and science and the law haven’t caught up.

So, the Content Protection Project doesn’t base our effort on that sort of copying. Instead, we use legal remedies against another kind called ephemeral copying, which is part of all computer use and is essential to the training of a large language model, the process in which it receives your work. Your right to restrict ephemeral copying of your content is well-supported in copyright law.

We will serve a legal document on the corporate agents of companies that operate AI that lists you or your company, your content and its online location if it has one, and states a legal prohibition on ephemeral copying of that content for the purpose of training a large language model, AI, machine learning system, or neural network. In addition, we provide changes that you should make to the robots.txt file on a web site, and the terms of service of your web site or any copyright notice accompanying your content. We will keep a record of every time AI companies are served with your prohibition.

When your content appears in the output of an AI, your task in court will be to show the likelihood that ephemeral copying occurred, rather than prove the existence of an exact copy within the large language model, which has so far been unsuccessful. You will be able to show that the operator of the AI received your prohibition.

It is likely that when AI appropriation happens, it will happen to many content creators who are registered with us, not just one. Thus, we will be able to ask legal counsel to collect multiple examples of the likelihood of ephemeral copying, and to bring a class action suit on behalf of those content creators. Many law firms take such cases on without any initial payment, for a share of the eventual penalties.

How to Participate

Register yourself or your company and your content with the Content Protection Project using this URL: https://postopen.org/content_protection/register

Add this text to your web site terms of service and any copyright statements related to your content.

Any copying, including ephemeral copying, for the purpose of training an artificial intelligence (AI), large language model (LLM), machine learning system or neural network is prohibited.

And you should add to your web site a file called /robots.txt if it doesn’t have one, and include in that file these two lines:

User-Agent: AI
Disallow: /

We will promote that usage as a web standard prohibiting access of the web site for the purpose of training AI.

This project is not anti-AI. We are against unfair appropriation of content using AI. If you would like to license your content to an AI company, or allow it to be used to train an AI that qualifies under the Open Source Definition or the Open Source AI Definition, we’re all for it. The goal is for your rights to be respected, and for your content to be paid for if you wish, and attributed correctly.

Our opinion is: AI companies know they’re appropriating your content unfairly and probably illegally. Their philosophy is that they’ll do it now, make a lot of money, and later on they’ll be able to afford to defend themselves in court until you run out of resources, or they’ll even be successful in lobbying to change the law in their favor. It’s time to bring some balance to this.

In most US states, only a licensed attorney who is contracted to advise you can provide legal advice. The Post Open Administration isn’t either of those things. Please show this to your lawyer and ask for legal advice. This might not work in court, so we can offer no guarantee.

If you’d like to support this effort, please make a donation at this URL:

The Post Open Administration is operated by HamOpen.org, a California non-profit, with the 501(c)3 fiscal sponsor Non-Profit Accounting Service.