OpenAI is fighting lawsuits from artists, writers, and publishers who allege it inappropriately used their work to train the algorithms behind ChatGPT and other AI systems. On Tuesday the company announced a tool apparently designed to appease creatives and rights holders, by granting them some control over how OpenAI uses their work.
The company says it will launch a tool in 2025 called Media Manager that allows content creators to opt out their work from the company’s AI development. In a blog post, OpenAI described the tool as a way to allow “creators and content owners to tell us what they own” and specify “how they want their works to be included or excluded from machine learning research and training.”
OpenAI said that it is working with “creators, content owners, and regulators” to develop the tool and intends it to “set an industry standard.” The company did not name any of its partners on the project or make clear exactly how the tool will operate.
Open questions about the system include whether content owners will be able to make a single request to cover all their works, and whether OpenAI will allow requests related to models that have already been trained and launched. Research is underway on machine “unlearning,” a process that adjusts an AI system to retrospectively remove the contribution of one part of its training data, but the technique has not yet been perfected.
Ed Newton-Rex, CEO of the startup Fairly Trained, which certifies AI companies that use ethically-sourced training data, says OpenAI’s apparent shift on training data is welcome but that the implementation will be critical. “I’m glad to see OpenAI engaging with this issue. Whether or not it will actually help artists will come down to the detail, which hasn’t been provided yet,” he says. The first major question on his mind: Is this simply an opt-out tool that leaves OpenAI contining to use data without permission unless a content owner requests its exclusion? Or will it represent a larger shift in how OpenAI does business? OpenAI did not immediately return a request for comment.
Newton-Rex is also curious to know if OpenAI will allow other companies to use its Media Manager so that artists can signal their preferences to multiple AI developers at once. “If not, it will just add further complexity to an already complex opt-out environment,” says Newton-Rex, who was formerly an executive at Stability AI, developer of the Stable Diffusion image generator.
OpenAI is not the first to look for ways for artists and other content creators to signal their preferences about use of their work and personal data for AI projects. Other tech companies, from Adobe to Tumblr, also offer opt-out tools regarding data collection and machine learning. The startup Spawning launched a registry called Do Not Train nearly two years ago and creators have already added their preferences for 1.5 billion works.
Jordan Meyer, CEO of Spawning, says the company is not working with OpenAI on its Media Manager project, but is open to doing so. “If OpenAI is able to make registering or respecting universal opt-outs easier, we’ll happily incorporate their work into our suite,” he says.
By Wired, May 7, 2024