| Internet Engineering Task Force is working on a standardized preference signal for AI agents and crawlers ("building blocks that allow for the expression of preferences about how content is collected and processed for Artificial Intelligence (AI) model development, deployment, and use.") More context | datatracker.ietf.org/wg/aipref/about | Nov 30, 2025 Vocabulary draft updated | Preference signal |
| Asset-level protocol for binding machine-readable TDM and AI-training preferences to digital works. More context Uses IETF AI Preferences (AIPref) | tdmai.org | Nov 03, 2025 Usage vocabulary updated | Preference signal Also uses New infrastructure Pipeline: Collect -> Train -> Fine-tune |
| Local Contexts labels that let Indigenous communities express culturally specific conditions for access and reuse of knowledge and data. | localcontexts.org/labels/traditional-knowledge-labels | Sep 17, 2025 Guide to using TK Labels and Notices updated | Preference signal Pipeline: Collect -> Retrieve -> Train |
| robots.txt can be used to express AI crawler access preferences. See example from OpenAI. Additionally, the X-Robots-Tag response header allows servers to send crawler directives via HTTP response headers. More context | platform.openai.com/docs/bots | Sep 16, 2025 OpenAI bots documentation | Preference signal |
| Registry and opt-out workflow for marking works that should not be used in future AI training datasets. More context | haveibeentrained.com | Sep 14, 2025 Face Reveal launched for Have I Been Trained | Preference signal Also uses New infrastructure Pipeline: Collect -> Train |
| Content Credentials-based preference system for signaling that generative AI should not train on or use a creator's files. More context | helpx.adobe.com/creative-cloud/apps/adobe-content-authenticity/generative-ai-training-preferences.html | Sep 01, 2025 Generative AI training and usage preference documentation | Preference signal Pipeline: Train -> Generate |
| A proposed convention for AI-specific crawler directives via an `ai.txt` file. More context | site.spawning.ai/spawning-ai-txt | Aug 27, 2025 Improved crawler-control post published | Preference signal |
| Proposed signals for communicating reuse preferences to AI and web agents. | github.com/creativecommons/cc-signals | Aug 26, 2025 Response to feedback | Preference signal |
| Proposed AT Protocol mechanism for users to declare data-reuse preferences such as generative-AI training. More context Uses IETF AI Preferences (AIPref) | demo.user-intents.org | Mar 07, 2025 Proposal discussion opened | Preference signal Pipeline: Collect -> Train -> Retrieve |
| Embedded image and video metadata fields for expressing whether assets may be used in data-mining and generative-AI training datasets. More context | pluscoalition.org/about-ai-and-ml-image-rights-standards | Feb 19, 2025 IPTC and PLUS explain metadata stance on GenAI training | Preference signal Pipeline: Collect -> Train -> Fine-tune |
| W3C specification for expressing text and data mining permissions via a well-known JSON file, designed for EU DSM Directive compliance. More context | w3.org/community/tdmrep | Aug 08, 2024 Version 3 final report listed | Preference signal Also uses Formal license |
| Proposal to add a `noml` directive so content can stay searchable but not be used for machine learning. More context | noml.info | May 28, 2024 Project featured in scholarly article on search and AI opt-out | Preference signal Pipeline: Collect -> Train |
| Platform-level HTML and HTTP directives that tell external AI datasets and models not to use artists' work unless they opt in. More context | deviantart.com/team/journal/UPDATE-All-Deviations-Are-OptedOut-of-AI-Datasets-934500371 | Apr 16, 2024 NoAI labels added to DeviantArt Studio | Preference signal Pipeline: Collect -> Train |