Catalog

Browse initiatives shaping AI data access, licensing, and enforcement. Search by name or evidence, filter by status, and use the approach sections below when you want the table view.

Home Glossary Methodology

54 initiatives

Showing current initiatives. Archived entries are hidden by default.

Browse by primary approach

Use these jump links when you want to scan the table section by section.

Primary approach type

Preference signal

Signals that express whether AI systems may crawl, train on, or reuse content, usually through metadata, headers, or other machine-readable notices.

Initiative	Website	Latest update	Approach type
Creative Commons Signals In progress Creative Commons framework for communicating expectations and building governance infrastructure around AI use of shared knowledge. More context Evidence trail Creative Commons outlines signals-to-infrastructure plan May 13, 2026 Creative Commons describes shift from signals to agency Apr 23, 2026 Year-end CC Signals progress update Dec 15, 2025 Response to feedback Aug 27, 2025	creativecommons.org/cc-signals	May 13, 2026 Creative Commons outlines signals-to-infrastructure plan	Preference signal Also uses New infrastructure
IETF AI Preferences (AIPref) In progress Internet Engineering Task Force is working on a standardized preference signal for AI agents and crawlers ("building blocks that allow for the expression of preferences about how content is collected and processed for Artificial Intelligence (AI) model development, deployment, and use.") More context Evidence trail Vocabulary draft 06 updated Apr 27, 2026 Milestone for protocol specifications Aug 01, 2025	datatracker.ietf.org/wg/aipref/about	Apr 27, 2026 Vocabulary draft 06 updated	Preference signal
IPTC + PLUS Data Mining Metadata Live Embedded image and video metadata fields for expressing whether assets may be used in data-mining and generative-AI training datasets. More context Evidence trail IPTC publishes version 2.0 of AI opt-out guidelines Mar 30, 2026 IPTC and PLUS submit response to US National Science Foundation AI Action Plan Mar 14, 2025 IPTC and PLUS finalize Data Mining field in Photo Metadata Standard 2023.1 Oct 12, 2023 Data types ImagesVideo	ns.useplus.org/LDF/ldf-XMPSpecification	Mar 30, 2026 IPTC publishes version 2.0 of AI opt-out guidelines	Preference signal Pipeline: Collect -> Train -> Fine-tune
TDM·AI In progress Asset-level protocol for binding machine-readable TDM and AI-training preferences to digital works. More context Evidence trail Usage vocabulary updated Nov 04, 2025 Data types Multimodal Uses IETF AI Preferences (AIPref)	tdmai.org	Nov 04, 2025 Usage vocabulary updated	Preference signal Also uses New infrastructure Pipeline: Collect -> Train -> Fine-tune
TDMRep (W3C) Live W3C specification for expressing text and data mining permissions via a well-known JSON file, designed for EU DSM Directive compliance. More context Evidence trail Community group notes outline 2025 alignment work with AI-Pref Oct 01, 2025 Community group notes discuss standardization path and vocabulary work Apr 22, 2025 Version 3 final report listed Aug 09, 2024 Data types Web content	w3.org/community/tdmrep	Oct 01, 2025 Community group notes outline 2025 alignment work with AI-Pref	Preference signal Also uses Formal license
DIY robots handling (robots.txt++) Live robots.txt can be used to express AI crawler access preferences. See example from OpenAI. Additionally, the X-Robots-Tag response header allows servers to send crawler directives via HTTP response headers. More context Evidence trail OpenAI bots documentation Sep 17, 2025 Data types Web content	developers.openai.com/api/docs/bots	Sep 17, 2025 OpenAI bots documentation	Preference signal
Adobe Content Authenticity Live Content Credentials-based preference system for signaling that generative AI should not train on or use a creator's files. More context Evidence trail Generative AI training and usage preference documentation Sep 02, 2025 Public beta launch for Adobe Content Authenticity Apr 24, 2025 Data types ImagesVideo	helpx.adobe.com/creative-cloud/apps/adobe-content-authenticity/generative-ai-training-preferences.html	Sep 02, 2025 Generative AI training and usage preference documentation	Preference signal Pipeline: Train -> Generate
Spawning ai.txt In progress A proposed convention for AI-specific crawler directives via an `ai.txt` file. More context Evidence trail Improved crawler-control post published Aug 28, 2025 User guide post Mar 24, 2024 Data types Web content	site.spawning.ai/spawning-ai-txt	Aug 28, 2025 Improved crawler-control post published	Preference signal
User Intents In progress Proposed AT Protocol mechanism for users to declare data-reuse preferences such as generative-AI training. More context Evidence trail Proposal discussion opened Mar 08, 2025 Uses IETF AI Preferences (AIPref)	demo.user-intents.org	Mar 08, 2025 Proposal discussion opened	Preference signal Pipeline: Collect -> Train -> Retrieve
trust.txt Live Publisher-oriented trust file that can also declare whether AI training is allowed through a machine-readable `datatrainingallowed=` field. More context Evidence trail Browser extension launch described the network at about 3,000 publishers Feb 21, 2025 trust.txt spec added the datatrainingallowed field Apr 04, 2024 Data types Web content Signals Users about 3,000 participating publishers	journallist.net	Feb 21, 2025 Browser extension launch described the network at about 3,000 publishers	Preference signal Also uses New infrastructure Pipeline: Train
DeviantArt NoAI / NoImageAI Live Platform-level HTML and HTTP directives that tell external AI datasets and models not to use artists' work unless they opt in. More context Evidence trail New DeviantArt Studio supports NoAI label presets Apr 17, 2024 DeviantArt rolls out default opt-out and publishes noai/noimageai directives Nov 11, 2022 Data types Images	deviantart.com/team/journal/UPDATE-All-Deviations-Are-OptedOut-of-AI-Datasets-934500371	Apr 17, 2024 New DeviantArt Studio supports NoAI label presets	Preference signal Pipeline: Collect -> Train
NoML Archived Proposed robots-style directive for keeping content searchable while asking crawlers not to use it for machine learning. More context Evidence trail Mojeek publishes NoML proposal and open letter Oct 25, 2023 Data types Web content	noml.info	Oct 25, 2023 Mojeek publishes NoML proposal and open letter	Preference signal Pipeline: Collect -> Train
TK Labels Live Local Contexts labels that let Indigenous communities express culturally specific conditions for access and reuse of knowledge and data.	localcontexts.org/labels/traditional-knowledge-labels	Jun 17, 2022 Oriana TV case study shows TK Labels in use	Preference signal Pipeline: Collect -> Retrieve -> Train

Primary approach type

Formal license

Formal legal terms or license language that grant, restrict, or condition AI-related reuse of content, datasets, or model inputs.

Initiative	Website	Latest update	Approach type
Really Simple Licensing (RSL) Live A machine-readable licensing schema for clearly signaling reuse permissions and conditions (including payment or use restriction). More context Evidence trail Technical standards released Dec 10, 2025 RSL Standard and Collective launched Sep 10, 2025 Data types Web content Signals Users 1,500+ organizations Data billions of web pages	rslstandard.org	Dec 10, 2025 Technical standards released	Formal license Also uses Preference signal
copyright.sh Live Machine-readable licensing layer that lets websites declare AI usage terms and pricing. More context Evidence trail WordPress plugin launched Oct 28, 2025 Data types Web content	copyright.sh	Oct 28, 2025 WordPress plugin launched	Formal license
AI-Ready Licenses In progress Research-backed proposal for modular standard data licenses tailored to AI data sharing.	mlcommons.org/2025/03/unlocking-data-collab	Mar 17, 2025 Research findings published	Formal license Pipeline: Collect -> Train -> Fine-tune -> Retrieve

Primary approach type

Licensing collective

Shared bargaining, aggregation, or rights-management structures that let many publishers or creators negotiate AI access together.

Initiative	Website	Latest update	Approach type
SPUR (Standards for Publisher Usage Rights) In progress Publisher coalition building shared standards and licensing frameworks for responsible AI use of journalism. More context Evidence trail Mediahuis joins SPUR as a founding member May 11, 2026 Guardian announced SPUR coalition launch Feb 26, 2026 Data types Text Signals Users 6 founding publisher members	spurcoalition.org	May 11, 2026 Mediahuis joins SPUR as a founding member	Licensing collective Also uses New infrastructure Pipeline: Train -> Retrieve
Publishers' Rights Organization Live Coalition pursuing licensing, compensation, and enforcement for publisher content used by AI systems. More context Evidence trail Business model page describes enforcement and future AI licensing May 02, 2026 Data types Text	publishersrights.org	May 02, 2026 Business model page describes enforcement and future AI licensing	Licensing collective Pipeline: Train -> Retrieve
CCC AI Licensing Suite Live Voluntary collective licensing program covering internal AI reuse, external AI training, and transactional AI uses for copyrighted works. More context Evidence trail Four AI licensing options announced Mar 03, 2026 AI Systems Training License announced Mar 04, 2025 Internal AI reuse rights launched in the Annual Copyright License Jul 16, 2024 Data types Text	copyright.com/solutions-rightsholders-ai	Mar 03, 2026 Four AI licensing options announced	Licensing collective Also uses Formal license Pipeline: Train -> Fine-tune -> Retrieve -> Generate
CLA Generative AI Training Licence Live UK collective licensing offer for AI model training, fine-tuning, and RAG over published text content. More context Evidence trail PLS launches first opt-in stage for collective AI licensing Mar 01, 2026 CLA says the opt-in collective AI training solution is targeting 2026 launch Dec 18, 2025 CLA announces development of Generative AI Training Licence Apr 23, 2025 Data types Text	cla.co.uk/ai-and-copyright	Mar 01, 2026 PLS launches first opt-in stage for collective AI licensing	Licensing collective Also uses Formal license Pipeline: Train -> Fine-tune -> Retrieve
Dataset Providers Alliance (DPA) Live Trade alliance of dataset licensors pushing for legal clarity, ethical sourcing, and scalable licensing markets for AI training data. More context Evidence trail DPA welcomed five new members Dec 10, 2024 Position paper on AI data licensing released Sep 04, 2024 Alliance launch announced Jun 26, 2024 Signals Users 12 announced members	thedpa.ai	Dec 10, 2024 DPA welcomed five new members	Licensing collective

Primary approach type

Marketplace

Commercial platforms or brokers that package, list, or sell access to datasets, content libraries, or licensing opportunities for AI use.

Initiative	Website	Latest update	Approach type
Stack Data Licensing Live Licensed access to Stack Overflow's developer knowledge corpus for AI training, fine-tuning, RAG, and agentic use cases. More context Evidence trail Current product page cites 83M+ questions and answers May 14, 2026 Product renamed from OverflowAPI to Stack Data Licensing Sep 04, 2025 Stack Overflow joined Databricks Marketplace Jun 09, 2025 OverflowAPI won Best AI API award Nov 06, 2024 Data types TextCode Signals Data 83M+ human-verified questions and answers	stackoverflow.co/data-licensing	May 14, 2026 Current product page cites 83M+ questions and answers	Marketplace Also uses Formal license Pipeline: Train -> Fine-tune -> Retrieve
SourceAudio AI Dataset Licensing Live Opt-in music dataset licensing program that pays rights holders for AI training use of tracks and catalogs. More context Evidence trail April recap covers AI training data economy panel May 06, 2026 Symphonic partnership expands AI dataset licensing marketplace Mar 06, 2026 Symphonic partnership press release cites marketplace scale Feb 19, 2026 SourceAudio outlines AI dataset licensing program scale Jun 05, 2025 Data types Music Signals Users 3,000+ music catalogs Data 14M+ opted-in songs Payments nearly $10M annual revenue from eight contracts	sourceaudio.com/blog/2025/06/05/a-new-chapter-in-music-licensing	May 06, 2026 April recap covers AI training data economy panel	Marketplace Pipeline: Train -> Fine-tune
Created by Humans Live Rights-cleared book licensing platform for AI training, reference, and transformative use. More context Evidence trail About page describes mission and company story May 02, 2026 Author overview documents AI Rights licensing and opt-out flows Jan 18, 2025 Data types Text Signals Users 100+ bestselling authors	createdbyhumans.ai	May 02, 2026 About page describes mission and company story	Marketplace Pipeline: Train -> Retrieve
Protege Live AI training data platform for compliant exchange of proprietary, real-world datasets across sectors. More context Evidence trail HC1 partnership adds large de-identified lab data repository Feb 12, 2026 Series A extension cites hundreds of data partners and cross-vertical growth Jan 07, 2026 Seed launch of the AI training data platform Sep 10, 2024 Data types Multimodal Signals Users hundreds of organizations	withprotege.ai	Feb 12, 2026 HC1 partnership adds large de-identified lab data repository	Marketplace Also uses New infrastructure Pipeline: Train -> Fine-tune
Microsoft Publisher Content Marketplace Live Paid marketplace routing premium publisher content into Microsoft Copilot, MSN, and Discover experiences. More context Evidence trail Building Toward a Sustainable Content Economy for the Agentic Web Feb 03, 2026 Data types Text Signals Users 7 launch publisher partners	about.ads.microsoft.com/en/blog/post/february-2026/building-toward-a-sustainable-content-economy-for-the-agentic-web	Feb 03, 2026 Building Toward a Sustainable Content Economy for the Agentic Web	Marketplace Pipeline: Retrieve -> Generate
Defined.ai Live Marketplace for ethically sourced, annotated datasets used to train and fine-tune AI systems. More context Evidence trail Defined.ai reports 2025 marketplace growth Jan 27, 2026 Data types Multimodal Signals Payments several partners generate $1M+/year	defined.ai	Jan 27, 2026 Defined.ai reports 2025 marketplace growth	Marketplace Pipeline: Train -> Fine-tune
Human Native Live AI data marketplace for licensed multimedia datasets, now being integrated into Cloudflare's AI crawl and content-access stack. More context Evidence trail Cloudflare acquisition and integration announcement Jan 15, 2026 Data types Multimodal	humannative.ai	Jan 15, 2026 Cloudflare acquisition and integration announcement	Marketplace Also uses New infrastructure Pipeline: Train
DataSeeds.AI Live Rights-cleared image dataset marketplace that uses Zedge and GuruShots creator networks to supply AI training data. More context Evidence trail Enterprise customer base growth update Oct 20, 2025 Sample dataset released Jun 09, 2025 DataSeeds.AI launched Jun 05, 2025 Data types ImagesVideo Signals Data approximately 30 million rights-cleared images	dataseeds.ai	Oct 20, 2025 Enterprise customer base growth update	Marketplace Pipeline: Train -> Fine-tune
Bria Artist Program / Licensed Training Catalog Live Contributor compensation and licensed visual training-data program tied to Bria's commercially safe generative AI stack. More context Evidence trail Platform release highlights rights-clear models Sep 18, 2025 Licensed training catalog published Apr 30, 2025 Data types Images Signals Users 30+ data partners	bria.ai/artist-program	Sep 18, 2025 Platform release highlights rights-clear models	Marketplace Pipeline: Train
ProRata / Gist Live A 50/50 revenue-share platform connecting publishers with AI companies, with 700+ publishers signed up including major news outlets. More context Evidence trail Gist Answers launched Sep 05, 2025 500+ publication milestone announced Jun 06, 2025 Gist.ai announced Dec 09, 2024 Data types Text Signals Users 700+ publishers	prorata.ai	Sep 05, 2025 Gist Answers launched	Marketplace Also uses Licensing collective / Tollgate
Dappier Live Rights-cleared content marketplace and monetization layer for RAG, assistants, and other AI applications. More context Evidence trail Licensing program launch announced Aug 18, 2025 DPA welcomed Dappier as a member Dec 10, 2024 Data types Text	dappier.com/marketplace	Aug 18, 2025 Licensing program launch announced	Marketplace Also uses Tollgate Pipeline: Retrieve -> Generate
GCX (Global Copyright Exchange) Live Music licensing platform for rights-cleared AI training data and audio assets. More context Evidence trail Rightsify says Hydra grew out of work on the GCX dataset service May 15, 2025 Rightsify rollout describes GCX large-scale music datasets Aug 22, 2023 Data types Music Signals Data 4.4M+ hours of audio / 32B metadata text pairs / 3PB music data	gcx.co	May 15, 2025 Rightsify says Hydra grew out of work on the GCX dataset service	Marketplace Pipeline: Train -> Fine-tune
Gloo AI Licensing Live Licensed-content marketplace for the faith ecosystem, seeded with a pooled guarantee for AI assistants and search experiences. More context Evidence trail Gloo launches AI Licensing with pooled guarantee Feb 20, 2025 Signals Payments $5M pooled guarantee	docs.gloo.com/product-guides/licensing	Feb 20, 2025 Gloo launches AI Licensing with pooled guarantee	Marketplace Pipeline: Retrieve -> Generate
vAIsual Archived Marketplace for rights-managed visual and biometric datasets tailored to AI training and evaluation. More context Evidence trail Biometric video dataset release describes original Dataset Shop scale May 01, 2023 Data types Images Signals Data 600,000+ high-quality images	vaisual.com	May 01, 2023 Biometric video dataset release describes original Dataset Shop scale	Marketplace Pipeline: Train -> Fine-tune

Primary approach type

Tollgate

Access layers that require payment, metering, or authenticated entry before content can be fetched, queried, or reused for AI workflows.

Initiative	Website	Latest update	Approach type
Cloudflare AI Crawl Control Live A set of tools to block or charge for scraping; includes AI Audit dashboard, managed robots.txt, and pay-per-crawl marketplace. More context Evidence trail Redirects for AI Training launched Apr 17, 2026 Content Signals Policy launched Sep 24, 2025 AI Crawl Control general availability announced Aug 28, 2025 AI Audit and marketplace features launched Jul 01, 2025 Data types Web content Signals Users 3.8M+ domains on managed robots.txt Data 1B+ 402 responses/day	blog.cloudflare.com/control-content-use-for-ai-training	Apr 17, 2026 Redirects for AI Training launched	Tollgate Also uses Technical blocking / Preference signal
TollBit Live Add subdomains to make content accessible to AI with blocking and monetization. More context Evidence trail Imperva integration announced Dec 16, 2025 TollBit Japan launched Oct 16, 2025 State of the Bots report Sep 08, 2025 Data types Text Signals Users 4,000+ premium publishers	tollbit.com	Dec 16, 2025 Imperva integration announced	Tollgate Also uses Marketplace

Primary approach type

Technical blocking

Technical controls that deny, rate-limit, or otherwise constrain crawling, downloading, or automated collection unless a requester meets specific conditions.

Initiative	Website	Latest update	Approach type
Fastly AI Bot Management Live Edge bot-management layer for detecting and blocking AI crawlers and fetchers that scrape website content. More context Evidence trail Threat Insights post covers bot traffic and AI-bot access Apr 16, 2026 Threat research on AI bot traffic released Aug 19, 2025 AI Bot Management launch announced Apr 15, 2025 Data types Web content	fastly.com/products/fastly-ai-bot-management	Apr 16, 2026 Threat Insights post covers bot traffic and AI-bot access	Technical blocking Pipeline: Collect -> Retrieve
Akamai Content Protector Live Enterprise anti-scraping product that detects and blocks persistent content scrapers, now positioned as part of broader AI and LLM bot management. More context Evidence trail Publishing-focused AI bot report released Apr 08, 2026 AI and LLM bot management framed as business-critical Jul 15, 2025 Content Protector launch explained Feb 06, 2024 Data types Web content	akamai.com/products/content-protector	Apr 08, 2026 Publishing-focused AI bot report released	Technical blocking Pipeline: Collect -> Retrieve
easy-dataset-share Live A simple anti-scraping tool intended to protect datasets from basic crawlers/scrapers. More context Evidence trail easy-dataset-share paper published Jan 09, 2026 Project launch post published Sep 13, 2025 Uses Cloudflare AI Crawl Control	github.com/Responsible-Dataset-Sharing/easy-dataset-share	Jan 09, 2026 easy-dataset-share paper published	Technical blocking Pipeline: Collect
Hugging Face Gated Datasets Live Hugging Face Hub access-control feature that requires users to request approval before downloading a dataset. More context Evidence trail Hugging Face docs updated with EU-specific gated-dataset guidance Aug 18, 2025 Hugging Face docs updated access-request rejection guidance for gated repositories Jan 17, 2025	huggingface.co/docs/hub/datasets-gated	Aug 18, 2025 Hugging Face docs updated with EU-specific gated-dataset guidance	Technical blocking Also uses New infrastructure Pipeline: Collect -> Train -> Fine-tune

Primary approach type

New infrastructure

New registries, protocols, hosting patterns, or coordination layers that make governed data access, compliance, or contribution easier to operate.

Initiative	Website	Latest update	Approach type
IETF Web Bot Auth In progress Working group standardizing cryptographic authentication for bots and AI agents on the web. More context Evidence trail Use cases draft updated Apr 01, 2026 Charter approved Oct 23, 2025 Data types Web content	datatracker.ietf.org/wg/webbotauth/about	Apr 01, 2026 Use cases draft updated	New infrastructure Pipeline: Collect -> Retrieve
IAB Tech Lab CoMP In progress Standards initiative for machine-readable commercial agreements, access policies, and monetization workflows before AI crawling or content use. More context Evidence trail CoMP v1.0 opened for public comment Mar 10, 2026 CoMP working group launched Aug 12, 2025 Data types Web contentText	iabtechlab.com/standards/comp-content-monetization-protocols-initiative	Mar 10, 2026 CoMP v1.0 opened for public comment	New infrastructure Also uses Tollgate / Marketplace Pipeline: Collect -> Train -> Retrieve
CommonsDB Live Registry for public-domain and openly licensed works using verifiable rights declarations and content-derived identifiers. More context Evidence trail Feasibility study part 2 published Jan 20, 2026 Explorer launched Oct 31, 2025 Signals Data 300,000+ declarations	commonsdb.org	Jan 20, 2026 Feasibility study part 2 published	New infrastructure Pipeline: Collect -> Train -> Retrieve
Wikimedia Enterprise Live Enterprise-grade APIs and structured dumps for Wikipedia and sister projects, designed for large-scale reuse in AI, search, and knowledge graphs. More context Evidence trail New enterprise partners announced Jan 15, 2026 Data types TextStructured data Signals Users 10+ announced partners Data 920+ datasets / 300M+ unique project pages	enterprise.wikimedia.com	Jan 15, 2026 New enterprise partners announced	New infrastructure Pipeline: Retrieve -> Train
Amlet Live AI content registry for publishers and authors that links ownership proof, TDM registration, and licensing rules for AI reuse. More context Evidence trail TDM registry case made for AI licensing workflows Dec 15, 2025 StreetLib partnership announced Oct 17, 2025 Amlet launched as an AI content registry Oct 13, 2025 Data types Text	amlet.ai	Dec 15, 2025 TDM registry case made for AI licensing workflows	New infrastructure Also uses Formal license Pipeline: Collect -> Train -> Retrieve
Mozilla Data Collective Live Community-centered dataset platform for sharing AI-relevant data under contributor-controlled licenses, access rules, and governance terms. More context Evidence trail Exclusive-hosting FAQ describes dataset protections and management controls Nov 25, 2025 Alpha launch announcement Sep 17, 2025 FAQ describes custom constraints, compensation, and authenticated access Aug 07, 2025 Data types Multimodal Signals Data 470+ datasets	mozilladatacollective.com	Nov 25, 2025 Exclusive-hosting FAQ describes dataset protections and management controls	New infrastructure Also uses Technical blocking Pipeline: Train -> Fine-tune
European Books Data Commons In progress Proposal for a commons-based infrastructure for large-scale access to digitized European books with conditional commercial access. More context Evidence trail Outline paper published Nov 20, 2025 Data types Text	openfuture.eu/publication/outline-for-a-european-books-data-commons	Nov 20, 2025 Outline paper published	New infrastructure Pipeline: Collect -> Train
SyftBox Live Open-source protocol for privacy-preserving AI and analytics across distributed datasets without centralizing the underlying data.	openmined.org/syftbox	Nov 12, 2025 syft-flwr release demonstrates active federated learning workflows on SyftBox	New infrastructure Pipeline: Train -> Retrieve
Attribution-based control In progress OpenMined architecture for permissioned data contribution and attribution-based access in AI systems.	openmined.org/attribution-based-control	Oct 06, 2025 OpenMined explains attribution-based control	New infrastructure
Spawning Do Not Train Registry In progress Registry and opt-out workflow for marking works that should not be used in future AI training datasets. More context Evidence trail Face Reveal launched for Have I Been Trained Sep 15, 2025 Project status update published Aug 28, 2025 Data types Images	haveibeentrained.com	Sep 15, 2025 Face Reveal launched for Have I Been Trained	New infrastructure Also uses Preference signal Pipeline: Collect -> Train
FlexOlmo In progress Distributed language-model training approach that lets data owners contribute experts without sharing raw data or giving up opt-out control. More context Evidence trail Ai2 introduces FlexOlmo and invites organizations with sensitive data to participate Jul 09, 2025 Data types Text	allenai.org/blog/flexolmo	Jul 09, 2025 Ai2 introduces FlexOlmo and invites organizations with sensitive data to participate	New infrastructure Pipeline: Train
Social License for Data Reuse In progress Participatory governance framework for communities to define conditions for data reuse, including AI training.	blog.thegovlab.org/reimagining-data-governance-for-ai-operationalizing-social-licensing-for-data-reuse	May 13, 2025 Operationalization report released	New infrastructure Pipeline: Collect -> Train -> Fine-tune -> Retrieve
Credtent Live Independent creative registry for opting out of AI use, licensing content, and certifying human-created work. More context Evidence trail Credtent marketplace and opt-out announcement Mar 31, 2025 Data types Multimodal Signals Users thousands of creators	credtent.org	Mar 31, 2025 Credtent marketplace and opt-out announcement	New infrastructure Also uses Marketplace / Certification Pipeline: Train
Spawning Data Diligence Live Python package and API helpers for checking whether works are opted out before model training.	github.com/Spawning-Inc/datadiligence	Oct 09, 2024 PyPI release 0.1.7 published	New infrastructure Pipeline: Collect -> Train -> Fine-tune

Primary approach type

Certification

Third-party review, badges, or verification programs that signal whether a model, company, or dataset follows stated sourcing or licensing requirements.

Initiative	Website	Latest update	Approach type
Fairly Trained Live Certification program for AI models and companies that meet stated consent and licensing criteria for training data. More context Evidence trail Individual model certification updated Aug 01, 2024 Five more companies received certification Mar 20, 2024 Certification launched for generative AI companies Jan 17, 2024 Signals Users 16+ announced certified entities	fairlytrained.org	Aug 01, 2024 Individual model certification updated	Certification

Primary approach type

Preference signal

Signals that express whether AI systems may crawl, train on, or reuse content, usually through metadata, headers, or other machine-readable notices.

Creative Commons Signals

In progress

Creative Commons framework for communicating expectations and building governance infrastructure around AI use of shared knowledge.

Website: creativecommons.org/cc-signals
Latest update: Creative Commons outlines signals-to-infrastructure plan May 13, 2026
Primary approach: Preference signal

Also uses New infrastructure