Browse initiatives shaping AI data access, licensing, and enforcement. Search by name
or evidence, filter by status, and use the approach sections below when you want the
table view.
Search and status are always visible. Open this panel when you want to sort, include
archived entries, or narrow the catalog by approach type and data type.
Catalog scope
Archived entries stay on the site for historical context, but they are hidden by
default because no newer public activity was found during review.
Approach types
Approach types describe the main mechanism in play, such as licenses, registries,
marketplaces, tollgates, blocking, or certification.
Data types
Data types help you focus on the kinds of inputs an initiative covers, like text,
images, audio, video, or code.
54 initiatives
Showing current initiatives. Archived entries are hidden by default.
Browse by primary approach
Use these jump links when you want to scan the table section by section.
Primary approach type
Preference signal
Signals that express whether AI systems may crawl, train on, or reuse content, usually through metadata, headers, or other machine-readable notices.
Internet Engineering Task Force is working on a standardized preference signal for AI agents and crawlers ("building blocks that allow for the expression of preferences about how content is collected and processed for Artificial Intelligence (AI) model development, deployment, and use.")
robots.txt can be used to express AI crawler access preferences. See example from OpenAI. Additionally, the X-Robots-Tag response header allows servers to send crawler directives via HTTP response headers.
Technical controls that deny, rate-limit, or otherwise constrain crawling, downloading, or automated collection unless a requester meets specific conditions.
New infrastructurePipeline: Collect -> Train -> Fine-tune
Primary approach type
Certification
Third-party review, badges, or verification programs that signal whether a model, company, or dataset follows stated sourcing or licensing requirements.
Internet Engineering Task Force is working on a standardized preference signal for AI agents and crawlers ("building blocks that allow for the expression of preferences about how content is collected and processed for Artificial Intelligence (AI) model development, deployment, and use.")
robots.txt can be used to express AI crawler access preferences. See example from OpenAI. Additionally, the X-Robots-Tag response header allows servers to send crawler directives via HTTP response headers.
Technical controls that deny, rate-limit, or otherwise constrain crawling, downloading, or automated collection unless a requester meets specific conditions.
---
gated: true
extra_gated_prompt: "You agree to not use the dataset to conduct experiments that cause harm to human subjects."
extra_gated_fields:
Company: text
Country: country
---
Primary approach type
New infrastructure
New registries, protocols, hosting patterns, or coordination layers that make governed data access, compliance, or contribution easier to operate.
Third-party review, badges, or verification programs that signal whether a model, company, or dataset follows stated sourcing or licensing requirements.