AI Topic Space — Project Data, Coverage & Risk-Card Quality

Project Data, Coverage & Risk-Card Quality Overview

Descriptive statistics for the AI knowledge-ecosystem project: a single semantic space (the Science–Technology–Policy Interface, STPI) over Korean and international AI academic papers, patents, and policy / institutional documents (1990–2026), plus a global AI-risk overlay. Here the Science domain is read broadly as all AI-related research and therefore includes the AI-risk research literature; the Policy domain likewise includes AI-risk governance standards (NIST AI RMF, OECD, ISO, OWASP, MITRE, NHTSA). Document accounting: Total literature = Science + Patents + Policy + Risk materials, where Risk materials = risk research papers + reports + simple sources (plus risk-related policy reports). Risk records that are academic papers and not already present in Science are deduplicated and folded into Science (26,760 added, so Science = 34,664 + 26,760 = 61,424); 3,491 risk-related policy reports are reassigned from Policy to Risk materials. The 1990–2026 STPI reference space itself is built from the 111,384 paper+patent+policy documents, with the AI-risk literature projected onto it as an overlay. The AI-risk charts, evidence-status counts, risk-card quality audit, and seminal-reference list are computed from the published data.

Corpus composition by domain

Documents by domain; Science shown inclusive of the AI-risk research literature (light segment)

Reference space hierarchy

L2 clusters and L3 key-phrase nodes per domain (1,930 L3 total after card deduplication)

Policy documents by period (v5 collection)

Domestic (Korean) vs. international policy documents across eight 5-year periods — temporal growth of the policy collection (8,993 total; 3,491 risk-related reports reassigned to Risk materials, 5,502 remain as general policy)

AI-risk nodes by L1 risk domain

Operational L4 risk nodes in the Global AI Risks overlay (live)

AI-risk evidence by type

Evidence document type backing each risk node (live)

AI-risk nodes by source framework

Standards / repositories cited as a node's source (live; nodes may cite several)

Risk-card quality audit

Non-exclusive curation flags for L4 risk cards: definition depth, direct evidence, and verification status

Risk-card evidence status

Current evidence-status labels in the published L4 risk-card records (live)

AI-risk literature screened, by risk domain

Broad records retrieved for the risk overlay (26,943 risk records: 26,936 academic papers + 7 reports/sources; risk materials incl. risk-related policy = 30,434)

AI-risk literature, leading affiliation countries

Affiliation-ready records (17,685 with country metadata)

Top-venue seminal AI-risk references

Field-defining / highly-original papers at NeurIPS, ICML, ICLR, EMNLP, ACL, FAccT, IEEE S&P, USENIX Security, CCS — ranked and linked to related L4 risk nodes (live)

Physical AI risk data

Literature, sources, and institutions behind the Physical AI risk taxonomy (155 L4 risk cards organised along the perception → decision → action pipeline). Paper-corpus charts are computed from the relevance-screened Physical AI literature (5,168 records, after removing 54 off-topic records from an initial 5,222 by BGE-M3 semantic-similarity screening); framework and evidence charts are computed live from the published risk cards.

Physical AI research collected, by year and type

5,168 relevance-screened OpenAlex records for the Physical AI overlay (1994–2026), segmented by work type

Korea vs. international research

Physical AI papers with affiliation metadata, by year (Korea-affiliated vs. international)

Leading research institutions

Top affiliations among the collected Physical AI papers

Leading affiliation countries

Country of contributing institutions (Physical AI literature)

Standards & frameworks grounding the taxonomy

Institutional sources cited by the 155 Physical AI risk cards (live)

Risk-card evidence by source type

Document type backing each Physical AI risk card: benchmark, survey, framework, standard, report (live)