Forschungsabstract Februar 2026

Institutional Document Provenance Framework

Ein governance-basiertes Referenzmodell zur Sicherung der Herkunft KI-generierter institutioneller Dokumente A governance-based reference model for securing the provenance of AI-generated institutional documents Um modelo de referencia baseado em governanca para garantir a proveniencia de documentos institucionais gerados por IA

IDPF Research Abstract

Author: Jober Mogele Correa

Role: Chief Governance Officer, WINDI Publishing House

Context: Masterarbeit — Hochschule Kempten

Date: Februar 2026

Protocol: WINDI-SOF-v1 - Secure Origin Framework

Problemstellung Problem Statement Declaracao do Problema

Die rasante Entwicklung generativer KI-Systeme ermoglicht die Erzeugung institutioneller Dokumente, die in Sprache, Layout und Formatierung kaum von authentischen Originalen zu unterscheiden sind. Dieses Phanomen — hier als Document Deepfakes bezeichnet — stellt eine neuartige Bedrohungskategorie dar: institutionelle Identitatsfalschung ohne technischen Einbruch. Klassische Ansatze der Deepfake-Erkennung fokussieren auf die nachtragliche Identifikation gefalschter Inhalte und stossen bei strukturell perfekten Dokumenten an ihre Grenzen. The rapid development of generative AI systems enables the creation of institutional documents that are almost indistinguishable from authentic originals in language, layout, and formatting. This phenomenon — referred to here as Document Deepfakes — represents a novel threat category: institutional identity forgery without technical intrusion. Classical deepfake detection approaches focus on the retrospective identification of forged content and reach their limits with structurally perfect documents. O desenvolvimento rapido de sistemas de IA generativa permite a criacao de documentos institucionais que sao quase indistinguiveis de originais autenticos em linguagem, layout e formatacao. Este fenomeno — aqui referido como Document Deepfakes — representa uma nova categoria de ameaca: falsificacao de identidade institucional sem intrusao tecnica. Abordagens classicas de deteccao de deepfake focam na identificacao retrospectiva de conteudo falsificado e atingem seus limites com documentos estruturalmente perfeitos.

Forschungslucke Research Gap Lacuna de Pesquisa

Wahrend fur audiovisuelle Medien bereits umfangreiche Provenienzstandards existieren (z.B. C2PA, Content Credentials), fehlt ein vergleichbares Modell fur den Bereich institutioneller Dokumente im offentlichen und regulierten Sektor. Insbesondere adressiert keine bestehende Losung die Verbindung von institutioneller Identitatskontrolle, kryptographischer Strukturverankerung und governance-gestufter Verifikation. While extensive provenance standards already exist for audiovisual media (e.g., C2PA, Content Credentials), a comparable model is missing for the domain of institutional documents in the public and regulated sectors. In particular, no existing solution addresses the connection between institutional identity control, cryptographic structure anchoring, and governance-tiered verification. Enquanto padroes extensivos de proveniencia ja existem para midia audiovisual (ex.: C2PA, Content Credentials), falta um modelo comparavel para o dominio de documentos institucionais nos setores publico e regulamentado. Em particular, nenhuma solucao existente aborda a conexao entre controle de identidade institucional, ancoragem criptografica de estrutura e verificacao em niveis de governanca.

Beitrag und Paradigmenwechsel Contribution and Paradigm Shift Contribuicao e Mudanca de Paradigma

Das vorliegende Referenzmodell — das Institutional Document Provenance Framework (IDPF) — schlagt einen grundlegend anderen Ansatz vor: Nicht die Erkennung des Falschen steht im Mittelpunkt, sondern die gesicherte Herkunft des Echten. Anstatt Falschungen nachtraglich zu identifizieren, wird jedes institutionelle Dokument bei seiner Erzeugung mit einer kryptographisch verankerten Herkunftsidentitat ausgestattet — einer digitalen Geburtsurkunde, die Struktur, Kontext und Governance-Ebene untrennbar mit dem Dokument verbindet. The present reference model — the Institutional Document Provenance Framework (IDPF) — proposes a fundamentally different approach: Not the detection of the false is the focus, but the secured provenance of the authentic. Instead of retrospectively identifying forgeries, each institutional document is equipped at its creation with a cryptographically anchored provenance identity — a digital birth certificate that inseparably connects structure, context, and governance level with the document. O presente modelo de referencia — o Institutional Document Provenance Framework (IDPF) — propoe uma abordagem fundamentalmente diferente: Nao e a deteccao do falso que esta no foco, mas a proveniencia garantida do autentico. Em vez de identificar falsificacoes retrospectivamente, cada documento institucional e equipado em sua criacao com uma identidade de proveniencia ancorada criptograficamente — uma certidao de nascimento digital que conecta inseparavelmente estrutura, contexto e nivel de governanca ao documento.

Paradigmenwechsel Paradigm Shift Mudanca de Paradigma

Bisherige Logik: „Wie erkennen wir Falschungen?" Previous Logic: "How do we detect forgeries?" Logica Anterior: "Como detectamos falsificacoes?"

IDPF-Ansatz: „Wie schaffen wir uberprufbare Herkunft fur echte Dokumente?" IDPF Approach: "How do we create verifiable provenance for authentic documents?" Abordagem IDPF: "Como criamos proveniencia verificavel para documentos autenticos?"

Architektur Architecture Arquitetura

Das IDPF basiert auf vier konzeptionellen Schichten: The IDPF is based on four conceptual layers: O IDPF e baseado em quatro camadas conceituais:

Layer / Schicht	Function / Funktion	Mechanism
Identity Control / Identitatskontrolle	Institutional Authentication	Identity Governance Layer with License Model
Structural Provenance / Strukturelle Provenienz	Cryptographic Provenance Proof	Canonical Hashes, Provenance Records
Verification / Verifikation	Authenticity Confirmation	Three-tier: VALID \| UNKNOWN \| TAMPERED
Resilience Assessment / Resilienz-Bewertung	Quantified Forgery Resistance	Deepfake Resilience Score (0–100)

Das Modell definiert drei Governance-Stufen (HIGH, MEDIUM, LOW) mit abgestuften Sicherheitsanforderungen und fuhrt den neuartigen Deepfake Resilience Score ein — eine quantifizierte Metrik (0–100) zur Bewertung der Falschungsresistenz institutioneller Dokumente. The model defines three governance levels (HIGH, MEDIUM, LOW) with graduated security requirements and introduces the novel Deepfake Resilience Score — a quantified metric (0–100) for assessing the forgery resistance of institutional documents. O modelo define tres niveis de governanca (HIGH, MEDIUM, LOW) com requisitos de seguranca graduados e introduz o novo Deepfake Resilience Score — uma metrica quantificada (0–100) para avaliar a resistencia a falsificacao de documentos institucionais.

Regulatorische Relevanz Regulatory Relevance Relevancia Regulatoria

Das Modell adressiert direkt zentrale Anforderungen des EU AI Act: Aufzeichnungspflichten (Art. 12), menschliche Aufsicht (Art. 14) und Transparenzpflichten bei KI-generierten Inhalten (Art. 50). Es bietet damit einen konzeptionellen Rahmen fur Organisationen in der digitalen Verwaltung, im Finanzsektor (BaFin-Konformitat), im Bildungswesen und in der offentlichen Kommunikation. The model directly addresses central requirements of the EU AI Act: record-keeping obligations (Art. 12), human oversight (Art. 14), and transparency obligations for AI-generated content (Art. 50). It thus provides a conceptual framework for organizations in digital administration, the financial sector (BaFin compliance), education, and public communication. O modelo aborda diretamente requisitos centrais do EU AI Act: obrigacoes de manutencao de registros (Art. 12), supervisao humana (Art. 14) e obrigacoes de transparencia para conteudo gerado por IA (Art. 50). Assim, fornece uma estrutura conceitual para organizacoes em administracao digital, setor financeiro (conformidade BaFin), educacao e comunicacao publica.

Validierung Validation Validacao

Eine prototypische Referenzimplementierung auf deutscher Serverinfrastruktur (Secure Origin Framework, SOF v1.0) demonstriert die praktische Umsetzbarkeit des Modells mit vollstandiger Provenienz-Kette, deterministischer Hash-Verifikation und stufenbasierter Governance — validiert durch 15 automatisierte Testfalle uber alle Governance-Ebenen. A prototypical reference implementation on German server infrastructure (Secure Origin Framework, SOF v1.0) demonstrates the practical feasibility of the model with complete provenance chain, deterministic hash verification, and tier-based governance — validated through 15 automated test cases across all governance levels. Uma implementacao de referencia prototipica em infraestrutura de servidor alema (Secure Origin Framework, SOF v1.0) demonstra a viabilidade pratica do modelo com cadeia de proveniencia completa, verificacao de hash deterministica e governanca baseada em niveis — validada atraves de 15 casos de teste automatizados em todos os niveis de governanca.

Schlusselworter Keywords Palavras-chave

Document Provenance · Document Deepfakes · Institutional Identity Governance · EU AI Act Compliance · Deepfake Resilience Score · Cryptographic Provenance Securing · Pre-AI Governance Layer

Einordnung Classification Classificacao

AI Governance · Information Security · Digital Trust Infrastructure · Regulatory Technology

Human Authorship Notice

Dieses Dokument wurde von menschlichen Autoren erstellt und uberpruft. This document was created and reviewed by human authors. Este documento foi criado e revisado por autores humanos.

KI-Unterstutzung wurde unter menschlicher Aufsicht und Kontrolle verwendet. AI assistance was used under human supervision and control. Assistencia de IA foi usada sob supervisao e controle humanos.

Endgultige Entscheidungen und Inhaltsfreigabe: Menschliche Verantwortung. Final decisions and content approval: Human responsibility. Decisoes finais e aprovacao de conteudo: Responsabilidade humana.

"KI verarbeitet. Mensch entscheidet. WINDI garantiert." "AI processes. Human decides. WINDI guarantees." "IA processa. Humano decide. WINDI garante."

WINDI Governance Ledger | Human-authored: 100.0% | Template: IDPF Abstract v1.0 NOIR | Structure: VALID