Digital Library

cab1

 
Title:      TRACEABLE RDF DATA GENERATION FROM GERMAN LEGISLATIVE DOCUMENTS THROUGH KNOWLEDGE GRAPH TRANSFORMATIONS
Author(s):      Alexander Gashkov, Oliver Schmidtke and Andreas Both
ISBN:      978-989-8704-62
Editors:      Paula Miranda and Pedro IsaĆ­as
Year:      2024
Edition:      Single
Keywords:      Knowledge Graphs, RDF Data Transformation, Semantic Enrichment, Legislative Documents, German Federal Law
Type:      Full
First Page:      259
Last Page:      266
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      The current shift toward a data-driven and web-based society is being driven by the data available. However, it is not just the availability but also the traceability, e.g., regarding the structure of documents and their semantics. In particular, because of the permanent growth and changing of normative documents, transforming them into a machine-readable and machine-interpretable form is crucial for trustworthy data-driven applications. However, often the provided data formats (e.g., XML) have several flaws, in particular, not providing the semantics of the data or limited metadata. For instance, access to German laws is provided via a web portal as text and PDF files (for humans) and as an XML file (a machine-readable format). However, an XML file downloaded from the site features limited semantics, e.g., it does not possess cross-references between laws or paragraphs. This paper presents an RDF-driven, step-by-step, extensible transformation process in which each transformation step leads to a knowledge graph. Hence, data transformation steps can be validated, s.t., the validity and trustworthiness of the generated data are increased and - in the best case - can be guaranteed. As a use case, we generate a knowledge graph from the German federal laws. The transformation steps include cleaning up the data, adding types, and establishing links between laws and their paragraphs. The contributions of this paper are a process, the dataset representing the RDF data of the German federal law, and additional insights for the particular use case.
   

Social Media Links

Search

Login