IN5800 – Declarative Data Engineering
This is a wiki for the course IN5800 – Declarative Data Engineering at Department for Informatics at the University of Oslo.
This wiki gives an introduction to the technologies and techniques taught in the course, as well as usefull pointers to guides, tutorials and other relevant web pages.
Lectures
- Intro
- Data structure
- Query languages
- Views, Triggers, Rules and Lore
- Semantics and Reasoning
- OTTR Templates
- Mappings
- Constraints
- Data Transformation
- Saturation
- Integration
- Ontology Engineering
- Data Cleaning and Validation
- Pipelines
Mandatory Exercise
The mandatory exercise can be found here.
Project Work
Information about the project work and presentation can be found here.
Technologies
The technologies used in this course can be organized into the following categories:
- Fundamental (glue code, documentation, version control)
- Relational technologies (relational databases, query languages)
- Semantic technologies (triple stores, semantics, query languages, ontology engineering)
See also the Complete list of technologies.
Techniques
Below is a list of techniques or methods for declarative data engineering used and taught in the course. In the course, we realize these techniques using the technologies described above.
- Data Transformation
- Integration (syntactic and semantic)
- Saturation
- Cleaning
- Validation
- Pipelines
Other Relevant Articles and Resources
Communication
Lists and Collections
- Awsome Semantic Web (Large list of resources by category)
- DB-engines (Knowlegde base of databases with comparisons)
Articles and Podcasts
- Data Engineering
Podcast
- (Not focused on delcarative techniques, though)
- Use the Index, Luke
- A data engineer’s guide to semantic modelleing
- We Don’t Need Data Scientists, We Need Data Engineers
- No, you don’t need ML/AI. You need SQL
Data Collections
- Felles datakatalog (Huge list of open Norwegian datasets)