IN5800 – Mappings

Leif Harald Karlsen

Mappings

Use of Mappings

  • Data migration: Map data from one source to another
    • Upgrade DBS, switch DBS, etc.
  • Data integration: Map multiple sources into one target source under common vocabulary
    • More on this later!

Mappings as Rules

Forward and Backwards Mappings

Mapping into RDF

person(id, name, age, works_at) -> ( ex:person{id} rdfs:label {name}^^xsd:string . )

Need to:

Mappings with Lore

-- Given:
CREATE RELATION student(
    id int PRIMARY KEY,
    name text NOT NULL,
    supervisor int REFERENCES employee(id)
);
CREATE RELATION employee(
    id int PRIMARY KEY,
    name text NOT NULL
);

-- and an ontology with the following:
--
-- @prefix ex: <http://example.org> .
--
-- ex:Student a owl:Class .
-- ex:Employee a owl:Class .
-- ex:supervisor a owl:ObjectProperty .

-- Mappings (forward):
IMPORT 'https://leifhka.org/lore/library/base_rdfs.lore';

student(i, n, s) -> ex.Student(qn('ex', 'student' || i)); -- ex:student1, ex:student2, ...
student(i, n, s) -> rdfs.label(qn('ex', 'student' || i), n);

employee(i, n) -> ex.Employee(qn('ex', 'employee' || i));
employee(i, n) -> rdfs.label(qn('ex', 'employee' || i), n);

student(i, n, s) -> ex.supervisor(qn('ex', 'student' || i),
                                  qn('ex', 'employee' || s));

If no ontology is made upfront, add following:

CREATE SCHEMA ex;
prefix('ex', 'http://example.org/');

CREATE RELATION ex.Student(individual text);
triplelore_class('ex.Student');
CREATE RELATION ex.Employee(individual text);
triplelore_class('ex.Employee');
CREATE RELATION ex.supervisor(subject text, object text);
triplelore_property('ex.supervisor', null);

Automated Mappings and Structure

Direct mappings

The direct mapping of relational data to RDF maps:

Direct mapping example

university.employee

 id |    name    |        email        
----+------------+---------------------
  0 | Mary Smith | mary@smith.no
  1 | Carl Green | carl_green@mail.net

university.student

 id |    name     | supervisor 
----+-------------+------------
  0 | Peter Swan  |           
  1 | Helen Brown |          0
  2 | Nathan Case |          1

Resulting triples:

@prefix uni: <http://example.org/university/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://example.org/university/student/id=2>
    a                           uni:student ;
    uni:student_id              "2"^^xsd:int ;
    uni:student_name            "Nathan Case" ;
    uni:student_ref_supervisor  <http://example.org/university/employee/id=1> ;
    uni:student_supervisor      "1"^^xsd:int .

<http://example.org/university/employee/id=0>
    a                   uni:employee ;
    uni:employee_email  "mary@smith.no" ;
    uni:employee_id     "0"^^xsd:int ;
    uni:employee_name   "Mary Smith" .

<http://example.org/university/student/id=0>
    a                 uni:student ;
    uni:student_id    "0"^^xsd:int ;
    uni:student_name  "Peter Swan" .

<http://example.org/university/employee/id=1>
    a                   uni:employee ;
    uni:employee_email  "carl_green@mail.net" ;
    uni:employee_id     "1"^^xsd:int ;
    uni:employee_name   "Carl Green" .

<http://example.org/university/student/id=1>
    a                           uni:student ;
    uni:student_id              "1"^^xsd:int ;
    uni:student_name            "Helen Brown" ;
    uni:student_ref_supervisor  <http://example.org/university/employee/id=0> ;
    uni:student_supervisor      "0"^^xsd:int .

Direct Mappings with Lore/Triplelore

Direct Mappings with Lore/Triplelore: Example

IMPORT 'http://leifhka.org/lore/library/direct_mappings.lore';

-- Schema, relation/view name, base-IRI, prefix, forward-rules
mappings.direct_mapping('university', 'employee', 'http://example.org/', 'uni', true);
mappings.direct_mapping('university', 'student', 'http://example.org/', 'uni', true);

If university.employee and university.student are Lore-relations or views:

IMPORT 'http://leifhka.org/lore/library/direct_mappings.lore';

mappings.direct_mapping('university', 'employee', 'http://example.org/', 'uni', true);
mappings.direct_mapping('university', 'student', 'http://example.org/', 'uni', true);

-- Schema, relation/view name, column
-- (use multiple lines for multi-column PK)
mappings.primary_key('university', 'employee', 'id');
mappings.primary_key('university', 'student', 'id');

-- Constraint name (make something up) but use same value for multi-column FK,
-- schema, relation/view name, fk-column, ref'ed schema and relation/view/table name, ref'ed column
mappings.foreign_key(1, 'university', 'student', 'supervisor', 'university', 'employee', 'id');

R2RML

R2RML Example

Lore:

IMPORT 'https://leifhka.org/lore/library/base_rdfs.lore';

prefix('ex', 'http://example.org/');

ex.Student(qn('ex', 'student' || i)) <- student(i, n, s);
rdfs.label(qn('ex', 'student' || i), n) <- student(i, n, s);

ex.Employee(qn('ex', 'employee' || i)) <- employee(i, n);
rdfs.label(qn('ex', 'employee' || i), n) <- employee(i, n);

ex.supervisor(qn('ex', 'student' || i), qn('ex', 'employee' || s))
    <- student(i, n, s);

R2RML:

@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.org/>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<#StudentMap>
  rr:logicalTable [ rr:tableName "student" ];
  rr:subjectMap [
    rr:template "http://example.org/student{id}";
    rr:class ex:Student;
  ];
  rr:predicateObjectMap [
    rr:predicate rdfs:label;
    rr:objectMap [
        rr:column "name" ;
        rr:termType rr:Literal ;
        rr:datatype xsd:string
    ];
  ].
    
<#EmployeeMap>
  rr:logicalTable [ rr:tableName "employee" ];
  rr:subjectMap [
    rr:template "http://example.org/employee{id}";
    rr:class ex:Employee;
  ];
  rr:predicateObjectMap [
    rr:predicate rdfs:label;
    rr:objectMap [
        rr:column "name" ;
        rr:termType rr:Literal ;
        rr:datatype xsd:string
    ];
  ].
  
<#StudentMap>
  rr:predicateObjectMap [
    rr:predicate ex:supervisor;
    rr:objectMap [
      rr:parentTriplesMap <#EmployeeMap>;
      rr:joinCondition [ 
        rr:child "supervisor";
        rr:parent "id";
      ];
   ];
 ].

R2RML: Overview

Mappings consists of

  1. a logical table of data to be mapped
    • either a table/view name (rr:tableName);
    • or an SQL-query (rr:sqlQuery)
  2. a subject map (rr:subjectMap)
    • optionally a class as type (rr:class)
  3. one or more predicate-object-maps (rr:predicateObjectMap)
    • containing a predicate map (rr:predicateMap)
    • and one or more object-maps (rr:objectMap)
  • Each value in the triple can be made with:
    • rr:template: IRI-template containing column(s) from logical table
    • rr:column: give column directly (can then be typed with rr:termType)
    • rr:constant: give constant directly
      • x rr:predicate y short for x rr:predicateMap [ rr:constant y ]
      • Similarly for rr:subject and rr:object
  • Can be given a
    • rr:termType: One of rr:IRI, rr:Literal, rr:BlankNode
    • rr:datatype: E.g. xsd:string, xsd:int, etc.
  • One map can refer to another, and specify joins
    • rr:parentTriplesMap: states which map to refer to
    • rr:joinCondition: states which columns should be joined
    • output triples then have same value based on join

R2RML:

@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.org/>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<#StudentMap>
  rr:logicalTable [ rr:tableName "student" ];
  rr:subjectMap [
    rr:template "http://example.org/student{id}";
    rr:class ex:Student;
  ];
  rr:predicateObjectMap [
    rr:predicate rdfs:label;
    rr:objectMap [
        rr:column "name" ;
        rr:termType rr:Literal ;
        rr:datatype xsd:string
    ];
  ].
    
<#EmployeeMap>
  rr:logicalTable [ rr:tableName "employee" ];
  rr:subjectMap [
    rr:template "http://example.org/employee{id}";
    rr:class ex:Employee;
  ];
  rr:predicateObjectMap [
    rr:predicate rdfs:label;
    rr:objectMap [
        rr:column "name" ;
        rr:termType rr:Literal ;
        rr:datatype xsd:string
    ];
  ].
  
<#StudentMap>
  rr:predicateObjectMap [
    rr:predicate ex:supervisor;
    rr:objectMap [
      rr:parentTriplesMap <#EmployeeMap>;
      rr:joinCondition [ 
        rr:child "supervisor";
        rr:parent "id";
      ];
   ];
 ].

Ontop VKG

R2RML:

@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.org/>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<#StudentMap>
  rr:logicalTable [ rr:tableName "student" ];
  rr:subjectMap [
    rr:template "http://example.org/student{id}";
    rr:class ex:Student;
  ];
  rr:predicateObjectMap [
    rr:predicate rdfs:label;
    rr:objectMap [
        rr:column "name" ;
        rr:termType rr:Literal ;
        rr:datatype xsd:string
    ];
  ].
    
<#EmployeeMap>
  rr:logicalTable [ rr:tableName "employee" ];
  rr:subjectMap [
    rr:template "http://example.org/employee{id}";
    rr:class ex:Employee;
  ];
  rr:predicateObjectMap [
    rr:predicate rdfs:label;
    rr:objectMap [
        rr:column "name" ;
        rr:termType rr:Literal ;
        rr:datatype xsd:string
    ];
  ].
  
<#StudentMap>
  rr:predicateObjectMap [
    rr:predicate ex:supervisor;
    rr:objectMap [
      rr:parentTriplesMap <#EmployeeMap>;
      rr:joinCondition [ 
        rr:child "supervisor";
        rr:parent "id";
      ];
   ];
 ].

Ontop mappings:

[PrefixDeclaration]
ex:     http://example.org/
rdfs:   http://www.w3.org/2000/01/rdf-schema#
xsd:    http://www.w3.org/2001/XMLSchema#

[MappingDeclaration] @collection [[

mappingId   studentMap
target      ex:student{id} a ex:Student ; rdfs:label {name}^^xsd:string . 
source      SELECT * FROM student

mappingId   employeeMap
target      ex:employee{id} a ex:Employee ; rdfs:label {name}^^xsd:string . 
source      SELECT * FROM employee

mappingId   supervisorMap
target      ex:student{id} ex:supervisor ex:employee{supervisor} . 
source      SELECT * FROM student

]]

OTTR and Mappings

  • A template is in a way a mapping
  • Instances are tabular in structure
  • Result of expansion is RDF
  • However, OTTR also have mapping languages for mapping from different structures into instances
    • (which can then be expanded into RDF)
  • Can map relational databases, RDF graphs, CSV and Excel spreadsheets into OTTR instances
  • Only forward-mappings (i.e. materialized)

bOTTR

  • Mapping language for mapping relational tables and RDF graphs into OTTR template instances
  • Can also map CSV by using functions in (H2’s) SQL for reading CSV
  • Planned XML and JSON support
  • Uses RDF to specify the mapping (just like R2RML)

bOTTR Mappings: Relational Databases

bOTTR Example: People

Relational data

university.employee

 id |    name    |        email        
----+------------+---------------------
  0 | Mary Smith | mary@smith.no
  1 | Carl Green | carl_green@mail.net

university.student

 id |    name     | supervisor 
----+-------------+------------
  0 | Peter Swan  |           
  1 | Helen Brown |          0
  2 | Nathan Case |          1

Templates:

ex-t:Student[
    ottr:IRI ?student,
    xsd:string ?name,
    ? ottr:IRI ?supervisor
] :: {
    o-rdf:Type(?student, ex:Student),
    o-rdfs:Label(?student, ?name),
    ottr:Triple(?student, ex:supervisor, ?supervisor)
} .

ex-t:Employee[
    ottr:IRI ?employee,
    xsd:string ?name
] :: {
    o-rdf:Type(?employee, ex:Employee),
    o-rdfs:Label(?employee, ?name)
} .

bottr-map:

ex-m:Student a :InstanceMap ;
  :template ex-t:Student ;                                    
  :query """SELECT
                'http://example.org/student' || id,           
                name,
                'http://example.org/employee' || supervisor   
            FROM university.student;""" ;                     
  :argumentMaps (                                             
    [ :type :IRI ]
    [ :type xsd:string ]
    [ :type :IRI ]
    ) ;
  :source
    [ a :JDBCSource ;                                         
      :sourceURL "jdbc:postgresql://localhost/in5800" ;
      :jdbcDriver "org.postgresql.Driver" ; # Note: Need to include driver in classpath
      :username "test" ;
      :password "test" ] .
      
      
ex-m:Employee a :InstanceMap ;
  :template ex-t:Employee ;
  :query """SELECT
                'http://example.org/employee' || id,
                name
            FROM university.employee;""" ;
  :argumentMaps (
    [ :type :IRI ]
    [ :type xsd:string ]
    ) ;
  :source
    [ a :JDBCSource ;
      :sourceURL "jdbc:postgresql://localhost/in5800" ;
      :jdbcDriver "org.postgresql.Driver" ;
      :username "test" ;
      :password "test" ] .

Download bOTTR-map and template library here.

Result:

ex:employee0  rdf:type  ex:Employee ;
        rdfs:label  "Mary Smith" .

ex:employee1  rdf:type  ex:Employee ;
        rdfs:label  "Carl Green" .

ex:student0  rdf:type  ex:Student ;
        rdfs:label  "Peter Swan" .

ex:student1  ex:supervisor  ex:employee0 ;
        rdf:type       ex:Student ;
        rdfs:label     "Helen Brown" .

ex:student2  ex:supervisor  ex:employee1 ;
        rdf:type       ex:Student ;
        rdfs:label     "Peter Swan" .

bOTTR Mappings: CSV Files

bOTTR Example: Planets

CSV-file with lots of real planets from The Extrasolar Planets Encyclopaedia!

Can map (part of) it to RDF with this bOTTR-map and this template library.

tabOTTR

  • tabOTTR Excel-spreadsheets into RDF
  • Mapping added into the spreadsheet, before the data
  • Maps rows to template instances
  • Has commands for
    • defining prefixes
    • stating which template to map data into
    • which columns should correspond to which parameter of the template
    • specifying the type (IRI, stirng, integer, etc.) of the column

tabOTTR Example: More Planets

Excel-file with lots of more real planets from NASA Exoplanet Archive!

Can map it to RDF by adding a tabOTTR preamble like this and the same library as above.

Other direct and specific mappings

Foreign data wrappers

Foreign Data Wrappers: PostgreSQL Example

Customer-DB:

CREATE TABLE customer (
    email text PRIMARY KEY,
    name text,
    city text,
    country text
);

INSERT INTO customer VALUES
('erina22@somemail,com', 'Erin Amone', 'Oslo', 'Norway'),
('aliden@fmail,com', 'Ali Dent', 'Moss', 'Norway'),
('pennyla@mail,com', 'Penny Larsen', 'Aarhus', 'Denmark');
-- ...

Survey CSV:

sid, done_at, customer_email, happiness_score 
1, 2023-01-02, supa_user@mail.com, 4
2, 2023-01-05, erina22@somemail.com, 2
3, 2023-02-03, aliden@fmail.org, 1
4, 2022-12-17, peterin@mail.net, 6
...

Foreign data wrappers:

-- Create the foreign data wrapper extension for other PostgreSQL-DBs:
-- Documentation: https://www.postgresql.org/docs/14/postgres-fdw.html
-- (this is already done for you in your IFI-databases)
CREATE EXTENSION postgres_fwd;

-- Then create a reference to the other database.
-- Lets assume "customer_db" is on host "123.4.5.6"
-- and port 5432:
CREATE SERVER customer_db_server
FOREGIGN DATA WRAPPER postgres_fdw
OPTIONS (host '123.4.5.6', port '5432', dbname 'customer_db');

-- Now need to create a user-mapping to gain access
-- to the other_db:
-- Lets assume we can use user with username 'cuser' and password 'pwd123':
CREATE USER MAPPING FOR master_db
SERVER customer_db_server
OPTIONS (user 'cuser', password 'pwd123');

-- Create schema to map foreign source into
CREATE SCHEMA customer; -- Could also be named something different

-- Map the foreign source's public-schema into the created schema
IMPORT FOREIGN SCHEMA public
FROM SERVER customer_db_server
INTO customer;

-- Can now run queries over customer-db as if it were
-- a normal schame in this database.

-- Now want to include a CSV-file, use the file FDW:
-- Documentation: https://www.postgresql.org/docs/14/file-fdw.html
CREATE EXTENSION file_fwd;

-- Create a server using the file_fwd
CREATE SERVER survey FOREIGN DATA WRAPPER file_fdw;

-- Create a table from the CSV-file surveys/user_survey.csv
CREATE FOREIGN TABLE survey (
    sid int,
    done_at date,
    customer_email text,
    happiness_score int
)
SERVER survey
OPTIONS ( filename 'surveys/user_survey.csv', format 'csv');

-- Finally, maybe we want to know more about the city population of unhappy customers
-- Include data from a SPARQL-query over DBPedia via the sparql_fdw
-- Documentation: https://github.com/sjstoelting/sparql_fdw
-- Installation is done via multicorn, see above link for more info
CREATE EXTENSION multicorn;

CREATE SERVER dbpedia FOREIGN DATA WRAPPER multicorn
OPTIONS ( wrapper 'sparqlfdw.SparqlForeignDataWrapper', endpoint 'http://dbpedia.org/sparql' );

CREATE FOREIGN TABLE city_data ( city text, population int )
SERVER dbpedia OPTIONS ( sparql '
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?label ?population
WHERE {
  ?city a dbo:City ;
        rdfs:label ?label ;
        dbo:populationTotal ?population .
}');

-- Can now query for population of all unhappy customers:
SELECT ci.population
FROM customer.customer AS cu
     JOIN survey AS s ON (cu.email = s.customer_email)
     JOIN city_data AS ci USING (city)
WHERE s.happiness_score <= 3;