Solution to Exercises on Saturation

Exercise 1: Who Did It?

Clue 1: From this, we cannot deduce more than that the two thieves are listed somewhere in the urbania.citizen-table.

Clue 2: To use this clue, we first need to find the initials to all citizens. This can be done by simply replacing all lower-case and space characters in the names of citizens with nothing, that is regexp_replace(name, '[a-z\s]+', '', 'g'). To make it easy to use this information later on, we will make a new VIEW containing all information about each citizen together with the initials column as follows:

CREATE VIEW citizen_i(cid, urb_id, name, initials, apartment_number, apartment_building, works_at) AS
SELECT cid, urb_id, name,
    regexp_replace(name, '[a-z\s]+', '', 'g') AS initials,
    apartment_number, apartment_building, works_at
FROM urbania.citizen;

We now know that one of the theives are amongst the following:

SELECT *
FROM citizen_i
WHERE initials = 'DW' OR initials = 'YS';

Clue 3: To use the information in this clue, we need to know the size ('large' or 'small') for each building. This is only a matter of grouping and aggregating

CREATE VIEW citizen_ibs(cid, urb_id, name, initials, apartment_number, apartment_building, works_at, size) AS
WITH
    building_occurences(building) AS ( -- Each occurence of building ID is one worker or inhabitant
        SELECT apartment_building
        FROM urbania.citizen
        UNION ALL
        SELECT offices_in
        FROM urbania.organization AS o
             JOIN urbania.citizen AS c ON (o.oid = c.works_at)
        WHERE offices_in IS NOT NULL 
    ),
    bcount AS (
       SELECT building, count(*) AS nr
       FROM building_occurences
       GROUP BY building
    )
SELECT ci.*, -- All columns from citizen_i
    CASE WHEN nr > 50 THEN
        'large'
    ELSE
        'small'
    END AS size
FROM bcount AS bs
     JOIN citizen_i AS ci ON (bs.building = ci.apartment_building);

We can now find out who the initials on the score-card belongs to, by combining the information that they are living together in a small building, and has initials on the score card, as follows:

SELECT c1.name AS suspect1, c2.name AS suspect2
FROM citizen_ibs AS c1 -- The two
     JOIN citizen_ibs AS c2 USING (apartment_number, apartment_building)
WHERE c1.initials = 'DW' AND
      c2.initials = 'YS' AND
      c1.size = 'small';

giving the result:

   suspect1   | suspect2  
--------------+-----------
 Derek Winter | Yali Stad
(1 row)

One of these is thus one of the two thieves. We still need to determine which of these it is, and who the other thief is.

Clue 4: To find out who is in family (i.e. have the same surename), we need to extract the surename. This can be done with a simple splitting of the name on space, and then picking the second element as follows (regexp_split_to_array(name, ' '))[2]. We then make a VIEW containing all pairs of citizen’s IDs if they have the same surename via a simple self-join as follows:

CREATE VIEW in_family_with(cid1, cid2) AS
WITH
  surenames AS (
      SELECT cid, (regexp_split_to_array(name, ' '))[2] AS surename
      FROM urbania.citizen
  )
SELECT s1.cid, s2.cid
FROM surenames AS s1
     JOIN surenames AS s2 USING (surename)
WHERE s1.cid != s2.cid;

We can now combine all of the information from the clues into one (large) query, to obtain who the two thieves are, as follows:

WITH
  suspect_pair AS ( -- Pair found after clue 3
      SELECT c1.cid AS s1_cid, c1.name AS s1_name, c1.works_at AS s1_works_at,
             c2.cid AS s2_cid, c2.name AS s2_name, c2.works_at AS s2_works_at
      FROM citizen_ibs AS c1
           JOIN citizen_ibs AS c2 USING (apartment_number, apartment_building)
      where c1.initials = 'DW' AND
            c2.initials = 'YS' AND
            c1.size = 'small'
  )
SELECT sp.s1_name AS thief1, c3.name AS thief2
FROM suspect_pair AS sp
     JOIN in_family_with AS ifw ON (sp.s1_cid = ifw.cid1)
     JOIN citizen_ibs AS c3 ON (ifw.cid2 = c3.cid AND sp.s1_works_at = c3.works_at)
WHERE c3.size = 'large'
UNION
SELECT sp.s2_name AS thief1, c3.name AS thief2
FROM suspect_pair AS sp
     JOIN in_family_with AS ifw ON (sp.s2_cid = ifw.cid1)
     JOIN citizen_ibs AS c3 ON (ifw.cid2 = c3.cid AND sp.s2_works_at = c3.works_at)
WHERE c3.size = 'large';

giving as result the two thieves:

    thief1    |   thief2   
--------------+------------
 Derek Winter | Che Winter
(1 row)

Note that we do not know which of the two persons we found at the end of last clue is the thief, therefore we make a UNION of two queries that each assumes it is one of them. Luckily, one one of them have a coworker that is also a family member that lives in a large building :)

Exercise 2: Finding the Scroll

We start by extracting the streetname and numbers into seperate columns in the suburbia.home table as follows:

ALTER TABLE suburbia.home ADD COLUMN streetname text;
ALTER TABLE suburbia.home ADD COLUMN streetnumber int;

UPDATE suburbia.home
SET streetname = btrim(regexp_replace(home_address, '\d+', '')),
    streetnumber = btrim(regexp_replace(home_address, '[A-Za-z\s]+', ''))::int;

Here we get the streetname by removing all numbers ('\d+') from the address as well as all whitespace in the beginning and end. We get the streetnumber by removing all whitespace and letters.

We then directly map this into the triplestore just like in the oblig.

The Lore-relations and rules in semantics/directions.lore are as follows:

IMPORT 'https://leifhka.org/lore/library/prefix.lore';

-- Make relations for the raws:homeof-property
CREATE RELATION raws.homeof(subject text, object text);
triplelore_property('raws.homeof', qn('xsd', 'string'));

-- Make the ex-prefix
prefix('exd', 'http://example.org/directional/');
CREATE SCHEMA IF NOT EXISTS exd;

-- Make relations for the directional relationships
CREATE RELATION exd.northof(subject text, object text);
CREATE RELATION exd.southof(subject text, object text);
CREATE RELATION exd.eastof(subject text, object text);
CREATE RELATION exd.westof(subject text, object text);
CREATE RELATION exd.northeastof(subject text, object text);
CREATE RELATION exd.northwestof(subject text, object text);
CREATE RELATION exd.southeastof(subject text, object text);
CREATE RELATION exd.southwestof(subject text, object text);
-- Sparqlify mappings
triplelore_property('exd.northof', NULL);
triplelore_property('exd.southof', NULL);
triplelore_property('exd.westof', NULL);
triplelore_property('exd.eastof', NULL);
triplelore_property('exd.northeastof', NULL);
triplelore_property('exd.northwestof', NULL);
triplelore_property('exd.southeastof', NULL);
triplelore_property('exd.southwestof', NULL);

-- Extract the street name and number and homeof-relation
raws.citizen_ref_home(c, h), raws.citizen_urb_id(c, u)
    -> raws.homeof(h, u);

-- Directional rules for inferring the directional relations
raws.home_streetnumber(h1, n1), raws.home_streetnumber(h2, n2) : n1 < n2
    -> exd.westof(h1, h2);

raws.home_zip_code(h1, z1), raws.home_zip_code(h2, z2) : z1 < z2
    -> exd.northof(h1, h2);

raws.home_streetname(h1, n1), raws.home_zip_code(h1, z),
raws.home_streetname(h2, n2), raws.home_zip_code(h2, z) : n1 < n2
    -> exd.northof(h1, h2);

exd.westof(x, y), exd.westof(y, z) -> exd.westof(x, z);
exd.northof(x, y), exd.northof(y, z) -> exd.northof(x, z);

exd.westof(x, y) -> exd.eastof(y, x);
exd.northof(x, y) -> exd.southof(y, x);

exd.northof(x, y), exd.eastof(x, y) -> exd.northeastof(x, y);
exd.southof(x, y), exd.eastof(x, y) -> exd.southeastof(x, y);
exd.northof(x, y), exd.westof(x, y) -> exd.northwestof(x, y);
exd.southof(x, y), exd.westof(x, y) -> exd.southwestof(x, y);

This can either be executed directly with Lore, or add a rule to your Makefile:

directions:
    $(lore) semantics/directions.lore

and run make directions.

The query in queries/find_address.sparql is:

PREFIX exd: <http://example.org/directional/> 
PREFIX raws: <http://leifhka.org/in5800/urb/raw_data/suburbia/> 

SELECT ?address
WHERE {
    ?bang exd:southof     [ raws:homeof "s147" ] ;
          exd:westof      [ raws:homeof "s101" ] ;
          exd:northwestof [ raws:homeof "s16" ] ;
          exd:northof     [ raws:homeof "s309" ] ;
          exd:northeastof [ raws:homeof "s77" ] ;
          exd:southeastof [ raws:homeof "s252" ] ;
          exd:eastof      [ raws:homeof "s58" ] .
    ?bang raws:home_home_address ?address .
}

Giving the result:

  address   
------------
Bird alley 7

Looking more closely at the data, we see that this is in fact the address of Jan Winter, related to the other two thieves!

Bonus Exercise: Translate the Scroll (Difficult!)

The scroll is actually written using the “Rövarspråket” that replaces each consonant with the consonant followed by an ‘o’ followed by the consonant again. E.g. urb becomes urorbob.

So doing the translation manually gives the following text:

the treasure is burried
under the sea bottom 100
meters south of the
docks in the city of urb

However, the more tricky part is automating this translation with SQL. The idea is to first make a table matching each consonant to its translation, i.e. b to bob, c to coc, etc. This can be done with the following query:

WITH 
  cons(c) AS (
    VALUES ('b'), ('c'), ('d'), ('f'),
           ('g'), ('h'), ('j'), ('k'),
           ('l'), ('m'), ('n'), ('p'),
           ('q'), ('r'), ('s'), ('t'),
           ('v'), ('w'), ('x'), ('z')
  )
SELECT (row_number() OVER ())::int AS nr, c || 'o' || c AS from_c, c as to_c
FROM cons;

resulting in the following result:

 nr | from_c | to_c 
----+--------+------
  1 | bob    | b
  2 | coc    | c
  3 | dod    | d
  4 | fof    | f
  5 | gog    | g
  6 | hoh    | h
  7 | joj    | j
  8 | kok    | k
  9 | lol    | l
 10 | mom    | m
 11 | non    | n
 12 | pop    | p
 13 | qoq    | q
 14 | ror    | r
 15 | sos    | s
 16 | tot    | t
 17 | vov    | v
 18 | wow    | w
 19 | xox    | x
 20 | zoz    | z

Then, we iterate over the string replacing each bob with b, then each coc with c, and so on. A single replacement can simply be done with regexp_replace(s, from_c, to_c, 'g'). To make all replacements, we need to iterate over the result of the previous result. This can be done using recursive queries as follows:

WITH RECURSIVE
  cons(c) AS (
    VALUES ('b'), ('c'), ('d'), ('f'),
           ('g'), ('h'), ('j'), ('k'),
           ('l'), ('m'), ('n'), ('p'),
           ('q'), ('r'), ('s'), ('t'),
           ('v'), ('w'), ('x'), ('z')
  ),
  tcons AS (
    SELECT (row_number() OVER ())::int AS nr, c || 'o' || c AS from_c, c as to_c FROM cons
  ),
  trans(nr, s) AS (
    SELECT 1 AS nr,
           'tothohe totroreasosurore isos boburorroriedod unondoderor tothohe sosea bobotottotomom 100 mometoterorsos sosoutothoh ofof tothohe dodocockoksos inon tothohe cocitoty ofof urorbob' AS s
    UNION ALL
    SELECT nr+1, regexp_replace(s, from_c, to_c, 'g') AS s
    FROM trans JOIN tcons USING (nr)
  )
SELECT s
FROM trans
ORDER BY nr DESC
LIMIT 1;

Note that we number each line according to how many iterations we have, i.e. how many letters we have replaced/translated. We can then finally order by this number, and pick the largest giving the final result. Without the LIMIT 1, the result would be:

                                                                                          s                                                                                          
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 the treasure is burried under the sea bottom 100 meters south of the docks in the city of urb
 the treasure is burried under the sea bottom 100 meters south of the docks in the city of urb
 the treasure is burried under the sea bottom 100 meters south of the docks in the city of urb
 the treasure is burried under the sea bottom 100 meters south of the docks in the city of urb
 the treasure is burried under the sea bottom 100 meters south of the docks in the city of urb
 tothe totreasure is burried under tothe sea botottotom 100 metoters soutoth of tothe docks in tothe citoty of urb
 tothe totreasosure isos burried under tothe sosea botottotom 100 metotersos sosoutoth of tothe docksos in tothe citoty of urb
 tothe totroreasosurore isos burorroried underor tothe sosea botottotom 100 metoterorsos sosoutoth of tothe docksos in tothe citoty of urorb
 tothe totroreasosurore isos burorroried underor tothe sosea botottotom 100 metoterorsos sosoutoth of tothe docksos in tothe citoty of urorb
 tothe totroreasosurore isos burorroried underor tothe sosea botottotom 100 metoterorsos sosoutoth of tothe docksos in tothe citoty of urorb
 tothe totroreasosurore isos burorroried unonderor tothe sosea botottotom 100 metoterorsos sosoutoth of tothe docksos inon tothe citoty of urorb
 tothe totroreasosurore isos burorroried unonderor tothe sosea botottotomom 100 mometoterorsos sosoutoth of tothe docksos inon tothe citoty of urorb
 tothe totroreasosurore isos burorroried unonderor tothe sosea botottotomom 100 mometoterorsos sosoutoth of tothe docksos inon tothe citoty of urorb
 tothe totroreasosurore isos burorroried unonderor tothe sosea botottotomom 100 mometoterorsos sosoutoth of tothe dockoksos inon tothe citoty of urorb
 tothe totroreasosurore isos burorroried unonderor tothe sosea botottotomom 100 mometoterorsos sosoutoth of tothe dockoksos inon tothe citoty of urorb
 tothohe totroreasosurore isos burorroried unonderor tothohe sosea botottotomom 100 mometoterorsos sosoutothoh of tothohe dockoksos inon tothohe citoty of urorb
 tothohe totroreasosurore isos burorroried unonderor tothohe sosea botottotomom 100 mometoterorsos sosoutothoh of tothohe dockoksos inon tothohe citoty of urorb
 tothohe totroreasosurore isos burorroried unonderor tothohe sosea botottotomom 100 mometoterorsos sosoutothoh ofof tothohe dockoksos inon tothohe citoty ofof urorb
 tothohe totroreasosurore isos burorroriedod unondoderor tothohe sosea botottotomom 100 mometoterorsos sosoutothoh ofof tothohe dodockoksos inon tothohe citoty ofof urorb
 tothohe totroreasosurore isos burorroriedod unondoderor tothohe sosea botottotomom 100 mometoterorsos sosoutothoh ofof tothohe dodocockoksos inon tothohe cocitoty ofof urorb
 tothohe totroreasosurore isos boburorroriedod unondoderor tothohe sosea bobotottotomom 100 mometoterorsos sosoutothoh ofof tothohe dodocockoksos inon tothohe cocitoty ofof urorbob
(21 rows)