Lore

Lore (Logical Relations) is a small extension to SQL (PostgreSQL) that adds the notion of logical relations and implications. Lore simply translates a Lore-file using these extensions to standard PostgreSQL. Also, any valid PostgreSQL-statement (or query) is a valid Lore-statement (or query) and behaves the same (normal SQL-statements are executed unaltered over the database).

Lore adds RELATIONs, IMPLICATIONs and IMPORT-statements. In addition, Lore provides a simpler (Datalog-like) syntax for specifying implications, assertions and simple queries.

The newest release can be downloaded here. The source code is published on Gitlab under GPLv2, written in Scala, and available here. For usage and examples of the lore.jar-program, simply execute

java -jar lore.jar --help

There is also a triplestore with an ontology reasoner built on top of Lore called TripleLore.

Example

IMPORT 'http://example.org/family-schema.lore'; -- Import statements from other Lore-file (local or online)

CREATE RELATION fam.person(id int PRIMARY KEY, pname text);
CREATE RELATION fam.mother(person int, mother int);
CREATE RELATION fam.father(person int, father int);
CREATE RELATION fam.parent(person int, parent int);
CREATE RELATION fam.ancestor(person int, ancestor int);

CREATE FORWARD IMPLICATION fam.parent AS -- Forward chaining implication
SELECT * FROM fam.mother m;              -- using SQL-like syntax 

fam.father(x, y) -> fam.parent(x, y); -- Forward chaining implication
                                      -- using Datalog-like syntax

CREATE IMPLICATION fam.ancestor AS -- Backward chaining implication
SELECT * FROM fam.parent p;       -- using SQL-like syntax 

fam.ancestor(x, z) <- fam.parent(x, y), fam.ancestor(y, z); -- Backward chaining implication
                                                            -- using Datalog-like syntax
fam.person(1, 'Ole'); -- Datalog-like INSERT-statement
fam.person(2, 'Mary');
fam.person(3, 'Karl');
fam.person(4, 'Ida');
fam.person(5, 'Bob');
fam.person(6, 'Hannah');

fam.mother(1, 2);
fam.father(1, 3);
fam.father(4, 1);

fam.ancestor(1, 6); -- Can also insert into any relation, even if (partly) defined with implications

Supported features

Lore is a pure extension to PostgreSQL, so any legal PostgreSQL query and command is a legal Lore query and command. Lore adds the following new elements, detailed in each section below:

Relations
Implications
Imports
Datalog-like rules

Each section gives an overview of the feature with examples and high level description, followed by asection on the allowed syntax and notes on the supported features.

For a formal specification of the syntax, see the corresponding Antlr4-files in the source-code.

Relation

A relation is a generalization of both a regular database table and a view. A relation is created using the keywords CREATE RELATION, and is otherwise similar as CREATE TABLE. E.g.:

CREATE RELATION university.student(sid int PRIMARY KEY, pname text NOT NULL);

One can INSERT into a RELATION just like a table, but one can also define a relation in terms of other tables, views or relations via implications (see below).

There is one difference between a relation and tables/views, and that is that tuples in relations are unique, and multiple inserts of the same tuple will be ignored. Relations are also automatically indexed, with one index (typically btree, but gist for columns of geometrical types) per column.

The above statement will be rewritten to the following SQL-statements:

CREATE TABLE IF NOT EXISTS university.student (sid int PRIMARY KEY, pname text NOT NULL);
ALTER TABLE university.student RENAME TO student_stored;
ALTER TABLE university.student_stored ADD UNIQUE (sid, pname);

CREATE INDEX ON university.student_stored USING btree (pname);

CREATE VIEW university.student_virtual AS SELECT * FROM university.student_stored LIMIT 0;

CREATE VIEW university.student AS
SELECT * FROM university.student_stored
UNION ALL
SELECT * FROM university.student_virtual;

CREATE RULE university_student_insert_rule AS
ON INSERT TO university.student DO INSTEAD 
INSERT INTO university.student_stored (SELECT NEW.sid, NEW.pname)
ON CONFLICT (sid, pname) DO NOTHING;

Note that an existing regular table can “upgraded” to a relation by simply creating (i.e. CREATE RELATION) it as a relation.

Note that relations should not have foreign keys to other relations. More on the reason and solution to this in the section Foreign Keys below.

Syntax

CREATE RELATION [FROM TABLE] <relname>(<columns>);
DROP RELATION [IF EXISTS] [TO TABLE] <relname> [CASCADE];

where

<name> is the name of the relation, potentially prefixed with a schema (e.g. ex.person)
<columns> is a normal column declaration list (e.g. id int, name text, PRIMARY KEY (id))

Items listen in square braces are optional.

Notes on supported features

The keywords IF EXISTS and CASCADE behave the same as for normal PostgreSQL.

If a CREATE RELATION-command contains the keywords FROM TABLE, it transformes an already existing (regular) table into a relation. The old table becomes the table-part (i.e. the stored part suffixed with _stored) of the relation. The command still requires the columns-list, and these needs to be equal to the table’s columns. This command fails if the table does not exist.

If a DROP RELATION-command contains the keywords TO TABLE, it transformes a relation into a regular table, that is, it drops the two VIEWs and all implications created on the relation, and removes the suffix _stored. The command does not alter any data stored in the table.

Implication

An implication is similar to a view definitions, except that it functions just like an implication, i.e. it does not specify the full extent of the target relation and there can be possibly many implications to the same relation. The final relation consists of all tuples resulting from all implications and all inserts into it. An implication is created with the CREATE IMPLICATION-keywords, and is otherwise similar to CREATE VIEW, with a few exceptions (see below). E.g.:

CREATE IMPLICATION university.student AS
SELECT p.pid, p.pname
FROM person AS p
     JOIN takes_course AS t ON (p.pid = t.attendee);

The above statement will be rewritten to the following SQL-statements:

CREATE OR REPLACE VIEW university.student_virtual(sid, pname) AS
WITH RECURSIVE
  tmp_subquery(sid, pname) AS (
    SELECT p.pid, p.pname
    FROM person AS p
         JOIN takes_course AS t ON (p.pid = t.attendee)
  )
SELECT * FROM tmp_subquery;

If there are more than one implications, the union of these will be placed in the query tmp_subquery. Note also that if one of the queries are recursive, the recursive “call” will be replaced with tmp_subquery. In adition, the same (recurive) query will be added to the union with the recurisve “call” replaced with the name of the (stored) table, so that these are also included.

For example, if we have the following two implications definint the ancestor-relation:

CREATE RELATION parent(p1 int, p2 int);
CREATE RELATION ancestor(p1 int, p2 int);

CREATE IMPLICATION ancestor AS
SELECT * FROM parent;

CREATE IMPLICATION ancestor AS
SELECT p.p1, a.p2
FROM parent AS p
     JOIN ancestor AS a ON p.p2 = a.p1;

the implications will become the following VIEW-definition:

CREATE OR REPLACE VIEW ancestor_virtual(p1, p2) AS
WITH RECURSIVE
  tmp_subquery(p1, p2) AS (
    SELECT * FROM parent
    UNION ALL
    SELECT p.p1, a.p2
    FROM parent AS p
         JOIN ancestor_stored AS a ON p.p2 = a.p1
    UNION ALL
    SELECT p.p1, a.p2
    FROM parent AS p
         JOIN tmp_subquery AS a ON p.p2 = a.p1
  )
SELECT * FROM tmp_subquery;

Forward and backwards implications

By default, as can be seen above, implications are rewritten to VIEWs, and can therefore be seen as backward-chaning implications. In other words, the implication is only used when needed, that is, when a query requires it.

However, Lore also supports forward-chaining implications that are computed and materialized. To make an implication a forward-chaining implication, simply add the FORWARD-keyword before IMPLCATION, e.g.:

CREATE FORWARD IMPLICATION university.student AS
SELECT p.pid, p.pname
FROM person AS p
     JOIN takes_course AS t ON (p.pid = t.attendee);

Forward-implications are translated to triggers, so e.g. the implication above would be translated to:

CREATE FUNCTION takes_course_to_university_student_1641312109__insert_trigger_fnc() RETURNS trigger AS $body$
BEGIN
  INSERT INTO university.student
  SELECT p.pid, p.pname
  FROM person AS p
       JOIN (SELECT NEW.attendee, NEW.course) AS t ON (p.pid = t.attendee);
  RETURN NEW;
END;
$body$ LANGUAGE plpgsql;

CREATE TRIGGER takes_course_to_university_student_1641312109__insert_trigger AFTER INSERT
ON takes_course_stored FOR EACH ROW
EXECUTE PROCEDURE takes_course_to_university_student_1641312109__insert_trigger_fnc();

INSERT INTO university.student
SELECT p.pid, p.pname
FROM person AS p
     JOIN takes_course AS t ON (p.pid = t.attendee);

Note that we both make a trigger and in adition give a normal INSERT-command. This makes sure that also already inserted data gets used by the implication.

Syntax

CREATE [FORWARD] IMPLICATION <relname> AS <SELECT_query>;

where

<relname> is the name of the relation that is the target for the implication
<SELECT_query> is a regular PostgreSQL SELECT-query

Notes on supported features

If FORWARD is given the implication will be a forward-chaning implication implemented via triggers. However, if FORWARD is not given the implication will be a backward-chaning implication implemented via views.

If FORWARD is used, all tables in a FROM-clause must be given a local name with AS (e.g. FROM person AS p).

The SELECT-query can be recursive, i.e. use <relname> in its FROM-clause. For FORWARD-implications, there are no restrictions on the use of recursion. For backward-chaining implications on the other hand, there can be only one recursive call (i.e. only one use of <relname> in the query, and this needs to be directly in the queries FROM-klause (i.e. not in any subquery). These are the same restrictions that apply to use of recursion in normal WITH RECURSIVE-queries in PostgreSQL.

Import

Lore also introduces the keyword IMPORT which is used to import the statements from other Lore-scripts.

E.g. if we have a file students.lore:

CREATE RELATION university.student(sid int PRIMARY KEY, pname text NOT NULL); 

CREATE FORWARD IMPLICATION university.student AS
SELECT p.pid, p.pname
FROM person AS p
     JOIN takes_course AS t ON (p.pid = t.attendee);

and a different file, university.lore:

IMPORT 'file:students.lore';

CREATE RELATION university.employee(sid int PRIMARY KEY, pname text NOT NULL); 

CREATE FORWARD IMPLICATION university.employee AS
SELECT p.pid, p.pname
FROM person AS p
     JOIN teaches_course AS t ON (p.pid = t.professor);

then, before the translation to SQL, Lore will replace the IMPORT-statement with the contents of students.lore, i.e.:

CREATE RELATION university.student(sid int PRIMARY KEY, pname text NOT NULL); 

CREATE FORWARD IMPLICATION university.student AS
SELECT p.pid, p.pname
FROM person AS p
     JOIN takes_course AS t ON (p.pid = t.attendee);

CREATE RELATION university.employee(sid int PRIMARY KEY, pname text NOT NULL); 

CREATE FORWARD IMPLICATION university.employee AS
SELECT p.pid, p.pname
FROM person AS p
     JOIN teaches_course AS t ON (p.pid = t.professor);

One can import files based on any URL, thus one can also give a URL that points to a file published online, e.g.

IMPORT 'http://leifhka.org/lore/library/prefix.lore';

Syntax

IMPORT '<protocol>:<path>';

where

<protocol> is either http or file
<path> is the path to the file
- if <protocol> = http then <path> is a normal web-URL
- if <protocol> = file then <path> is a filepath (potentially relative to the working directory of the process, i.e. the folder lore is executed from)

Datalog-like syntax

The syntax for defining relations and implications above should feel familiar to anyone used to writing SQL.

However, if one wants to make large knowledge bases with Lore, the SQL-syntax might feel a bit verbose. Thus, Lore also introduces a Datalog-like syntax for implications. So instead of

CREATE IMPLICATION university.student AS
SELECT p.pid, p.pname
FROM person AS p
     JOIN takes_course AS t ON (p.pid = t.attendee);

one can write

university.student(pid, pname) <- person(pid, pname), takes_course(pid, course);

The part before the arrow is called the head and the part after the arrow is called the body. The analog forward implication can then be written by simply flipping the direction of the arrow and the placement of the head and body, i.e.:

person(pid, pname), takes_course(pid, course) -> university.student(pid, pname);

Note that the column-variables can be replaced with any identifier, so the following is equivalent to the implication above:

person(x, y), takes_course(x, z) -> university.student(x, y);

It is also possible to relate the variables using functions or constants (like in a normal SQL-WHERE-clause) by writing this after the body as follows:

person(x, y), takes_course(x, z) : z = 5 OR z >= 10 -> university.student(x, y);

In fact, anything possible in a normal SQL-WHERE-clause can be done after the colon.

The implication above is equivalent to the following:

CREATE FORWARD IMPLICATION university.student AS
SELECT x, y
FROM person AS t1(x, y) NATURAL JOIN takes_course AS t2(x, z)
WHERE z = 5 OR z >= 10;

Note that one can also use functions on the variables in the implication’s head, e.g.:

person(x, y), takes_course(x, z) : z = 5 OR z >= 10 -> university.student(10 * x, 'Student ' || y);

Lore also introduces a simpler syntax for INSERT-statements with a similar syntax as Datalog. So one can write

university.student(1, 'Mary');

which will then be translated to

INSERT INTO university.student VALUES (1, 'Mary');

Syntax

<relname>(<SELECT_clause>) <- <relname>(<cols>), ..., <relname>(<cols>) [: <WHERE_clause>];

<relname>(<cols>), ..., <relname>(<cols>) [: <WHERE_clause>] -> <relname>(<SELECT_clause>); 

<- <relname>(<cols>), ..., <relname>(<cols>) [: <WHERE_clause>];

where

<relname> is a relation name (potentially prefixed with a schema name)
<SELECT_clause> is a list of expressions that may use any variable from any <cols> (e.g. x+1, 'hello', y)
<cols> is a list of variable names that will denote the columns in the relation before the parenthesis (e.g. x, y, z)
<WHERE_clause> is any regular PostgreSQL WHERE-clause over the variables bound in any <cols>, and may contain regular SQL-subqueries (e.g. x > 5 AND y NOT IN (SELECT z FROM t))

Notes on supported features

The names used for variables in <cols> can be any legal column name in regular PostgreSQL. Different <cols> can contain variables with the same name, which will cause the relations to be joined on these columns.

The rules are translated into the CREATE IMPLICATION-syntax above, so the same restrictions and features applies. I.e.:

<relname1>(<SELECT_clause>) <- <relname2>(<cols2>), ..., <relnameN>(<colsN>) [: <WHERE_clause>];

becomes

CREATE IMPLICATION <relname> AS
SELECT <SELECT_clause>
FROM <relname2> AS t_2(<cols2>) NATURAL JOIN ... NATURAL JOIN <relnameN> AS tN(<colsN>)
WHERE <WHERE_clause>;

and

<relname2>(<cols2>), ..., <relnameN>(<colsN>) [: <WHERE_clause>] -> <relname1>(<SELECT_clause>);

becomes the same CREATE IMPLICATION-statement as above, except with the FORWARD-keyword included.

The final rule, without any head-clause becomes a normal SELECT-query, i.e.

<- <relname1>(<cols1>), ..., <relnameN>(<colsN>) [: <WHERE_clause>];

becomes

SELECT <SELECT_clause>
FROM <relname1> AS t_1(<cols1>) NATURAL JOIN ... NATURAL JOIN <relnameN> AS tN(<colsN>)
WHERE <WHERE_clause>;

Key Constraints

Problem

Relations does not support key constraints such as foreign keys. The reason for this is that the behaviour of such constraints is not clear. To illustrate this, assume we have the following two relations:

CREATE RELATION person(pid int PRIMARY KEY, pname text);
CREATE RELATION knows(p1 int REFERENCES person(pid), p2 int REFERENCES person(pid));

It is not clear if the foreign keys should be enforced only for the stored (table) part of the relation, or whether it should also be enforced for the enitre relations also with backwards-inferred tuples.

In the former case, the constraints might be violated or satisfied depending on the order we apply the implicatoins. E.g. if we have two implications:

employee(pid, pname, worksFor) -> person(pid, pname);
employee(p1, pname1, worksFor), employee(p2, pname2, worksFor) -> knows(p1, p2);

if we apply the second implication first, then we will insert values into knows that are not (yet) present in person, thus violating the constraints, whereas if we apply the first implication first, the inserted tuples would not violate the constraints. A solution to this could be to postpone constraint checking to after all implications have been executed, however this complicates the execution somewhat, and would also not check backwards-inferred tuples.

If the constraints are meant to also include the backwards-inferred tuples, i.e. the whole relation, then there is no clear translation of this into standard SQL.

The same holds for PRIMARY KEY-constraints.

Possible Solution

A possible solution to keys is to rather encode the constraint via rules via the following vocabulary (available here):

CREATE RELATION lore.error(msg text);

CREATE RELATION lore.unique_violation(schema_name text, rel_name text, cols text, vals text);
CREATE RELATION lore.fk_violation(schema_name text, rel_name text, cols text, fk_schema text, fk_rel text, fk_cols text, vals text);
CREATE RELATION lore.inconsistency(stmt1 text, stmt2 text);

lore.error(format('Value(s) %s violates unique constraint on %s.%s(%s).', vals, schema_name, rel_name, cols))
    <- lore.unique_violation(schema_name, rel_name, cols, vals);

lore.error(format('Value(s) %s in %s.%s(%s) violates foreign key constraint to %s.%s(%s).', vals, schema_name, rel_name, cols, fk_schema, fk_rel, fk_cols))
    <- lore.fk_violation(schema_name, rel_name, cols, fk_schema, fk_rel, fk_cols, vals);

lore.error(format('The statement %s contradicts with the statement %s.', stmt1, stmt2))
    <- lore.inconsistency(stmt1, stmt2);

Example use:

Given:

CREATE RELATION person(pid int, name text, works_for int);
CREATE RELATION company(cid int, name text);

then one can express that pid and cid are unique and that works_for is an FK referencing cid as follows:

lore.unique_violation('public', 'person', 'pid', x) <- person(x, n1, w1), person(x, n2, w2) : n1 != n2 OR w1 != w2;
lore.unique_violation('public', 'company', 'cid', x) <- company(x, n1), company(x, n2) : n1 != n2 ;
lore.fk_violation('public', 'person', 'works_for', 'public', 'company', 'cid', x) <- person(pid, n, x) : x NOT IN (SELECT cid FROM company);

One can then query the lore.errors-relation to check for constraint violations, when one thinks all data ought to be correct.

It is planned to include such a check after execution of each Lore-script, and fail, rollback and abort if errors are found. Note that this check can be quite expensive, so a flag to disable it will be included.