Lore
Lore (Logical Relations) is a small extension to SQL (PostgreSQL) that adds the notion of logical relations and implications. Lore simply translates a Lore-file using these extensions to standard PostgreSQL. Also, any valid PostgreSQL-statement (or query) is a valid Lore-statement (or query) and behaves the same (normal SQL-statements are executed unaltered over the database).
Lore adds RELATION
s, IMPLICATION
s and
IMPORT
-statements. In addition, Lore provides a simpler
(Datalog-like) syntax for specifying implications, assertions and simple
queries.
The newest release can be downloaded here. The source
code is published on Gitlab under GPLv2, written in Scala, and available
here. For usage and
examples of the lore.jar
-program, simply execute
java -jar lore.jar --help
There is also a triplestore with an ontology reasoner built on top of Lore called TripleLore.
Example
'http://example.org/family-schema.lore'; -- Import statements from other Lore-file (local or online)
IMPORT
CREATE RELATION fam.person(id int PRIMARY KEY, pname text);
CREATE RELATION fam.mother(person int, mother int);
CREATE RELATION fam.father(person int, father int);
CREATE RELATION fam.parent(person int, parent int);
CREATE RELATION fam.ancestor(person int, ancestor int);
CREATE FORWARD IMPLICATION fam.parent AS -- Forward chaining implication
SELECT * FROM fam.mother m; -- using SQL-like syntax
-> fam.parent(x, y); -- Forward chaining implication
fam.father(x, y) -- using Datalog-like syntax
CREATE IMPLICATION fam.ancestor AS -- Backward chaining implication
SELECT * FROM fam.parent p; -- using SQL-like syntax
<- fam.parent(x, y), fam.ancestor(y, z); -- Backward chaining implication
fam.ancestor(x, z) -- using Datalog-like syntax
1, 'Ole'); -- Datalog-like INSERT-statement
fam.person(2, 'Mary');
fam.person(3, 'Karl');
fam.person(4, 'Ida');
fam.person(5, 'Bob');
fam.person(6, 'Hannah');
fam.person(
1, 2);
fam.mother(1, 3);
fam.father(4, 1);
fam.father(
1, 6); -- Can also insert into any relation, even if (partly) defined with implications fam.ancestor(
Supported features
Lore is a pure extension to PostgreSQL, so any legal PostgreSQL query and command is a legal Lore query and command. Lore adds the following new elements, detailed in each section below:
- Relations
- Implications
- Imports
- Datalog-like rules
Each section gives an overview of the feature with examples and high level description, followed by asection on the allowed syntax and notes on the supported features.
For a formal specification of the syntax, see the corresponding Antlr4-files in the source-code.
Relation
A relation is a generalization of both a regular database table and a
view. A relation is created using the keywords
CREATE RELATION
, and is otherwise similar as
CREATE TABLE
. E.g.:
CREATE RELATION university.student(sid int PRIMARY KEY, pname text NOT NULL);
One can INSERT
into a RELATION
just like a
table, but one can also define a relation in terms of other tables,
views or relations via implications (see below).
There is one difference between a relation and tables/views, and that
is that tuples in relations are unique, and multiple inserts of the same
tuple will be ignored. Relations are also automatically indexed, with
one index (typically btree
, but gist
for
columns of geometrical types) per column.
The above statement will be rewritten to the following SQL-statements:
CREATE TABLE IF NOT EXISTS university.student (sid int PRIMARY KEY, pname text NOT NULL);
ALTER TABLE university.student RENAME TO student_stored;
ALTER TABLE university.student_stored ADD UNIQUE (sid, pname);
CREATE INDEX ON university.student_stored USING btree (pname);
CREATE VIEW university.student_virtual AS SELECT * FROM university.student_stored LIMIT 0;
CREATE VIEW university.student AS
SELECT * FROM university.student_stored
UNION ALL
SELECT * FROM university.student_virtual;
CREATE RULE university_student_insert_rule AS
ON INSERT TO university.student DO INSTEAD
INSERT INTO university.student_stored (SELECT NEW.sid, NEW.pname)
ON CONFLICT (sid, pname) DO NOTHING;
Note that an existing regular table can “upgraded” to a relation by
simply creating (i.e. CREATE RELATION
) it as a
relation.
Note that relations should not have foreign keys to other relations. More on the reason and solution to this in the section Foreign Keys below.
Syntax
CREATE RELATION [FROM TABLE] <relname>(<columns>);
DROP RELATION [IF EXISTS] [TO TABLE] <relname> [CASCADE];
where
<name>
is the name of the relation, potentially prefixed with a schema (e.g.ex.person
)<columns>
is a normal column declaration list (e.g.id int, name text, PRIMARY KEY (id)
)
Items listen in square braces are optional.
Notes on supported features
The keywords IF EXISTS
and CASCADE
behave
the same as for normal PostgreSQL.
If a CREATE RELATION
-command contains the keywords
FROM TABLE
, it transformes an already existing (regular)
table into a relation. The old table becomes the table-part (i.e. the
stored part suffixed with _stored
) of the relation. The
command still requires the columns-list, and these needs to be equal to
the table’s columns. This command fails if the table does not exist.
If a DROP RELATION
-command contains the keywords
TO TABLE
, it transformes a relation into a regular table,
that is, it drops the two VIEW
s and all implications
created on the relation, and removes the suffix _stored
.
The command does not alter any data stored in the table.
Implication
An implication is similar to a view definitions, except that it
functions just like an implication, i.e. it does not specify the full
extent of the target relation and there can be possibly many
implications to the same relation. The final relation consists of all
tuples resulting from all implications and all inserts into it. An
implication is created with the
CREATE IMPLICATION
-keywords, and is otherwise similar to
CREATE VIEW
, with a few exceptions (see below). E.g.:
CREATE IMPLICATION university.student AS
SELECT p.pid, p.pname
FROM person AS p
JOIN takes_course AS t ON (p.pid = t.attendee);
The above statement will be rewritten to the following SQL-statements:
CREATE OR REPLACE VIEW university.student_virtual(sid, pname) AS
WITH RECURSIVE
AS (
tmp_subquery(sid, pname) SELECT p.pid, p.pname
FROM person AS p
JOIN takes_course AS t ON (p.pid = t.attendee)
)SELECT * FROM tmp_subquery;
If there are more than one implications, the union of these will be
placed in the query tmp_subquery
. Note also that if one of
the queries are recursive, the recursive “call” will be replaced with
tmp_subquery
. In adition, the same (recurive) query will be
added to the union with the recurisve “call” replaced with the name of
the (stored) table, so that these are also included.
For example, if we have the following two implications definint the
ancestor
-relation:
CREATE RELATION parent(p1 int, p2 int);
CREATE RELATION ancestor(p1 int, p2 int);
CREATE IMPLICATION ancestor AS
SELECT * FROM parent;
CREATE IMPLICATION ancestor AS
SELECT p.p1, a.p2
FROM parent AS p
JOIN ancestor AS a ON p.p2 = a.p1;
the implications will become the following
VIEW
-definition:
CREATE OR REPLACE VIEW ancestor_virtual(p1, p2) AS
WITH RECURSIVE
AS (
tmp_subquery(p1, p2) SELECT * FROM parent
UNION ALL
SELECT p.p1, a.p2
FROM parent AS p
JOIN ancestor_stored AS a ON p.p2 = a.p1
UNION ALL
SELECT p.p1, a.p2
FROM parent AS p
JOIN tmp_subquery AS a ON p.p2 = a.p1
)SELECT * FROM tmp_subquery;
Forward and backwards implications
By default, as can be seen above, implications are rewritten to
VIEW
s, and can therefore be seen as backward-chaning
implications. In other words, the implication is only used when needed,
that is, when a query requires it.
However, Lore also supports forward-chaining implications that are
computed and materialized. To make an implication a forward-chaining
implication, simply add the FORWARD
-keyword before
IMPLCATION
, e.g.:
CREATE FORWARD IMPLICATION university.student AS
SELECT p.pid, p.pname
FROM person AS p
JOIN takes_course AS t ON (p.pid = t.attendee);
Forward-implications are translated to triggers, so e.g. the implication above would be translated to:
CREATE FUNCTION takes_course_to_university_student_1641312109__insert_trigger_fnc() RETURNS trigger AS $body$
BEGIN
INSERT INTO university.student
SELECT p.pid, p.pname
FROM person AS p
JOIN (SELECT NEW.attendee, NEW.course) AS t ON (p.pid = t.attendee);
RETURN NEW;
END;
$body$ LANGUAGE plpgsql;
CREATE TRIGGER takes_course_to_university_student_1641312109__insert_trigger AFTER INSERT
ON takes_course_stored FOR EACH ROW
EXECUTE PROCEDURE takes_course_to_university_student_1641312109__insert_trigger_fnc();
INSERT INTO university.student
SELECT p.pid, p.pname
FROM person AS p
JOIN takes_course AS t ON (p.pid = t.attendee);
Note that we both make a trigger and in adition give a normal
INSERT
-command. This makes sure that also already inserted
data gets used by the implication.
Syntax
CREATE [FORWARD] IMPLICATION <relname> AS <SELECT_query>;
where
<relname>
is the name of the relation that is the target for the implication<SELECT_query>
is a regular PostgreSQLSELECT
-query
Notes on supported features
If FORWARD
is given the implication will be a
forward-chaning implication implemented via triggers. However, if
FORWARD
is not given the implication will be a
backward-chaning implication implemented via views.
If FORWARD
is used, all tables in a
FROM
-clause must be given a local name with AS
(e.g. FROM person AS p)
.
The SELECT
-query can be recursive, i.e. use
<relname>
in its FROM
-clause. For
FORWARD
-implications, there are no restrictions on the use
of recursion. For backward-chaining implications on the other hand,
there can be only one recursive call (i.e. only one use of
<relname>
in the query, and this needs to be directly
in the queries FROM
-klause (i.e. not in any subquery).
These are the same restrictions that apply to use of recursion in normal
WITH RECURSIVE
-queries in PostgreSQL.
Import
Lore also introduces the keyword IMPORT
which is used to
import the statements from other Lore-scripts.
E.g. if we have a file students.lore
:
CREATE RELATION university.student(sid int PRIMARY KEY, pname text NOT NULL);
CREATE FORWARD IMPLICATION university.student AS
SELECT p.pid, p.pname
FROM person AS p
JOIN takes_course AS t ON (p.pid = t.attendee);
and a different file, university.lore
:
'file:students.lore';
IMPORT
CREATE RELATION university.employee(sid int PRIMARY KEY, pname text NOT NULL);
CREATE FORWARD IMPLICATION university.employee AS
SELECT p.pid, p.pname
FROM person AS p
JOIN teaches_course AS t ON (p.pid = t.professor);
then, before the translation to SQL, Lore will replace the
IMPORT
-statement with the contents of
students.lore
, i.e.:
CREATE RELATION university.student(sid int PRIMARY KEY, pname text NOT NULL);
CREATE FORWARD IMPLICATION university.student AS
SELECT p.pid, p.pname
FROM person AS p
JOIN takes_course AS t ON (p.pid = t.attendee);
CREATE RELATION university.employee(sid int PRIMARY KEY, pname text NOT NULL);
CREATE FORWARD IMPLICATION university.employee AS
SELECT p.pid, p.pname
FROM person AS p
JOIN teaches_course AS t ON (p.pid = t.professor);
One can import files based on any URL, thus one can also give a URL that points to a file published online, e.g.
'http://leifhka.org/lore/library/prefix.lore'; IMPORT
Syntax
'<protocol>:<path>'; IMPORT
where
<protocol>
is eitherhttp
orfile
<path>
is the path to the file- if
<protocol> = http
then<path>
is a normal web-URL - if
<protocol> = file
then<path>
is a filepath (potentially relative to the working directory of the process, i.e. the folderlore
is executed from)
- if
Datalog-like syntax
The syntax for defining relations and implications above should feel familiar to anyone used to writing SQL.
However, if one wants to make large knowledge bases with Lore, the SQL-syntax might feel a bit verbose. Thus, Lore also introduces a Datalog-like syntax for implications. So instead of
CREATE IMPLICATION university.student AS
SELECT p.pid, p.pname
FROM person AS p
JOIN takes_course AS t ON (p.pid = t.attendee);
one can write
<- person(pid, pname), takes_course(pid, course); university.student(pid, pname)
The part before the arrow is called the head and the part after the arrow is called the body. The analog forward implication can then be written by simply flipping the direction of the arrow and the placement of the head and body, i.e.:
-> university.student(pid, pname); person(pid, pname), takes_course(pid, course)
Note that the column-variables can be replaced with any identifier, so the following is equivalent to the implication above:
-> university.student(x, y); person(x, y), takes_course(x, z)
It is also possible to relate the variables using functions or
constants (like in a normal SQL-WHERE
-clause) by writing
this after the body as follows:
= 5 OR z >= 10 -> university.student(x, y); person(x, y), takes_course(x, z) : z
In fact, anything possible in a normal SQL-WHERE
-clause
can be done after the colon.
The implication above is equivalent to the following:
CREATE FORWARD IMPLICATION university.student AS
SELECT x, y
FROM person AS t1(x, y) NATURAL JOIN takes_course AS t2(x, z)
WHERE z = 5 OR z >= 10;
Note that one can also use functions on the variables in the implication’s head, e.g.:
= 5 OR z >= 10 -> university.student(10 * x, 'Student ' || y); person(x, y), takes_course(x, z) : z
Lore also introduces a simpler syntax for
INSERT
-statements with a similar syntax as Datalog. So one
can write
1, 'Mary'); university.student(
which will then be translated to
INSERT INTO university.student VALUES (1, 'Mary');
Syntax
<relname>(<SELECT_clause>) <- <relname>(<cols>), ..., <relname>(<cols>) [: <WHERE_clause>];
<relname>(<cols>), ..., <relname>(<cols>) [: <WHERE_clause>] -> <relname>(<SELECT_clause>);
<- <relname>(<cols>), ..., <relname>(<cols>) [: <WHERE_clause>];
where
<relname>
is a relation name (potentially prefixed with a schema name)<SELECT_clause>
is a list of expressions that may use any variable from any<cols>
(e.g.x+1, 'hello', y
)<cols>
is a list of variable names that will denote the columns in the relation before the parenthesis (e.g.x, y, z
)<WHERE_clause>
is any regular PostgreSQLWHERE
-clause over the variables bound in any<cols>
, and may contain regular SQL-subqueries (e.g.x > 5 AND y NOT IN (SELECT z FROM t)
)
Notes on supported features
The names used for variables in <cols>
can be any
legal column name in regular PostgreSQL. Different
<cols>
can contain variables with the same name,
which will cause the relations to be joined on these columns.
The rules are translated into the
CREATE IMPLICATION
-syntax above, so the same restrictions
and features applies. I.e.:
<relname1>(<SELECT_clause>) <- <relname2>(<cols2>), ..., <relnameN>(<colsN>) [: <WHERE_clause>];
becomes
CREATE IMPLICATION <relname> AS
SELECT <SELECT_clause>
FROM <relname2> AS t_2(<cols2>) NATURAL JOIN ... NATURAL JOIN <relnameN> AS tN(<colsN>)
WHERE <WHERE_clause>;
and
<relname2>(<cols2>), ..., <relnameN>(<colsN>) [: <WHERE_clause>] -> <relname1>(<SELECT_clause>);
becomes the same CREATE IMPLICATION
-statement as above,
except with the FORWARD
-keyword included.
The final rule, without any head-clause becomes a normal
SELECT
-query, i.e.
<- <relname1>(<cols1>), ..., <relnameN>(<colsN>) [: <WHERE_clause>];
becomes
SELECT <SELECT_clause>
FROM <relname1> AS t_1(<cols1>) NATURAL JOIN ... NATURAL JOIN <relnameN> AS tN(<colsN>)
WHERE <WHERE_clause>;
Key Constraints
Problem
Relations does not support key constraints such as foreign keys. The reason for this is that the behaviour of such constraints is not clear. To illustrate this, assume we have the following two relations:
CREATE RELATION person(pid int PRIMARY KEY, pname text);
CREATE RELATION knows(p1 int REFERENCES person(pid), p2 int REFERENCES person(pid));
It is not clear if the foreign keys should be enforced only for the stored (table) part of the relation, or whether it should also be enforced for the enitre relations also with backwards-inferred tuples.
In the former case, the constraints might be violated or satisfied depending on the order we apply the implicatoins. E.g. if we have two implications:
-> person(pid, pname);
employee(pid, pname, worksFor) -> knows(p1, p2); employee(p1, pname1, worksFor), employee(p2, pname2, worksFor)
if we apply the second implication first, then we will insert values
into knows
that are not (yet) present in
person
, thus violating the constraints, whereas if we apply
the first implication first, the inserted tuples would not violate the
constraints. A solution to this could be to postpone constraint checking
to after all implications have been executed, however this complicates
the execution somewhat, and would also not check backwards-inferred
tuples.
If the constraints are meant to also include the backwards-inferred tuples, i.e. the whole relation, then there is no clear translation of this into standard SQL.
The same holds for PRIMARY KEY
-constraints.
Possible Solution
A possible solution to keys is to rather encode the constraint via rules via the following vocabulary (available here):
CREATE RELATION lore.error(msg text);
CREATE RELATION lore.unique_violation(schema_name text, rel_name text, cols text, vals text);
CREATE RELATION lore.fk_violation(schema_name text, rel_name text, cols text, fk_schema text, fk_rel text, fk_cols text, vals text);
CREATE RELATION lore.inconsistency(stmt1 text, stmt2 text);
'Value(s) %s violates unique constraint on %s.%s(%s).', vals, schema_name, rel_name, cols))
lore.error(format(<- lore.unique_violation(schema_name, rel_name, cols, vals);
'Value(s) %s in %s.%s(%s) violates foreign key constraint to %s.%s(%s).', vals, schema_name, rel_name, cols, fk_schema, fk_rel, fk_cols))
lore.error(format(<- lore.fk_violation(schema_name, rel_name, cols, fk_schema, fk_rel, fk_cols, vals);
'The statement %s contradicts with the statement %s.', stmt1, stmt2))
lore.error(format(<- lore.inconsistency(stmt1, stmt2);
Example use:
Given:
CREATE RELATION person(pid int, name text, works_for int);
CREATE RELATION company(cid int, name text);
then one can express that pid and cid are unique and that works_for is an FK referencing cid as follows:
'public', 'person', 'pid', x) <- person(x, n1, w1), person(x, n2, w2) : n1 != n2 OR w1 != w2;
lore.unique_violation('public', 'company', 'cid', x) <- company(x, n1), company(x, n2) : n1 != n2 ;
lore.unique_violation('public', 'person', 'works_for', 'public', 'company', 'cid', x) <- person(pid, n, x) : x NOT IN (SELECT cid FROM company); lore.fk_violation(
One can then query the lore.errors
-relation to check for
constraint violations, when one thinks all data ought to be correct.
It is planned to include such a check after execution of each Lore-script, and fail, rollback and abort if errors are found. Note that this check can be quite expensive, so a flag to disable it will be included.