Introduction
We are currently in our third iteration of unification for rational trees. What first begun as a separate routine inside library(math), has now recently become a novel take on the host language routine unify() inside the Dogelog Player Prolog system.
We report how we replaced the union find map data structure inside unify() by some Prolog compound pointer swizzling as suggested by Jaxon Jaffar in 1984, arriving at a more competetive rational tree unification realization.
List Algorithm
The underlying union find algorithm is simpler to explain when we look at (==)/2
before we examine (=)/2
. This algorithm has its root in a 1971 publication by John Hopcroft and Richard Karp for a bisimulation equality algorithm among deterministic finite automaton (DFA). As a union find data structure we use a key value list:
sys_union_find(X, L, T) :-
member(Y-Z, L),
same_term(X, Y), !,
sys_union_find(Z, L, T).
sys_union_find(X, _, X).
The union find serves as a means to gradually create a coarser equivalence relation. In the above implementation it requires an extra built-in same_term/2 which tests two Prolog compounds for the same memory location. Merging two equivalence classes amounts to adding a further pair to the key value list:
equals(X, Y) :-
sys_equals(X, Y, [], _).
sys_equals(X, Y, L, R) :- compound(X), compound(Y), !,
(functor(X, F, N), functor(Y, G, M),
N/F \== M/G -> fail;
sys_union_find(X, L, Z),
sys_union_find(Y, L, T),
(same_term(Z, T) -> L = R;
Z =.. [_|P], T =.. [_|Q],
foldl(sys_equals, P, Q, [Z-T|L], R))).
sys_equals(X, Y, L, L) :- same_term(X, Y).
The algorithm is quite capable. It is conservative for acyclic terms, behaves the same as an inductive definition of syntactic Prolog term equality. Such inductive definitions for built-ins such as (==)/2 are for example found in the ISO core standard. We can verify that the following test cases pass:
?- X = f(a,g(Z,b),Z), Y = f(a,g(Z,b),Z), equals(X, Y).
X = f(a, g(Z, b), Z), Y = f(a, g(Z, b), Z).
?- X = f(a,g(Z,b),Z), Y = f(a,g(T,b),Z), equals(X, Y).
fail.
?- X = f(a,g(Z,b),Z), Y = f(a,g(Z,c),Z), equals(X, Y).
fail.
Map Algorithm
What makes the Hopcroft & Karp (1971) algorithm interesting for rational trees, is its ability to work with Prolog terms that are cyclic. When naively implemented, inductive definitions as found in the ISO core standard usually do not terminate for cyclic terms. On the other hand the previous 100% Prolog implementation gives:
?- X = f(f(X)), Y = f(f(f(Y))), equals(X, Y).
X = f(X), X = Y.
The only problem is that the union find lookup as realized by a key value list and via member/2 is relatively slow. It takes in the average O(N) time to lookup a value by a key, making the overall algorithm O(N^2), whereas Hopcroft & Karp (1971) already claimed quasi linear time complexity of their bisimulation equality.
To speed up the equals/2 implementation in release 1.3.6 of Dogelog Player we provided a native implementation of (==)/2 using a native hash map datastructure. A hash map data structure has amortized O(1) lookup. The extension of the hash map data structure has similarly O(1) time complexity.
public static Object union_find(Map map, Object obj) {
Object obj2 = map.get(obj);
while (obj2 != null) {
obj = obj2;
obj2 = map.get(obj);
}
return obj;
}
public static void union_add(Map map, Object from, Object to) {
map.put(from, to);
return map;
}
The above two utility routines where put into the core of release 1.3.6 of Dogelog Player. The above shows the Java target implementation, we found according implementations for the Python and the JavaScript target. They were then used in native implementations of (=)/2, (==)/2 and compare/3.
Pointer Algorithm
As it turns out the Hopcroft & Karp (1971) algorithm has not only the benefit of detecting loops, it can also detect sharing. To demonstrate this claim, Jaxon Jaffar (1984) already proposed Hydra test cases. They are based on the construction of Prolog terms, that take very less memory but are very large when written:
hydra(0, k) :- !.
hydra(N, h(X, X)):- N>0, N0 is N-1, hydra(N0, X).
?- hydra(3, X).
X = h(h(h(k, k), h(k, k)), h(h(k, k), h(k, k))).
Unfortunately our map implementation couldn't deal with them, since we used still some native stack in their implementation. Jaxon Jaffar (1984) also provided a solution to this problem. His contribution is two fold, using pointer to speed up union find and using iterative unify to solve the stack problem.
public static Compound union_find2(Compound obj) {
while (is_compound(obj.functor))
obj = (Compound)obj.functor;
return obj;
}
public static void union_add2(Compound src, Compound dst) {
src.functor = dst;
}
The above shows our variation of the union find pointers. While Jaxon Jaffar (1984) uses a separte field .term
in his Prolog compounds and represents the equivalence classes as cyclic lists. We used the .functor
field of a Prolog compound for the tree based union find data structure. But we are not yet there, we still saw:
?- hydra(1000000,_X), hydra(1000000,_Y), _X == _Y.
🚨 Fehler: Ausführung wegen Stapelüberlauf abgebrochen.
user auf 5
For the upcoming release 2.1.1 of Dogelog Player we made the unification iterative:
?- hydra(1000000,_X), hydra(1000000,_Y), _X == _Y.
true.
Benchmark Results
Jaxon Jaffar (1984) was doing similar tests with N = 5, 10, 25. Our previous run with N = 1'000'000 might seem monstrous. But a million Prolog compounds should only occupy some mega bytes (MB) of memory. A Prolog system that allocates a few giga bytes (GB) resident memory, should be able to deal with that many Prolog compounds.
In the following we perform a million Prolog compounds stress test for the built-in (=)/2. An implementation of (=)/2 follows the same line as (==)/2 except that it has a few more cases where Prolog variables are bound. Since the unification is for rational trees, there is no break by an occurs check that would slow down the unification.
Interestingly Java and JavaScript do not show much difference. While Python is a factor 3-4 slower than Java or JavaScript. Rust has a strange behaviour, in the smaller test cases N = 16'384, 65'536, 261'144 it has a minor advantage over Java and JavaScript. But for N = 1'048'576 it starts losing the game. See appendix for the test details.
Conclusions
Jaxon Jaffar (1984) already proposed Hydra test cases. They take very less memory but are very large when written. We conducted according tests with Dogelog Player and Scryer Prolog. For N = 16'384, 65'536, 261'144 the Rust Prolog has a minor advantage over the JavaScript Prolog, but interestingly losing it for N = 1'048'576.
Appendix
We used the following test harness:
hydra(0, _) :- !.
hydra(N, h(X, X)):- N>0, N0 is N-1, hydra(N0, X).
hydra(0, A, A) :- !.
hydra(N, h(X, X), A):- N>0, N0 is N-1, hydra(N0, X, A).
test(N) :- hydra(N, X), hydra(N, Y, Y),
time((between(1,10,_), X = Y, fail; true)).
The following Prolog systems:
- Dogelog Player 2.1.1, Oracle Corporation, Java 24.0.1
- Dogelog Player 2.1.1, node, JavaScript 24.6.0
- Dogelog Player 2.1.1, PyPy, Python 3.11.13
- Scryer Prolog 0.9.4-660, rustc 1.90.0
And the following mainboard and operating system:
- AMD Ryzen AI 7 350 CPU with 32 GB RAM,
- Windows 11 Home with WSL2 Ubuntu 24.04