Introduction
Dogelog Player is a Prolog system that currently runs on different backends such as JavaScript, Python and Java. We present a new Prolog dict library library(misc/dict).
The library is a spin off of our JSON data library library(misc/json). Both are pure Prolog implementations that we could also test on Trealla Prolog and Scryer Prolog.
Affine Dicts
Our original take of the JSON data library was guided by the idea of what we termed affine dicts. The affinity we were aspiring to, was a close resemblance between the Prolog terms for JSON and the JSON text serialization itself.
/* JSON */
{"a":123, "b":["f","g"]}
/* Prolog */
{a:123, b:[f,g]}
We made use of the {}/1
Prolog syntax form to introduce a dict and the ','/2
operator to separate the dict entries. What we couldn't borrow from the ISO core standard was the (:)/2
operator, but it is found in the ISO module standard.
Unlike SWI-Prologs approach, we neither require a String data type, since JSON keys and JSON string values are simply mapped to Prolog atoms. Nor do we require a new Prolog dict syntax neither a new Prolog dict data type.
Input Order
We could get away without a new Prolog dict syntax, since unlike SWI-Prolog we do not include a tag. So variable tags as in _{a:123}
or constant tags as in point{a:123}
are not in the scope of our Prolog dicts.
At the moment there is also no Prolog dict data type around. SWI-Prolog has its own Prolog dict data type, that is seen as a Prolog compound with a first tag argument followed by an alternation of value and key arguments.
Our first take on Prolog dict operations was a little bit ad-hoc and the operations were named json_object_XXX/n. We have meanwhile streamlined the operations into the naming dict_XXX/n and also settled on a clear input order semantics.
/* SWI-Prolog */
?- put_dict(a,_{d:2,a:1},9,X).
X = _{a:9, d:2}.
?- put_dict(b,_{d:2,a:1},9,X).
X = _{a:1, b:9, d:2}.
/* Dogelog Player */
?- dict_set({d:2,a:1},a,9,X).
X = {d:2, a:9}.
?- dict_set({d:2,a:1},b,9,X).
X = {d:2, a:1, b:9}.
The input order semantics is seen in the above samples where we compare SWI-Prolog put_dict/4 against our new dict_set/4. SWI-Prolog keysorted dicts have the advantage that they have O(log N) access time, whereas our dicts have O(N) access time.
On the other hand set and remove operations do copy a full dict, and we are back to O(N) modification time even for SWI-Prolog. In the above we tested modifying 1000000 dicts of size N=16,32,48,64,80. SWI-Prologs dict data type that compresses into Prolog compounds shows nevertheless an advantage.
Key Sorting
Our Prolog dicts together with these operations implement Pythons OrderedDict()
. It is a little misnomer, since one might be mislead that it refers to keysorted, but it refers to input order. Recently Pythons dict()
type obeys the same order.
SWI-Prologs decision to have keysorted dicts might not alone have been guided by access time but also by the insight that such dicts have an easier equality and lexical sorting. We opted for an explicit operation to keysort a dict.
dict_keysort({}, {}).
dict_keysort({C}, {D}) :-
sys_map_unpack(C, H),
keysort(H, [X|J]),
sys_map_pack(J, X, D).
The built-in keysort/2 is part of ISO core standard and sys_map_unpack/2 and sys_map_pack/3 are straight forward in pure Prolog for affine dicts. The shallow operation can also be extended to a deep operation on JSON data, making JSON data comparable.
In above we tested sorting 10 times a list with 10000 already key sorted Prolog dicts. We used the ISO core standard sort/2 built-in and we see a Prolog system performance variety. SWI-Prologs compressed compounds show again an advantage.
Optimize Later
This article refers to "Frozen Dicts". Although Python does not officially offer an additional "Frozen Dict" data structure. Only some proposals that indicate Python doesn't want to strip the input order from "Frozen Dicts".
Pythons "Frozen Dicts" seem to be preoccupied with the problem of bringing __eq__
and __hash__
to dicts. We get these for free, from Prolog built-ins such as (==)/2
and term_hash/2
. Otherwise our explicit operation is closer to "Frozen Set":
>>> frozenset([2,1,2,3])
frozenset({1, 2, 3})
The only difference is that our keysort/2 is more tolerant, and does also allow multiple keys. To keep things simple we shied away from a corresponding duplicate key check as of now. This very much sums up our current functionality:
dict_keysort({a:1,b:2,a:3}, X).
X = {a:1, a:3, b:2}.
We are currently exploring optimizations options. Our testing showed that SWI-Prolog compressed compounds have an advantage making the O(N) modification and sort/2 faster. We could maybe adopt this compression without sacrificing input order?
Conclusions
There is no new dict syntax, since unlike SWI-Prolog we do not include a tag. Our operations preserve input order and we have an explicit operation to keysort a dict. Testing indicates that a compression could boost modification and sorting.
User Manual: library(misc/dict)
https://www.novacuor.ch/doctab/doclet/en/docs/12_lang/07_libraries/04_misc/01_dict.html