A Forth-flavored language for matrix work, set/array manipulation,
string/regex processing, and logic programming. A compact C
interpreter built with clang -O3.
make # builds ./logicforth
make test # runs the golden-output test suite
./logicforth # REPL
Links PCRE2 (libpcre2-8.a, statically) for the regex engine — install it
first (brew install pcre2). Vendoring PCRE2 to restore the self-contained,
zero-dependency build is planned.
\ Arithmetic
3 4 + . \ 7
\ Matrices: * is element-wise; matrix multiply is dgemm (αAB + βC)
[ 1 2 3 4 ] 2 2 matrix dup transpose * \ element-wise product of M and Mᵀ
\ Sets and set algebra
< 1 2 3 > < 2 3 4 > + . \ < 1 2 3 4 > (union via polymorphic +)
\ Set-builder { x² | x ∈ 1..10, even x } — literal + filter/map + destruct
< 1 10 range [: 2 mod 0= :] filter [: fsq :] map destruct > . \ < 4 16 36 64 100 >
\ Frames — symbol-keyed nested maps
{ :a 1 :b { :c 2 } } /b/c @ . \ 2
\ JSON: parse to frames/arrays, serialize back
"[1, 2, 3]" json>frame frame>json . \ [1, 2, 3]
\ Higher-order operations
[ 1 2 3 4 5 ] [: dup * :] map . \ [ 1 4 9 16 25 ]
\ Strings and regex (PCRE2)
"x=42" "(\w+)=(\d+)" match . \ [ "x=42" "x" "42" ]
"hello world" "o" "0" replace . \ hell0 w0rld
\ Exceptions
[: "missing" throw :]
[: "got " . . cr :] try-catch \ prints "got missing"
\ Coroutines via delimited continuations
: yield shift ;
: producer 1 yield 2 yield 3 ;
reset producer \ leaves (1, k) — next value via resume
\ Logic: unify binds variables; amb is a committed choice
[ 1 2 3 ] [ X Y Z ] ~ drop X $ . Y $ . Z $ . cr \ 1 2 3
[: fail :] [: "fallback" :] amb . \ fallback- Tagged Vals — floats, strings, symbols, sets, arrays, frames, matrices, execution tokens, dictionary addresses, continuations, internal marks. A single 8-byte NaN-boxed representation; the tag determines interpretation.
- Direct-threaded inner interpreter — each dictionary cell is a handler function pointer, dispatched by an indirect tail call (
musttail); a colon call, literal, or branch carries its operand in the cell(s) right after the handler. The dictionary is the threaded code — no separate bytecode. - Compile-time instruction fusion (superwords) — adjacent variable-reads and float ops collapse into single instructions (
var var f+→ one op;… var f+!fuses the store), andf*+/f*-are fused multiply-add/subtract. - Per-interpreter state — all mutable state lives in an
Interpreter, which owns itsVocabulary(a growable dictionary plus name/source/symbol pools). Multiple independent instances can coexist in one process; the engine is embeddable. - Three stacks — data, return, and a side stack for stashing values that mustn't sit on the other two.
- Colon definitions —
: name body ;. The body is captured as source text forseeand the text-formsave. - Anonymous quotations —
[: ... :]pushes a fresh xt. Works at top level and inside colon defs. - Control flow —
if/else/then, thebegin/until/againandbegin/while/repeatloops, countedtimes/i-times,exit, and>r/r>/r@for return-stack access. - Tick and execute —
' word executefor first-class invocation by name. forget— truncate the dictionary back to a named word; symbol identities survive.- Variables and symbols —
variable foodeclares a global; read it by bare name, assign with42 to foo(toalso auto-creates a global on first assignment at the REPL).symbol bardefines a symbol;:foois a symbol literal interned on use;string>symbolinterns a computed string. - Word-local variables —
| x y |at the head of a colon definition or quotation declares scoped slots (initialized to0.0); read by bare name, assign withto name.++ name/-- nameincrement/decrement a local in place (f++/f--the unsafe float-only forms). Locals nest through quotations and survive continuation capture. - Mark-and-sweep GC — walks data/return/side stacks, dictionary, and a small
gc_rootsarray for in-flight C-level temporaries.
- Polymorphic arithmetic —
+/-/*//dispatch on operand tags: floats compute, strings concatenate (+), sets union/difference/intersection, matrices element-wise, a scalar broadcasts over a matrix, and arrays concatenate (+). - In-place matrix ops —
+!/-!/*!//!mutate the left matrix in place (explicit; the programmer decides). Float-only fast paths (f+,f-,f*,f/,f^, …) skip the type dispatch when both operands are known floats. - Matrix construction —
R C 0-matrix(zeros),[ ... ] R C matrix,V N diagonal-matrix(N×N with V on the diagonal),N identity-matrix. - DGEMM —
dgemm-nn/tn/nt/tt(αAB + βC) for all four transpose variants. The non-transposednnpath is ikj-ordered withrestrictpointers for cache-friendly access; the transposed variants use a straightforward triple loop. - Indexing —
@i/@j/@i,jto read rows, columns, or single cells. - Shape —
dim,reshape,flatten,transpose,diagonal. - Reductions —
sum,row-sums,column-sums,max,min,row-maxes,row-mins,column-maxes,column-mins. Librarymean,row-means,column-meanson top. - Element-wise math —
abs,sqrt,exp,log,sin,cos,tan,tanh. Polymorphic over floats and matrices. - Total ordering —
=/lt/gtcompare matrices by shape then row-major contents, so matrices work as set members.
- Set literals —
< 1 2 3 >, set operations,member?,size. - Array literals —
[ 1 2 3 ], thearrayconstructor (gather N from the stack),array-of(fill), indexed access via@i. - Map, fold, zip-map, filter —
mapfor a single source,reducefor a left fold with an accumulator,mapnfor N-ary zip,filterto select by predicate, with anonymous quotations as the higher-order argument.
Symbol-keyed nested maps — the associative type, and the compound term the planned logic layer builds on. The three bracket families are distinct: [ ] arrays, { } frames, < > sets.
- Literals —
{ :a 1 :b 2 }; values may be any Val, including nested frames, arrays, and sets. - Builders —
frame( keys values -- frame ) from two parallel collections,>frame( kv-array -- frame ) from an alternating key/value array. - Path literals —
/a/b/cis a symbol array[ :a :b :c ], built once at compile time, used to address into the tree. - Access —
@( frame key/path -- value ) get,!( frame key/path value -- frame ) set with auto-vivified intermediates,has?existence test,delete-atremove,update-atapply a quotation to a leaf,mergecombine two frames (right wins), pluskeys/values/size. Each accessor takes a single:symbolkey or a/a/b/cpath. - Representation — sorted parallel key/value arrays with binary-search lookup; mutable in place, reference semantics. Structurally comparable, so frames work as set members and round-trip through their
{ }literal.
- String literals are raw (newlines allowed;
""is the one escape → a literal");formatfills{n}placeholders from the stack —"got {0} of {1}" format; polymorphic concatenation via+. - Regex on PCRE2 (Perl-compatible, JIT-compiled):
match(first match as a flat[ whole cap… ]),match-all(all matches, nested),replace(replace-all, with&/\1–\9backrefs), and thehas?string overload (does the pattern match?). Patterns are plain"..."literals — PCRE2 reads\d,\w,\n, lookaround,\p{...}. - Slicing / building —
substring(half-open byte range),join(concatenate an array of strings with a separator).
json>frame— parse a JSON string into native values: objects → frames (keys interned as symbols), arrays → arrays, strings → strings (escapes and\uXXXXdecoded to UTF-8), numbers → floats,true/false→ the reserved:1/:0boolean symbols,null→null(the none value). Recursive-descent, GC-safe, rejects trailing garbage.frame>json— serialize a value back to a JSON string: floats use a shortest round-trip representation, strings are escaped,:1/:0→true/false, none →null.
- Stdin REPL, rlwrap-friendly, with a
count|topprompt showing stack depth and the top value — green on a terminal, red on error..pretty-prints a nested array across lines with the opening brackets aligned; strings print quoted inside a collection and in.s, raw when printed bare. loadruns a source file as if typed.savewrites the user's vocabulary as a re-loadable.l4source file.save-image/load-image— binary image with full state preservation (dictionary, objects, stacks, continuations).reloadtruncates user state and re-runs every fileloaded this session, in order.
Drive external programs over pipes (fork/execvp/pipe/waitpid; binary-safe, no shell):
argv start-process— launch from an argv array; returns a frame{ :pid :in :out :err }with the child's pid and its stdin/stdout/stderr asT_STREAMvalues.write/read/close— write a string to a stream, read a stream to EOF, close one (closing:insends EOF).running?/wait/stop— non-blocking liveness check, block-until-exit, signal-and-reap.lib.l4conveniences:run(split a command line and start it),read-out/read-err/write-in.commands width parallel-run— run a batch of argv arrays concurrently, at mostwidthat a time, collecting{ :out :err :status }per command in input order (refills a slot as each child finishes). Process-level parallelism — e.g. firing off manycurlrequests at once.
A four-primitive substrate the rest of the control story is built on. See docs/continuations.md for the full treatment.
reset— installs a delimiter (a uniquely-tagged mark on the return stack).shift— captures the slice up to the nearest reset, removes the mark and captured frames, pushes the continuation as aT_CONTVal. Used for coroutines and generators.shift-with— same capture, but runs a handler xt in the outer context after the unwind. Used for exceptions and restarts.resume— re-enters a captured continuation. Multi-shot.
A third stack for stashing arbitrary Vals without disturbing the data or return stack: >side, side>, side-drop, side-depth.
Built in lib.l4 on top of the continuation primitives:
throw— non-local exit with a value.catch— wraps an xt; returns(result 0)on success,(exc 1)on throw.try-catch— wraps an xt with a recovery handler. Arity-agnostic.
The shift-with handler can also resume the captured continuation, giving the Common Lisp restart pattern — exceptions can recover rather than just abort.
Unification and committed choice, on the trail and the continuation machinery:
- Logic variables —
lvarmakes a fresh one; a capitalized identifier is a logic-var literal: a persistent global at the REPL, or a fresh per-call variable when declared in| X |inside a definition or quotation. unify(~) — unifies two terms, binding logic vars through a trail so they match: atoms by value, arrays element-wise, frames as open records (shared keys must unify, extras allowed); on a mismatch it fails.deref($) follows a variable's binding chain.amb/fail— committed choice: run the first branch; if it fails (aunifymismatch or an explicitfail), roll its bindings back through the trail and run the second, committing to whichever succeeds._— the anonymous wildcard: unifies with anything, binds nothing, and allocates nothing.matches?— a non-destructiveunifytest: marks the trail, unifies, rolls back, and pushes whether the two unified — so it composes in straight-line code.- Cons lists —
[( a b c )]builds cons pairs and[( H T )]is the[H|T]head/tail pattern underunify; withcons,head-tail, andarray↔consconversions. - Fact database —
relation/assert/query/retract. A relation is a frame of a row-set plus per-column indexes (declared symbol columns); rows are column-keyed frames that dedup;querymatches a pattern by unification, narrowing through the index when it can.
depth,roll— stack-manipulation primitives.copy— deep copy of a value (strings, arrays, sets, frames, matrices).now— current Unix time as a float (seconds since epoch).see— prints a word's source definition;see-compileddisassembles its threaded body.man—( xt -- fr ), returns a frame of a word's reference entry (stack effect, one-line summary, cost notes).help nameprints it for the named word.words,forget,bye,gc,clear,.s,.a— interpreter utilities.
Tracked in PLAN.md, with design notes for each.
- Time / dates — strftime/strptime formatting and parsing (
time-format,time-parse) over thenowfloat timestamps that already exist. - Random numbers — xoshiro256++ PRNG:
random-float,random-int,seed!,shuffle.
lib.l4wrappers over the regex layer —split,index-of,starts-with,ends-with,trim,lines.- UTF-8 / codepoint indexing — string ops are byte-indexed today; codepoint-indexed at the user level is planned.
- Vendor PCRE2 — bundle the PCRE2 sources to restore the self-contained build.
- TSV file I/O — the sole tabular format; other formats convert outside logicforth.
- SQLite integration — embedded relational storage via the vendored amalgamation. Queries return sets of column-keyed row frames — the same shape as a fact-database relation — or matrices when all columns are numeric.
- Sort —
sort,sort-with,sort-by. - stdin / env —
stdin/stdout/stderras streams (read/written with the subprocessread/write), environment variable access. - Functional primitives —
rangeremains in C;find,any?,all?,flat-map,sort-byinlib.l4. (map/mapn/filter/reduce/take/reverse/concatdone in C;skip/lastinlib.l4.)
Built in stages on the per-interpreter foundation already in place:
- Cooperative green threads —
spawn,yield,run-scheduleron top of continuations. Cheap interleaving for I/O-bound work, no parallelism. - OS-thread parallelism + mailboxes — Erlang-style isolated interpreters communicating via per-thread mailboxes. Real multi-core parallelism, no shared mutable state.
- Slicing,
hstack/vstack, norms, element-wise comparison. - SVD — one-sided Jacobi.
- Optional BLAS/LAPACK build — swap the hand-rolled kernels for BLAS/LAPACK behind a build switch; default stays zero-dependency.
src/c/logicforth.h — types, Interpreter/Vocabulary structs, prototypes
src/c/core.c — engine: interpreter, dictionary, GC, printing, image, REPL
src/c/words.c — arithmetic, stack, I/O, control flow, defining words, continuations
src/c/collections.c — sets, arrays, and frames
src/c/matrix.c — matrix words and numeric kernels
src/c/functional.c — higher-order operations (map, mapn, …)
src/c/superwords.c — compile-time instruction fusion (superwords)
src/c/strings.c — string and PCRE2 regex operations
src/forth/lib.l4 — standard library (auto-loaded at startup)
tests/ — golden-output test files
docs/ — design documents
examples/ — sample programs
PLAN.md — deferred work and design notes
See LICENSE.