A Relation Is an Operation

Jun 28, 2026

This is the first post in Knowledge Graphs as Geometry, a weekly series that builds one idea, chapter by chapter: a knowledge graph stores facts as relations, every relation is an operation in space, and predicting a missing fact is geometry. Every post runs on a real (if tiny) graph, and the full, executed code is one click away at the end.

Today’s AI is fluent but ungrounded. A language model can say almost anything, yet it has no stable place to keep what it knows and no ledger an agent can write to, check against or answer from. A knowledge graph is that place: the structured memory a system reasons over and the control plane a governance layer inspects, holding facts as explicit and typed relations rather than as weights no one can read. To see why that is more than a database, it helps to start small.

Here is a family, drawn as a graph. Ann and Bob are married and have two children, Carol and Dave. Carol married Eve, and the two of them have a child, Frank. Each fact is an arrow: Ann →spouse→ Bob, Bob →parentOf→ Carol, Carol →siblingOf→ Dave. A knowledge graph is nothing more exotic than this — entities as nodes, facts as labeled, directed arrows between them.

Now cover one arrow and ask the graph to put it back. Who are Ann’s children? Formally we ask the query (Ann, parentOf, ?) and want the graph to answer Carol and Dave, not Eve or Frank. This is link prediction, and it is the task this whole series is about. It sounds like a lookup, but it is not. A lookup can only return what is already stored, and the entire point of a knowledge graph in a modern system — the store an AI agent writes to, checks against and reads a confidence from — is to answer questions about facts that were never written down. The graph has to generalize.

So the question of this post is simple to state and surprisingly deep: how does a machine predict a fact it has never seen? The answer the book gives, and the lens for everything that follows, is one sentence. We place each entity at a point in space, and we make each relation an operation that moves through that space — so a fact is true when the relation’s operation carries the head to the tail. A relation is not a label on an arrow. It is a verb, an action, a map.

A prediction is a score, and a score is geometry

Give every entity a vector — a position in some space we get to learn. Ann is a point, Carol is a point, and so on. Now we need each relation to do something to those points. The simplest choice, the one we start the book with, is a translation: the relation parentOf is a fixed step 𝒓, and the fact (h, parentOf, t) should hold when

\(\mathbf{h} + \mathbf{r} \approx \mathbf{t}\)

Stepping from a parent by the parentOf vector should land you on a child. To score a candidate fact we measure how well it lands — how close 𝒉 + 𝒓 comes to 𝒕:

\(s(h, r, t) \;=\; -\,\lVert \mathbf{h} + \mathbf{r} - \mathbf{t}\rVert\)

A small distance is a high score, a true-looking fact. To answer (Ann, parentOf, ?) we score every candidate tail and rank them. Carol and Dave should land near Ann + 𝒓 and score highest; Frank and Eve should land far away and score low.

Translation is just the first instance. The general form — the one the rest of the series earns — replaces the fixed step with a learned operation Fᵣ that can rotate, stretch and shift:

\(s(h, r, t) \;=\; -\,\lVert F_r(\mathbf{h}) - \mathbf{t}\rVert\)

Read that as: apply the relation’s operation to the head, and see if you land on the tail. Everything in knowledge graph embedding — the dozens of models with their own names and notations — turns out to be a choice of what Fᵣ is allowed to do. That is the destination. For now hold onto the reframe: the score is geometric. A fact’s plausibility is a distance, and a prediction is a ranking by distance.

Different relations need different operations

The reframe earns its keep the moment you look at what real relations do, because they do not all behave the same way, and the operation has to match.

Look at spouse. If Ann is married to Bob, then Bob is married to Ann — the relation is its own reverse. A translation can never capture that. If 𝒉 + 𝒓 ≈ 𝒕 means Ann + spouse ≈ Bob, then symmetry demands Bob + spouse ≈ Ann too, and adding the two forces 𝒓 ≈ 0 — the relation collapses to nothing. So a symmetric relation needs an operation that can be its own inverse without vanishing. A half-turn rotation is exactly that: do it twice and you are back where you started. Already the geometry is telling us that spouse wants a rotation where parentOf was content with a step.

Now look at parentOf itself. Ann is a parent of both Carol and Dave — one head, many tails. And Carol has two parents, Ann and Bob — many heads, one tail. The relation is many-to-many, and that too is a demand on the operation: it cannot be a clean one-to-one map, because it has to send one point toward several and collapse several toward one. Contrast siblingOf, which is symmetric like spouse but carries no such fan-out in this small graph.

These are not quirks of one family. They are the four patterns that organize the entire field — symmetry, inversion, composition and cardinality — and the family graph already shows three of them. We can read them straight off the edges:

Each relation pattern is a demand on the operation

This table is the seed of the book’s central claim. Each pattern is a requirement on the operation: symmetry asks for a half-turn, a many-to-one cardinality asks for a map that can collapse a direction, composition asks for operations that chain. Choosing a model is, underneath, choosing which of these the operation can express. We will spend the series making each requirement precise and watching the matching operation win on exactly the relations that need it.

Measuring a prediction is harder than it looks

Before we trust any score, we have to agree on how to grade it, and in this field the grading decides more than the model does. The trap is hiding in our own example.

Ask (Ann, parentOf, ?) and suppose the model ranks the six candidates by score. Carol is the answer we held out, but Dave is also a true child of Ann — he is a correct fact that happens to be sitting in our training data. If Dave scores above Carol, do we punish the model for ranking Carol second? We should not: Dave is not a wrong answer, he is a different right one. The fix is the filtered protocol — before reading off Carol’s rank, we remove every other known-true tail from the list, so the model is graded only against genuinely false competitors. Filtering moves Carol from rank 2 to rank 1, and it does so for the right reason.

There is a second, subtler trap: ties. Suppose a lazy model gives every candidate the identical score. What rank does Carol get? If we let her slide to the top of the tied block she looks perfect — and a constant scorer that knows nothing would look like a champion. The book uses the realistic convention instead: a tied gold answer gets the average position over the tie, the count of strictly-better candidates plus half the ties, plus one. Under that rule the constant scorer earns exactly the mediocre score it deserves, and no model can game the metric by being uninformative. These two rules — filter the known truths, charge ties at their average rank — are the difference between numbers you can trust and numbers you cannot, and most of the embarrassing irreproducibility in this literature traces back to getting them wrong.

Watch a real model learn the family

None of this is hypothetical. Train an actual model on the family graph — a DistMult, dim 16, a few hundred steps — and ask it (Ann, parentOf, ?). Here are its learned scores over all six candidates:

Trained DistMult scores for (Ann, parentOf, ?)

Two things stand out. The true children, Carol and Dave, rise to the top, and everyone else falls away — the model generalized the parentOf operation from the other facts. And Carol and Dave score almost identically. That near-tie is not a bug; it is the model correctly learning that Ann parents both, so it treats them alike. It is also exactly why we needed filtering and realistic ties: without them, that near-tie between two true answers would scramble the score for no good reason.

That is the whole arc of this first chapter in miniature. We turned facts into geometry, saw that different relations demand different operations, fixed a principled way to grade predictions, and watched a real model recover the family. Everything after this is about the operation Fᵣ — how much it should be allowed to do, how to read which form a relation needs, and the surprise that all the named models are one operation wearing different clothes.

Run it yourself

The core is a dozen lines: build the graph, train a model, rank a query.

from kge.data import TripleFactory
from kge.models import DistMult
from kge.train import Trainer, TrainConfig
import torch

triples = [("Ann","spouse","Bob"), ("Bob","spouse","Ann"),
           ("Ann","parentOf","Carol"), ("Bob","parentOf","Carol"),
           ("Ann","parentOf","Dave"),  ("Carol","parentOf","Frank")]
g = TripleFactory(train=triples, valid=[], test=[], name="family")

model = DistMult(g.n_entities, g.n_relations, dim=16)
Trainer(model, g, TrainConfig(mode="1vsall", epochs=300, lr=0.1), device="cpu").fit()

ann, parent = g.ent2id["Ann"], g.rel2id["parentOf"]
scores = model.score_all_tails(torch.tensor([ann]), torch.tensor([parent]))[0]
for e in sorted(range(g.n_entities), key=lambda i: -scores[i]):
    print(f"{g.id2ent[e]:6s} {scores[e]:+.3f}")   # Carol and Dave on top

The companion notebook builds the family graph, trains the model, ranks (Ann, parentOf, ?), and walks the filtered + realistic-tie protocol step by step — including the constant-scorer trap and why it gets caught.

▶ Run the notebook in Colab (no install): https://colab.research.google.com/github/asudjianto-xml/Knowledge-Graph-Geometry/blob/main/notebooks/ch01_knowledge_graphs.ipynb

Notebook: https://github.com/asudjianto-xml/Knowledge-Graph-Geometry/blob/main/notebooks/ch01_knowledge_graphs.ipynb
📦 Code: https://github.com/asudjianto-xml/Knowledge-Graph-Geometry — pip install "kge-geometric @ git+https://github.com/asudjianto-xml/Knowledge-Graph-Geometry.git"

A knowledge graph stores facts as relations, and a relation is an operation in space: a verb that moves a head toward a tail. Predicting a missing fact is then geometry — a score is a distance, a prediction is a ranking. The catch, and the opportunity, is that different relations demand different operations, and the right one wins on exactly the relations that need it. Next week: the simplest operation of all, translation, and the single thing it can never do.

Knowledge Graphs as Geometry is a free weekly series adapted from my book Knowledge Graph Embeddings as Geometric Operators. The posts carry the intuition and the runnable code; the book carries the full derivations. Subscribe to follow the whole argument — from a single translation to one operator that contains the entire model zoo.

Agus’s Substack

Discussion about this post

Ready for more?