GLOSSARY // ORGANIZED AI

Wiki

A–Z definitions for every term used across the guide and arch pages. Each entry has a short definition, a concrete example, and links to where the term appears in context. Skim it or use it as a lookup.

21 terms

3 sections

cross-linked

no jargon-as-decoration

// HOW TO USE

How to read this glossary

Every entry has the same three-part shape: definition (what the thing is, in plain English), example (a concrete instance you can point at), and see-also (where it appears in the other sites or related terms here).

def

plain English meaning

concrete instance

see

where it appears

Color in the term name signals primary domain: Yellow = core retrieval concept, applies everywhere.

// TERMS A–E

Terms A–E

Agent

A software actor that can take actions on your behalf — make API calls, write to systems, decide what to do next — usually wrapped around an LLM with tools, memory, and guardrails.

ex: Orectic's "Oracle" is an agent. Penumbra doesn't ship one — you build your own on top of their substrate.

see → arch / Orectic

Chunk

A small slice of a document — a paragraph, a few sentences, sometimes a fixed token window — that gets embedded into a vector store as one unit of retrieval.

ex: a 500-token slice of an MSA contract, indexed alongside thousands of other slices, retrievable by similarity.

see → embedding, vector store, RAG

Domain model

A description of how a particular business actually works — its objects, the rules between them, the workflows that move them, the standards that judge them. The thing Penumbra has you write.

ex: for a private-equity firm: Deal, Memo, InvestmentCommittee, DiligenceFinding as objects; "every memo cites at least two findings" as a rule.

see → arch / Penumbra, ontology

Embedding

A vector of numbers — typically 384, 768, or 1536 floats — that represents the meaning of a piece of text. Similar meanings land close in vector space. The atomic unit of semantic search.

ex: the sentence "the dog barked" might embed to [0.12, -0.43, 0.91, ...]; "the puppy yelped" lands very close to it; "the car started" lands far away.

see → vector store, semantic search

Entity

A specific, named thing in your business — Customer Acme, Renewal Q3-2025, Decision DISC-2156. Knowledge graphs are made of entities (nodes) and the typed relationships between them (edges).

ex: Customer:Acme is one entity. Its identity is fixed across all the documents that mention it — that resolution is called entity linking.

see → knowledge graph, extraction

Extraction

The process of pulling structured data — entities, relationships, dates, dollar amounts — out of unstructured text, audio, or video. The first half of Orectic's product.

ex: from a sales call transcript, extract Customer = Acme, Topic = renewal, NextStep = approval needed by Marta.

see → arch / Orectic, entity

// TERMS F–O

Terms F–O

GraphRAG

Hybrid retrieval that combines a vector store and a knowledge graph. The graph provides typed precision (the exact entity); the vector store provides recall (related unstructured chunks). The LLM gets both.

ex: "Why did we discount Acme?" → graph hops to Decision DISC-2156 (precise); vector finds the Slack thread debating it (texture); LLM synthesizes both.

see → guide / Combining, guide / Worked example

Guardrails

Rules and constraints that limit what an agent is allowed to do, see, or say. Often implemented as input/output filters, permission checks, or content classifiers.

ex: "this agent may read Customer records but never write to them" or "this response must cite at least one source from the graph."

see → arch / Penumbra (one of the generated components)

Hybrid retrieval

Any retrieval strategy that combines two or more methods — most commonly vector + graph, but also vector + keyword (BM25), or vector + structured SQL. GraphRAG is one popular instance.

ex: pure vector returns 50 candidate chunks; a graph traversal narrows to the 12 connected to the canonical entity; the LLM gets the intersection.

see → GraphRAG, guide / Worked example

Knowledge graph

A database of typed entities (nodes) and typed relationships (edges) between them. Lets you do multi-hop traversal — "find all decisions made by approvers in the Sales org affecting customers in EMEA last quarter."

ex: (Customer:Acme) -[placed]-> (Renewal:Q3-2025) -[decided]-> (Decision:DISC-2156).

see → ontology, GraphRAG, guide / Building blocks

LLM

Large language model. The thing that takes a prompt (retrieved chunks + question) and writes the final answer. In all three retrieval approaches, the LLM is the synthesizer — the differences are what gets handed to it.

ex: Claude Sonnet 4.6, GPT-4o, Gemini 2.5 Pro. The retrieval method determines whether the model is doing reasoning or just paraphrasing.

see → RAG

Memory

Structured state an agent leaves behind so it (or another agent) can pick up where it stopped. Different from chat history — memory is typed and queryable, not just a transcript to scroll.

ex: after a triage agent processes a ticket, it writes {ticket: T-447, classification: refund, confidence: 0.92} to memory; the next agent reads that directly.

see → arch / Penumbra (one of the generated components)

Ontology

The schema of a knowledge graph — what entity types exist, what relationship types are valid between them, what attributes each entity carries. The thing Penumbra has you author. The thing Orectic infers from data.

ex: "a Decision always has an approver (Person) and a reason (string) and may have one or more citations (Document)."

see → knowledge graph, domain model

// TERMS P–Z

Terms P–Z

Provenance

A traceable record of where an answer came from — which source documents, which graph entities, which prior decisions shaped it. Without provenance, an agent's output is unverifiable.

ex: the answer "Marta approved 15% off on 2025-08-14" cites DISC-2156 (typed record) and slack#acme-deal (chunk) as sources.

see → arch / Penumbra (one of the generated components)

RAG

Retrieval-augmented generation. The standard pattern: embed your documents into a vector store; at query time, find the nearest chunks; stuff them into the LLM's prompt; let the LLM answer with that context.

ex: every "chat with your PDF" app you've used since 2023 is RAG. Cheap, common, hits a ceiling on multi-hop or precise-citation queries.

see → GraphRAG, guide / Combining

Retrieval

The act of pulling relevant context out of a store (vector, graph, keyword, SQL) before handing it to an LLM. The "R" in RAG. Mostly invisible to end users but determines whether the answer is right.

ex: when you ask ChatGPT about a PDF you uploaded, the retrieval step decides which slices of the PDF make it into the prompt.

see → RAG, semantic search, hybrid retrieval

Schema

The shape of your structured data — the tables/types/fields/relations you allow. For graphs, schema is the ontology. For Orectic, schema is inferred. For Penumbra, schema is authored.

ex: "a Renewal has start_date (date), customer (Customer), amount (money), decision (Decision)."

see → ontology, domain model

Semantic search

The act of searching by meaning rather than keywords. Implemented by embedding the query, then finding nearest neighbors in a vector store. The thing that lets "I want a refund" match "cancel my order."

ex: query "puppy" retrieves chunks containing "dog," "canine," "lab mix," none of which share the literal word.

see → vector store, guide / Building blocks

Tools

Typed functions an agent can call — usually with structured input and output. Different from "any code the agent can write." A tool is a contract: a name, a JSON schema for arguments, a documented behavior.

ex: create_invoice(customer_id, amount, due_date) exposed to an agent so it can act on the AR system without hallucinating the API.

see → arch / Penumbra (one of the generated components)

Vector

A list of floating-point numbers. In AI retrieval, it's the embedding of a chunk or query. The dimensionality (length) is fixed per embedding model.

ex: OpenAI's text-embedding-3-small produces 1536-dim vectors. Cohere's embed-v3 produces 1024.

see → embedding, vector store

Vector store

A database optimized for storing and similarity-searching vectors. Inputs go in as (id, vector, optional metadata). Queries return the top-k nearest neighbors by cosine or dot-product distance.

ex: Pinecone, Weaviate, Qdrant, Chroma, pgvector. All do roughly the same thing with different operational trade-offs.

see → semantic search, guide / Building blocks

// CROSS-REFERENCES

Where each term appears in context

If a term clicks into place better with surrounding prose than with a definition, follow these links.

Term	guide	arch	source
Vector store	building blocks	Orectic	—
Semantic search	building blocks	—	—
Knowledge graph	building blocks	Orectic	—
RAG	combining	—	—
GraphRAG	worked example	pipeline	—
Ontology	—	Penumbra	Penumbra refs
Extraction	—	Orectic	—
Provenance	worked example	Penumbra	—

// IMPLEMENTATION

Stack & conventions

Same single-file pattern as the rest of the hub. Glossary entries use a custom .term block — left-bordered, monospace term name in yellow, plain English definition, italicized example, see-also links in monospace.

Organized AI Cloudflare Pages wrangler 4.55 single-file HTML

// DEPLOY & RUN

Deploy & run

CLOUDFLARE_ACCOUNT_ID=691fe25d377abac03627d6a88d3eeac9 \
  wrangler pages project create orectic-penumbra-wiki \
  --production-branch main 2>/dev/null || true

cd docs/wiki
CLOUDFLARE_ACCOUNT_ID=691fe25d377abac03627d6a88d3eeac9 \
  wrangler pages deploy . \
  --project-name orectic-penumbra-wiki \
  --branch main \
  --commit-dirty=true

GitHub

github.com/organized-ai/orectic-penumbra-docs

Siblings

guide · arch · source