The working
studio glossary.
Real definitions, real examples, and how Vedwix uses each term in client work. 110 terms across 6 categories.
Retrieval-Augmented Generation: an LLM technique where the model retrieves relevant documents before generating a response.
Fine-tuningThe process of training a base LLM further on your own data to specialize its outputs.
EmbeddingA vector representation of text, image, or other data used for similarity search.
Vector DatabaseA database optimized for storing and querying high-dimensional vectors (embeddings).
Hybrid SearchA search technique combining keyword (BM25) and semantic (vector) retrieval.
RerankerA second-stage model that reorders retrieved results by relevance to the query.
Eval HarnessA test suite for AI features that measures quality, regressions, and edge cases.
AI AgentAn LLM-powered system that autonomously chooses tools and takes multi-step actions.
Tool UseAn LLM's ability to call external functions (search, calculator, database, etc.) during a response.
Function CallingA structured way for LLMs to invoke developer-defined functions with typed arguments.
Structured OutputLLM responses constrained to a JSON schema or specific format.
LLM-as-JudgeUsing one LLM to evaluate the outputs of another LLM (or itself) against criteria.
Prompt EngineeringThe practice of crafting LLM inputs to produce better, more reliable outputs.
Chain of ThoughtA prompting technique where the model is asked to reason step by step before answering.
Prompt CachingAPI-level caching of prompt prefixes to reduce cost and latency on repeated calls.
Context WindowThe maximum number of tokens an LLM can process in a single call.
TokenThe unit of text an LLM processes — typically 3-4 characters or about 0.75 of a word.
TemperatureA parameter (0-2) controlling how random or deterministic an LLM's output is.
Top-p / Nucleus SamplingA sampling parameter that limits LLM output to the smallest set of tokens whose probabilities sum to p.
LoRALow-Rank Adaptation: a lightweight fine-tuning method that trains small adapter layers on top of a frozen base model.
Supervised Fine-Tuning (SFT)Fine-tuning a model on labeled input-output pairs.
RLHFReinforcement Learning from Human Feedback: training a model based on human preference rankings of outputs.
DPO (Direct Preference Optimization)A simpler alternative to RLHF that trains directly on preference pairs without a reward model.
QLoRAQuantized LoRA: combines LoRA with 4-bit quantization to fine-tune large models on consumer GPUs.
QuantizationReducing the numerical precision of model weights to make inference cheaper and faster.
Small Language Model (SLM)A compact LLM (typically 1-15B parameters) optimized for specific tasks or constrained environments.
Mixture of Experts (MoE)An architecture where each forward pass routes tokens through a subset of "expert" sub-networks.
TransformerThe neural network architecture underlying virtually every modern LLM.
Attention MechanismThe transformer component that lets each token in a sequence attend to other tokens.
PretrainingThe initial training of a foundation model on massive amounts of unlabeled text.
Foundation ModelA large pretrained model that can be adapted to many downstream tasks.
Frontier ModelThe most capable AI models — currently models like Claude Opus, GPT-4, and Gemini Ultra.
Reasoning ModelAn LLM trained or post-trained to perform multi-step reasoning, often using extended hidden thinking tokens.
Multimodal ModelAn LLM that can process more than text — images, audio, video, or structured inputs.
AI ObservabilityLogging, tracing, and monitoring of LLM calls in production.
Red-TeamingAdversarial testing of an AI system to find harmful, biased, or wrong outputs.
JailbreakA prompt that bypasses an LLM's safety training to make it produce restricted content.
Prompt InjectionAn attack where malicious instructions in user input or retrieved data hijack the LLM's behavior.
HallucinationWhen an LLM generates plausible-sounding but factually incorrect information.
BenchmarksStandardized evaluation suites for comparing AI models on common tasks.
MCP (Model Context Protocol)An open protocol for connecting LLMs to external tools, data sources, and contexts.
A2A (Agent-to-Agent)Protocols and patterns for AI agents to discover, communicate, and coordinate with each other.
Multi-Agent SystemA system where multiple AI agents collaborate or compete on a task.
InferenceThe process of running an already-trained model to produce predictions or generations.
KV CacheA runtime cache of attention key/value tensors that speeds up sequential token generation.
Speculative DecodingAn inference technique using a smaller "draft" model to propose tokens that a larger model verifies.
GGUFA quantized model file format used for efficient CPU and GPU inference, popularized by llama.cpp.
Edge AIRunning AI models on devices (phones, browsers, IoT) rather than in the cloud.
Context RotA degradation in LLM quality as context length grows, even within the model's stated window.
GuardrailsRuntime checks that validate or filter LLM inputs and outputs against policies.
Total Cost of Ownership (TCO)The full cost of running an AI feature in production — inference, eval, observability, ops, and people.
Generating large numbers of SEO-optimized pages from templates and structured data.
Long-Tail KeywordsSpecific, lower-volume search queries that collectively make up most search demand.
Keyword ResearchIdentifying search queries to target based on volume, difficulty, intent, and business fit.
Search IntentThe underlying goal behind a user's search query — informational, navigational, commercial, transactional.
E-E-A-TExperience, Expertise, Authoritativeness, Trustworthiness — Google's framework for content quality.
YMYLYour Money or Your Life — content where accuracy can affect health, finances, or wellbeing.
Helpful Content SystemA Google ranking system that demotes pages written primarily for SEO rather than humans.
Thin ContentPages with little or no unique value — typically targeted in pSEO penalties.
SERPSearch Engine Results Page — what users see after a search.
AI OverviewsGoogle's AI-generated summaries that appear at the top of SERPs for many queries.
llms.txtA proposed standard file telling LLM crawlers how to find and use a site's content.
AI Search OptimizationOptimizing content for AI-powered search interfaces (ChatGPT, Perplexity, Claude, AI Overviews).
Featured SnippetA boxed answer at the top of a SERP, pulled from one ranking page.
Schema MarkupStructured data added to a page to help search engines understand its content.
JSON-LDA JSON-based format for embedding structured data (like schema markup) in web pages.
Core Web VitalsGoogle's page experience metrics: LCP, INP (formerly FID), and CLS.
XML SitemapA file listing all important URLs on a site, submitted to search engines.
Crawl BudgetThe number of URLs a search engine will crawl on a site within a given timeframe.
IndexationThe process of search engines storing a page in their database for retrieval in search results.
Canonical TagA link tag indicating the preferred URL when multiple URLs serve the same content.
Meta Title (Title Tag)The HTML title element — the primary signal for what a page is about and the headline shown in SERPs.
Meta DescriptionA 150-160 character summary of a page that often appears under the title in SERPs.
On-Page SEOOptimizations applied directly to a page — content, headings, links, schema, meta tags.
Internal LinkingLinks between pages on the same site — a major signal of topical authority and crawl depth.
BacklinkA link from another site pointing to yours — a major off-page ranking signal.
Domain AuthorityA Moz/Ahrefs-style aggregate score (0-100) estimating a domain's ability to rank.
Site ArchitectureThe hierarchical structure of a site — how URLs and internal links are organized.
URL StructureThe pattern of URL paths — affects SEO, UX, and shareability.
Pillar PageA comprehensive top-level page on a topic, supporting and linked to by many narrower cluster pages.
Topic ClusterA pillar page plus its supporting cluster pages — a coordinated set on one topic.
Pillar / Cluster StrategyA content architecture pairing one comprehensive pillar page with many supporting cluster pages on a topic.
A coordinated set of design components, tokens, patterns, and guidelines used across a product.
Design TokensNamed, semantic values for design properties — colors, spacing, typography — shared across platforms.
Brand SystemThe complete set of brand assets, rules, and applications — beyond just a logo.
Visual IdentityThe visual elements — logo, type, color, imagery — that represent a brand.
Logo FamilyA coordinated set of logo variants — primary, secondary, monogram, wordmark, lockups — for different uses.
Type SystemA coordinated set of typefaces, weights, and rules for typography across a brand.
Motion SystemRules for how UI elements animate — duration, easing, choreography — applied consistently across a product.
A tool for building and documenting UI components in isolation.
Server-Side Rendering (SSR)Rendering web pages on the server before sending HTML to the client.
Static Site Generation (SSG)Pre-rendering web pages to HTML at build time, served as static files.
Incremental Static Regeneration (ISR)A hybrid pattern where static pages are regenerated on-demand or on a timer.
Edge RenderingRunning render or API logic in CDN-edge locations close to the user, not in a single origin server.
CDNContent Delivery Network — globally distributed servers that cache and serve content close to users.
LCP (Largest Contentful Paint)Core Web Vital measuring how fast the largest visible element loads — should be under 2.5s.
INP (Interaction to Next Paint)Core Web Vital (since March 2024) measuring responsiveness — should be under 200ms.
CLS (Cumulative Layout Shift)Core Web Vital measuring unexpected layout shifts during page load — should be under 0.1.
Next.jsA React framework for production-grade web apps — by Vercel.
ReactA JavaScript library for building component-based UIs — by Meta.
TypeScriptA typed superset of JavaScript that compiles to plain JavaScript.
Tailwind CSSA utility-first CSS framework that scales by composing small classes.
JAMstackAn architecture pattern: pre-rendered JavaScript, APIs, and Markup served from a CDN.
Comparing two variants of a UI or experience by randomly assigning users to each.
Product-Market Fit (PMF)When a product satisfies a market deeply enough that demand pulls it forward.
Cohort AnalysisTracking groups of users (cohorts) over time to measure retention, behavior, and lifetime value.
LTV:CACThe ratio of customer lifetime value to customer acquisition cost — a core SaaS health metric.
Jobs To Be Done (JTBD)A framework focusing on the underlying job a customer is trying to get done, rather than demographics.