Glossary
The terms, defined plainly.
No jargon for its own sake. Plain definitions of the words our market leans on, with the honest version where it matters.
- Agentic workflow
- An agentic workflow is a process where an AI agent — not a human — plans the steps, calls tools or APIs, checks intermediate results and iterates toward a goal. It is fundamentally different from a chatbot that just answers a single question.
- AI agent
- An AI agent is a system that uses a language model to plan and take actions toward a goal, calling tools, APIs or other steps rather than only returning text. Agents can automate multi-step workflows, but they add reliability and oversight challenges that production engineering has to handle.
- Build vs buy
- Build vs buy is the decision to develop a capability in-house or acquire it from a vendor or SaaS product. The answer turns on whether the capability is your real differentiation, the total cost over its realistic lifetime, the time-to-market, and the maintenance load you can sustain afterwards.
- Context window
- Context window is the amount of text — measured in tokens — that a large language model can read in a single call, covering both the input you send and the output it generates. Once the conversation, documents and answer together exceed that budget, something has to be cut, summarised or retrieved on demand.
- Dedicated development team
- A dedicated development team is a self-contained group of engineers that works only for you on a body of work, with its own day-to-day coordination. You set the direction and priorities; the team handles the internal mechanics of delivering against them, so you rent outcomes rather than hours.
- Embeddings
- Embeddings are numeric representations — vectors — of text, images or audio that place similar meanings close together in a high-dimensional space. They turn fuzzy human concepts into coordinates, so questions like 'is this paragraph about the same thing as that one' become a measurable distance rather than a guess.
- Evals
- Evals are automated tests for LLM-powered features that measure accuracy, safety and regression on a curated set of representative inputs. They are the thing that turns prompt engineering into production engineering: without evals you cannot tell whether your last change made the system better or worse.
- Fine-tuning
- Fine-tuning is further training a pre-trained AI model on a smaller, specific dataset so it adapts to a particular task, tone or domain. It is more involved than prompting or retrieval-augmented generation, and for most products those two solve the problem at lower cost and risk.
- Foundation model
- A foundation model is a large AI model trained on broad data that can be adapted to many downstream tasks. Foundation models exist for language, images, audio and multimodal use — the language ones are called LLMs.
- Function calling
- Function calling is a mechanism where a large language model picks and invokes a developer-defined tool — a function or API — instead of answering with plain text. It is the wire format that turns a language model into an agent that can read a database, send an email or trigger a workflow.
- Hallucination
- Hallucination is when a large language model confidently produces information that is wrong, fabricated or made up — names, citations and facts that look right but are not. It is not a bug, it is a property of statistical text generation.
- Human in the loop
- Human in the loop is designing an AI system so a person reviews, approves or edits the key steps before they take effect. It is the honest answer to how you make an agentic system safe enough to ship: you keep a human on the high-stakes decisions and automate the cheap ones.
- Inference
- Inference is running a trained model to produce an output, as opposed to training the model in the first place. In production AI features, inference is where the runtime cost lives — per token, per call, every time a user hits the system.
- IP assignment
- IP assignment is the set of contract clauses that transfer intellectual property in created work — code, designs, content — from the vendor and its engineers to the buyer. It is the plain-English 'work-for-hire' outcome that buyers expect by default but does not happen automatically in every jurisdiction.
- Knowledge transfer
- Knowledge transfer is the deliberate process of moving knowledge from one team or person to another so the work survives the move — onboarding sessions, documentation, paired coding, recorded decisions and a clear map of who knows what. It is the real risk in any outsourcing engagement or staff rotation, because the work is only as durable as the knowledge that backs it.
- Large language model (LLM)
- A large language model (LLM) is an AI model trained on vast amounts of text to predict and generate language. It powers chatbots, code assistants and the AI answer engines that increasingly sit between people and websites. The GPT, Claude and Gemini families are well-known examples.
- Managed services
- Managed services is an outsourcing model where the vendor owns the outcome end-to-end — people, process and delivery — instead of just supplying engineers. You buy a result against a defined scope and SLA, not hours, and the vendor is on the hook for hitting it.
- MLOps
- MLOps (machine-learning operations) is the practice of deploying, monitoring and maintaining AI and machine-learning models reliably in production. It covers versioning, evaluation, cost, latency and the pipelines that keep a model working after launch, the unglamorous work that decides whether an AI feature is dependable.
- Model Context Protocol (MCP)
- Model Context Protocol (MCP) is an open protocol introduced by Anthropic that lets AI agents connect to tools, data sources and applications in a standardised way. It is effectively a common plug-shape between language models and the world they need to read from and act on.
- Nearshoring
- Nearshoring is outsourcing software work to a country in a nearby time zone, so the team's working hours overlap with yours. For a Dutch or German company that usually means Central European Time, where a question is answered the same hour instead of overnight.
- Offshoring
- Offshoring is outsourcing software work to a distant country, typically with a six-to-twelve-hour time difference. It can deliver excellent work at a low hourly rate, but the time gap turns each clarification into a lost day, and you tend to discover quality issues only after the work is done.
- Onshoring
- Onshoring is building or moving software work entirely within your own country. It is the highest-cost option with the lowest distance, no time-zone gap and no culture gap, which makes it the default when regulation, security or physical proximity dictates where the work has to sit.
- Prompt engineering
- Prompt engineering is the craft of writing instructions, examples and structure for a large language model so it produces reliable, useful outputs. The model is fixed; the prompt is the lever, which is why most production LLM work actually happens here rather than in the model itself.
- Retrieval-augmented generation (RAG)
- Retrieval-augmented generation (RAG) is a technique that gives a language model access to your own documents or data at query time, so answers are grounded in your information rather than only what the model learned in training. It is the most common way to build an accurate, up-to-date AI feature on top of a general model.
- Staff augmentation
- Staff augmentation is adding individual external engineers to your own team. They join your standups, take tickets from your board and are managed by your tech lead, so you rent skill while keeping full control of how the work is run.
- Statement of Work (SoW)
- A Statement of Work (SoW) is the document defining scope, deliverables, milestones, acceptance criteria and assumptions for a fixed engagement between buyer and vendor. It is the artifact that separates 'we have a contract' from 'we will figure it out as we go'.
- Time and materials
- Time and materials is a pricing model where the buyer pays for hours actually worked plus any materials used, rather than a fixed price for a fixed scope. It suits work whose shape is still emerging — research, ongoing maintenance, AI features where the unknowns are larger than the knowns.
- Time to hire
- Time to hire is the number of calendar days from the moment a team decides it needs a skill to the moment a productive engineer is actually doing the work. It is the hidden cost of in-house hiring, because every month a role stays open is a month of opportunity cost on whatever that engineer would have been building.
- Time zone overlap
- Time zone overlap is the number of working hours two teams share. Full overlap, such as a nearshore team on Central European Time working with a Dutch client, means questions resolve in minutes and work can be corrected mid-build. It is the single biggest predictor of whether a distributed team feels close.
- Total cost of ownership
- In software outsourcing, total cost of ownership is the real cost of an engagement beyond the hourly rate: onboarding and ramp, management overhead, rework from miscommunication, and churn. The hourly rate is usually only 65 to 75 percent of it, so a useful rule is to add 25 to 35 percent to a quoted blended rate.
- Vector database
- A vector database is a database optimised for storing and querying embeddings, so that semantic similarity becomes a query you can run. It is the storage layer underneath retrieval-augmented generation and semantic search, where finding the closest meaning matters more than matching the exact words.
AI
AI
Contracts
AI
Teams
AI
AI
AI
AI
AI
AI
AI
AI
Contracts
Teams
AI
Teams
AI
AI
Teams
Teams
Teams
AI
AI
Teams
Contracts
Contracts
Teams
Teams
Contracts
AI