Skip to main content
Preview
Preview Feature — This feature is currently in preview and under active development. APIs and functionality may change. We recommend testing thoroughly before using in production.

Using Vectors with Agents

Vector databases give AI agents long-term memory and semantic retrieval capabilities. This guide shows how to integrate vectors with Ductape agents.

Overview

There are two ways to use vectors with agents:

  1. Long-Term Memory - Automatic storage and retrieval of conversation context
  2. Direct Access - Tools that query vectors for specific information

Method 1: Long-Term Memory

Configure an agent to automatically store and retrieve relevant memories using vectors.

Setup

import { VectorDBType, DistanceMetric } from '@ductape/sdk';

// 1. Create vector database for agent memory
await ductape.vector.create({
product: 'my-product',
name: 'Agent Memory',
tag: 'agent-memory',
dbType: VectorDBType.PINECONE,
dimensions: 1536,
metric: DistanceMetric.COSINE,
envs: [
{
slug: 'dev',
endpoint: 'https://dev-index.pinecone.io',
apiKey: process.env.PINECONE_API_KEY,
index: 'agent-memories',
namespace: 'support-agent',
},
],
});

// 2. Define agent with long-term memory
const agent = await ductape.agents.define({
product: 'my-product',
tag: 'support-agent',
name: 'Customer Support Agent',
model: {
provider: 'anthropic',
model: 'claude-sonnet-4-20250514',
},
systemPrompt: `You are a helpful customer support agent.
Use your memory to remember previous conversations and provide personalized support.`,
memory: {
shortTerm: {
maxMessages: 50,
truncationStrategy: 'summarize',
},
longTerm: {
enabled: true,
vectorStore: 'agent-memory', // Reference to vector config tag
retrieveTopK: 5,
minSimilarity: 0.7,
autoStore: true, // Automatically store important interactions
namespace: 'support-agent',
},
},
tools: [/* your tools */],
});

How It Works

With long-term memory enabled:

  1. Automatic Retrieval - Before each response, the agent queries the vector store for relevant past interactions
  2. Context Injection - Retrieved memories are added to the system prompt as context
  3. Automatic Storage - Important interactions are automatically stored for future retrieval
// Run the agent - memories are handled automatically
const result = await ductape.agents.run({
product: 'my-product',
env: 'dev',
tag: 'support-agent',
input: 'I had an issue with my order last week, can you help?',
sessionId: 'user-123', // Session ID for memory isolation
});

// The agent will automatically recall relevant memories
// from the user's previous conversations

Memory Configuration Options

OptionTypeDefaultDescription
enabledbooleanfalseEnable long-term memory
vectorStorestring-Tag of vector config to use
retrieveTopKnumber5Number of memories to retrieve
minSimilaritynumber0.7Minimum similarity threshold
autoStorebooleanfalseAuto-store interactions
namespacestring-Namespace for memory isolation

Method 2: Direct Vector Access

Give agents tools that directly query vector databases.

Knowledge Base Tool

const agent = await ductape.agents.define({
product: 'my-product',
tag: 'knowledge-agent',
name: 'Knowledge Base Agent',
model: {
provider: 'anthropic',
model: 'claude-sonnet-4-20250514',
},
systemPrompt: `You are a helpful assistant with access to a knowledge base.
Use the search_knowledge tool to find relevant information before answering questions.`,
tools: [
{
tag: 'search-knowledge',
name: 'Search Knowledge Base',
description: 'Search the knowledge base for relevant documents',
parameters: {
query: {
type: 'string',
description: 'The search query',
required: true,
},
category: {
type: 'string',
description: 'Optional category filter',
enum: ['faq', 'documentation', 'tutorials'],
},
limit: {
type: 'number',
description: 'Maximum number of results',
default: 5,
},
},
handler: async (ctx, params) => {
// Generate embedding for the query
const queryVector = await generateEmbedding(params.query);

// Query the vector database
const results = await ctx.recall({
vector: queryVector,
topK: params.limit || 5,
filter: params.category ? { category: params.category } : undefined,
includeMetadata: true,
});

// Return formatted results
return results.matches.map((match) => ({
title: match.metadata?.title,
content: match.metadata?.content,
relevance: match.score,
}));
},
},
],
});

RAG (Retrieval-Augmented Generation)

Build a full RAG pipeline with vectors and agents:

const ragAgent = await ductape.agents.define({
product: 'my-product',
tag: 'rag-agent',
name: 'RAG Assistant',
model: {
provider: 'anthropic',
model: 'claude-sonnet-4-20250514',
},
systemPrompt: `You are an AI assistant that answers questions based on retrieved documents.
Always search for relevant documents before answering.
Cite your sources by referencing document titles.
If no relevant documents are found, say so.`,
tools: [
{
tag: 'search-documents',
description: 'Search for relevant documents to answer the question',
parameters: {
query: {
type: 'string',
description: 'Search query based on the user question',
required: true,
},
},
handler: async (ctx, params) => {
const embedding = await generateEmbedding(params.query);

const results = await ctx.recall({
vector: embedding,
topK: 5,
minScore: 0.7,
includeMetadata: true,
});

if (results.matches.length === 0) {
return { found: false, message: 'No relevant documents found' };
}

return {
found: true,
documents: results.matches.map((match) => ({
title: match.metadata?.title,
content: match.metadata?.content,
source: match.metadata?.source,
score: match.score,
})),
};
},
},
],
});

// Use the RAG agent
const result = await ductape.agents.run({
product: 'my-product',
env: 'dev',
tag: 'rag-agent',
input: 'How do I configure authentication in Ductape?',
});

Remember and Recall

Use the built-in context methods for memory operations:

const agent = await ductape.agents.define({
product: 'my-product',
tag: 'memory-agent',
name: 'Memory Agent',
model: { provider: 'anthropic', model: 'claude-sonnet-4-20250514' },
systemPrompt: 'You are an assistant that remembers user preferences.',
tools: [
{
tag: 'save-preference',
description: 'Save a user preference for future reference',
parameters: {
preference: {
type: 'string',
description: 'The preference to remember',
required: true,
},
category: {
type: 'string',
description: 'Category of preference',
required: true,
},
},
handler: async (ctx, params) => {
// Store in vector database
await ctx.remember({
content: params.preference,
metadata: {
type: 'preference',
category: params.category,
userId: ctx.sessionId,
timestamp: new Date().toISOString(),
},
});

return { saved: true };
},
},
{
tag: 'get-preferences',
description: 'Retrieve user preferences',
parameters: {
category: {
type: 'string',
description: 'Category to search',
},
},
handler: async (ctx, params) => {
const results = await ctx.recall({
query: params.category || 'user preferences',
filter: {
type: 'preference',
userId: ctx.sessionId,
},
topK: 10,
});

return results.matches.map((m) => m.metadata?.content);
},
},
],
});

Using handlerRef for Portable Tools

For tools that should work when loaded from the database, use handlerRef:

const agent = await ductape.agents.define({
product: 'my-product',
tag: 'portable-agent',
name: 'Portable Agent',
model: { provider: 'anthropic', model: 'claude-sonnet-4-20250514' },
systemPrompt: 'You have access to various tools.',
tools: [
{
tag: 'search-docs',
description: 'Search documentation',
parameters: {
query: { type: 'string', required: true },
},
// Handler reference - resolved at runtime
handlerRef: 'feature:doc-search',
},
{
tag: 'get-customer',
description: 'Get customer information',
parameters: {
customerId: { type: 'string', required: true },
},
handlerRef: 'database:customers-db:find-customer',
},
],
});

The handlerRef format is type:tag:event:

  • action:app-tag:event-name - Call an app action
  • feature:feature-tag - Run a feature
  • database:db-tag:event-name - Database operation
  • graph:graph-tag:action - Graph operation
  • storage:storage-tag:event - Storage operation

Best Practices

1. Use Appropriate Embedding Models

Match your embedding model to your use case:

Use CaseRecommended ModelDimensions
General textOpenAI ada-0021536
High qualityOpenAI text-embedding-3-large3072
MultilingualCohere multilingual-v31024
Fast/cheapall-MiniLM-L6-v2384

2. Namespace by Context

memory: {
longTerm: {
enabled: true,
vectorStore: 'agent-memory',
namespace: `user-${userId}`, // Isolate by user
},
}

3. Set Appropriate Similarity Thresholds

memory: {
longTerm: {
enabled: true,
vectorStore: 'agent-memory',
minSimilarity: 0.75, // Higher threshold = more relevant results
retrieveTopK: 3, // Fewer results = less noise
},
}

4. Combine Short and Long-Term Memory

memory: {
shortTerm: {
maxMessages: 20,
truncationStrategy: 'summarize',
},
longTerm: {
enabled: true,
vectorStore: 'agent-memory',
retrieveTopK: 5,
},
}

Next Steps