Preview
Preview Feature — This feature is currently in preview and under active development. APIs and functionality may change. We recommend testing thoroughly before using in production.
Using Vectors with Agents
Vector databases give AI agents long-term memory and semantic retrieval capabilities. This guide shows how to integrate vectors with Ductape agents.
Overview
There are two ways to use vectors with agents:
- Long-Term Memory - Automatic storage and retrieval of conversation context
- Direct Access - Tools that query vectors for specific information
Method 1: Long-Term Memory
Configure an agent to automatically store and retrieve relevant memories using vectors.
Setup
import { VectorDBType, DistanceMetric } from '@ductape/sdk';
// 1. Create vector database for agent memory
await ductape.vector.create({
product: 'my-product',
name: 'Agent Memory',
tag: 'agent-memory',
dbType: VectorDBType.PINECONE,
dimensions: 1536,
metric: DistanceMetric.COSINE,
envs: [
{
slug: 'dev',
endpoint: 'https://dev-index.pinecone.io',
apiKey: process.env.PINECONE_API_KEY,
index: 'agent-memories',
namespace: 'support-agent',
},
],
});
// 2. Define agent with long-term memory
const agent = await ductape.agents.define({
product: 'my-product',
tag: 'support-agent',
name: 'Customer Support Agent',
model: {
provider: 'anthropic',
model: 'claude-sonnet-4-20250514',
},
systemPrompt: `You are a helpful customer support agent.
Use your memory to remember previous conversations and provide personalized support.`,
memory: {
shortTerm: {
maxMessages: 50,
truncationStrategy: 'summarize',
},
longTerm: {
enabled: true,
vectorStore: 'agent-memory', // Reference to vector config tag
retrieveTopK: 5,
minSimilarity: 0.7,
autoStore: true, // Automatically store important interactions
namespace: 'support-agent',
},
},
tools: [/* your tools */],
});
How It Works
With long-term memory enabled:
- Automatic Retrieval - Before each response, the agent queries the vector store for relevant past interactions
- Context Injection - Retrieved memories are added to the system prompt as context
- Automatic Storage - Important interactions are automatically stored for future retrieval
// Run the agent - memories are handled automatically
const result = await ductape.agents.run({
product: 'my-product',
env: 'dev',
tag: 'support-agent',
input: 'I had an issue with my order last week, can you help?',
sessionId: 'user-123', // Session ID for memory isolation
});
// The agent will automatically recall relevant memories
// from the user's previous conversations
Memory Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
enabled | boolean | false | Enable long-term memory |
vectorStore | string | - | Tag of vector config to use |
retrieveTopK | number | 5 | Number of memories to retrieve |
minSimilarity | number | 0.7 | Minimum similarity threshold |
autoStore | boolean | false | Auto-store interactions |
namespace | string | - | Namespace for memory isolation |
Method 2: Direct Vector Access
Give agents tools that directly query vector databases.
Knowledge Base Tool
const agent = await ductape.agents.define({
product: 'my-product',
tag: 'knowledge-agent',
name: 'Knowledge Base Agent',
model: {
provider: 'anthropic',
model: 'claude-sonnet-4-20250514',
},
systemPrompt: `You are a helpful assistant with access to a knowledge base.
Use the search_knowledge tool to find relevant information before answering questions.`,
tools: [
{
tag: 'search-knowledge',
name: 'Search Knowledge Base',
description: 'Search the knowledge base for relevant documents',
parameters: {
query: {
type: 'string',
description: 'The search query',
required: true,
},
category: {
type: 'string',
description: 'Optional category filter',
enum: ['faq', 'documentation', 'tutorials'],
},
limit: {
type: 'number',
description: 'Maximum number of results',
default: 5,
},
},
handler: async (ctx, params) => {
// Generate embedding for the query
const queryVector = await generateEmbedding(params.query);
// Query the vector database
const results = await ctx.recall({
vector: queryVector,
topK: params.limit || 5,
filter: params.category ? { category: params.category } : undefined,
includeMetadata: true,
});
// Return formatted results
return results.matches.map((match) => ({
title: match.metadata?.title,
content: match.metadata?.content,
relevance: match.score,
}));
},
},
],
});
RAG (Retrieval-Augmented Generation)
Build a full RAG pipeline with vectors and agents:
const ragAgent = await ductape.agents.define({
product: 'my-product',
tag: 'rag-agent',
name: 'RAG Assistant',
model: {
provider: 'anthropic',
model: 'claude-sonnet-4-20250514',
},
systemPrompt: `You are an AI assistant that answers questions based on retrieved documents.
Always search for relevant documents before answering.
Cite your sources by referencing document titles.
If no relevant documents are found, say so.`,
tools: [
{
tag: 'search-documents',
description: 'Search for relevant documents to answer the question',
parameters: {
query: {
type: 'string',
description: 'Search query based on the user question',
required: true,
},
},
handler: async (ctx, params) => {
const embedding = await generateEmbedding(params.query);
const results = await ctx.recall({
vector: embedding,
topK: 5,
minScore: 0.7,
includeMetadata: true,
});
if (results.matches.length === 0) {
return { found: false, message: 'No relevant documents found' };
}
return {
found: true,
documents: results.matches.map((match) => ({
title: match.metadata?.title,
content: match.metadata?.content,
source: match.metadata?.source,
score: match.score,
})),
};
},
},
],
});
// Use the RAG agent
const result = await ductape.agents.run({
product: 'my-product',
env: 'dev',
tag: 'rag-agent',
input: 'How do I configure authentication in Ductape?',
});
Remember and Recall
Use the built-in context methods for memory operations:
const agent = await ductape.agents.define({
product: 'my-product',
tag: 'memory-agent',
name: 'Memory Agent',
model: { provider: 'anthropic', model: 'claude-sonnet-4-20250514' },
systemPrompt: 'You are an assistant that remembers user preferences.',
tools: [
{
tag: 'save-preference',
description: 'Save a user preference for future reference',
parameters: {
preference: {
type: 'string',
description: 'The preference to remember',
required: true,
},
category: {
type: 'string',
description: 'Category of preference',
required: true,
},
},
handler: async (ctx, params) => {
// Store in vector database
await ctx.remember({
content: params.preference,
metadata: {
type: 'preference',
category: params.category,
userId: ctx.sessionId,
timestamp: new Date().toISOString(),
},
});
return { saved: true };
},
},
{
tag: 'get-preferences',
description: 'Retrieve user preferences',
parameters: {
category: {
type: 'string',
description: 'Category to search',
},
},
handler: async (ctx, params) => {
const results = await ctx.recall({
query: params.category || 'user preferences',
filter: {
type: 'preference',
userId: ctx.sessionId,
},
topK: 10,
});
return results.matches.map((m) => m.metadata?.content);
},
},
],
});
Using handlerRef for Portable Tools
For tools that should work when loaded from the database, use handlerRef:
const agent = await ductape.agents.define({
product: 'my-product',
tag: 'portable-agent',
name: 'Portable Agent',
model: { provider: 'anthropic', model: 'claude-sonnet-4-20250514' },
systemPrompt: 'You have access to various tools.',
tools: [
{
tag: 'search-docs',
description: 'Search documentation',
parameters: {
query: { type: 'string', required: true },
},
// Handler reference - resolved at runtime
handlerRef: 'feature:doc-search',
},
{
tag: 'get-customer',
description: 'Get customer information',
parameters: {
customerId: { type: 'string', required: true },
},
handlerRef: 'database:customers-db:find-customer',
},
],
});
The handlerRef format is type:tag:event:
action:app-tag:event-name- Call an app actionfeature:feature-tag- Run a featuredatabase:db-tag:event-name- Database operationgraph:graph-tag:action- Graph operationstorage:storage-tag:event- Storage operation
Best Practices
1. Use Appropriate Embedding Models
Match your embedding model to your use case:
| Use Case | Recommended Model | Dimensions |
|---|---|---|
| General text | OpenAI ada-002 | 1536 |
| High quality | OpenAI text-embedding-3-large | 3072 |
| Multilingual | Cohere multilingual-v3 | 1024 |
| Fast/cheap | all-MiniLM-L6-v2 | 384 |
2. Namespace by Context
memory: {
longTerm: {
enabled: true,
vectorStore: 'agent-memory',
namespace: `user-${userId}`, // Isolate by user
},
}
3. Set Appropriate Similarity Thresholds
memory: {
longTerm: {
enabled: true,
vectorStore: 'agent-memory',
minSimilarity: 0.75, // Higher threshold = more relevant results
retrieveTopK: 3, // Fewer results = less noise
},
}
4. Combine Short and Long-Term Memory
memory: {
shortTerm: {
maxMessages: 20,
truncationStrategy: 'summarize',
},
longTerm: {
enabled: true,
vectorStore: 'agent-memory',
retrieveTopK: 5,
},
}
Next Steps
- Agents Overview - Learn more about building agents
- Best Practices - Vector database optimization
- Workflows - Combine agents with workflows