Preview

Preview Feature — This feature is currently in preview and under active development. APIs and functionality may change. We recommend testing thoroughly before using in production.

Using Vectors with Agents

Vector databases give AI agents long-term memory and semantic retrieval capabilities. This guide shows how to integrate vectors with Ductape agents.

Overview

There are two ways to use vectors with agents:

Long-Term Memory - Automatic storage and retrieval of conversation context
Direct Access - Tools that query vectors for specific information

Method 1: Long-Term Memory

Configure an agent to automatically store and retrieve relevant memories using vectors.

Setup

import { VectorDBType, DistanceMetric } from '@ductape/sdk';

// 1. Create vector database for agent memory
await ductape.vector.create({
  product: 'my-product',
  name: 'Agent Memory',
  tag: 'agent-memory',
  dbType: VectorDBType.PINECONE,
  dimensions: 1536,
  metric: DistanceMetric.COSINE,
  envs: [
    {
      slug: 'dev',
      endpoint: 'https://dev-index.pinecone.io',
      apiKey: process.env.PINECONE_API_KEY,
      index: 'agent-memories',
      namespace: 'support-agent',
    },
  ],
});

// 2. Define agent with long-term memory
const agent = await ductape.agents.define({
  product: 'my-product',
  tag: 'support-agent',
  name: 'Customer Support Agent',
  model: {
    provider: 'anthropic',
    model: 'claude-sonnet-4-20250514',
  },
  systemPrompt: `You are a helpful customer support agent.
Use your memory to remember previous conversations and provide personalized support.`,
  memory: {
    shortTerm: {
      maxMessages: 50,
      truncationStrategy: 'summarize',
    },
    longTerm: {
      enabled: true,
      vectorStore: 'agent-memory',  // Reference to vector config tag
      retrieveTopK: 5,
      minSimilarity: 0.7,
      autoStore: true,  // Automatically store important interactions
      namespace: 'support-agent',
    },
  },
  tools: [/* your tools */],
});

How It Works

With long-term memory enabled:

Automatic Retrieval - Before each response, the agent queries the vector store for relevant past interactions
Context Injection - Retrieved memories are added to the system prompt as context
Automatic Storage - Important interactions are automatically stored for future retrieval

// Run the agent - memories are handled automatically
const result = await ductape.agents.run({
  product: 'my-product',
  env: 'dev',
  tag: 'support-agent',
  input: 'I had an issue with my order last week, can you help?',
  sessionId: 'user-123',  // Session ID for memory isolation
});

// The agent will automatically recall relevant memories
// from the user's previous conversations

Memory Configuration Options

Option	Type	Default	Description
`enabled`	boolean	false	Enable long-term memory
`vectorStore`	string	-	Tag of vector config to use
`retrieveTopK`	number	5	Number of memories to retrieve
`minSimilarity`	number	0.7	Minimum similarity threshold
`autoStore`	boolean	false	Auto-store interactions
`namespace`	string	-	Namespace for memory isolation

Method 2: Direct Vector Access

Give agents tools that directly query vector databases.

Knowledge Base Tool

const agent = await ductape.agents.define({
  product: 'my-product',
  tag: 'knowledge-agent',
  name: 'Knowledge Base Agent',
  model: {
    provider: 'anthropic',
    model: 'claude-sonnet-4-20250514',
  },
  systemPrompt: `You are a helpful assistant with access to a knowledge base.
Use the search_knowledge tool to find relevant information before answering questions.`,
  tools: [
    {
      tag: 'search-knowledge',
      name: 'Search Knowledge Base',
      description: 'Search the knowledge base for relevant documents',
      parameters: {
        query: {
          type: 'string',
          description: 'The search query',
          required: true,
        },
        category: {
          type: 'string',
          description: 'Optional category filter',
          enum: ['faq', 'documentation', 'tutorials'],
        },
        limit: {
          type: 'number',
          description: 'Maximum number of results',
          default: 5,
        },
      },
      handler: async (ctx, params) => {
        // Generate embedding for the query
        const queryVector = await generateEmbedding(params.query);

        // Query the vector database
        const results = await ctx.recall({
          vector: queryVector,
          topK: params.limit || 5,
          filter: params.category ? { category: params.category } : undefined,
          includeMetadata: true,
        });

        // Return formatted results
        return results.matches.map((match) => ({
          title: match.metadata?.title,
          content: match.metadata?.content,
          relevance: match.score,
        }));
      },
    },
  ],
});

RAG (Retrieval-Augmented Generation)

Build a full RAG pipeline with vectors and agents:

const ragAgent = await ductape.agents.define({
  product: 'my-product',
  tag: 'rag-agent',
  name: 'RAG Assistant',
  model: {
    provider: 'anthropic',
    model: 'claude-sonnet-4-20250514',
  },
  systemPrompt: `You are an AI assistant that answers questions based on retrieved documents.
Always search for relevant documents before answering.
Cite your sources by referencing document titles.
If no relevant documents are found, say so.`,
  tools: [
    {
      tag: 'search-documents',
      description: 'Search for relevant documents to answer the question',
      parameters: {
        query: {
          type: 'string',
          description: 'Search query based on the user question',
          required: true,
        },
      },
      handler: async (ctx, params) => {
        const embedding = await generateEmbedding(params.query);

        const results = await ctx.recall({
          vector: embedding,
          topK: 5,
          minScore: 0.7,
          includeMetadata: true,
        });

        if (results.matches.length === 0) {
          return { found: false, message: 'No relevant documents found' };
        }

        return {
          found: true,
          documents: results.matches.map((match) => ({
            title: match.metadata?.title,
            content: match.metadata?.content,
            source: match.metadata?.source,
            score: match.score,
          })),
        };
      },
    },
  ],
});

// Use the RAG agent
const result = await ductape.agents.run({
  product: 'my-product',
  env: 'dev',
  tag: 'rag-agent',
  input: 'How do I configure authentication in Ductape?',
});

Remember and Recall

Use the built-in context methods for memory operations:

const agent = await ductape.agents.define({
  product: 'my-product',
  tag: 'memory-agent',
  name: 'Memory Agent',
  model: { provider: 'anthropic', model: 'claude-sonnet-4-20250514' },
  systemPrompt: 'You are an assistant that remembers user preferences.',
  tools: [
    {
      tag: 'save-preference',
      description: 'Save a user preference for future reference',
      parameters: {
        preference: {
          type: 'string',
          description: 'The preference to remember',
          required: true,
        },
        category: {
          type: 'string',
          description: 'Category of preference',
          required: true,
        },
      },
      handler: async (ctx, params) => {
        // Store in vector database
        await ctx.remember({
          content: params.preference,
          metadata: {
            type: 'preference',
            category: params.category,
            userId: ctx.sessionId,
            timestamp: new Date().toISOString(),
          },
        });

        return { saved: true };
      },
    },
    {
      tag: 'get-preferences',
      description: 'Retrieve user preferences',
      parameters: {
        category: {
          type: 'string',
          description: 'Category to search',
        },
      },
      handler: async (ctx, params) => {
        const results = await ctx.recall({
          query: params.category || 'user preferences',
          filter: {
            type: 'preference',
            userId: ctx.sessionId,
          },
          topK: 10,
        });

        return results.matches.map((m) => m.metadata?.content);
      },
    },
  ],
});

Using handlerRef for Portable Tools

For tools that should work when loaded from the database, use handlerRef:

const agent = await ductape.agents.define({
  product: 'my-product',
  tag: 'portable-agent',
  name: 'Portable Agent',
  model: { provider: 'anthropic', model: 'claude-sonnet-4-20250514' },
  systemPrompt: 'You have access to various tools.',
  tools: [
    {
      tag: 'search-docs',
      description: 'Search documentation',
      parameters: {
        query: { type: 'string', required: true },
      },
      // Handler reference - resolved at runtime
      handlerRef: 'feature:doc-search',
    },
    {
      tag: 'get-customer',
      description: 'Get customer information',
      parameters: {
        customerId: { type: 'string', required: true },
      },
      handlerRef: 'database:customers-db:find-customer',
    },
  ],
});

The handlerRef format is type:tag:event:

action:app-tag:event-name - Call an app action
feature:feature-tag - Run a feature
database:db-tag:event-name - Database operation
graph:graph-tag:action - Graph operation
storage:storage-tag:event - Storage operation

Best Practices

1. Use Appropriate Embedding Models

Match your embedding model to your use case:

Use Case	Recommended Model	Dimensions
General text	OpenAI ada-002	1536
High quality	OpenAI text-embedding-3-large	3072
Multilingual	Cohere multilingual-v3	1024
Fast/cheap	all-MiniLM-L6-v2	384

2. Namespace by Context

memory: {
  longTerm: {
    enabled: true,
    vectorStore: 'agent-memory',
    namespace: `user-${userId}`,  // Isolate by user
  },
}

3. Set Appropriate Similarity Thresholds

memory: {
  longTerm: {
    enabled: true,
    vectorStore: 'agent-memory',
    minSimilarity: 0.75,  // Higher threshold = more relevant results
    retrieveTopK: 3,      // Fewer results = less noise
  },
}

4. Combine Short and Long-Term Memory

memory: {
  shortTerm: {
    maxMessages: 20,
    truncationStrategy: 'summarize',
  },
  longTerm: {
    enabled: true,
    vectorStore: 'agent-memory',
    retrieveTopK: 5,
  },
}

Next Steps

Agents Overview - Learn more about building agents
Best Practices - Vector database optimization
Workflows - Combine agents with workflows

Overview​

Method 1: Long-Term Memory​

Setup​

How It Works​

Memory Configuration Options​

Method 2: Direct Vector Access​

Knowledge Base Tool​

RAG (Retrieval-Augmented Generation)​

Remember and Recall​

Using handlerRef for Portable Tools​

Best Practices​

1. Use Appropriate Embedding Models​

2. Namespace by Context​

3. Set Appropriate Similarity Thresholds​

4. Combine Short and Long-Term Memory​

Next Steps​