Memory
Introduction
Canso Memory is a powerful memory abstraction within the Canso AI Agentic System that enables AI agents to store, retrieve, update, and delete information from vector databases. This system serves as a long-term memory for AI agents, allowing them to maintain context across conversations and tasks. This guide will help you set up and start using Canso Memory with your Canso AI agents.
CansoMemory provides:
A standardized interface for interacting with vector databases
Automatic embedding generation for text content
Support for multiple embedding types and vector database backends
Collection management for organizing different types of memory
Seamless integration with Canso AI agents
Key Features
Persistent Memory Storage: Store information that persists across agent restarts and sessions
Semantic Search: Retrieve information based on semantic similarity rather than exact matches
Flexible Data Organization: Organize information into collections for different use cases
Simple API: Store and retrieve memory with just a few lines of code
Integration with AI Agents: Seamlessly incorporate memory capabilities into your Canso AI agents
Supported Technologies
Vector Databases
Milvus
Embedding Models
OpenAI Embeddings
What is Collection in Memory ?
A collection in Milvus DB is similar to a table in traditional databases. It's a logical grouping of data entities that serves as the basic unit for data management. Collections help organize and store related information in a structured way that allows for efficient vector similarity search, which is essential for AI Agent memory.
Why Do We Need Collections ?
These collections, for the time being are being used for Text to SQL - helping the agent convert natural language questions into SQL queries by leveraging structured knowledge. Collections allow the AI agent to:
Store and organize different types of information - Each collection is designed to hold specific types of data relevant to the agent's operations.
Perform semantic search - By storing vector embeddings alongside text data, the agent can find information based on meaning, not just keywords.
Maintain context awareness - Collections help the agent understand the database structure, domain knowledge, and query patterns needed to generate accurate responses.
Supported Collections
Below is a summary of the collections we currently support, with detailed field specifications for each:
1. Table Metadata Collection
Purpose: Stores information about your database structure to help the agent understand the schema and generate accurate SQL queries.
Collection Name: canso_table_metadata
table_name
VARCHAR
Primary key, max length 200 characters, i.e., "customers"
schema
VARCHAR
JSON representation of table schema, max length 65535 characters, i.e., {"columns": [{"name": "customer_id", "type": "INT"}, {"name": "name", "type": "VARCHAR"}]}
schema_embeddings
FLOAT_VECTOR
Vector embeddings of table_name, schema information, dimension depends on embedding model, i.e., [0.1, 0.2, ..., 0.5]
Index: IVF_FLAT
with L2
metric type on schema_embeddings field
2. Domain Knowledge Collection
Purpose: Contains context-specific information about your business domain to enable the agent to understand domain-specific concepts and translate them into SQL.
Collection Name: canso_domain_knowledge
fact
VARCHAR
Primary key, max length 200 characters, i.e., "Premium customers receive 10% discount"
explanation
VARCHAR
Detailed explanation of the domain fact, max length 65535 characters, i.e., "Our loyalty program offers a 10% discount to all customers with Premium status"
logic
VARCHAR
Business logic related to the fact, max length 65535 characters, i.e., "IF customer.status = 'Premium' THEN apply_discount(0.1)"
embeddings
FLOAT_VECTOR
Vector embeddings of domain knowledge, fact, explanation, logic, dimension depends on embedding model, i.e., [0.4, 0.1, 0.8, ..., 0.3]
Index: IVF_FLAT
with L2
metric type on embeddings field
3. Example Queries Collection
Purpose: Stores successful query patterns and examples to help the agent learn from past interactions and improve future query generation.
Collection Name: canso_examples
name
VARCHAR
Primary key, max length 200 characters, i.e., "monthly_sales_report"
description
VARCHAR
Description of the example query, max length 65535 characters, i.e., "Query to generate monthly sales report by product category"
content
VARCHAR
Actual query content, max length 65535 characters, i.e., "SELECT category, SUM(amount) FROM sales GROUP BY category ORDER BY SUM(amount) DESC"
embeddings
FLOAT_VECTOR
Vector embeddings of example queries, dimension depends on embedding model, i.e., [0.7, 0.2, 0.1, ..., 0.6]
Index: IVF_FLAT
with L2
metric type on embeddings field
4. Column Metadata Collection
Purpose: Stores metadata information for the columns in your database including possible values for columns with low cardinality. This helps the agent to generate more accurate queries when exact values need to be used in queries. The collection also has an optional metadata field which can be used to provide additional information like aliases, synonyms for column values etc.
Collection Name: canso_column_metadata
table_column_composite_key
VARCHAR
Primary key; A composite of table_name
and column_name
for identification; max length 400 characters; Ex "alerts|severity"
table_name
VARCHAR
Name of the table; max length 200 characters, Ex "alerts"
column_name
VARCHAR
Name of the column; max length 200 characters, Ex "severity"
candidate_values
ARRAY<VARCHAR>
Array of candidate values; each element is VARCHAR, max_length=100, max_capacity=2048; Ex ["high","low","medium"]
metadata
JSON
Stores additional metadata in JSON format. Since this is a JSON field it can be used as a catchall for any additional info; Ex {"aliases": ["acceptable", "mid"]}
embeddings
FLOAT_VECTOR
Vector embeddings of a string concatenation of table_name
and column_name
, dimension depends on embedding model; Ex `[0.7, 0.2, 0.1, ..., 0.6]
Index: IVF_FLAT
with L2
metric type on embeddings field
Collection Limitations and Some Notes
Field Length: Maximum length for any VARCHAR field in
Milvus DB
is65535
characters. Similarly for an ARRAY field inMilvus DB
we can have a maximum of2048
elements.Primary Keys: Each collection has a designated primary key field for unique identification.
Vector Embeddings: Each collection contains at least one vector field that stores the semantic representation of text data.
Index Types: Collections use the
IVF_FLAT
index type, which balances search speed and recall rate.
Memory Tool Integration
CansoMemory integrates with various tools including:
Memory Retrieval Tool: Allows agents to search and retrieve relevant memory during conversations
Text to SQL Tool: Uses memory to generate user-specific SQL queries based on schema information and examples stored in memory
Prerequisites
Access to a supported vector database (currently Milvus)
Appropriate credentials for embedding generation (OpenAI API key)
Setting Up the Vector Database
Before using CansoMemory, you need to deploy a vector database to store the embeddings. Canso currently supports Milvus as its vector database backend. This can also be an external vector database.
Configure the Vector Database
Create a configuration file (e.g., config.yaml) with the following content:
Deploy the Vector Database
Use the Canso CLI to deploy the vector database to your cluster:
This command will provision a Milvus instance in your cluster with the specified configuration.
Basic Setup
1. Initialize a Memory Instance
2. Connect Memory to an Agent
Deploying the Agent
Using the CLI for Memory Management
CansoMemory can also be managed using the Canso CLI:
Storing Memory
Where data_file.json
contains:
Updating Memory
Deleting Memory
Converse with Agent with Memory Enhanced Context
Detailed Documentation
For more detailed information about CansoMemory, please refer to:
Last updated
Was this helpful?