Skip to main content

Knowledge Base Commands

Commands for managing the AI knowledge base, vector search, and pgvector integration.

Setup Commands

setup_knowbase

Setup Knowledge Base with pgvector extension and migrations.

python manage.py setup_knowbase [OPTIONS]

Options:

  • --skip-extensions - Skip creating PostgreSQL extensions

Complete Setup

First-Time Setup

python manage.py setup_knowbase

What it does:

  1. ✅ Creates pgvector extension for vector similarity search
  2. ✅ Creates pg_trgm extension for text search
  3. ✅ Runs required database migrations
  4. ✅ Sets up vector indexes

Output:

🚀 Setting up Knowledge Base...
📦 Creating PostgreSQL extensions...
✓ pgvector extension created
✓ pg_trgm extension created
✅ Knowledge Base setup completed!

Skip Extensions (Already Created)

python manage.py setup_knowbase --skip-extensions

Use when:

  • Extensions already created manually
  • Running as non-superuser
  • Extensions exist from previous setup

Statistics Commands

knowbase_stats

Display knowledge base statistics and metrics.

python manage.py knowbase_stats [OPTIONS]

Options:

  • --format [json|yaml|table] - Output format (default: table)
  • --detailed - Show detailed statistics

Viewing Statistics

Basic Statistics

python manage.py knowbase_stats

Output:

📚 Knowledge Base Statistics
==================================================
Documents: 1,234
Embeddings: 1,234
Average Embedding Size: 1536
Storage Used: 45.2 MB
Last Updated: 2025-10-01 14:30:00

Detailed Statistics

python manage.py knowbase_stats --detailed

Additional output:

Document Types:
Text: 856 (69.4%)
PDF: 234 (19.0%)
Markdown: 144 (11.6%)

Index Performance:
Average Query Time: 12ms
Cache Hit Rate: 87.3%
Total Queries (24h): 4,567

Storage Breakdown:
Documents: 12.3 MB
Embeddings: 32.9 MB
Indexes: 0.8 MB

JSON Export

python manage.py knowbase_stats --format json

Output:

{
"documents": 1234,
"embeddings": 1234,
"average_embedding_size": 1536,
"storage_used_mb": 45.2,
"last_updated": "2025-10-01T14:30:00Z",
"document_types": {
"text": 856,
"pdf": 234,
"markdown": 144
}
}

YAML Export

python manage.py knowbase_stats --format yaml

Output:

documents: 1234
embeddings: 1234
average_embedding_size: 1536
storage_used_mb: 45.2
last_updated: "2025-10-01T14:30:00Z"
document_types:
text: 856
pdf: 234
markdown: 144

PostgreSQL Extensions

pgvector Extension

Purpose: Enables vector similarity search

Features:

  • Store embeddings as vector data types
  • Similarity search (cosine, L2, inner product)
  • Indexing for fast retrieval
  • Works with OpenAI, Cohere, Hugging Face embeddings

Example query:

SELECT * FROM knowbase_document
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;

pg_trgm Extension

Purpose: Enables fuzzy text search

Features:

  • Trigram-based text search
  • Fuzzy matching
  • Fast text similarity queries
  • Works with LIKE, ILIKE, similarity()

Example query:

SELECT * FROM knowbase_document
WHERE content % 'search query'
ORDER BY similarity(content, 'search query') DESC;

Common Tasks

Initial Setup Workflow

# 1. Setup Knowledge Base
python manage.py setup_knowbase

# 2. Verify setup
python manage.py knowbase_stats

# 3. Check database extensions
psql -d your_database -c "\dx"

Manual Extension Creation

If setup_knowbase fails due to permissions:

# Connect as PostgreSQL superuser
psql -U postgres -d your_database

# Create extensions manually
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;

# Verify extensions
\dx

Then run setup without extensions:

python manage.py setup_knowbase --skip-extensions

Monitor Knowledge Base Growth

# Create script to track growth
#!/bin/bash
while true; do
echo "$(date)"
python manage.py knowbase_stats --format json | jq '.documents'
sleep 3600 # Every hour
done

Export Statistics for Monitoring

# Export to monitoring system
python manage.py knowbase_stats --format json > /var/log/knowbase_stats.json

# Schedule with cron
0 * * * * cd /path/to/project && python manage.py knowbase_stats --format json > /var/log/knowbase_stats_$(date +\%Y\%m\%d_\%H).json

Configuration

Database Requirements

Minimum PostgreSQL Version: 11+ Recommended Version: 14+

pgvector Configuration

Add to postgresql.conf:

# Increase shared_buffers for better vector performance
shared_buffers = 256MB

# Optimize for vector operations
max_parallel_workers_per_gather = 4

Django Configuration

# config.py
from django_cfg import DjangoConfig, DatabaseConfig

class MyConfig(DjangoConfig):
# Enable Knowledge Base
enable_knowbase: bool = True

# Configure database
databases: dict[str, DatabaseConfig] = {
"default": DatabaseConfig(
engine="postgresql",
name="mydb",
host="localhost",
port=5432,
)
}

Troubleshooting

Extension Creation Failed

Error:

❌ Failed to create extensions: permission denied

Solution:

# Option 1: Run as superuser
psql -U postgres -d your_database -c "CREATE EXTENSION vector;"
psql -U postgres -d your_database -c "CREATE EXTENSION pg_trgm;"

# Option 2: Grant permissions
GRANT CREATE ON DATABASE your_database TO your_user;

Missing pgvector Extension

Error:

could not open extension control file: No such file or directory

Solution:

# Install pgvector
# Ubuntu/Debian
sudo apt-get install postgresql-14-pgvector

# macOS (Homebrew)
brew install pgvector

# From source
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install

Performance Issues

Slow queries?

-- Create indexes for better performance
CREATE INDEX ON knowbase_document USING ivfflat (embedding vector_cosine_ops);
CREATE INDEX ON knowbase_document USING gin (content gin_trgm_ops);

Best Practices

1. Setup Early in Development

# Run setup immediately after project creation
django-cfg create-project "My Project" && \
cd my_project && \
python manage.py migrate_all && \
python manage.py setup_knowbase

2. Monitor Storage Usage

# Check storage regularly
python manage.py knowbase_stats --detailed

# Set up alerts for storage limits

3. Regular Statistics Export

# Daily statistics export
0 0 * * * python manage.py knowbase_stats --format json > /backups/kb_stats_$(date +\%Y\%m\%d).json

4. Backup Extensions Configuration

# Backup PostgreSQL extensions
pg_dump --schema-only --create your_database > extensions_backup.sql

5. Test Vector Search Performance

# In Django shell
from django_cfg.apps.knowbase.models import Document
import time

start = time.time()
results = Document.objects.vector_search("test query", limit=10)
duration = time.time() - start

print(f"Query took {duration*1000:.2f}ms")


Knowledge Base powers semantic search! 📚