diff --git a/src/Templates/Boilerplate/Bit.Boilerplate/.docs/20- .NET Aspire.md b/src/Templates/Boilerplate/Bit.Boilerplate/.docs/20- .NET Aspire.md index 650f3f11be..a6fb010de2 100644 --- a/src/Templates/Boilerplate/Bit.Boilerplate/.docs/20- .NET Aspire.md +++ b/src/Templates/Boilerplate/Bit.Boilerplate/.docs/20- .NET Aspire.md @@ -333,6 +333,70 @@ This ensures **consistency** between development and production environments! --- +## ☁️ Going Production: Switching to Azure Managed Services + +While local containers (Docker) are perfect for development, for production, you could switch to fully managed Azure services to ensure scalability, security, and high availability. + +.NET Aspire makes this transition seamless by swapping the hosting resources in your `AppHost/Program.cs` without changing your application code. + +### 1. Prerequisite Packages + +Ensure you have the Azure integration packages installed in your **AppHost** project: + +```xml + + + + +``` + +### 2. Updating AppHost (`Program.cs`) + +Replace your local container definitions with Azure resources. + +#### 🛢️ Azure SQL Database + +Instead of `AddSqlServer` (container), use `AddAzureSqlServer`: + +```csharp +// ❌ Local Container +// var sqlServer = builder.AddSqlServer("sqlserver"); + +// ✅ Azure Managed Service +var sqlServer = builder.AddAzureSqlServer("sqlserver") + .AddDatabase("sqldb"); + +``` + +#### ⚡ Azure Cache for Redis + +Instead of `AddRedis` (container), use `AddAzureRedis`: + +```csharp +// ❌ Local Container +// var redis = builder.AddRedis("redis"); + +// ✅ Azure Managed Service +var redis = builder.AddAzureRedis("redis"); + +``` + +#### 🐘 Azure Database for PostgreSQL (Flexible Server) + +To leverage high-performance vector search in production, use Azure PostgreSQL Flexible Server. + +```csharp +// ❌ Local Container +// var postgres = builder.AddPostgres("postgres"); + +// ✅ Azure Managed Service +var postgres = builder.AddAzurePostgresFlexibleServer("postgres") + .AddDatabase("pgdb"); + +``` + +--- + ## Additional Resources - 📚 **Official Documentation**: https://aspire.dev diff --git a/src/Templates/Boilerplate/Bit.Boilerplate/.docs/25- RAG - Semantic Search with Vector Embeddings (Advanced).md b/src/Templates/Boilerplate/Bit.Boilerplate/.docs/25- RAG - Semantic Search with Vector Embeddings (Advanced).md index 2515858915..cf8479c373 100644 --- a/src/Templates/Boilerplate/Bit.Boilerplate/.docs/25- RAG - Semantic Search with Vector Embeddings (Advanced).md +++ b/src/Templates/Boilerplate/Bit.Boilerplate/.docs/25- RAG - Semantic Search with Vector Embeddings (Advanced).md @@ -350,5 +350,91 @@ A hybrid approach can offer a balance between speed and accuracy. You can first --- +## 7. Performance Optimization: Azure DiskANN index & Reranking + +When moving to production with large datasets (e.g., millions of vectors) in **SQL Server** or **PostgreSQL**, it is fully recommended to use the **DiskANN** index. DiskANN provides high-performance, disk-based approximate nearest neighbor (ANN) search. + +However, to utilize DiskANN efficiently and maintain high accuracy (recall), you should modify your query strategy. Instead of a simple "Order By & Take", use a **two-step Reranking** approach. + +### Steps to enable DiskANN + +#### Step A: Enable Extensions in `DbContext` + +In your `AppDbContext.cs` (Server.Api project), ensure the extensions are registered: +Example instructions provided for PostgreSQL + +```csharp +protected override void OnModelCreating(ModelBuilder modelBuilder) +{ + // Enable pgvector extension for vector operations + modelBuilder.HasPostgresExtension("vector"); + + // Enable Azure's high-performance DiskANN extension + modelBuilder.HasPostgresExtension("pg_diskann"); +} +``` + +#### Step B: Configure the Index (`EntityTypeConfiguration`) + +In your Entity Configuration file (e.g., `ProductConfiguration.cs`), define the DiskANN index. This tells EF Core to create the specialized index: + +```csharp +public void Configure(EntityTypeBuilder builder) +{ + // Define the vector column + builder.Property(p => p.Embedding) + .HasColumnType("vector(384)"); // Adjust size based on your embedding model + + // Define the DiskANN Index + builder.HasIndex(p => p.Embedding) + .HasMethod("diskann") // Explicitly use Azure's DiskANN + .HasOperators("vector_cosine_ops") // Use cosine similarity + .HasStorageParameter("product_quantized", true); // Enable `Product Quantization` to leverage high-dimensional support +} +``` + +### The Query Strategy + +1. **Fetch Candidates**: Request more results than you need (e.g., 50) using the approximate index. +2. **Rerank**: Re-sort those candidates by exact distance and return the top results (e.g., 10). + +https://learn.microsoft.com/en-us/azure/postgresql/extensions/how-to-use-pgdiskann#improve-accuracy-when-using-pq-with-vector-reranking + +Instead of the standard query, refactor your LINQ query to use a subquery structure. This encourages the database engine to use the index for coarse filtering before performing fine-grained sorting. + +**Standard Query (Less Optimized for DiskANN):** + +```csharp +// Simple scan - might be slow or less accurate with heavily compressed indexes +var value = new Pgvector.Vector(embeddedSearchQuery.Vector); +return dbContext.Products + .Where(p => p.Embedding!.CosineDistance(value!) < DISTANCE_THRESHOLD) + .OrderBy(p => p.Embedding!.CosineDistance(value!)) + .Take(10); +``` + +**Recommended Reranking Query:** + +```csharp +var value = new Pgvector.Vector(embeddedSearchQuery.Vector); +return dbContext.Products + // This would require the rest of the filters to be applied here instead of the method's returned IQueryable by the caller. For example Price > X etc. + .Where(p => p.Embedding!.CosineDistance(value!) < DISTANCE_THRESHOLD) + .OrderBy(p => p.Embedding!.CosineDistance(value!)) + .Take(CANDIDATE_COUNT) // Step 1: Approximate Search (fetch more candidates for better accuracy). This is especially important when using DiskANN indexes which may sacrifice some accuracy for speed + .OrderBy(p => p.Embedding!.CosineDistance(value!)) + .Take(FINAL_RESULT_COUNT); // Step 2: Reranking (Refine to top results with exact distances) +``` + +**Important**: The database execution plan should reflect the re-ranking strategy, ensuring that the DiskANN index is utilized effectively. + +**Why this matters:** + +* **DiskANN** uses compression (quantization) to be fast. +* By fetching a larger "candidate list" (50) first, you compensate for potential approximation errors. +* Sorting the small list (50 items) by exact distance ensures your final Top 10 are highly accurate. + +--- + Happy coding! 🚀