diff --git a/_includes/ai-blueprints.html b/_includes/ai-blueprints.html new file mode 100644 index 00000000000..0728d2a8ba7 --- /dev/null +++ b/_includes/ai-blueprints.html @@ -0,0 +1,33 @@ +
+
+
+ AI blueprints icon + AI blueprints icon +
+
+

Enterprise AI Blueprints for Java with Quarkus & LangChain4j

+

The following three blueprints are conceptual, infrastructure-agnostic reference architectures. Each stands on its own and shows how to structure a Java solution with Quarkus (runtime, APIs, orchestration) and LangChain4j (LLM access, embeddings, tools, chains).

+

Quarkus provides the foundation for building secure, cloud-native, and AI-infused applications. Quarkus applications integrate with external model runtimes through LangChain4j, which offers rich abstractions for connecting to LLM providers, managing embeddings, defining tools, and orchestrating agentic workflows. This keeps AI where it belongs, as a capability embedded in enterprise applications, while Quarkus ensures performance, scalability, and operational reliability.

+

These blueprints demonstrate practical patterns and best practices for developing enterprise-grade AI solutions using a combination of these technologies. They aim to simplify the process of using AI in Java applications and guiding software architects along the way. Whether you're building intelligent chatbots, recommendation engines, or sophisticated data analysis tools, these blueprints provide a solid starting point for your next AI project. Explore each blueprint to discover how Quarkus and LangChain4j can enrich your Java applications with advanced AI capabilities.

+
+ +
+

Frozen RAG (Retrieval-Augmented Generation)

+

Improve LLM accuracy with RAG, leveraging enterprise data. Quarkus handles RAG's entire process, including data ingestion, query execution, embedding, context retrieval, and LLM communication.

+

Learn the basics of Frozen RAG

+
+ +
+

Contextual RAG (Multi-Sources, Rerank, Injection)

+

Advanced Contextual RAG improves frozen RAG by adding multi-source retrieval, reranking, and content injection. This makes it ideal for complex enterprise scenarios, ensuring accuracy, relevance, and explainability across distributed information. It enables dynamic information handling, complex queries, and clear lineage for auditable, high-stakes decisions.

+

Learn about Contextual RAG

+
+ +
+

Chain-of-Thought (CoT) Reasoning

+

Chain-of-Thought (CoT) guides LLMs through explicit intermediate steps to solve complex problems. This systematic approach breaks tasks into manageable sub-problems for sequential processing and solution building. CoT enhances LLM accuracy, enabling understanding and debugging, especially for multi-step reasoning in mathematical problem-solving, code generation, and logical inference.

+

Learn about Chain-of-Thought Reasoning

+
+ +
+
diff --git a/_includes/ai-breadcrumb.html b/_includes/ai-breadcrumb.html new file mode 100644 index 00000000000..910a4e39695 --- /dev/null +++ b/_includes/ai-breadcrumb.html @@ -0,0 +1,7 @@ +
+
+ +
+
\ No newline at end of file diff --git a/_includes/ai-chainofthought.html b/_includes/ai-chainofthought.html new file mode 100644 index 00000000000..3aa122978d8 --- /dev/null +++ b/_includes/ai-chainofthought.html @@ -0,0 +1,54 @@ +
+
+
+

Chain-of-Thought (CoT) Reasoning

+

The architecture of the Chain-of-Thought (CoT) blueprint focuses on guiding a Large Language Model (LLM) through explicit intermediate steps to solve complex problems, improve reasoning, and provide transparency in its decision-making.

+

Main Use-Cases

+
    +
  • Improved Reasoning: Decompose complex problems to reduce logical errors.
  • +
  • Transparency: Provide optional explanations for decisions.
  • +
  • Training & Enablement: Illustrate the "why" behind concepts, not just the "what."
  • +
  • Decision Support: Aid in investments, vendor selection, and risk assessments.
  • +
  • Troubleshooting: Facilitate structured diagnostics in operations and engineering.
  • +
  • Policy Application: Apply multi-clause rules with traceable steps.
  • +
+

Architecture Overview

+

The CoT architecture starts with a "User Query" that initiates the process. This query is received by the "Quarkus CoT Service," which serves as the orchestrator for the entire reasoning flow. Within the Quarkus service, the core Chain-of-Thought logic, powered by LangChain4j, is executed.

+ CoT architecture image + CoT architecture image +

The "LangChain4j" package encapsulates the sequential steps of the CoT process:

+
+
+
+
Step 1: Analyze Factors:
+
This initial step involves the LLM breaking down the complex user query into its constituent parts, identifying key factors, and performing an initial analysis. This could involve understanding the problem, identifying relevant data points, or defining the scope of the task.
+
+
+
+
+
Step 2: Synthesize Options:
+
Building on the analysis from Step 1, the LLM then synthesizes various options, potential solutions, or different perspectives related to the query. This step demonstrates the model's ability to explore different avenues of thought before arriving at a conclusion.
+
+
+
+
+
Step 3: Recommendation:
+
In the final step, the LLM formulates a "Recommendation" or a definitive answer based on the analysis and synthesis performed in the preceding steps. This recommendation is the ultimate output of the CoT process.
+
+
+
+

Finally, the Response is returned to the user, with the option to include the intermediate reasoning steps when transparency is required. Quarkus orchestrates the execution of single- or multi-prompt chains, while LangChain4j supplies the abstractions for building prompts and capturing reasoning outputs at each step. This structured flow improves the LLM’s performance on complex tasks and, when needed, provides an auditable record of how the answer was derived.

+

Further Patterns

+

Further patterns in Chain-of-Thought reasoning extend beyond basic single-prompt approaches to offer more sophisticated control and integration. "Single-prompt CoT" provides a concise way to elicit reasoning, where a single instruction like "think step by step" guides the LLM to return both its thought process and the final answer.

+

More advanced scenarios benefit from "Program-of-Thought," which involves multiple chained prompts, where the output of one step feeds into the next, often including optional verification steps for enhanced accuracy.

+

Lastly, a "Hybrid" approach combines CoT with Retrieval-Augmented Generation (RAG) to ground the reasoning process in factual information, ensuring that the LLM's logical steps are supported by relevant data. These patterns provide flexibility in how CoT is applied, allowing architects to choose the level of control and factual grounding necessary for their specific enterprise AI applications.

+

Guardrails & Privacy

+

Architecting Chain-of-Thought (CoT) solutions for enterprise environments necessitates careful consideration of guardrails and privacy. The following points represent an initial excerpt of critical aspects that software architects must account for to ensure responsible and secure AI deployment. These considerations are vital to manage the transparency of reasoning, maintain answer consistency, and control data exposure within the CoT process.

+
    +
  • Reasoning Exposure: Decide whether to reveal the Chain of Thought (CoT) or keep it internal.
  • +
  • Consistency Checks: Implement a final verifier prompt or apply deterministic post-rules.
  • +
  • Token Budgeting: Limit intermediate verbosity and summarize between steps.
  • +
+
+
+
diff --git a/_includes/ai-contextualrag.html b/_includes/ai-contextualrag.html new file mode 100644 index 00000000000..53aee1d93d3 --- /dev/null +++ b/_includes/ai-contextualrag.html @@ -0,0 +1,45 @@ +
+
+
+

Contextual RAG (Multi-Sources, Rerank, Injection)

+

Advanced Contextual RAG extends the core frozen RAG pattern by incorporating multi-source retrieval, reranking, and content injection techniques. This is designed for more complex enterprise scenarios where information might be spread across various systems, requiring more sophisticated methods to ensure accuracy, relevance, and explainability. It allows for dynamic information handling, complex query processing, and provides clearer lineage for auditable decisions, making it ideal for high-stakes applications.

+

Main Use-Cases

+
    +
  • Complex Queries: Addresses intricate questions requiring synthesis from multiple sources.
  • +
  • Dynamic Information: Handles rapidly changing data environments by incorporating real-time updates.
  • +
  • High-Accuracy Needs: Reranking and injection ensure more precise and relevant answers.
  • +
  • Auditable Decisions: Provides clear lineage and context for generated responses, crucial for compliance and debugging.
  • +
+

Architecture Overview

+

The process begins with a User Query, which is first processed by a Query Transformer to refine or enhance it for more effective retrieval. The transformed query is then passed to a Query Router that decides which knowledge sources to target. For unstructured data, the ingestion pipeline remains the same as in the foundational RAG architecture (documents split, embedded, and stored in a vector store), but contextual RAG extends retrieval to multiple sources such as structured databases, APIs, and search indexes.

+

The Query Router is responsible for directing the query to multiple retrieval sources simultaneously. These sources include:

+
    +
  • Vector Retriever: Retrieves information based on semantic similarity from a vector store.
  • +
  • Web/Search Retriever: Gathers information from the web or external search engines.
  • +
  • Database Retriever: Extracts relevant data from structured databases.
  • +
  • Full-Text Retriever: Performs keyword-based searches across a corpus of documents.
  • +
+

All the information retrieved from these diverse sources is then fed into an Aggregator/Reranker. This component combines and prioritizes the retrieved content based on relevance to the original query.

+

The aggregated and reranked content is passed to a Content Injector (Prompt Builder). This component constructs an Enhanced Prompt for the Large Language Model (LLM) by incorporating the retrieved context alongside the original user query.

+

Finally, the LLM processes the Augmented Prompt, using the provided context to generate an answer. Alongside the answer, the system can return the retrieved source segments for transparency and verification, though these should be considered supporting context rather than strict citations.

+ Contextual RAG query image + Contextual RAG query image +

Scalability & Performance

+

Efficiently scaling and optimizing the performance of your AI solutions are crucial for enterprise adoption and operational success. While this blueprint only gives you some high level guidance, we strongly recommend to also look into the non functional aspects of your solution and ways to address these concepts:

+
    +
  • Domain/Tenant Sharding: Retrieves information based on semantic similarity from a vector store.
  • +
  • Caching: Cache query vectors and top-K hits for improved performance.
  • +
  • Asynchronous Ingestion: Utilize asynchronous ingestion to batch embeddings and stream deltas.
  • +
  • Lean Prompts: Prioritize token budget for context, keeping prompts concise.
  • +
+

Security

+

Architecting secure enterprise AI solutions demands a proactive approach to safeguard sensitive data and preserve organizational integrity. Below are some first thoughts about critical security considerations and architectural patterns you should further investigate when building your solution.

+
    +
  • Authorization at retrieval: Before injecting context, filter by user/tenant claims.
  • +
  • Audit lineage: Store the chunk→document→source linkage with timestamps.
  • +
  • PII controls: Redact or mask sensitive spans before embedding and prompting.
  • +
  • Guard responses: Post-filter for data leakage and policy violations.
  • +
+
+
+
diff --git a/_includes/ai-frozenrag.html b/_includes/ai-frozenrag.html new file mode 100644 index 00000000000..5a61662410d --- /dev/null +++ b/_includes/ai-frozenrag.html @@ -0,0 +1,40 @@ +
+
+ +
+

Frozen RAG (Retrieval-Augmented Generation)

+

Integrate RAG to anchor Large Language Model (LLM) responses in your enterprise data, with Quarkus handling ingestion pipelines, query execution, embedding generation, context retrieval, and seamless LLM interaction. This blueprint focuses on the foundational RAG pattern (also called frozen RAG); more advanced contextual RAG variants, including multi-source routing and reranking, are covered separately.

+

Main Use-Cases

+
    +
  • Reduced Hallucinations: RAG ensures that LLM answers are explicitly tied to enterprise-specific sources such as policies, manuals, or knowledge bases. This grounding reduces the risk of fabricated or misleading responses and increases trust in AI-assisted decision-making.
  • +
  • Up-to-Date Information: Because the retrieval step pulls directly from current document repositories and databases, responses adapt automatically as content evolves. There is no need to retrain or fine-tune the underlying model whenever business data changes.
  • +
  • Cost Efficiency: By retrieving only the most relevant context chunks, prompts stay concise. This reduces token usage in LLM calls, which directly lowers cost while preserving accuracy and completeness.
  • +
  • Java-Native Enterprise Integration: Quarkus provides a first-class runtime for embedding RAG workflows into existing enterprise systems. Developers can secure RAG services with OIDC or LDAP, expose them through familiar REST or Kafka APIs, and monitor them with Prometheus and OpenTelemetry. Because RAG runs inside the same application fabric as other Java services, it fits naturally into existing authentication, authorization, deployment, and observability workflows. This ensures AI augmentation is not just added, but part of the enterprise architecture.
  • +
+

Architecture Overview

+

Contextual RAG focuses on integrating Retrieval-Augmented Generation (RAG) to ground Large Language Model (LLM) responses in organizational data.

+

The architecture is divided into two main phases:

+

Ingestion: This phase prepares enterprise knowledge for retrieval. In a frozen RAG setup, data typically originates from unstructured document sources such as manuals, PDFs, or reports.

+
    +
  • Documents are processed by a "Text Splitter" to break them into smaller chunks.
  • +
  • These chunks are then converted into numerical representations (embeddings) using an "Embedding Model".
  • +
  • The embeddings are stored in a "Vector Store" for semantic similarity searches.
  • +
  • Metadata about the documents, such as lineage and other relevant information, is stored in a "Metadata Store".
  • +
+ Frozen RAG ingestion image + Frozen RAG ingestion image +

Query: This phase handles user queries and generates grounded answers.

+
    +
  • A "User Query" is received and processed by a "Query Embedding" component to create an embedding of the query.
  • +
  • The query embedding is used in a "Similarity Search" against the "Vector Store" to retrieve relevant document chunks. The "Vector Store" "serves" the similarity search.
  • +
  • The retrieved chunks, along with metadata from the "Metadata Store" (which acts as the "source of truth"), are assembled into a "Context Pack".
  • +
  • The "Context Pack" is used by a "Prompt Assembly" component to construct an "Enhanced Prompt" that includes the relevant context.
  • +
  • The "Enhanced Prompt" is fed into an "LLM (LangChain4j)".
  • +
  • The LLM generates a "Grounded Answer" based on the provided context.
  • +
+ Frozen RAG query image + Frozen RAG query image +

This two-phase approach allows for reduced hallucinations in LLM responses, up-to-date information without retraining, cost efficiency by retrieving only relevant information, and seamless integration with existing enterprise Java services and workflows.

+
+
+
diff --git a/_includes/ai-java-for-ai.html b/_includes/ai-java-for-ai.html new file mode 100644 index 00000000000..00f70a13996 --- /dev/null +++ b/_includes/ai-java-for-ai.html @@ -0,0 +1,85 @@ +
+
+
+ Java for AI icon +
+
+

Artificial Intelligence is reshaping enterprise software, and Java remains the backbone. Its long-standing reliability, security, and scalability make it ideal for building AI-infused applications.

+
+ +
+

Java in Data Preparation

+

Many AI initiatives begin with robust data processing pipelines.

+

Effective AI requires two distinct pipelines:

+
+
+
+
Training Data Preparation
+
Build predictive models with DeepLearning4J (DL4J). Java’s ecosystem, including Apache Kafka, Apache Flink, and Apache Camel, supports large-scale ETL, data cleansing, and preprocessing across enterprise data.
+
+
+
+
+
RAG Data Preparation
+
For Retrieval-Augmented Generation, ingest documents and business knowledge, compute embeddings, and index in vector stores. Java seamlessly connects to databases, message brokers, and vector stores, enabling RAG pipelines that are both fast and reliable.
+
+
+
+

This dual capability, supporting both model training and RAG processes, makes Java uniquely powerful in AI architectures.

+
+ + + +
+

Java for AI-Infused Applications and Intelligent Applications

+

Once data and models are ready, enterprises need applications that:

+
    +
  • Embed AI in customer-facing workflows (chatbots, fraud detection, document assistants).
  • +
  • Interact with predictive and generative AI models (in-process, locally or remotely).
  • +
  • Scale from prototype to production on cloud or on-prem.
  • +
+

The JVM ecosystem ensures consistency, portability, and performance across deployments.

+
+ +
+

Enterprise-Grade Security, Governance & Observability

+

Java brings full enterprise readiness to AI systems:

+
+
+
+
Security by Design:
+
Supports traceable inputs, audit trails, and governance-ready execution from day one.
+
+
+
+
+
Proven at Scale:
+
Decades of enterprise deployment ensure predictable performance and long-term maintainability.
+
+
+
+
+
Stable Innovation:
+
Embraces modern tools like LangChain4j while preserving strong typing, backward compatibility, and developer familiarity.
+
+
+
+
+
Observability:
+
Java applications, especially those built with Quarkus, incorporate metrics, tracing, and logs, thereby improving operational tasks. This unified telemetry enables real-time monitoring of AI pipelines and live troubleshooting.
+
+
+ +
+

Java & Emerging AI Protocols

+

The Java ecosystem has long been foundational for interoperability.

+

So, it’s not surprising to see Java client and server implementations for the emerging AI protocols, such as:

+
    +
  • MCP (Model context Protocol): Enables building servers and clients that integrate LLMs with enterprise systems.
  • +
  • A2A (Agent-to-Agent): Supports reliable autonomous agent communication across ecosystems.
  • +
+

Java’s mature networking and concurrency capabilities make it ideal for implementing these agentic, AI-driven architectures.

+
+ +
+
diff --git a/_includes/ai-overview.html b/_includes/ai-overview.html new file mode 100644 index 00000000000..8ed745441f6 --- /dev/null +++ b/_includes/ai-overview.html @@ -0,0 +1,59 @@ +
+
+
+ Java for AI icon +
+
+

Why use Java for AI?

+

Java is ideal for AI development due to its platform independence, robust memory management, and rich open-source libraries like Deeplearning4j, Weka, and Apache Spark. Its mature ecosystem and strong community support enable scalable AI applications. With recent JVM optimizations, Java's performance handles demanding machine learning tasks and large datasets efficiently.

+

Learn more about using Java for AI

+
+ +
+

Why use Quarkus for AI-Infused Applications

+

Quarkus is ideal for AI applications due to its performance, agility, and developer experience. It offers native Generative AI integration via LangChain4j, supporting declarative AI services, various LLMs, and advanced prompt engineering. It also handles predictive AI and data pipeline automation with ML toolkits for scalable ETL and embedding workflows. Quarkus's "AI-Enhanced Developer Experience" provides fast startups, low memory, and a reactive core for cloud-native AI. It boosts developer velocity with live coding, a unified Java stack, and robust observability/security for reliable AI services.

+

Learn more about using Quarkus for AI

+
+ +
+

Benefits of Quarkus for AI-Infused Applications

+
+
+ Native Integration with Generative AI icon + Native Integration with Generative AI icon +

Native Integration with Generative AI

+

Easily build AI workflows with minimal code, leveraging top LLM providers to create features like chatbots and summarizers.

+
+
+ Predictive AI and Data Pipelines icon + Predictive AI and Data Pipelines icon +

Predictive AI and Data Pipelines

+

Enable predictive AI, model training, and scalable data workflows, connecting seamlessly to ML tools, message brokers, databases, and files.

+
+
+ Enhanced Developer Experience icon + Enhanced Developer Experience icon +

Enhanced Developer Experience

+

Incorporate AI directly into development workflow with instant feedback, code explanations, and documentation, speeding up iterations.

+
+
+ Enterprise-Grade icon + Enterprise-Grade icon +

Enterprise-Grade AI

+

Create reliable, secure AI applications using Quarkus’s built-in observability, security, and governance—ensuring AI services are safe, accountable, and performance-ready from the beginning.

+
+
+
+ +
+ AI blueprints icon + AI blueprints icon +
+
+

Enterprise AI Blueprints for Java with Quarkus & LangChain4j

+

AI blueprints offer conceptual, infrastructure-agnostic reference architectures for developing enterprise-grade AI solutions in Java. They simplify AI integration in Java applications, guiding software architects in building intelligent chatbots, recommendation engines, and data analysis tools. These blueprints, leveraging Quarkus and LangChain4j, provide a solid starting point for advanced AI capabilities in your Java projects.

+

Learn more about using Quarkus for AI

+
+ +
+
diff --git a/_includes/ai-quarkus-for-ai.html b/_includes/ai-quarkus-for-ai.html new file mode 100644 index 00000000000..756d86ff821 --- /dev/null +++ b/_includes/ai-quarkus-for-ai.html @@ -0,0 +1,79 @@ +
+
+ +
+

Generative AI with Quarkus LangChain4j

+
+ +
+ Native Integration with Generative AI icon + Native Integration with Generative AI icon +
+ +
+

Quarkus integrates GenAI natively through the LangChain4j extension, empowering developers with:

+
    +
  • Declarative AI services via Java programming model: Write minimal code using CDI and annotations to define AI workflows.
  • +
  • Rich LLM ecosystem support: Access OpenAI, Watsonx, Ollama, and vector databases like Pinecone, Chroma, or Redis.
  • +
  • Advanced prompt engineering tools, chains, and agents: Seamlessly build intelligent workflows in Java.
  • +
+

This enables teams to develop sophisticated AI-infused features, such as chatbots or dynamic summarizers, without requiring a stack shift.

+
+ +
+

Predictive AI and Seamless Data Preparation

+
+ +
+ Predictive AI and Data Pipelines icon + Predictive AI and Data Pipelines icon +
+ +
+

Quarkus is comprehensive, it supports both predictive AI

and data pipeline automation:

+
    +
  • Integrate directly with ML toolkits such as DL4J to train or run models within a single Quarkus application.
  • +
  • Build scalable, production-grade ETL and embedding workflows that connect to message brokers, databases, and file systems easily - a must-have for preparing data for both model training and RAG patterns.
  • +
+
+ +
+

AI-Enhanced Developer Experience

+
+ +
+ Enhanced Developer Experience icon + Enhanced Developer Experience icon +
+ +
+

Quarkus is extending AI into the developer workflow through the Chappie initiative:

+
    +
  • Embeds AI features directly into Dev Mode and Dev UI for efficiency without disruption.
  • +
  • Allows features such as exception diagnosis, test and documentation generation, and code explanation (AI Assistant).
  • +
  • Pushes Quarkus application knowledge into AI-based assistants using the Dev-UI MCP server.
  • +
  • Provides access to the documentation and guide for the specific versions you are using through a chatbot.
  • +
+

This becomes more than a powerful coding assistant; it augments the whole developer experience.

+
+ +
+

Enterprise-Grade AI

+
+ +
+ Enterprise-Grade icon + Enterprise-Grade icon +
+ +
+

Build secure, trustworthy AI applications with Quarkus’s comprehensive observability, security, and governance features— providing automatic metrics, logs, traces, data sanitization, and access controls to ensure AI services are secure, accountable, and performance-optimized from day one.

+
    +
  • Automatically instrument AI services with metrics, logs, and traces for each call and tool invocation via OpenTelemetry.
  • +
  • Enable AI services that are accountable, observable, traceable, and testable, enforcing auditability from day one.
  • +
  • Built-in guardrails, like data sanitization, prompt anonymization, and context-aware user access, are available out of the box.
  • +
+
+ +
+
diff --git a/_includes/header-navigation.html b/_includes/header-navigation.html index f5f3d8ee2eb..9d60cdd378b 100644 --- a/_includes/header-navigation.html +++ b/_includes/header-navigation.html @@ -11,7 +11,7 @@