Quarkus for AI
+Harness the power of AI with Quarkus. Experience unmatched performance, agility, and a superior developer experience, specifically designed for next-generation AI applications.
+ +diff --git a/_includes/ai-blueprints.html b/_includes/ai-blueprints.html new file mode 100644 index 00000000000..0728d2a8ba7 --- /dev/null +++ b/_includes/ai-blueprints.html @@ -0,0 +1,33 @@ +
The following three blueprints are conceptual, infrastructure-agnostic reference architectures. Each stands on its own and shows how to structure a Java solution with Quarkus (runtime, APIs, orchestration) and LangChain4j (LLM access, embeddings, tools, chains).
+
Quarkus provides the foundation for building secure, cloud-native, and AI-infused applications. Quarkus applications integrate with external model runtimes through LangChain4j, which offers rich abstractions for connecting to LLM providers, managing embeddings, defining tools, and orchestrating agentic workflows. This keeps AI where it belongs, as a capability embedded in enterprise applications, while Quarkus ensures performance, scalability, and operational reliability.
+
These blueprints demonstrate practical patterns and best practices for developing enterprise-grade AI solutions using a combination of these technologies. They aim to simplify the process of using AI in Java applications and guiding software architects along the way. Whether you're building intelligent chatbots, recommendation engines, or sophisticated data analysis tools, these blueprints provide a solid starting point for your next AI project. Explore each blueprint to discover how Quarkus and LangChain4j can enrich your Java applications with advanced AI capabilities.
+Improve LLM accuracy with RAG, leveraging enterprise data. Quarkus handles RAG's entire process, including data ingestion, query execution, embedding, context retrieval, and LLM communication.
+ +Advanced Contextual RAG improves frozen RAG by adding multi-source retrieval, reranking, and content injection. This makes it ideal for complex enterprise scenarios, ensuring accuracy, relevance, and explainability across distributed information. It enables dynamic information handling, complex queries, and clear lineage for auditable, high-stakes decisions.
+ +Chain-of-Thought (CoT) guides LLMs through explicit intermediate steps to solve complex problems. This systematic approach breaks tasks into manageable sub-problems for sequential processing and solution building. CoT enhances LLM accuracy, enabling understanding and debugging, especially for multi-step reasoning in mathematical problem-solving, code generation, and logical inference.
+ +The architecture of the Chain-of-Thought (CoT) blueprint focuses on guiding a Large Language Model (LLM) through explicit intermediate steps to solve complex problems, improve reasoning, and provide transparency in its decision-making.
+The CoT architecture starts with a "User Query" that initiates the process. This query is received by the "Quarkus CoT Service," which serves as the orchestrator for the entire reasoning flow. Within the Quarkus service, the core Chain-of-Thought logic, powered by LangChain4j, is executed.
+The "LangChain4j" package encapsulates the sequential steps of the CoT process:
+Finally, the Response is returned to the user, with the option to include the intermediate reasoning steps when transparency is required. Quarkus orchestrates the execution of single- or multi-prompt chains, while LangChain4j supplies the abstractions for building prompts and capturing reasoning outputs at each step. This structured flow improves the LLM’s performance on complex tasks and, when needed, provides an auditable record of how the answer was derived.
+Further patterns in Chain-of-Thought reasoning extend beyond basic single-prompt approaches to offer more sophisticated control and integration. "Single-prompt CoT" provides a concise way to elicit reasoning, where a single instruction like "think step by step" guides the LLM to return both its thought process and the final answer.
+More advanced scenarios benefit from "Program-of-Thought," which involves multiple chained prompts, where the output of one step feeds into the next, often including optional verification steps for enhanced accuracy.
+Lastly, a "Hybrid" approach combines CoT with Retrieval-Augmented Generation (RAG) to ground the reasoning process in factual information, ensuring that the LLM's logical steps are supported by relevant data. These patterns provide flexibility in how CoT is applied, allowing architects to choose the level of control and factual grounding necessary for their specific enterprise AI applications.
+Architecting Chain-of-Thought (CoT) solutions for enterprise environments necessitates careful consideration of guardrails and privacy. The following points represent an initial excerpt of critical aspects that software architects must account for to ensure responsible and secure AI deployment. These considerations are vital to manage the transparency of reasoning, maintain answer consistency, and control data exposure within the CoT process.
+Advanced Contextual RAG extends the core frozen RAG pattern by incorporating multi-source retrieval, reranking, and content injection techniques. This is designed for more complex enterprise scenarios where information might be spread across various systems, requiring more sophisticated methods to ensure accuracy, relevance, and explainability. It allows for dynamic information handling, complex query processing, and provides clearer lineage for auditable decisions, making it ideal for high-stakes applications.
+The process begins with a User Query, which is first processed by a Query Transformer to refine or enhance it for more effective retrieval. The transformed query is then passed to a Query Router that decides which knowledge sources to target. For unstructured data, the ingestion pipeline remains the same as in the foundational RAG architecture (documents split, embedded, and stored in a vector store), but contextual RAG extends retrieval to multiple sources such as structured databases, APIs, and search indexes.
+The Query Router is responsible for directing the query to multiple retrieval sources simultaneously. These sources include:
+All the information retrieved from these diverse sources is then fed into an Aggregator/Reranker. This component combines and prioritizes the retrieved content based on relevance to the original query.
+The aggregated and reranked content is passed to a Content Injector (Prompt Builder). This component constructs an Enhanced Prompt for the Large Language Model (LLM) by incorporating the retrieved context alongside the original user query.
+Finally, the LLM processes the Augmented Prompt, using the provided context to generate an answer. Alongside the answer, the system can return the retrieved source segments for transparency and verification, though these should be considered supporting context rather than strict citations.
+Efficiently scaling and optimizing the performance of your AI solutions are crucial for enterprise adoption and operational success. While this blueprint only gives you some high level guidance, we strongly recommend to also look into the non functional aspects of your solution and ways to address these concepts:
+Architecting secure enterprise AI solutions demands a proactive approach to safeguard sensitive data and preserve organizational integrity. Below are some first thoughts about critical security considerations and architectural patterns you should further investigate when building your solution.
+Integrate RAG to anchor Large Language Model (LLM) responses in your enterprise data, with Quarkus handling ingestion pipelines, query execution, embedding generation, context retrieval, and seamless LLM interaction. This blueprint focuses on the foundational RAG pattern (also called frozen RAG); more advanced contextual RAG variants, including multi-source routing and reranking, are covered separately.
+Contextual RAG focuses on integrating Retrieval-Augmented Generation (RAG) to ground Large Language Model (LLM) responses in organizational data.
+The architecture is divided into two main phases:
+Ingestion: This phase prepares enterprise knowledge for retrieval. In a frozen RAG setup, data typically originates from unstructured document sources such as manuals, PDFs, or reports.
+Query: This phase handles user queries and generates grounded answers.
+This two-phase approach allows for reduced hallucinations in LLM responses, up-to-date information without retraining, cost efficiency by retrieving only relevant information, and seamless integration with existing enterprise Java services and workflows.
+Artificial Intelligence is reshaping enterprise software, and Java remains the backbone. Its long-standing reliability, security, and scalability make it ideal for building AI-infused applications.
+Many AI initiatives begin with robust data processing pipelines.
+This dual capability, supporting both model training and RAG processes, makes Java uniquely powerful in AI architectures.
+Once data and models are ready, enterprises need applications that:
+The JVM ecosystem ensures consistency, portability, and performance across deployments.
+Java brings full enterprise readiness to AI systems:
+The Java ecosystem has long been foundational for interoperability.
+So, it’s not surprising to see Java client and server implementations for the emerging AI protocols, such as:
+Java’s mature networking and concurrency capabilities make it ideal for implementing these agentic, AI-driven architectures.
+Java is ideal for AI development due to its platform independence, robust memory management, and rich open-source libraries like Deeplearning4j, Weka, and Apache Spark. Its mature ecosystem and strong community support enable scalable AI applications. With recent JVM optimizations, Java's performance handles demanding machine learning tasks and large datasets efficiently.
+ +Quarkus is ideal for AI applications due to its performance, agility, and developer experience. It offers native Generative AI integration via LangChain4j, supporting declarative AI services, various LLMs, and advanced prompt engineering. It also handles predictive AI and data pipeline automation with ML toolkits for scalable ETL and embedding workflows. Quarkus's "AI-Enhanced Developer Experience" provides fast startups, low memory, and a reactive core for cloud-native AI. It boosts developer velocity with live coding, a unified Java stack, and robust observability/security for reliable AI services.
+ +Easily build AI workflows with minimal code, leveraging top LLM providers to create features like chatbots and summarizers.
+Enable predictive AI, model training, and scalable data workflows, connecting seamlessly to ML tools, message brokers, databases, and files.
+Incorporate AI directly into development workflow with instant feedback, code explanations, and documentation, speeding up iterations.
+Create reliable, secure AI applications using Quarkus’s built-in observability, security, and governance—ensuring AI services are safe, accountable, and performance-ready from the beginning.
+AI blueprints offer conceptual, infrastructure-agnostic reference architectures for developing enterprise-grade AI solutions in Java. They simplify AI integration in Java applications, guiding software architects in building intelligent chatbots, recommendation engines, and data analysis tools. These blueprints, leveraging Quarkus and LangChain4j, provide a solid starting point for advanced AI capabilities in your Java projects.
+ +Quarkus integrates GenAI natively through the LangChain4j extension, empowering developers with:
+This enables teams to develop sophisticated AI-infused features, such as chatbots or dynamic summarizers, without requiring a stack shift.
+Quarkus is comprehensive, it supports both predictive AI
and data pipeline automation: +Quarkus is extending AI into the developer workflow through the Chappie initiative:
+This becomes more than a powerful coding assistant; it augments the whole developer experience.
+Build secure, trustworthy AI applications with Quarkus’s comprehensive observability, security, and governance features— providing automatic metrics, logs, traces, data sanitization, and access controls to ensure AI services are secure, accountable, and performance-optimized from day one.
+