llama3.java

jexp · jexp · commit 71b15f4d519d · 2025-03-04T14:20:21.000+01:00
diff --git a/adoc/articles/javaspektrum-llama3-java.adoc b/adoc/articles/javaspektrum-llama3-java.adoc
@@ -64,7 +64,7 @@ Jetzt können wir das gerade heruntergeladene Modell ausführen und eine Frage s
 Das kleine Modell ist wie gesagt, nicht besonders gut, oft kommen sehr fragwürdige Antworten. 
 Daher sollte auch in kritischen, praktischen Anwendungen nur die Sprachfähigkeiten der LLMs, aber möglichst nicht ihr "Wissen" benutzt werden, sondern dieses aus vertrauenswürdigen Quellen, wie Datenbanken beziehen (mittels Retrieval Augmented Generation - RAG).
 
-.Listing {listing}
+.Listing {listing} - Test mit jbang, erste Frage
 [source,shell]
 ----
 jbang Llama3.java --model ../$MODEL --prompt "Kurz: Wie funktioniert physikalisch ein Induktionsherd?"
@@ -231,13 +231,14 @@ Zum Glück ist bei GGUF/GGMF das Token-Vokabular direkt in das Modelldatei integ
 
 Die Hauptprobleme für Tokenizierung treten bei asiatischen Sprachen z.b. mit Kanji und interessanterweise mit Emoji auf.
 
-Die `Tokenizer` Klasse kümmert sich um die Konvertierung zwischen Text und den Token-Ids:
+Die `Tokenizer` Klasse (Listing {counter:listing}) kümmert sich um die Konvertierung zwischen Text und den Token-Ids:
 
 * Implementiert den "Byte Pair Encoding" (BPE) Algorithmus
 * Behandlung spezieller Tokens
 * Effiziente Textaufteilung mittels regulärer Ausdrücke
 
-[source,java]
+.Listing {listing} - Tokenizer Implementierung
+[source,java] 
 ----
 class Tokenizer {
     private final Pattern compiledPattern;
@@ -397,7 +398,7 @@ Model
 
 Konfigurierbare Auswahl des nächsten Token aus dem Vektor der Wahrscheinlichkeitsverteilung (Logits) abhänging von Temperatur, Top-P aber auch Grammatik- oder Funktionssignatur-getriebene Auswahl.
 
-Es gibt verschiedene Sampling-Strategien:
+Es gibt verschiedene Sampling-Strategien, siehe Listing {counter:listing}:
 
 * `Sampler`: Basis Sampler Strategie Interface
 * `CategoricalSampler`: Global nach Wahrscheinlichkeitsverteilung
diff --git a/adoc/mcp-neo4j.adoc b/adoc/mcp-neo4j.adoc
@@ -394,6 +394,7 @@ MCP follows a https://modelcontextprotocol.io/docs/concepts/architecture[client-
 The protocol layer handles message framing, request/response linking, notificaitons and high-level communication patterns.
 
 The MCP allows for different transport protocols, currently supported are HTTPS (with Server-Sent-Events (SSE) for server->client messages and HTTP POST for client->server) and STDIO for local servers where the server is started by the client and can communicate via stdin/stdout.
+The protocol has a lifecycle of initialization, message exchange and termination.
 
 All transport messages exchanges are based on a https://spec.modelcontextprotocol.io/specification/[specification^] using JSON-RPC 2.0. 
 So it encourages to implement the protocol in other languages or transport layers.
@@ -407,6 +408,15 @@ The based message types are:
 
 With additional relevant aspects being configuration, progress tracking, cancellation, error reporting, logging.
 
+Message types are:
+
+* Client->Server: Requests (expect response) and Notifications with method name and parameters.
+* Server->Client: Notifications, Results (Dictionary), and Errors (code, message, data) with some error codes from the JSON RPC spec and others from the application/SDKs.
+
+The MCP site also documents sample client and server implementations in Python and TypeScript to enable implementers to get started quickly.
+Additionally it provides a list of good practices to adhere to for security, error handling and request processing.
+
+
 The protocol spec is also considering *security and trust*, which is an important aspect when allowing LLMs to access external data sources, because especially with write access to databases and filesystems and servers running locally and the potential for malicious code execution, security is a top priority.
 The foundation models are known to be vulnerable to adversarial attacks and hallucinations.
 Often LLM users are non-technical and might not be aware of the risks involved in allowing an AI model to access their data.
@@ -491,23 +501,23 @@ There is a small check that we only allow read statements in the read tool and v
 
 [source,python]
 ----
-    @server.call_tool()
-    async def handle_call_tool(
-        name: str, arguments: dict[str, Any] | None
-    ) -> list[types.TextContent | types.ImageContent | types.EmbeddedResource]:
-        """Handle tool execution requests"""
-        try:
-            if name == "get-neo4j-schema":
-                results = db._execute_query(
-                    """
-CALL apoc.meta.data() yield label, property, type, other, unique, index, elementType
-WHERE elementType = 'node'
-RETURN label, 
-    collect(case when type <> 'RELATIONSHIP' then [property, type] end) as attributes,
-    collect(case when type = 'RELATIONSHIP' then [property, head(other)] end) as relationships
-                    """
-                )
-                return [types.TextContent(type="text", text=str(results))]
+      @server.call_tool()
+      async def handle_call_tool(
+          name: str, arguments: dict[str, Any] | None
+      ) -> list[types.TextContent | types.ImageContent | types.EmbeddedResource]:
+          """Handle tool execution requests"""
+          try:
+              if name == "get-neo4j-schema":
+                  results = db._execute_query(
+                      """
+  CALL apoc.meta.data() yield label, property, type, other, unique, index, elementType
+  WHERE elementType = 'node'
+  RETURN label, 
+      collect(case when type <> 'RELATIONSHIP' then [property, type] end) as attributes,
+      collect(case when type = 'RELATIONSHIP' then [property, head(other)] end) as relationships
+                      """
+                  )
+                  return [types.TextContent(type="text", text=str(results))]
 ----
 
 == Conversational Memory as a Graph