microsoft · dineshsoudagar · Jun 13, 2025 · Jun 13, 2025 · Jun 13, 2025 · Jun 13, 2025
diff --git a/mobile/examples/Qwen_QA/Android/.gitignore b/mobile/examples/Qwen_QA/Android/.gitignore
@@ -0,0 +1,15 @@
+*.iml
+.gradle
+/local.properties
+/.idea/caches
+/.idea/libraries
+/.idea/modules.xml
+/.idea/workspace.xml
+/.idea/navEditor.xml
+/.idea/assetWizardSettings.xml
+.DS_Store
+/build
+/captures
+.externalNativeBuild
+.cxx
+local.properties
diff --git a/mobile/examples/Qwen_QA/Android/README.md b/mobile/examples/Qwen_QA/Android/README.md
@@ -0,0 +1,119 @@
+# Local Qwen LLM on Android
+
+This example shows how to run Qwen2.5-0.5B-Instruct and Qwen3-0.6B entirely on an Android device using ONNX Runtime.
+All tokens are generated offline on the phone no network calls, no telemetry.
+
+---
+
+## Key features
+
+- On-device inference with the official onnxruntime-android.
+- Tokenizer compatibility – reads the Hugging Face-standard tokenizer.json shipped with Qwen.
+- Prompt formatting for Qwen 2.5 and Qwen 3, including the **Thinking Mode** toggle supported by Qwen3.
+- Streaming generation with past-KV caching for smooth, low-latency text output (see [OnnxModel.kt](app/src/main/java/com/example/local_llm/OnnxModel.kt)).
+- Output supports Markdown — copy and reuse formatted answers anywhere.
+
+
+---
+
+## 📸 Inference Preview
+
+<p align="center">
+  <img src="demo/Demo.gif" alt="Model Output 2" width="25%" style="margin: 1%"/>
+  <img src="demo/Demo2.gif" alt="Input Prompt" width="25%" style="margin: 1%"/>
+  <img src="demo/Qwen3demo.gif" alt="Input Prompt" width="25%" style="margin: 1%"/>
+</p>
+
+<p align="center">
+  <em>Figure: App interface showing prompt input and generated answers using the local LLM.</em>
+</p>
+
+---
+
+## Model Info
+
+This app supports both **Qwen2.5-0.5B-Instruct** and **Qwen3-0.6B** — optimized for instruction-following, QA, and reasoning tasks.
+
+### Option 1: Use Preconverted ONNX Model
+
+Download the `model.onnx` and `tokenizer.json` from Hugging Face:
+
+- 🔹 [Qwen2.5](https://huggingface.co/onnx-community/Qwen2.5-0.5B-Instruct)  
+- 🔹 [Qwen3](https://huggingface.co/onnx-community/Qwen3-0.6B-ONNX)  
+
+- You can also use quantized models (e.g., `model_q4fp16.onnx`) for faster, lighter inference with minimal accuracy loss.
+
+### ⚙️ Option 2: Convert Model Yourself
+
+```bash
+pip install optimum[onnxruntime]
+# or
+python -m pip install git+https://github.com/huggingface/optimum.git
+```
+
+Export the model:
+
+```bash
+optimum-cli export onnx --model Qwen/Qwen2.5-0.5B-Instruct qwen2.5-0.5B-onnx/
+```
+
+- You can also convert any fine-tuned variant by specifying the model path.
+- Learn more about [Optimum here](https://huggingface.co/docs/optimum/main/en/index).
+
+---
+
+## ⚙️ Requirements
+
+- [Android Studio](https://developer.android.com/studio)
+- [ONNX Runtime for Android](https://github.com/microsoft/onnxruntime-genai/releases) (already included in this repo).
+- A physical Android device for deployment and testing, ≥ 4 GB RAM for FP16 / Q4 models, ≥ 6 GB RAM for FP32 models.
+- Real hardware preferred—emulators are acceptable for UI checks only.
+
+---
+#### Choose which Qwen model to run
+
+In[MainActivity.kt](app/src/main/java/com/example/local_llm/MainActivity.kt) you will find two pre-defined `ModelConfig` objects:
+
+```kotlin
+val modelconfigqwen25 = …   // Qwen 2.5-0.5B
+val modelconfigqwen3  = …   // Qwen 3-0.6B
+````
+Right below them is a single line that tells the app which one to use:
+
+````kotlin
+val config = modelconfigqwen25      // ← change to modelconfigqwen3 for Qwen 3
+````
+
+## How to Build & Run
+
+1. Open Android Studio and create a new project (Empty Activity).
+2. Name your app `local_llm`.
+3. Copy all the project files from `Qwen_QA/Android` into the appropriate folders.
+4. Place your `model.onnx` and `tokenizer.json` in:
+   ```
+   app/src/main/assets/
+   ```
+5. Connect your Android phone using wireless debugging or USB.
+6. To install:
+   - Press Run ▶️ in Android Studio, **or**
+   - Go to **Build → Generate Signed Bundle / APK** to export the `.apk` file.
+7. Once installed, look for the **Pocket LLM** icon&nbsp;
+   <img src="demo/pocket_llm_icon.png" alt="Pocket LLM icon" width="28" style="vertical-align:middle;border-radius:100%"/>
+   on your home screen.
+
+**Note**: All Kotlin files are declared in the package com.example.local_llm, and the Gradle script sets applicationId "com.example.local_llm".
+If you name the app (or change the package) to anything other than local_llm, you must refactor:
+- The directory structure in app/src/main/java/...,                     
+- Every package com.example.local_llm line, and
+- The applicationId in app/build.gradle.
+- Otherwise, Android Studio will raise “package … does not exist” errors and the project will fail to compile.
+----
+
+## Customize Your App Experience with These
+- Define the assistant’s tone and role by setting defaultSystemPrompt (in your model config).
+- Adjust TEMPERATURE to control response randomness — lower for accuracy, higher for creativity ([OnnxModel.kt](app/src/main/java/com/example/local_llm/OnnxModel.kt)).
+- Use REPETITION_PENALTY to avoid repetitive answers and improve fluency ([OnnxModel.kt](app/src/main/java/com/example/local_llm/OnnxModel.kt)).
+- Change MAX_TOKENS to limit or expand the length of generated replies ([OnnxModel.kt](app/src/main/java/com/example/local_llm/OnnxModel.kt)).
+
+### 📄 License Notice
+Note: These ONNX models are based on Qwen, which is licensed under the [Apache License 2.0](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct/blob/main/LICENSE).
diff --git a/mobile/examples/Qwen_QA/Android/app/.gitignore b/mobile/examples/Qwen_QA/Android/app/.gitignore
@@ -0,0 +1 @@
+/build
diff --git a/mobile/examples/Qwen_QA/Android/app/build.gradle.kts b/mobile/examples/Qwen_QA/Android/app/build.gradle.kts
@@ -0,0 +1,65 @@
+plugins {
+    alias(libs.plugins.android.application)
+    alias(libs.plugins.kotlin.android)
+    alias(libs.plugins.kotlin.compose)
+}
+
+android {
+    namespace = "com.example.local_llm"
+    compileSdk = 35
+
+    defaultConfig {
+        applicationId = "com.example.local_llm"
+        minSdk = 24
+        targetSdk = 35
+        versionCode = 1
+        versionName = "1.0"
+
+        testInstrumentationRunner = "androidx.test.runner.AndroidJUnitRunner"
+    }
+
+    buildTypes {
+        release {
+            isMinifyEnabled = false
+            proguardFiles(
+                getDefaultProguardFile("proguard-android-optimize.txt"),
+                "proguard-rules.pro"
+            )
+        }
+    }
+    compileOptions {
+        sourceCompatibility = JavaVersion.VERSION_11
+        targetCompatibility = JavaVersion.VERSION_11
+    }
+    kotlinOptions {
+        jvmTarget = "11"
+    }
+    buildFeatures {
+        compose = true
+        viewBinding = true
+    }
+}
+
+dependencies {
+
+    implementation(libs.androidx.core.ktx)
+    implementation(libs.androidx.lifecycle.runtime.ktx)
+    implementation(libs.androidx.activity.compose)
+    implementation(platform(libs.androidx.compose.bom))
+    implementation(libs.androidx.ui)
+    implementation(libs.androidx.ui.graphics)
+    implementation(libs.androidx.ui.tooling.preview)
+    implementation(libs.androidx.material3)
+    implementation(libs.onnxruntime.android)
+    implementation(libs.androidx.appcompat)
+    testImplementation(libs.junit)
+    androidTestImplementation(libs.androidx.junit)
+    androidTestImplementation(libs.androidx.espresso.core)
+    androidTestImplementation(platform(libs.androidx.compose.bom))
+    androidTestImplementation(libs.androidx.ui.test.junit4)
+    debugImplementation(libs.androidx.ui.tooling)
+    implementation (libs.json.json)
+    implementation("androidx.constraintlayout:constraintlayout:2.1.4")
+    implementation(files("libs/onnxruntime-genai-android-0.7.1.aar"))
+    implementation("io.noties.markwon:core:4.6.2")
+}
diff --git a/mobile/examples/Qwen_QA/Android/app/libs/onnxruntime-genai-android-0.7.1.aar b/mobile/examples/Qwen_QA/Android/app/libs/onnxruntime-genai-android-0.7.1.aar
diff --git a/mobile/examples/Qwen_QA/Android/app/proguard-rules.pro b/mobile/examples/Qwen_QA/Android/app/proguard-rules.pro
@@ -0,0 +1,21 @@
+# Add project specific ProGuard rules here.
+# You can control the set of applied configuration files using the
+# proguardFiles setting in build.gradle.
+#
+# For more details, see
+#   http://developer.android.com/guide/developing/tools/proguard.html
+
+# If your project uses WebView with JS, uncomment the following
+# and specify the fully qualified class name to the JavaScript interface
+# class:
+#-keepclassmembers class fqcn.of.javascript.interface.for.webview {
+#   public *;
+#}
+
+# Uncomment this to preserve the line number information for
+# debugging stack traces.
+#-keepattributes SourceFile,LineNumberTable
+
+# If you keep the line number information, uncomment this to
+# hide the original source file name.
+#-renamesourcefileattribute SourceFile
diff --git a/mobile/examples/Qwen_QA/Android/app/src/main/AndroidManifest.xml b/mobile/examples/Qwen_QA/Android/app/src/main/AndroidManifest.xml
@@ -0,0 +1,29 @@
+<?xml version="1.0" encoding="utf-8"?>
+<manifest xmlns:android="http://schemas.android.com/apk/res/android"
+    xmlns:tools="http://schemas.android.com/tools">
+
+    <application
+        android:allowBackup="true"
+        android:dataExtractionRules="@xml/data_extraction_rules"
+        android:fullBackupContent="@xml/backup_rules"
+        android:icon="@mipmap/ic_launcher_2"
+        android:label="@string/app_name"
+        android:roundIcon="@mipmap/ic_launcher_2_round"
+        android:supportsRtl="true"
+
+        android:theme="@style/Theme.local_llm"
+        tools:targetApi="31">
+        <activity
+            android:name=".MainActivity"
+            android:exported="true"
+            android:label="@string/app_name"
+            android:theme="@style/Theme.local_llm">
+            <intent-filter>
+                <action android:name="android.intent.action.MAIN" />
+
+                <category android:name="android.intent.category.LAUNCHER" />
+            </intent-filter>
+        </activity>
+    </application>
+
+</manifest>
diff --git a/mobile/examples/Qwen_QA/Android/app/src/main/assets/Readme.md b/mobile/examples/Qwen_QA/Android/app/src/main/assets/Readme.md
@@ -0,0 +1 @@
+### Add model.onnx and tokenizer.json in this folder
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		### Add model.onnx and tokenizer.json in this folder