qodo-benchmark · ofir-frd · Jan 4, 2026 · Jan 4, 2026 · Jan 4, 2026 · Jan 4, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -1,54 +1,253 @@
-# AGENTS.md
+# Compliance Rules
 
-## Project Overview
+This file contains the compliance and code quality rules for this repository.
 
-Dify is an open-source platform for developing LLM applications with an intuitive interface combining agentic AI workflows, RAG pipelines, agent capabilities, and model management.
+## 1. Python Functions Must Include Type Annotations
 
-The codebase is split into:
+**Objective:** Ensure type safety and improve code maintainability by requiring explicit type annotations for all function parameters and return values in Python code, as enforced by basedpyright type checking
 
-- **Backend API** (`/api`): Python Flask application organized with Domain-Driven Design
-- **Frontend Web** (`/web`): Next.js 15 application using TypeScript and React 19
-- **Docker deployment** (`/docker`): Containerized deployment configurations
+**Success Criteria:** All Python function definitions include type hints for parameters and return types using Python 3.12+ syntax (e.g., list[str] instead of List[str], int | None instead of Optional[int])
 
-## Backend Workflow
+**Failure Criteria:** Python function definitions lack type annotations for parameters or return values
 
-- Run backend CLI commands through `uv run --project api <command>`.
+---
 
-- Before submission, all backend modifications must pass local checks: `make lint`, `make type-check`, and `uv run --project api --dev dev/pytest/pytest_unit_tests.sh`.
+## 2. Python Code Must Follow Ruff Linting Rules
 
-- Use Makefile targets for linting and formatting; `make lint` and `make type-check` cover the required checks.
+**Objective:** Maintain consistent code quality and style by adhering to the project's ruff configuration, which enforces rules for code formatting, import ordering, security checks, and best practices
 
-- Integration tests are CI-only and are not expected to run in the local environment.
+**Success Criteria:** All Python code passes ruff format and ruff check --fix without errors, following the rules defined in .ruff.toml including line length of 120 characters, proper import sorting, and security rules (S102, S307, S301, S302, S311)
 
-## Frontend Workflow
+**Failure Criteria:** Python code violates ruff linting rules such as improper formatting, incorrect import ordering, use of print() instead of logging, or security violations like using exec/eval/pickle
 
-```bash
-cd web
-pnpm lint:fix
-pnpm type-check:tsgo
-pnpm test
-```
+---
 
-## Testing & Quality Practices
+## 3. Backend Code Must Use Logging Instead of Print Statements
 
-- Follow TDD: red → green → refactor.
-- Use `pytest` for backend tests with Arrange-Act-Assert structure.
-- Enforce strong typing; avoid `Any` and prefer explicit type annotations.
-- Write self-documenting code; only add comments that explain intent.
+**Objective:** Enable proper observability and debugging by requiring all output to use the logging module rather than print statements, with logger instances declared at module level
 
-## Language Style
+**Success Criteria:** All logging is performed using logger = logging.getLogger(__name__) declared at module top, with no print() statements in production code (tests are exempt)
 
-- **Python**: Keep type hints on functions and attributes, and implement relevant special methods (e.g., `__repr__`, `__str__`).
-- **TypeScript**: Use the strict config, rely on ESLint (`pnpm lint:fix` preferred) plus `pnpm type-check:tsgo`, and avoid `any` types.
+**Failure Criteria:** Code contains print() statements outside of test files, or logging is performed without proper logger initialization
 
-## General Practices
+---
 
-- Prefer editing existing files; add new documentation only when requested.
-- Inject dependencies through constructors and preserve clean architecture boundaries.
-- Handle errors with domain-specific exceptions at the correct layer.
+## 4. Python Code Must Use Modern Type Syntax for Python 3.12+
 
-## Project Conventions
+**Objective:** Leverage modern Python type system features for better code clarity and type safety by using the latest type annotation syntax
 
-- Backend architecture adheres to DDD and Clean Architecture principles.
-- Async work runs through Celery with Redis as the broker.
-- Frontend user-facing strings must use `web/i18n/en-US/`; avoid hardcoded text.
+**Success Criteria:** Type annotations use Python 3.12+ syntax: list[T] instead of List[T], dict[K,V] instead of Dict[K,V], int | None instead of Optional[int], and str | int instead of Union[str, int]
+
+**Failure Criteria:** Code uses legacy typing imports like List, Dict, Optional, Union from the typing module when modern syntax is available
+
+---
+
+## 5. Python Backend Files Must Not Exceed 800 Lines
+
+**Objective:** Maintain code readability and modularity by keeping individual Python files under 800 lines, promoting proper code organization and separation of concerns
+
+**Success Criteria:** All Python files in the backend (api/) contain fewer than 800 lines of code
+
+**Failure Criteria:** Python files in the backend exceed 800 lines and should be split into multiple files
+
+---
+
+## 6. SQLAlchemy Sessions Must Use Context Managers
+
+**Objective:** Ensure proper database connection management and prevent resource leaks by requiring all SQLAlchemy sessions to be opened with context managers
+
+**Success Criteria:** All database operations use 'with Session(db.engine, expire_on_commit=False) as session:' pattern for session management
+
+**Failure Criteria:** Database sessions are created without context managers or sessions are not properly closed
+
+---
+
+## 7. Database Queries Must Include tenant_id Scoping
+
+**Objective:** Ensure data isolation and security in multi-tenant architecture by requiring all database queries to be scoped by tenant_id to prevent cross-tenant data access
+
+**Success Criteria:** All database queries that access tenant-scoped resources include WHERE clauses filtering by tenant_id
+
+**Failure Criteria:** Database queries access tenant-scoped tables without tenant_id filtering, creating potential data leakage
+
+---
+
+## 8. Python Tests Must Follow pytest AAA Pattern
+
+**Objective:** Maintain clear and maintainable test structure by requiring all pytest tests to follow the Arrange-Act-Assert pattern for better readability and understanding
+
+**Success Criteria:** Test functions are structured with three distinct sections: Arrange (setup), Act (execution), Assert (verification), with clear separation between phases
+
+**Failure Criteria:** Test functions mix setup, execution, and assertion logic without clear separation or organization
+
+---
+
+## 9. TypeScript Must Avoid any Type Annotations
+
+**Objective:** Maintain type safety in the frontend codebase by avoiding the any type, which bypasses TypeScript's type checking and can lead to runtime errors
+
+**Success Criteria:** TypeScript code uses specific types or unknown instead of any, with ts/no-explicit-any warnings addressed
+
+**Failure Criteria:** Code contains 'any' type annotations without justified exceptions or proper type definitions
+
+---
+
+## 10. TypeScript Must Use Type Definitions Instead of Interfaces
+
+**Objective:** Maintain consistency in type declarations across the codebase by preferring type definitions over interfaces, as enforced by ts/consistent-type-definitions rule
+
+**Success Criteria:** All TypeScript type declarations use 'type' keyword instead of 'interface' keyword, following the pattern: type MyType = { ... }
+
+**Failure Criteria:** Code uses 'interface' declarations instead of 'type' definitions
+
+---
+
+## 11. Frontend User-Facing Strings Must Use i18n Translations
+
+**Objective:** Enable internationalization and localization by requiring all user-facing text in the frontend to be retrieved from i18n translation files rather than hardcoded
+
+**Success Criteria:** All user-facing strings are defined in web/i18n/en-US/ translation files and accessed via useTranslation hook with proper namespace options, following dify-i18n/require-ns-option rule
+
+**Failure Criteria:** User-facing strings are hardcoded in component files instead of using translation keys
+
+---
+
+## 12. TypeScript Files Must Follow Strict TypeScript Configuration
+
+**Objective:** Ensure type safety and catch potential errors at compile time by enabling strict TypeScript compiler options
+
+**Success Criteria:** All TypeScript code compiles successfully with strict mode enabled in tsconfig.json, including strict type checking and consistent casing enforcement
+
+**Failure Criteria:** Code contains type errors or inconsistent casing that would fail strict TypeScript compilation
+
+---
+
+## 13. Backend Configuration Must Be Accessed via configs Module
+
+**Objective:** Centralize configuration management and prevent direct environment variable access by requiring all configuration to be retrieved through the configs module
+
+**Success Criteria:** Configuration values are accessed through configs.dify_config or related config modules, not via direct os.environ or os.getenv calls
+
+**Failure Criteria:** Code directly reads environment variables using os.environ or os.getenv instead of using the configs module
+
+---
+
+## 14. Python Backend Must Use Pydantic v2 for Data Validation
+
+**Objective:** Ensure consistent data validation and serialization using Pydantic v2 models with proper configuration for DTOs and request/response validation
+
+**Success Criteria:** All data transfer objects use Pydantic v2 BaseModel with ConfigDict(extra='forbid') by default, and use @field_validator/@model_validator for domain rules
+
+**Failure Criteria:** Data validation uses Pydantic v1 syntax, allows undeclared fields without explicit configuration, or uses custom validation logic instead of Pydantic validators
+
+---
+
+## 15. Backend Errors Must Use Domain-Specific Exceptions
+
+**Objective:** Provide clear error handling and appropriate HTTP responses by raising domain-specific exceptions from services and translating them to HTTP responses in controllers
+
+**Success Criteria:** Business logic raises exceptions from services/errors or core/errors modules, and controllers handle these exceptions to return appropriate HTTP responses
+
+**Failure Criteria:** Services return HTTP responses directly, or generic exceptions are raised without domain context
+
+---
+
+## 16. Python Code Must Use Snake Case for Variables and Functions
+
+**Objective:** Maintain consistent naming conventions across the Python codebase by using snake_case for variables and functions, PascalCase for classes, and UPPER_CASE for constants
+
+**Success Criteria:** All Python variables and functions use snake_case naming (e.g., user_name, get_user_data), classes use PascalCase (e.g., UserService), and constants use UPPER_CASE (e.g., MAX_RETRIES)
+
+**Failure Criteria:** Python code uses camelCase, PascalCase for variables/functions, or inconsistent naming patterns
+
+---
+
+## 17. Frontend ESLint Sonarjs Rules Must Be Followed
+
+**Objective:** Maintain code quality and prevent common bugs by adhering to SonarJS cognitive complexity and maintainability rules configured in the project
+
+**Success Criteria:** TypeScript code passes SonarJS linting rules including no-dead-store (error level), max-lines warnings (1000 line limit), and no-variable-usage-before-declaration (error level)
+
+**Failure Criteria:** Code violates SonarJS rules such as dead stores, files exceeding 1000 lines, or variables used before declaration
+
+---
+
+## 18. Backend Architecture Must Follow Import Layer Constraints
+
+**Objective:** Maintain clean architecture boundaries by enforcing layer separation through import-linter rules that prevent circular dependencies and upward imports
+
+**Success Criteria:** Code adheres to import-linter contracts defined in .importlinter, including workflow layer separation (graph_engine → graph → nodes → entities) and domain isolation rules
+
+**Failure Criteria:** Imports violate architectural layers by importing from higher layers or creating circular dependencies not explicitly allowed in .importlinter
+
+---
+
+## 19. Backend Storage Access Must Use Abstraction Layer
+
+**Objective:** Ensure consistent and secure storage operations by requiring all storage access to go through extensions.ext_storage.storage abstraction
+
+**Success Criteria:** All file storage operations use extensions.ext_storage.storage interface instead of direct filesystem or cloud storage APIs
+
+**Failure Criteria:** Code directly accesses filesystem, S3, Azure Blob, or other storage without using the storage abstraction layer
+
+---
+
+## 20. Backend HTTP Requests Must Use SSRF Proxy Helper
+
+**Objective:** Prevent Server-Side Request Forgery attacks by requiring all outbound HTTP requests to use the SSRF proxy helper for validation and protection
+
+**Success Criteria:** All outbound HTTP fetches use core.helper.ssrf_proxy instead of direct httpx, requests, or urllib calls
+
+**Failure Criteria:** Code makes outbound HTTP requests without using the SSRF proxy helper, potentially exposing internal resources
+
+---
+
+## 21. Python Code Must Not Override Dunder Methods Unnecessarily
+
+**Objective:** Prevent subtle bugs and maintain expected Python object behavior by avoiding unnecessary overrides of special methods like __init__, __iadd__, etc.
+
+**Success Criteria:** Special methods (__init__, __iadd__, __str__, __repr__) are only overridden when necessary with proper implementation of relevant special methods documented in coding_style.md
+
+**Failure Criteria:** Code overrides dunder methods without clear justification or without implementing complementary methods
+
+---
+
+## 22. Backend Must Avoid Security-Risky Functions
+
+**Objective:** Prevent remote code execution vulnerabilities by prohibiting the use of dangerous built-in functions that can execute arbitrary code
+
+**Success Criteria:** Python code does not use exec(), eval(), pickle, marshal, or ast.literal_eval() as enforced by ruff security rules S102, S307, S301, S302
+
+**Failure Criteria:** Code uses exec(), eval(), pickle.loads(), marshal.loads(), or ast.literal_eval() which can execute arbitrary code
+
+---
+
+## 23. Python Backend Must Use Deterministic Control Flow
+
+**Objective:** Optimize for observability and debugging by maintaining deterministic control flow with clear logging and actionable errors
+
+**Success Criteria:** Code avoids clever hacks, maintains readable control flow, includes tenant/app/workflow identifiers in log context, and logs retryable events at warning level and terminal failures at error level
+
+**Failure Criteria:** Code uses obfuscated logic, lacks proper logging context, or mixes logging levels inappropriately
+
+---
+
+## 24. Backend Async Tasks Must Be Idempotent
+
+**Objective:** Ensure reliability of background processing by requiring all async tasks to be idempotent and log relevant object identifiers for debugging
+
+**Success Criteria:** All background tasks in tasks/ are idempotent (can be safely retried), log the relevant object identifiers (tenant_id, app_id, etc.), and specify explicit queue selection
+
+**Failure Criteria:** Background tasks are not idempotent, lack proper logging identifiers, or don't specify queue configuration
+
+---
+
+## 25. Frontend Code Must Not Use console Statements
+
+**Objective:** Prevent debug code from reaching production by treating console statements as warnings that should be removed or replaced with proper logging
+
+**Success Criteria:** Production frontend code avoids console.log, console.warn, console.error statements as enforced by no-console warning rule
+
+**Failure Criteria:** Code contains console statements that should be removed or replaced with proper logging mechanisms
+
+---
diff --git a/api/core/workflow/nodes/node_factory.py b/api/core/workflow/nodes/node_factory.py
@@ -8,9 +8,15 @@
 from core.helper.code_executor.code_node_provider import CodeNodeProvider
 from core.workflow.enums import NodeType
 from core.workflow.graph import NodeFactory
+from core.workflow.graph_engine.error_handler import ErrorHandler
 from core.workflow.nodes.base.node import Node
 from core.workflow.nodes.code.code_node import CodeNode
 from core.workflow.nodes.code.limits import CodeNodeLimits
+from core.workflow.nodes.template_transform.template_renderer import (
+    CodeExecutorJinja2TemplateRenderer,
+    Jinja2TemplateRenderer,
+)
+from core.workflow.nodes.template_transform.template_transform_node import TemplateTransformNode
 from libs.typing import is_str, is_str_dict
 
 from .node_mapping import LATEST_VERSION, NODE_TYPE_CLASSES_MAPPING
@@ -37,6 +43,7 @@ def __init__(
         code_executor: type[CodeExecutor] | None = None,
         code_providers: Sequence[type[CodeNodeProvider]] | None = None,
         code_limits: CodeNodeLimits | None = None,
+        template_renderer: Jinja2TemplateRenderer | None = None,
     ) -> None:
         self.graph_init_params = graph_init_params
         self.graph_runtime_state = graph_runtime_state
@@ -54,6 +61,7 @@ def __init__(
             max_string_array_length=dify_config.CODE_MAX_STRING_ARRAY_LENGTH,
             max_object_array_length=dify_config.CODE_MAX_OBJECT_ARRAY_LENGTH,
         )
+        self._template_renderer = template_renderer or CodeExecutorJinja2TemplateRenderer()
 
     @override
     def create_node(self, node_config: dict[str, object]) -> Node:
@@ -107,6 +115,15 @@ def create_node(self, node_config: dict[str, object]) -> Node:
                 code_limits=self._code_limits,
             )
 
+        if node_type == NodeType.TEMPLATE_TRANSFORM:
+            return TemplateTransformNode(
+                id=node_id,
+                config=node_config,
+                graph_init_params=self.graph_init_params,
+                graph_runtime_state=self.graph_runtime_state,
+                template_renderer=self._template_renderer,
+            )
+
         return node_class(
             id=node_id,
             config=node_config,

diff --git a/api/core/workflow/nodes/template_transform/template_renderer.py b/api/core/workflow/nodes/template_transform/template_renderer.py
@@ -0,0 +1,40 @@
+from __future__ import annotations
+
+from collections.abc import Mapping
+from typing import Any, Protocol
+
+from core.helper.code_executor.code_executor import CodeExecutionError, CodeExecutor, CodeLanguage
+
+
+class TemplateRenderError(ValueError):
+    """Raised when rendering a Jinja2 template fails."""
+
+
+class Jinja2TemplateRenderer(Protocol):
+    """Render Jinja2 templates for template transform nodes."""
+
+    def render_template(self, template: str, variables: Mapping[str, Any]) -> str:
+        """Render a Jinja2 template with provided variables."""
+        raise NotImplementedError
+
+
+class CodeExecutorJinja2TemplateRenderer(Jinja2TemplateRenderer):
+    """Adapter that renders Jinja2 templates via CodeExecutor."""
+
+    _code_executor: type[CodeExecutor]
+
+    def __init__(self, code_executor: type[CodeExecutor] | None = None) -> None:
+        self._code_executor = code_executor or CodeExecutor
+
+    def render_template(self, template: str, variables: Mapping[str, Any]) -> str:
+        try:
+            result = self._code_executor.execute_workflow_code_template(
+                language=CodeLanguage.JINJA2, code=template, inputs=variables
+            )
+        except CodeExecutionError as exc:
+            raise TemplateRenderError(str(exc)) from exc
+
+        rendered = result.get("result")
+        if not isinstance(rendered, str):
+            raise TemplateRenderError("Template render result must be a string.")
+        return rendered