-
Notifications
You must be signed in to change notification settings - Fork 110
Installing Texera on a Kubernetes Cluster
This document describes the five main parts that users are expected to configure when deploying Texera using this Helm chart. All other values should generally be left with their defaults unless users are aware of specific customizations needed.
Before configuring and deploying the Texera platform, ensure the following prerequisites are met:
-
Kubernetes Cluster: A working Kubernetes cluster (e.g., local
minikube, or a cloud-based cluster). There should be at least 16 free CPU cores and 8 GB of RAM available. - Helm Installed: Helm v3 or later must be installed on your system to deploy the chart.
- Custom Hostnames:
- In production environment with HTTPS support, two valid hostnames must be available—one for the Texera services and another for MinIO access. For example,
texera.my.orgfor Texera services andminio.my.orgfor Minio should be available for deployment and external access. - In testing environment, e.g. localhost or exposing services via HTTP: one valid hostname(i.e. IP address of the server or
localhost) is enough. Port30080and Port31000will be occupied by default in this setting. To change the port occupation, see the below instructions.
-
TLS Configuration: You should either:
- Have a pre-created TLS secret, or
- Use cert-manager with a valid Issuer.
All configuration options mentioned in this guide are defined in the values.yaml file located under the deployment/k8s/texera-helmchart directory.
Credentials are used across different components such as PostgreSQL, MinIO, and LakeFS.
postgresql:
auth:
postgresPassword: root_password- postgresPassword: The superuser password used during database initialization. Required by LakeFS and Texera backend services.
minio:
auth:
rootUser: texera_minio
rootPassword: password- rootUser and rootPassword: Credentials used to access MinIO. These must match the S3 credentials provided to LakeFS.
lakefs:
secrets:
authEncryptSecretKey: random_string_for_lakefs
databaseConnectionString: postgres://postgres:root_password@texera-postgresql:5432/texera_lakefs?sslmode=disable
auth:
username: texera-admin
accessKey: AKIAIOSFOLKFSSAMPLES
secretKey: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY- Ensure
databaseConnectionStringincludes the correct PostgreSQL credentials. -
authdecides the credentials of initializing LakeFS's admin user and accessing LakeFS via API calls.
texeraEnvVars:
- name: STORAGE_JDBC_USERNAME
value: postgres
- name: USER_SYS_ENABLED
value: "true"
- name: MAX_NUM_OF_RUNNING_COMPUTING_UNITS_PER_USER
value: "10"- These variables must reflect the credentials you configured for PostgreSQL, i.e. the user of the PostgreSQL.
- These variables control the behavior of the Texera system.
Customize the domain names used for accessing Texera services via Ingress. TLS is optional but recommended for production.
ingressPaths:
enabled: true
hostname: "localhost"
tlsSecretName: "" # Optional TLS secret
issuer: "" # Optional cert-manager issuer-
hostname: Set this to the custom domain for your Texera deployment (e.g.,
texera.example.com). - tlsSecretName: Optional. Set to the name of a Kubernetes TLS secret if using HTTPS.
- issuer: Optional. Set to a cert-manager issuer if certificates should be managed automatically.
minio:
ingress:
hostname: "localhost"
tlsSecretName: ""
issuer: ""- These settings follow the same rules as Texera's ingress. Configure
hostname,tlsSecretName, andissueras needed.
Adjust resource requests to fit your cluster's capacity.
postgresql:
primary:
resources:
requests:
cpu: "4"
memory: "4Gi"- Tune based on expected database workload.
Other components (e.g., webserver, file service, envoy, and language servers) use default resource settings and are not expected to be changed.
The chart defaults to using local-path which may not be suitable for all clusters.
postgresql:
primary:
persistence:
enabled: true
size: 10Gi
storageClass: local-pathminio:
persistence:
enabled: true
size: 20Gi
storageClass: local-path- Replace
local-pathwith your cluster's preferred StorageClass (e.g.,gp2,standard, etc.).
You can scale some components by changing their replica count using the parameters below:
webserver:
numOfPods: 1
yWebsocketServer:
replicaCount: 1
pythonLanguageServer:
replicaCount: 8
envoy:
replicas: 1
workflowComputingUnitManager:
numOfPods: 1
workflowCompilingService:
numOfPods: 1
fileService:
numOfPods: 1- Increase these values to scale each component horizontally based on workload needs.
By default, Texera services will occupy port 30080, and Minio will occupy 31000. If you want to change it, go to the corresponding sections in the values.yaml:
minio:
service:
type: NodePort
nodePorts:
api: 31000 # change hereingress-nginx:
controller:
replicaCount: 1
service:
type: NodePort
nodePorts:
http: 30080 # change hereBy configuring these five areas—credentials, hostnames/TLS, resources, storage classes, and replica counts—you can tailor the deployment to suit your environment while relying on sane defaults for all other settings.
If you are installing Texera in your local Kubernetes environment, you don't need to change any of the above configurations unless needed.
Run the following command from the root directory of the repository:
helm install texera texera-helmchart --namespace texera-dev --create-namespaceThis will:
- Create a Helm release named
texera - Create a namespace named
texera-dev - Deploy all Texera components under that namespace
Please wait about 1-3 minutes for all pods to be ready. Once the deployment is complete, Texera should be accessible at:
http://<your-hostname-for-texera>
Note: If you're using a non-default kubeconfig file, append
--kubeconfig /path/to/your/kubeconfigto the Helm command.
To uninstall Texera and clean up all related resources:
helm uninstall texera --namespace texera-devCopyright © 2025 The Apache Software Foundation.
Getting Started
Implementing an Operator
- Step 2 - Guide to Implement a Java Native Operator
- Step 3 - Guide to Use a Python UDF
- Step 4 - Guide to Implement a Python Native Operator
Contributing to the Project