TiSpark 1.0 GA
TiSpark provides distributed computing of TiDB data using Apache Spark.
- Provide a gRPC communication framework to read data from TiKV
- Provide encoding and decoding of TiKV component data and communication protocol
- Provide calculation pushdown, which includes:
- Aggregate pushdown
- Predicate pushdown
- TopN pushdown
- Limit pushdown
- Provide index related support
- Transform predicate into Region key range or secondary index
- Optimize
Index Only
queries - Adaptive downgrade index scan to table scan per region
- Provide cost-based optimization
- Support statistics
- Select index
- Estimate broadcast table cost
- Provide support for multiple Spark interfaces
- Support Spark Shell
- Support ThriftServer/JDBC
- Support Spark-SQL interaction
- Support PySpark Shell
- Support SparkR