Ginkgo is an in-memory distributed data management and processing system for big data applications, which runs on clusters of commodity servers and aims to provide real-time data analytics on relational dataset.
Ginkgo relies on highly parallel query processing to dramatically accelerate data analysis speed. Query evaluations are not only distributed across the cluster to leverage the computation power of the cluster, but are also executed in a multi-threaded fashion to unleash the power of modern many-core hardware. Due to unpredictable data distributions, the static scheduling policy could idle computing resources of the cluster. To maximize the resource utilization on the cluster, it employs elastic execution on the pipelines of query DAG, which fills the resource bubbles by resizing the pipelining width on the fly.
Many analytical systems usually perform bulk loading with a long delay. This imperfection makes the query scan stale data. Ginkgo employs a real-time data ingestion module, which continuously ingests external fresh data into the partitions among the in-memory cluster and then asynchronously flushed to HDFS for persistence. To solve obvious read/write conflicts on the cluster, it introduces a metadata-based protocol, which converts each distributed transcation into multiple single-site transactions for rawdata and metadata respectively. As a result, it is enabled to produce a lightweight snapshot for query execution.
Ginkgo employs a large set of optimization techniques to achieve efficient in-memory data processing, including batch-at-a-time processing, cache-sensitive operators, column pruning, data compression, SIMD-based optimization, code generation, lock-free and concurrent processing structures. These optimizations work collaborately and enable Ginkgo to process up to gigabytes data per second per thread.
Currently, we are developing Ginkgo at East China Normal University. If you have any problems about this project, please contact us.
Email: ginkgo.bigdata@gmail.com
Try our Ginkgo, please follow Quick Start. Learn more information, please go to Wiki.
Chuliang Weng, Professor.
Zhifang Li, Ph.D. Student.
Shangwei Wu, Ph.D. Student.
Xiaoshuang Peng, Ph.D. Student.
Xiaopeng Fan, Ph.D. Student.
Zewen Sun, Ph.D. Student.
Yingtong Xiong, Postgraduate Student.
Zeyu He, Postgraduate Student.
Beicheng Peng, Postgraduate Student.
Qiuli Huang, Zhuhe Fang, Zhenhui Zhao, Tingting Sun, Minqi Zhou, Li Wang, Lei Zhang, Shaochan Dong, Xinzhou Zhang, Yu Kai, Yongfeng Li, Lin Gu