-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any Tutorial / Example for writing a Presto-ElasticSearch connector ? #3057
Comments
ElasticSearch has several Java clients available -- use the scroll method for large result sets: Crate.io has an SQL adapter on top of the ElasticSearch code base, which could be of some help. |
Crate does not support JOINs.
|
I would start by forking the https://github.com/facebook/presto/tree/master/presto-example-http plugin, and adapting it to be able to read from elastic search. Last time I used elastic search the apis were all REST based so the presto-example-http plugin should be pretty close to what you need. Once you get that working, you’ll want to work on getting predicate push down working, but I’d start by just getting it to read at all. -dain
|
In the legacy SPI that the example connector implements, a table is logically divided in partitions and partitions are divided into splits. A partition can provide a Presto will enumerate and filter the partitions and then enumerate the splits for the partitions. Then Presto reads data in parallel from splits. If your system does not support parallel reading, simply return a single Partition and a single Split. If your system has a more sophisticated physical layout, you will want to use the new TableLayouts SPI so that Presto can take advantage of the data organization. |
I wrote a basic connector with the necessary classes implemented. I also added a .properties file in 'etc/catalog' and
but I get this error :
The problem is here :
The size of plugins when loading this new plugin is 0, whereas for other old plugins , it is 1
Can you please help, why the first 2 lines of this code are not working as expected ? Can you please elaborate on this part of the Developer Docs, which I could not understand properly ?
How should I provide the classname of my new plugin to presto ? |
The above problem solved after I added a file with the name 'com.facebook.presto.spi.Plugin' in the 'META-INF/services' directory :
But, I observed that for the other connectors (except tpch), the same file is present in a different directory :
Then how is the serviceloader loading the connectors kafka and raptor ? |
Pls tell me how to accurately implement these 3 function in the 'RecordCursor' interface while writing a connector
Please help ... pls..pls..pls.. |
Those functions are only for stats. If they don't mean anything for your connector or that info is not available just return 0. |
@electrum , @dain : Does Presto support dynamic columns in Tables, (for example, data stores which contain JSON documents, where new properties can be added in a JSON doc residing in an index/type) ? |
What's the status on this one? Any progress made? Thanks! |
Slightly off topic but still relevant for people looking into this topic; we have released a first version of a JDBC driver for Elasticsearch called sql4es. It supports most common SQL statements and can be use from any system supporting the JDBC interface. |
Should not be better to have a connector to Apache Lucene instead Elasticsearch HTTP API ? BTW you could use Hive external table elastisearch (via elasticsearch-hadoop official hive connector) and query it from PrestoDB. |
Well, first of all the driver uses the transport API and not the HTTP one. I think you actually do want to use Elasticsearch because it provides distributed query execution and high availability. The Hive connection you mention should work I think although I must admit i have never used it. |
@sumanth232 Are you still working on this? Any progress? |
I found this other fork today : https://github.com/ebyhr/presto, with an |
This issue is obsolete now, closing. As to the elasticsearch connector, if the above mentioned implementations are applicable to general audience, it would be valuable to have one in presto codebase. |
I want to write an ElasticSearch connector to perform JOINS in ElasticSearch using Presto.
Can anyone pls suggest on how to start. Any guidance will be of lot of help.
The text was updated successfully, but these errors were encountered: