Replies: 2 comments 1 reply
-
|
Thank you for reaching out, your motivating words, and your ideas! To be honest, I don't have any detailed plans on how we're gonna do persistent storage, so you're idea definitely comes at the right time :). Parquet-based storage mentioned in #4 was just the one that I've encountered often and DataFusion has already great support. But there is also a Vortex DataFusion integration that we could also use. From my naïve current standpoint, I think RDF Fusion's storage format should be
I have not researched enough how well Parquet and Vortex cater to these needs but I think that 1. and 2. are fulfilled by both of them. I am not that sure about 3. yet, hopefully I can find the time to research that. If we can integrate new file formats with a relatively low amount of efforts (re-using existing implementations) I think these two options are not mutually exclusive. RDF Fusion aims to also be a platform that allows for prototyping new ideas and doing further research. Basically, we hope to enable this by making DataFusion's extensibility more accessible to the SPARQL / Semantic Web community. I think evaluating the suitability of open file formats for SPARQL processing would be a cool thing to do (maybe there is already a comparison out there). :) I'll link this discussion in the relevant ticket and copy some of my thoughts in there. Feel free to add your thoughts there! I've also added the paper to my reading list. I am a big fan of Wasm so that alone makes it interesting for me.
Have you by chance read the documentation of the rdf-fusion crate? If you have any further input on how I can make it more accessible, please don't hesitate to just open an issue! My goal was that people having basic knowledge about relational databases get enough information on SPARQL / DataFusion to get an idea what this project is. |
Beta Was this translation helpful? Give feedback.
-
|
I had to edit the title - I mixed up Vortex and Velox - too many similar names out there ;-) I think this is a great idea and a step in the right direction. I read and was deeply influenced by The Composable Data Management System Manifesto last year. A more accessible version would be The Composable Codex. I think that the SPARQL community and the documentation might benefit from some highlighted sections from this as more of an explainer for the Rust/Arrow/Datafusion design choice. Regarding SPARQL-side documentation, I still don't know what I don't know and am still very ignorant. What you had written was enlightening! I am looking forward to reading your paper when it is published - "RDF Fusion: An Extensible SPARQL Engine for Hybrid Data Models" - found that reference online but not sure that is correct. I have a Geospatial background - hence the interest in high performance engines. Will definitely be following the project. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
This is more to say 'hello' - I am sure that you already have some great ideas for storage formats.
First of all, I want to say that this is such an intriguing project/vision - I think it has serious potential.
I came across this project last week when I was exploring SPARQL ( ignorance ) and looking around for a Rust/Arrow/DF implementation ( beginner ).
I have been aware of Vortex as an alternative to Parquet for a little while now and having just finished reading F3: The Open-Source Data File Format for the Future, I realise that Vortex might be the first to implement F3 capabilities and might be worth exploring.
Regards,
Jonathan
Beta Was this translation helpful? Give feedback.
All reactions