-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic reads #132
base: master
Are you sure you want to change the base?
Dynamic reads #132
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good overall, thanks @robinbernon! Would you be able to fix a couple of other minor issues so that the linting step in the tests passes? e.g. https://dev.azure.com/alecmocatta/amadeus/_build/results?buildId=1591&view=logs&jobId=4420cb5a-3e60-5d7c-f139-f152148f0805&j=6471ded8-2f96-5fe2-f49c-aa706746e11a&t=0c335986-3956-5fa5-937e-2ddc0500a0f1
@@ -1,4 +1,4 @@ | |||
use hashlink::LinkedHashMap; | |||
use hashlink::linked_hash_map::LinkedHashMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was getting some strange issues due to double struct use in hashlink lib.rs. Found this fix was necessary to stop the given issue when compiling.
/// Predicate for [`Group`]s | ||
pub struct GroupPredicate( | ||
/// Map of field names to predicates for the fields in the group | ||
pub(super) LinkedHashMap<String, Option<<Value as ParquetData>::Predicate>, FxBuildHasher>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I realise I overlooked when you messaged about this. This would have a perf regression. Looks like it might be a simple fix to hashlink
, would you be able to make that PR? We can patch until your PR is merged and a new release published:
[patch.crates-io]
hashlink = { version = "0.6", git = "...", branch = "..." }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have made the fix for this and submitted a PR: kyren/hashlink#7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like to evaluate ritelinked ? This is a version derived from hashlink
. We use griddle
to reduce the tail delay and have solved the serialization problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh sure, happy to have a look :) Do you have any example code available for a common use case you might encounter that wouldn't work directly with LinkedHashMap or BTreeMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robinbernon You can directly replace hashlink
with ritelinked
without any additional changes, and the APIs of the two are compatible. In fact, ritelinked
is a hashlink
that only provides LinkedHashMap
and LinkedHashSet
, we just did some work to make it more usable .
@@ -1,11 +1,6 @@ | |||
use amadeus::prelude::*; | |||
use serde::{Deserialize, Serialize}; | |||
|
|||
#[derive(Data, Clone, PartialEq, Debug)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come this is removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry have put it back now, currently experiencing an issue with the predicate derives - Have got the specific issue highlighted in the following branch: https://github.com/robinbernon/amadeus/blob/temp2/amadeus-derive/src/lib.rs#L279-L290.
Pull request has been modified.
2c25e93
to
1972489
Compare
The helpers here unfortunately break amadeus on stable. Ideally they impl amadeus/amadeus-serde/src/json.rs Lines 45 to 51 in 7f071aa
serde_closure to make this easier but haven't had a chance yet.
The other alternative is to make |
a19053b
to
b312e54
Compare
b312e54
to
bc57c1e
Compare
Issue with impl serde closure instead of normal closure is that it would mean all adapter operations for parallel streams would have to change to using serde closures to support the change. There is currently a noticeable lack of intellisense when working with macros even on the best available IDE's - due to this I'd personally prefer to keep the standard closures in parallel streams as I like having intellisense when working on the various adapter operations etc before moving them into the required serde macros for distributed processing. Any chaance I can include the helpers here as a nightly feature then? |
Oh right, I see the issue. Yes agreed it's strongly desirable to keep |
Adding ability to read a subset of parquet data dynamically.