Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Do not merge) Data serialization #3275

Closed

Conversation

duizendnegen
Copy link
Contributor

@duizendnegen duizendnegen commented Nov 9, 2016

Issue

This PR addresses #2242. I/O usage and enabling cross-platform data (de)serialization

Tasklist

  • replace ifstream / ofstream / fstream with internal binary-read/write File system
  • decide on where to read / write from and using which serialization method
  • plug in serialization method

Relevant documents

@duizendnegen
Copy link
Contributor Author

Different file system usage across the project is marked in the current PR - hopefully this can be useful as a guide as to where to start shifting out code.

I started charting what kind of data types where being read / written, but half way I gave up and resorted to merely placing TODO comments where data access was used. I think at least the preliminary findings already indicate what type of data we're dealing with.

Knowing what data types we're dealing with (and even going further, normalizing data types) should prove useful for if we're opting for a data serialization technique which is using schemas. This will prove to be a real challenge, as there's a lot of slightly different types around.

The TL;DR is: there's many types around and they're all some vector of structs, int32s and the likes.

util::io usage

  • writeFingerprint - FingerPrint
    • extractor.hpp
  • readAndCheckFingerprint (like above)
    • internal_datafacade.hpp, storage.cpp
  • serializeVector
    • extractor.hpp: EdgeWeight, unsigned int, EntryClass (= int32_t based flag collection); Mask
    • util-tests io.cpp: int
  • deserializeVector
    • contractor.cpp: EdgeWeight
    • internal_datafacade.hpp: unsigned int, EntryClass
    • util-tests io.cpp: int
    • storage.cpp: unsigned int; EntryClass
  • serializeVectorIntoAdjacencyArray (no references?)
  • deserializeAdjacencyArray
    • internal_datafacade.hpp: Mask
    • storage.cpp: Mask
  • serializeFlags / deserializeFlags (candidate for deletion? only used in test files, not in actual code)
    • util-tests io.cpp: bool

direct ifstream / ofstream usage

  • contractor.cpp (partly through boost::spirit::qi): SegmentSpeedSource, TurnPenaltySource, QueryNode, geometries (three vectors ofuint32_t,int32_t,int32_t,),float`
  • extractor.cpp: FingerPrint, TurnRestriction (includes some bool flag struct), Nodes (includes QueryNode)
  • internal_dataface.hpp: ProfileProperties, LaneTupleIdPair, std::string, ...

@ghoshkaj
Copy link
Member

@duizendnegen hi! I'd like to help you on this. I've started profiling the direct ifstream and ofstream usages. Will post them here in a bit.

@ghoshkaj
Copy link
Member

Hi @duizendnegen! This is currently where we're at: @danpat wrote a simple FileReader. So the next step is to convert all the reads and writes to use this FileReader. This is happening here: #3321
Please feel to jump in and help with this if you like.

The FileReader is not complete and will need to be modified to meet all goals.
The plan after converting all raw ifstreams is to adapt that FileReader class to be more like @daniel-j-h outlined here.

@daniel-j-h
Copy link
Member

Now with #3321 merged what are your next actions here @ghoshkaj @danpat?

@ghoshkaj
Copy link
Member

I think this branch can be closed. I believe the next steps are to refactor the the FileReader and FileWriter classes.

@duizendnegen
Copy link
Contributor Author

Absolutely - next steps are to convert all ofstreams to use the FileWriter (just like FileReader now wraps all ifstreams)

I'll try and see whether I have some time to get at that in the coming weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants