Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve I/O error handling #3321

Merged
merged 1 commit into from
Dec 1, 2016
Merged

Improve I/O error handling #3321

merged 1 commit into from
Dec 1, 2016

Conversation

danpat
Copy link
Member

@danpat danpat commented Nov 16, 2016

Issue

This PR is to address #3047. It adds some basic wrappers around our file I/O which include consistent error handling and useful exception messages on error.

No file formats are changed.

Tasklist

  • add regression / cucumber cases (see docs/testing.md)
  • review
  • adjust for comments

@danpat danpat added this to the 5.5.0 milestone Nov 16, 2016
@TheMarex TheMarex changed the title [wip] Improve I/O error handling Improve I/O error handling Nov 17, 2016
@TheMarex
Copy link
Member

@duizendnegen
Copy link
Contributor

duizendnegen commented Nov 26, 2016

I'll try and open a branch to create PRs against this PR, not sure if I'll manage to find time in the coming days though. (Edit: it looks like all the ifstream usage is already implemented; is the next step then a FileWriter?)
Just to align: what about the usage of libosmium io? Will this also be refactored into using the FileReader classes? e.g. https://github.com/Project-OSRM/osrm-backend/blob/master/src/extractor/extractor.cpp#L131

@ghoshkaj
Copy link
Member

@danpat for review please!!

@ghoshkaj
Copy link
Member

Benchmark on my machine, extracting California:

FileReader Branch

real       7m6.943s
user       13m11.849s
sys        1m53.742s

Master

real       7m0.821s
user       12m53.339s
sys        1m49.112s

Benchmark on @danpat's machine:

FileReader Branch

real     6m8.743s
user     10m9.099s
sys      0m18.222s

Master

real     6m18.034s
user     10m16.862s
sys      0m18.979s

@ghoshkaj ghoshkaj force-pushed the fix/3047 branch 2 times, most recently from 38d08ce to 64151ee Compare November 30, 2016 00:14
@ghoshkaj
Copy link
Member

@danpat review por favor?

@ghoshkaj
Copy link
Member

ooor, because Dan and I have been working on this pretty much the whole time together, anybody else want to review? :D @TheMarex mayhaps?


BOOST_ASSERT(buffer.size() > 1);
stream.read(&buffer[0], static_cast<std::streamsize>(buffer.size()));

file_reader.ReadInto(&buffer[0], buffer.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the pattern

string b;
b.resize(reader.Size())
reader.ReadInto(&b[0], b.size());

why not abstract over this to let the reader already return a type filled with the contents?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ticketing and will tackle in a different PR.

{
namespace storage
{
namespace dedicated_io
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is dedicated io different to the storage::io namespace? and how is it related to util::io?

Copy link
Member

@ghoshkaj ghoshkaj Nov 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daniel-j-h dedicated_io could have a better name. It's basically the parts of FileReader that is reading a specialized type of input. For example, readHSGRHeader which "Reads the checksum, number of nodes and number of edges written in the header file of a .hsgr file and returns them in a HSGRHeader struct". This is not your typical filestream that simply opens up a file, verifies a fingerprint or not and returns the FileReader object.

This file has readHSGRHeader, readPropertiesCount, readDatasourceIndexes, readEdges, readNodes, readDatasourceNames and the assoicated structs.

It strikes me that these are all related to reading graphs and graph components. So perhaps I can rename this header to graph_io or something similar?

{
util::SimpleLogger().Write(logWARNING) << ".hsgr was prepared with different build.\n"
"Reprocess to get rid of this warning.";
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression the FileReader is now doing all the checks and we don't have to do this ourselves?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that the FileReader doesn't do this specific fingerprint test: if (!fingerprint_loaded.TestGraphUtil(fingerprint_valid)) so we have to do it here.

input_file.ReadInto(header.number_of_edges);

BOOST_ASSERT_MSG(0 != header.number_of_nodes, "number of nodes is zero");
// number of edges can be zero, this is the case in a few test fixtures
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about checking: if (num edges > 0) assert num nodes > 0; ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daniel-j-h I changed this to what you suggested above, but I'm missing context. Would you please explain? I'd like to know why BOOST_ASSERT_MSG(0 != header.number_of_nodes, "number of nodes is zero"); made sense at all.

BOOST_ASSERT(node_buffer);
BOOST_ASSERT(edge_buffer);
input_file.ReadInto(node_buffer, number_of_nodes);
input_file.ReadInto(edge_buffer, number_of_edges);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the reader handles a count of zero?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup! By returning:

if (count == 0)

if (!turn_penalty_file)
throw util::exception{"Unable to open turn penalty file " + filename};

storage::io::FileReader turn_penalty_file_reader(std::string(filename),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is weird. Earlier, both this and the segment_speed_filenames[idx] were failing to compile because filename was a char* and not a string. And therefore would not compile without being cast as a string. Now it seems to be compiling just fine. So I'll remove it.

TurnPenaltySourceFlatMap local;

std::uint64_t from_node_id{};
std::uint64_t via_node_id{};
std::uint64_t to_node_id{};
double penalty{};

for (std::string line; std::getline(turn_penalty_file, line);)
{
std::for_each(turn_penalty_file_reader.GetLineIteratorBegin(), turn_penalty_file_reader.GetLineIteratorEnd(), [&](const std::string &line) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -28,4 +30,77 @@ BOOST_AUTO_TEST_CASE(io_data)
BOOST_CHECK_EQUAL_COLLECTIONS(data_out.begin(), data_out.end(), data_in.begin(), data_in.end());
}

BOOST_AUTO_TEST_CASE(io_nonexistent_file)
{
static std::string IO_NONEXISTENT_FILE = "non_existent_test_io.tmp";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

} catch (const osrm::util::exception &e) {
std::cout << e.what() << std::endl;
BOOST_REQUIRE(std::string(e.what()) == "Error opening non_existent_test_io.tmp:No such file or directory");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a macro CHECK_THROWS of something like that for this exact use-case

data_in.resize(153);

for (std::size_t i = 0; i < data_in.size(); ++i)
data_in[i] = i;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check out std::iota

vector<int> v(n);
iota(begin(v), end(v));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool! done!

@ghoshkaj ghoshkaj force-pushed the fix/3047 branch 2 times, most recently from 4f4c6b3 to 025cbb6 Compare November 30, 2016 18:26
@danpat danpat force-pushed the fix/3047 branch 2 times, most recently from 3cf65f8 to b63f8b6 Compare December 1, 2016 01:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants