-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace dump
functions with operator<<
overloads
#5026
Conversation
8f974d8
to
2c08b88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these functions that implement a "dump" should be implemented as nonmember functions. This is in part to assure correctness. There's nothing in such a function for a schema that should ever need to rely on private member variables or private member functions. Using nonmember functions assures that they can't be called.
For historical reasons, our serialization and deserialization functions, both for storage and network, are implemented as member functions, but it would be better if they were moved out themselves. That's out of scope of this PR, but all the new code going in can be done this way.
We don't even want to write functions called dump
internally; that's a C idiom; we will reserve that term for the C API only. Since we're dealing with text output, everything here can be done better by overloading operator<<
for std::ostream
. We don't want functions that return strings.
Changes required:
- No new
dump
functions. - New overloads of
operator<<
. - Implement C API functions with
stringstream
andfstream
internally.
For an quick reference for the write function signatures, see the section "Stream Insertion and Extraction" in https://stackoverflow.com/questions/4421706/what-are-the-basic-rules-and-idioms-for-operator-overloading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The various T::dump(FILE *)
functions should all be inlined into their call sites in the C API implementation functions. The class-specific parts are not in operator<<
functions and all that's remaining in these is a bit of interface code. Also in all of these function fwrite
is better than fprintf
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even without dealing with the issues with Filter
, there are a number of mostly-small changes to be made.
For class Filter
, it looks like the following would work:
- A single definition of
operator<<
for filters. - A pure virtual function
Filter::output
that (1) calls. This should beprotected
. - An override of
Filter::output
in each class derived from filter. Alsoprotected
. - A friend declaration for
operator<<
so thatFilter::output
is available.
@eric-hughes-tiledb, making it a friend causes the need to be in |
Friend declarations can be for namespace-qualified identifiers. |
f5451fe
to
67d7ed4
Compare
91bbef2
to
7baa8fd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't approve this until ArraySchema::dump
is gone, along with all of its ilk. Now that we have proper stream output, there's nothing left of value in these functions. All of the
All the assert
statements here have to go. If the need for [[maybe_unused]]
wasn't clear enough, assert
does not always check the return value of an I/O function. (It does so only in some compiles, but not all.) We always need to check the return value of I/O functions.
The signature of Filter::output
has to change. And we don't need Filter::dump
any longer after this change; it and all its overrides can just go.
There are a number of places that use \n
explicitly. These should be replaced with std::endl
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't resolve all your suggestions
I have. The problem with the regression test seems to have been a preexisting defect; there was missing inclusion of the external dimension header.
I went ahead and fixed up all my last comments as well, since I already was building the branch locally.
LGTM.
The result is a solid improvement over the prior.
std::string*
argumentdump
functions with operator<<
overloads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These APIs do not exist.
700a974
to
e72a39c
Compare
0813e4b
to
7d71079
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
[sc-32959] In TileDB-Py `schema.dump()` is called without any argument, leading writes to C `stdout` which Jupyter does not capture: https://github.com/TileDB-Inc/TileDB/blob/76dbda43d98bff7b71527fbffd0dfd657437b00f/tiledb/sm/array_schema/array_schema.cc#L693-L694 Implementing a function that captures the output of the `dump()` to a string and then prints it to `sys.stdout` will fix the problem. Of course, `ArraySchema::dump(std::string*)` calls the `dump()` functions of other classes, so those need to be implemented as well. **But** we don't even want to write functions called dump internally; that's a C idiom; we will reserve that term for the C API only. Since we're dealing with text output, everything here can be done better by overloading `operator<<` for `std::ostream`. --- TYPE: IMPROVEMENT DESC: Output of `schema.dump()` in TileDB-Py is not captured by Jupyter. Replacing `dump` functions with `operator<<` overloads will give the ability to print the resulted string from Python. --------- Co-authored-by: Eric Hughes <eric.hughes@tiledb.com> (cherry picked from commit f40530f)
[sc-32959] Following #5026, `dump` C++ APIs were mistakenly deleted. This PR reimplements them as deprecated. `operator<<` remains the canonical interface. --- TYPE: NO_HISTORY DESC: Reimplement `dump` C++ APIs as deprecated. --------- Co-authored-by: Luc Rancourt <lucrancourt@gmail.com> Co-authored-by: KiterLuc <67824247+KiterLuc@users.noreply.github.com> Co-authored-by: Theodore Tsirpanis <theodore.tsirpanis@tiledb.com>
…dump(FILE*)` with `operator<<` overload. (#5266) After #5026, the only `dump` API that remains without a string counterpart is on `FragmentInfo`. This PR adds the `tiledb_fragment_info_dump_str` C API and replaces `FragmentInfo::dump(FILE*)` with an `operator<<` overload. [sc-50605] --- TYPE: C_API | CPP_API DESC: Add `tiledb_fragment_info_dump_str` C API and replace `FragmentInfo::dump(FILE*)` with `operator<<` overload. --------- Co-authored-by: Theodore Tsirpanis <theodore.tsirpanis@tiledb.com>
[sc-32959]
In TileDB-Py
schema.dump()
is called without any argument, leading writes to Cstdout
which Jupyter does not capture:TileDB/tiledb/sm/array_schema/array_schema.cc
Lines 693 to 694 in 76dbda4
Implementing a function that captures the output of the
dump()
to a string and then prints it tosys.stdout
will fix the problem. Of course,ArraySchema::dump(std::string*)
calls thedump()
functions of other classes, so those need to be implemented as well.But we don't even want to write functions called dump internally; that's a C idiom; we will reserve that term for the C API only. Since we're dealing with text output, everything here can be done better by overloading
operator<<
forstd::ostream
.TYPE: IMPROVEMENT
DESC: Output of
schema.dump()
in TileDB-Py is not captured by Jupyter. Replacingdump
functions withoperator<<
overloads will give the ability to print the resulted string from Python.