-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Restart files contain incorrect types for ArrayOfArrays #3434
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #3434 +/- ##
===========================================
- Coverage 57.58% 57.55% -0.04%
===========================================
Files 1135 1135
Lines 97797 97857 +60
===========================================
+ Hits 56316 56318 +2
- Misses 41481 41539 +58 ☔ View full report in Codecov by Sentry. |
ArrayOfArraysView< T const, INDEX_TYPE > const & var = var2.toViewConst(); | ||
internal::logOutputType( LvArray::system::demangleType( var ), "Output array via external pointer: " ); | ||
|
||
// ArrayOfArray::m_numArrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// ArrayOfArray::m_numArrays | |
// ArrayOfArray::m_numArrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe you want it as a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
// Push the data into conduit | ||
constexpr int conduitTypeID = conduitTypeInfo< T >::id; | ||
constexpr int sizeofConduitType = conduitTypeInfo< T >::sizeOfConduitType; | ||
conduit::DataType const dtype( conduitTypeID, offsets[numArrays] * sizeof( T ) / sizeofConduitType ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand the size here. What' s in offsets[numArrays]
?
Shouldn't this be the size of each array * numOfArrays
* sizeof(T)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
offsets
is the location in the buffer that a subarray starts. The next offset
is where the next one starts. So the last value offsets[numArrays]
is the end of the buffers as there are numArrays+1
values in offsets
. So `offsets[numArrays] is the length of the buffer if I am not mistaken. @corbett5 @wrtobin ??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I see. I would write this exact comment in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From ArrayOfArraysView.hpp
:
/// The number of arrays contained.
INDEX_TYPE_NC m_numArrays = 0;
/// Holds the offset of each array, of length m_numArrays + 1. Array i begins at
/// m_offsets[ i ] and has capacity m_offsets[i+1] - m_offsets[ i ].
BUFFER_TYPE< INDEX_TYPE > m_offsets;
/// Holds the size of each array.
BUFFER_TYPE< SIZE_TYPE > m_sizes;
/// Holds the values of each array. Values in the range [m_offsets[ i ], m_offsets[ i ] + m_sizes[ i ])
/// are valid. All other entries contain uninitialized values.
BUFFER_TYPE< T > m_values;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Randy. The last offset is one measure of the total length of the underlying memory. Other measures are the capacity from the underlying buffer, or offsets(numArrays - 1) + sizeOfArray(numArrays - 1) which would be the end of the valid values. For io you should only need to write up to this last value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@corbett5 I did notice it would be nice for a bit different nomenclature for this object....for example:
- there is no accessor function that returns the actual size of the data buffer (offset[m_numArrays])
- Perhaps language of calling the "arrays" is confusing. I think "subArrays" would be better...however if we want to do an generic
NestedArrays
arrays object for an arbitrary nesting, it would be nice to come up with something extendible. - The accessors for
m_offset
andm_sizes
protected inArrayOfArraysView
. So that means thatArrayOfArrays
can't provide public access. I am sure this is by design, but there needs to be more data helper functions to avoid needing direct access to these internal data arrays IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1
If you're accessing the underlying data (ie through m_offsets
and m_sizes
) you could be accessing uninitialized data. Which isn't a problem if you have an ArrayOfArrays< double >
, but is very much a problem (or at least you need to know what you're doing) if you have an ArrayOfArrays< string >
. So yeah I didn't make this easy on purpose.
There is a way to get the total underlying buffer capacity
/**
* @return Return the total number values that can be stored before reallocation.
*/
LVARRAY_HOST_DEVICE constexpr inline
INDEX_TYPE_NC valueCapacity() const
{ return m_values.capacity(); }
One way to do IO for the whole buffer would be
std::copy(arrayOfArryays[1].begin(), arrayOfArrays[arrayOfArrays.size() - 1].end(), dst_iter);
2
Certainly open for a name change. I thought I tried to call them inner arrays in the docs, but it's certainly confusing.
3
I agree it's strange that you can getValues()
becomes accessible once you convert an ArrayOfArrays
to an ArrayOfArraysView
. I don't think there would be any harm in making these methods accessible directly from ArrayOfArrays
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give it a look over this evening if it hasn't been merged by then
// should preallocate var.m_values with estimated sizes | ||
INDEX_TYPE const arraySizeEstimate = (*numArrays)==0 ? 0 : valuesSize / (*numArrays); | ||
var.resize( *numArrays, arraySizeEstimate ); | ||
var.reserveValues( valuesSize ); | ||
|
||
// correctly set the sizes and capacities of each sub-array | ||
localIndex allocatedSize = 0; | ||
for( INDEX_TYPE i = 0; i < *numArrays; ++i ) | ||
{ | ||
INDEX_TYPE const arrayAllocation = offsets[i+1] - offsets[i]; | ||
var.setCapacityOfArray( i, arrayAllocation ); | ||
var.resizeArray( i, sizes[ i ] ); | ||
allocatedSize += arrayAllocation; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@corbett5
I have been trying a few iterations with this section of code. I can't seem to get it right. Can you have a close look at the attempt to pre-allocate values and the subsequent capacity/resize of each array to make sure it makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this whole function I think I would do this
var.resizeFromOffsets( *numArrays, offsets );
for (INDEX_TYPE i = 0; i < numArrays; ++i )
{
T const * begin = valuesNode.data_ptr() + offsets[i];
T const * end = begin + sizes[ i ];
var.appendToArray( i, begin, end );
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@corbett5 Is the name resizeFromOffsets
accurate? Looking that the function should it be reserveFromOffsets
?
@corbett5 I can't seem to pass integrated tests after baselines are set. There are diffs in the
I would think that we don't care about initialized values...but then we can't just dump the values out and read them back in. We either have to avoid writing them out, or set them to some value. Is there a way to set the initial values of uninitialized array entries? |
I think the best solution would be to modify the python diff code to realize when it's comparing an ArrayOfArrays. The easiest thing would be to set the uninitialized values in the ArrayOfArrays before writing out. T * const values = const_cast< T * >( var.getValues() );
INDEX_TYPE const * const offsets = var.getOffsets();
for( INDEX_TYPE i = 0; i < numArrays; ++i)
{
INDEX_TYPE const curOffset = offsets[ i ];
INDEX_TYPE const nextOffset = offsets[ i + 1 ];
for ( j = curOffset + var.sizeOfArray( i ); j < nextOffset; ++j )
{
values[ j ] = T() // Probably should be a NaN or something.
}
} |
INDEX_TYPE const nextOffset = offsets[ i + 1 ]; | ||
for( INDEX_TYPE j = curOffset + var.sizeOfArray( i ); j < nextOffset; ++j ) | ||
{ | ||
if constexpr ( std::is_arithmetic< T >::value ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this redundant since T() = 0
if std::is_arithmetic< T >::value == true
? Also, why do you choose to make a copy instead of changing the values in the array? If we're going to make a copy anyways I'd suggest compressing out the uninitialized values. The only "problem" with this is that when we read the array back in it is compressed as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this redundant since
T() = 0
ifstd::is_arithmetic< T >::value == true
? Also, why do you choose to make a copy instead of changing the values in the array? If we're going to make a copy anyways I'd suggest compressing out the uninitialized values. The only "problem" with this is that when we read the array back in it is compressed as well.
- I wasn't sure that
T() = 0
was actually true. If we have a variable declared asint a;
, isa
guaranteed to be equal to zero? - I choose to make a copy because I didn't want to break the
const
promise on the write function. It is of course a double edge sword, because you are not reading in what was passed to the write function...but then changing theconst
object that was passed to the write function seems somewhat illegal?...nevermind...can't make a local copy. Have to break constness. - I did not want to compress it because I wanted to make sure that the behavior of a restart run is identical to the original run. Compressing the
ArrayOfArrays
would result in behavioral changes as adding new values would require different operations after restart....which of course should be OK...unless there is some sort of allocation error that occurs when you try to insert new values in theArrayOfArrays
...etc. This is of course ignoring the fact that the array you read back in will not be exactly identical to the one you wrote out in the "unallocated" values....but those shouldn't matter ever or we have bigger problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int a;
is different fromint a = int()
. The first is uninitialized and can be anything, the second calls the default constructor which for numeric types always returns zero.- Yeah I think breaking correctness is fine here.
- I get that, if you want as similar behavior as possible then setting the uninitialized values is definitely the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@corbett5
I should know that. thanks!
This PR adds proper conduit IO for ArrayOfArrays. Previously the IO routines output ArrayOfArrays as 8-bit integers/char. So diffs were incomparable....example: