Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Restart files contain incorrect types for ArrayOfArrays #3434

Merged
merged 24 commits into from
Nov 12, 2024

Conversation

rrsettgast
Copy link
Member

@rrsettgast rrsettgast commented Nov 9, 2024

This PR adds proper conduit IO for ArrayOfArrays. Previously the IO routines output ArrayOfArrays as 8-bit integers/char. So diffs were incomparable....example:

********************************************************************************
Error: /Problem/domain/MeshBodies/mesh1/meshLevels/Level0/ElementRegions/elementRegionsGroup/Fracture/elementSubRegions/FractureSubRegion/nodeList/__values__
	Arrays of types int8 and int8 have 244 values of which 90 have differing values.
	Statistics of the differences greater than 0:
		max_index = (np.int64(216),), max = 111, mean = 28.666666666666668, std = 36.53856410242015
	Differing valid characters (substrings reordering detected):
	          !  !  ,  ,  7  7  B  B  
	          !  !  ,  ,  7  B  7  B  
	                         ^  ^     
********************************************************************************

@rrsettgast rrsettgast added ci: run CUDA builds Allows to triggers (costly) CUDA jobs ci: run integrated tests Allows to run the integrated tests in GEOS CI ci: run code coverage enables running of the code coverage CI jobs labels Nov 9, 2024
@rrsettgast rrsettgast self-assigned this Nov 9, 2024
@rrsettgast rrsettgast requested a review from CusiniM November 9, 2024 04:53
Copy link

codecov bot commented Nov 9, 2024

Codecov Report

Attention: Patch coverage is 0% with 61 lines in your changes missing coverage. Please review.

Project coverage is 57.55%. Comparing base (3b0d907) to head (fcfa719).
Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
...c/coreComponents/dataRepository/wrapperHelpers.hpp 0.00% 61 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3434      +/-   ##
===========================================
- Coverage    57.58%   57.55%   -0.04%     
===========================================
  Files         1135     1135              
  Lines        97797    97857      +60     
===========================================
+ Hits         56316    56318       +2     
- Misses       41481    41539      +58     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ArrayOfArraysView< T const, INDEX_TYPE > const & var = var2.toViewConst();
internal::logOutputType( LvArray::system::demangleType( var ), "Output array via external pointer: " );

// ArrayOfArray::m_numArrays
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ArrayOfArray::m_numArrays
// ArrayOfArray::m_numArrays

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe you want it as a comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

// Push the data into conduit
constexpr int conduitTypeID = conduitTypeInfo< T >::id;
constexpr int sizeofConduitType = conduitTypeInfo< T >::sizeOfConduitType;
conduit::DataType const dtype( conduitTypeID, offsets[numArrays] * sizeof( T ) / sizeofConduitType );
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand the size here. What' s in offsets[numArrays]?

Shouldn't this be the size of each array * numOfArrays * sizeof(T) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

offsets is the location in the buffer that a subarray starts. The next offset is where the next one starts. So the last value offsets[numArrays] is the end of the buffers as there are numArrays+1 values in offsets. So `offsets[numArrays] is the length of the buffer if I am not mistaken. @corbett5 @wrtobin ??

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see. I would write this exact comment in the code.

Copy link
Member Author

@rrsettgast rrsettgast Nov 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From ArrayOfArraysView.hpp:

  /// The number of arrays contained.
  INDEX_TYPE_NC m_numArrays = 0;

  /// Holds the offset of each array, of length m_numArrays + 1. Array i begins at
  /// m_offsets[ i ] and has capacity m_offsets[i+1] - m_offsets[ i ].
  BUFFER_TYPE< INDEX_TYPE > m_offsets;

  /// Holds the size of each array.
  BUFFER_TYPE< SIZE_TYPE > m_sizes;

  /// Holds the values of each array. Values in the range [m_offsets[ i ], m_offsets[ i ] + m_sizes[ i ])
  /// are valid. All other entries contain uninitialized values.
  BUFFER_TYPE< T > m_values;

Copy link
Contributor

@corbett5 corbett5 Nov 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Randy. The last offset is one measure of the total length of the underlying memory. Other measures are the capacity from the underlying buffer, or offsets(numArrays - 1) + sizeOfArray(numArrays - 1) which would be the end of the valid values. For io you should only need to write up to this last value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@corbett5 I did notice it would be nice for a bit different nomenclature for this object....for example:

  • there is no accessor function that returns the actual size of the data buffer (offset[m_numArrays])
  • Perhaps language of calling the "arrays" is confusing. I think "subArrays" would be better...however if we want to do an generic NestedArrays arrays object for an arbitrary nesting, it would be nice to come up with something extendible.
  • The accessors for m_offset and m_sizes protected in ArrayOfArraysView. So that means that ArrayOfArrays can't provide public access. I am sure this is by design, but there needs to be more data helper functions to avoid needing direct access to these internal data arrays IMO.

Copy link
Contributor

@corbett5 corbett5 Nov 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1

If you're accessing the underlying data (ie through m_offsets and m_sizes) you could be accessing uninitialized data. Which isn't a problem if you have an ArrayOfArrays< double >, but is very much a problem (or at least you need to know what you're doing) if you have an ArrayOfArrays< string >. So yeah I didn't make this easy on purpose.

There is a way to get the total underlying buffer capacity

  /**
   * @return Return the total number values that can be stored before reallocation.
   */
  LVARRAY_HOST_DEVICE constexpr inline
  INDEX_TYPE_NC valueCapacity() const
  { return m_values.capacity(); }

One way to do IO for the whole buffer would be

std::copy(arrayOfArryays[1].begin(),  arrayOfArrays[arrayOfArrays.size() - 1].end(), dst_iter);

2

Certainly open for a name change. I thought I tried to call them inner arrays in the docs, but it's certainly confusing.

3

I agree it's strange that you can getValues() becomes accessible once you convert an ArrayOfArrays to an ArrayOfArraysView. I don't think there would be any harm in making these methods accessible directly from ArrayOfArrays.

src/coreComponents/dataRepository/wrapperHelpers.hpp Outdated Show resolved Hide resolved
Copy link
Contributor

@corbett5 corbett5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give it a look over this evening if it hasn't been merged by then

Comment on lines +557 to +570
// should preallocate var.m_values with estimated sizes
INDEX_TYPE const arraySizeEstimate = (*numArrays)==0 ? 0 : valuesSize / (*numArrays);
var.resize( *numArrays, arraySizeEstimate );
var.reserveValues( valuesSize );

// correctly set the sizes and capacities of each sub-array
localIndex allocatedSize = 0;
for( INDEX_TYPE i = 0; i < *numArrays; ++i )
{
INDEX_TYPE const arrayAllocation = offsets[i+1] - offsets[i];
var.setCapacityOfArray( i, arrayAllocation );
var.resizeArray( i, sizes[ i ] );
allocatedSize += arrayAllocation;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@corbett5
I have been trying a few iterations with this section of code. I can't seem to get it right. Can you have a close look at the attempt to pre-allocate values and the subsequent capacity/resize of each array to make sure it makes sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this whole function I think I would do this

var.resizeFromOffsets( *numArrays, offsets );

for (INDEX_TYPE i = 0; i < numArrays; ++i )
{
  T const * begin = valuesNode.data_ptr() + offsets[i];
  T const * end = begin + sizes[ i ];
  var.appendToArray( i, begin, end );
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@corbett5 Is the name resizeFromOffsets accurate? Looking that the function should it be reserveFromOffsets?

@rrsettgast
Copy link
Member Author

@corbett5 I can't seem to pass integrated tests after baselines are set. There are diffs in the ArrayOfArraysView::m_values in the excess capacity of each array. The values appear to be uninitialized. Here an example:

dataset: </Problem/domain/MeshBodies/mesh1/meshLevels/Level0/ElementRegions/elementRegionsGroup/Domain/elementSubRegions/cb1/ToEmbeddedSurfaces/__values__> and </Problem/domain/MeshBodies/mesh1/meshLevels/Level0/ElementRegions/elementRegionsGroup/Domain/elementSubRegions/cb1/ToEmbeddedSurfaces/__values__>
size:           [10]           [10]
position        __values__      __values__      difference          
------------------------------------------------------------
[ 1 ]          22053           22034           19             
[ 5 ]          22053           22034           19             
[ 7 ]          22048           22039           9              
[ 9 ]          22048           22039           9              
4 differences found

I would think that we don't care about initialized values...but then we can't just dump the values out and read them back in. We either have to avoid writing them out, or set them to some value.

Is there a way to set the initial values of uninitialized array entries?

@corbett5
Copy link
Contributor

corbett5 commented Nov 11, 2024

@corbett5 I can't seem to pass integrated tests after baselines are set. There are diffs in the ArrayOfArraysView::m_values in the excess capacity of each array. The values appear to be uninitialized. Here an example:

dataset: </Problem/domain/MeshBodies/mesh1/meshLevels/Level0/ElementRegions/elementRegionsGroup/Domain/elementSubRegions/cb1/ToEmbeddedSurfaces/__values__> and </Problem/domain/MeshBodies/mesh1/meshLevels/Level0/ElementRegions/elementRegionsGroup/Domain/elementSubRegions/cb1/ToEmbeddedSurfaces/__values__>
size:           [10]           [10]
position        __values__      __values__      difference          
------------------------------------------------------------
[ 1 ]          22053           22034           19             
[ 5 ]          22053           22034           19             
[ 7 ]          22048           22039           9              
[ 9 ]          22048           22039           9              
4 differences found

I would think that we don't care about initialized values...but then we can't just dump the values out and read them back in. We either have to avoid writing them out, or set them to some value.

Is there a way to set the initial values of uninitialized array entries?

I think the best solution would be to modify the python diff code to realize when it's comparing an ArrayOfArrays. The easiest thing would be to set the uninitialized values in the ArrayOfArrays before writing out.

T * const values = const_cast< T * >( var.getValues() );
INDEX_TYPE const * const offsets = var.getOffsets();
for( INDEX_TYPE i = 0; i < numArrays; ++i)
{
  INDEX_TYPE const curOffset = offsets[ i ];
  INDEX_TYPE const nextOffset = offsets[ i + 1 ];
  
   for ( j = curOffset + var.sizeOfArray( i ); j < nextOffset; ++j )
   {
      values[ j ] = T() // Probably should be a NaN or something.
   }
}

INDEX_TYPE const nextOffset = offsets[ i + 1 ];
for( INDEX_TYPE j = curOffset + var.sizeOfArray( i ); j < nextOffset; ++j )
{
if constexpr ( std::is_arithmetic< T >::value )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this redundant since T() = 0 if std::is_arithmetic< T >::value == true? Also, why do you choose to make a copy instead of changing the values in the array? If we're going to make a copy anyways I'd suggest compressing out the uninitialized values. The only "problem" with this is that when we read the array back in it is compressed as well.

Copy link
Member Author

@rrsettgast rrsettgast Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this redundant since T() = 0 if std::is_arithmetic< T >::value == true? Also, why do you choose to make a copy instead of changing the values in the array? If we're going to make a copy anyways I'd suggest compressing out the uninitialized values. The only "problem" with this is that when we read the array back in it is compressed as well.

  • I wasn't sure that T() = 0 was actually true. If we have a variable declared as int a;, is a guaranteed to be equal to zero?
  • I choose to make a copy because I didn't want to break the const promise on the write function. It is of course a double edge sword, because you are not reading in what was passed to the write function...but then changing the const object that was passed to the write function seems somewhat illegal?...nevermind...can't make a local copy. Have to break constness.
  • I did not want to compress it because I wanted to make sure that the behavior of a restart run is identical to the original run. Compressing the ArrayOfArrays would result in behavioral changes as adding new values would require different operations after restart....which of course should be OK...unless there is some sort of allocation error that occurs when you try to insert new values in the ArrayOfArrays...etc. This is of course ignoring the fact that the array you read back in will not be exactly identical to the one you wrote out in the "unallocated" values....but those shouldn't matter ever or we have bigger problems.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • int a; is different from int a = int(). The first is uninitialized and can be anything, the second calls the default constructor which for numeric types always returns zero.
  • Yeah I think breaking correctness is fine here.
  • I get that, if you want as similar behavior as possible then setting the uninitialized values is definitely the way to go.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@corbett5
I should know that. thanks!

@rrsettgast rrsettgast merged commit 1c89619 into develop Nov 12, 2024
23 of 25 checks passed
@rrsettgast rrsettgast deleted the bugfix/conduitTypeError branch November 12, 2024 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci: run code coverage enables running of the code coverage CI jobs ci: run CUDA builds Allows to triggers (costly) CUDA jobs ci: run integrated tests Allows to run the integrated tests in GEOS CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants