Flaky tests #496

mehertz · 2023-06-16T10:57:39Z

Tracking ticket for flaky tests

test_restore_version frequent DuplicateKeyException in Conda build #469
Sporadic segmentation fault in test_engine.py::test_partial_write_hashed #316
3.8 windows test_filter_categorical[lmdb_version_store]
3.8 windows test_filter_categorical[lmdb_version_store_dynamic_schema]
3.7 windows test_filter_categorical[lmdb_version_store_dynamic_schema]
-> Through debugging, the underlying issue is LMDB map size is too small on Windows #229
linux 3.11 test_filter_numeric_isnotin_unsigned -> Fix flaky test_filter_numeric_isnotin_unsigned (#496) #604
linux 3.9 test_filter_numeric_isnotin_unsigned
Windows 3.11 test_filter_numeric_isnotin_unsigned
test_filtering.py::test_filter_numeric_isnotin_unsigned
linux 3.11 test_hypothesis_mean_agg_dynamic
-> Not trivially fixable/skippable. See Improve aggregation numerical accuracy #603
Windows 3.7 test_basic_version_store.py::test_read_ts[EncodingVersion.V2]
-> Fix attempt bundled with Fix flaky test_filter_numeric_isnotin_unsigned (#496) #604
Windows 3.8 test_find_version
-> Looks similar in nature to test_read_ts above
test_append_with_cont_mem_problem below Flaky tests for defragmentation API #985
test_append_with_defragmentation below Flaky tests for defragmentation API #985
test_column_names_by_timestamp (need to use distinct_timestamps on it)
test_filter_numeric_isnotin_unsigned
linux 3.11 test_hypothesis_sum_agg
Unskip test: test_read_incomplete_no_warning #1153
Flaky test: test_warnings_arctic_instance #1154

The text was updated successfully, but these errors were encountered:

poodlewars · 2023-06-21T15:37:21Z

test_append_with_cont_mem_problem added about a month ago is flaky

1: ================================== FAILURES ===================================
1: ______________________ test_append_with_cont_mem_problem ______________________
1: 
1: sym = 'test2023-06-21T14_56_11_375132'
1: lmdb_version_store_tiny_segment_dynamic = NativeVersionStore: Library: local.test_698_2023-06-21T14_56_11_375132, Primary Storage: lmdb_storage.
1: 
1:     def test_append_with_cont_mem_problem(sym, lmdb_version_store_tiny_segment_dynamic):
1:         set_config_int("SymbolDataCompact.SegmentCount", 1)
1:         df0 = pd.DataFrame({"0": ["01234567890123456"]}, index=[pd.Timestamp(0)])
1:         df1 = pd.DataFrame({"0": ["012345678901234567"]}, index=[pd.Timestamp(1)])
1:         df2 = pd.DataFrame({"0": ["0123456789012345678"]}, index=[pd.Timestamp(2)])
1:         df3 = pd.DataFrame({"0": ["01234567890123456789"]}, index=[pd.Timestamp(3)])
1:         df = pd.concat([df0, df1, df2, df3])
1:     
1:         for _ in range(100):
1:             lib = lmdb_version_store_tiny_segment_dynamic
1:             lib.write(sym, df0).version
1:             lib.append(sym, df1).version
1:             lib.append(sym, df2).version
1:             lib.append(sym, df3).version
1: >           lib.version_store.defragment_symbol_data(sym, None)
1: E           arcticdb_ext.storage.NoDataFoundException: test2023-06-21T14_56_11_375132
1:

poodlewars · 2023-06-27T15:26:31Z

test_dynamic_strings_with_all_nones

Windows & Python 3.7

Stack

``` 1: ================================== FAILURES =================================== 1: _____________________ test_dynamic_strings_with_all_nones _____________________ 1: 1: lmdb_version_store = NativeVersionStore: Library: local.test_973_2023-06-27T15_10_01_939197, Primary Storage: lmdb_storage. 1: 1: def test_dynamic_strings_with_all_nones(lmdb_version_store): 1: df = pd.DataFrame({"x": [None, None]}) 1: > lmdb_version_store.write("strings", df, dynamic_strings=True) 1: 1: tests\integration\arcticdb\version_store\test_basic_version_store.py:776: 1: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 1: 1: self = NativeVersionStore: Library: local.test_973_2023-06-27T15_10_01_939197, Primary Storage: lmdb_storage. 1: symbol = 'strings', data = x 1: 0 None 1: 1 None, metadata = None 1: prune_previous_version = False, pickle_on_failure = False 1: validate_index = False, kwargs = {'dynamic_strings': True} 1: proto_cfg = dynamic_strings: true 1: , dynamic_strings = True 1: recursive_normalizers = False, parallel = False, incomplete = False 1: coerce_columns = None, sparsify_floats = False 1: 1: def write( 1: self, 1: symbol: str, 1: data: Any, 1: metadata: Optional[Any] = None, 1: prune_previous_version: Optional[bool] = None, 1: pickle_on_failure: Optional[bool] = None, 1: validate_index: bool = False, 1: **kwargs, 1: ) -> Optional[VersionedItem]: 1: """ 1: Write `data` to the specified `symbol`. If `symbol` already exists then a new version will be created to 1: reference the newly written data. For more information on versions see the documentation for the `read` 1: primitive. 1: 1: Pandas DataFrames, Pandas Series and Numpy NDArrays will be normalised into a common structure suitable for 1: storage. Data that cannot be normalised can be written by pickling the data, however pickled data 1: consumes more storage space, is less performant for reads and writes and does not support advanced query 1: features. Pickling is therefore only supported via the `pickle_on_failure` flag. 1: 1: Normalised data will be divided into segments that are deduplicated against storage prior to write. As a result, 1: if `data` contains only slight changes compared to pre-existing versions only the delta will be written. 1: 1: Note that `write` is not designed for multiple concurrent writers over a single symbol. 1: 1: Note: ArcticDB will use the 0-th level index of the Pandas DataFrame for its on-disk index. 1: 1: Any non-`DatetimeIndex` will converted into an internal `RowCount` index. That is, ArcticDB will assign each 1: row a monotonically increasing integer identifier and that will be used for the index. 1: 1: Parameters 1: ---------- 1: symbol : `str` 1: Symbol name. Limited to 255 characters. The following characters are not supported in symbols: 1: "*", "&", "<", ">" 1: data : `Union[pd.DataFrame, pd.Series, np.array]` 1: Data to be written. 1: metadata : `Optional[Any]`, default=None 1: Optional metadata to persist along with the symbol. 1: prune_previous_version : `bool`, default=True 1: Removes previous (non-snapshotted) versions from the database. 1: pickle_on_failure: `bool`, default=False 1: Pickle `data` if it can't be normalized. 1: validate_index: bool, default=False 1: If True, will verify that the index of `data` supports date range searches and update operations. This in effect tests that the data is sorted in ascending order. 1: ArcticDB relies on Pandas to detect if data is sorted - you can call DataFrame.index.is_monotonic_increasing on your input DataFrame to see if Pandas believes the 1: data to be sorted 1: kwargs : 1: passed through to the write handler 1: 1: Returns 1: ------- 1: Optional[VersionedItem] 1: Structure containing metadata and version number of the written symbol in the store. 1: The data attribute will not be populated. 1: 1: Raises 1: ------ 1: UnsortedDataException 1: If data is unsorted, when validate_index is set to True. 1: 1: Examples 1: -------- 1: 1: >>> df = pd.DataFrame({'column': [5,6,7]}) 1: >>> lib.write("symbol", df, metadata={'my_dictionary': 'is_great'}) 1: >>> lib.read("symbol").data 1: column 1: 0 5 1: 1 6 1: 2 7 1: """ 1: self.check_symbol_validity(symbol) 1: proto_cfg = self._lib_cfg.lib_desc.version.write_options 1: 1: dynamic_strings = self._resolve_dynamic_strings(kwargs) 1: 1: pickle_on_failure = self.resolve_defaults( 1: "pickle_on_failure", proto_cfg, global_default=False, existing_value=pickle_on_failure, **kwargs 1: ) 1: prune_previous_version = self.resolve_defaults( 1: "prune_previous_version", proto_cfg, global_default=False, existing_value=prune_previous_version, **kwargs 1: ) 1: recursive_normalizers = self.resolve_defaults( 1: "recursive_normalizers", proto_cfg, global_default=False, uppercase=False, **kwargs 1: ) 1: parallel = self.resolve_defaults("parallel", proto_cfg, global_default=False, uppercase=False, **kwargs) 1: incomplete = self.resolve_defaults("incomplete", proto_cfg, global_default=False, uppercase=False, **kwargs) 1: 1: # TODO remove me when dynamic strings is the default everywhere 1: if parallel: 1: dynamic_strings = True 1: 1: coerce_columns = kwargs.get("coerce_columns", None) 1: sparsify_floats = kwargs.get("sparsify_floats", False) 1: 1: _handle_categorical_columns(symbol, data, False) 1: 1: log.debug( 1: "Writing with pickle_on_failure={}, prune_previous_version={}, recursive_normalizers={}", 1: pickle_on_failure, 1: prune_previous_version, 1: recursive_normalizers, 1: ) 1: 1: # Do a multi_key write if the structured is nested and is not trivially normalizable via msgpack. 1: if recursive_normalizers: 1: vit = self.try_flatten_and_write_composite_object( 1: symbol, data, metadata, pickle_on_failure, dynamic_strings 1: ) 1: if isinstance(vit, VersionedItem): 1: return vit 1: 1: udm, item, norm_meta = self._try_normalize( 1: symbol, data, metadata, pickle_on_failure, dynamic_strings, coerce_columns 1: ) 1: # TODO: allow_sparse for write_parallel / recursive normalizers as well. 1: if isinstance(item, NPDDataFrame): 1: if parallel: 1: self.version_store.write_parallel(symbol, item, norm_meta, udm) 1: return None 1: elif incomplete: 1: self.version_store.append_incomplete(symbol, item, norm_meta, udm) 1: return None 1: else: 1: vit = self.version_store.write_versioned_dataframe( 1: > symbol, item, norm_meta, udm, prune_previous_version, sparsify_floats, validate_index 1: ) 1: E arcticdb_ext.exceptions.InternalException: (mdb_dbi_open: MDB_INCOMPATIBLE: Operation and DB incompatible, or DB flags changed) ```

Looking at the errors for this one, my recent LMDB fixes, which are in PR, might help resolve it.

The cause was type_arithmetic_promoted_type would return int64 as the common type for uint64 and any signed int. When we then do the static_cast<WideType>(*ptr++) before calling the Is[Not]InOperator, the uint64 is converted to int64 and the special overloads in the Operators are never used. + Ability to only_test_encoding_version_v1 in a test/class/module

to trigger the bug in #496

Fixes #496

The cause was type_arithmetic_promoted_type would return int64 as the common type for uint64 and any signed int. When we then do the static_cast<WideType>(*ptr++) before calling the Is[Not]InOperator, the uint64 is converted to int64 and the special overloads in the Operators are never used. + Ability to only_test_encoding_version_v1 in a test/class/module

mehertz · 2023-07-14T16:54:18Z

@qc00 To summarise, after what I'm going to call round 1 of the test fixing, we'll be left with:

test_engine.py::test_partial_write_hashed
test_hypothesis_mean_agg_dynamic
All the LMDB tests on Windows

Is that about right?

qc00 · 2023-07-14T18:13:45Z

There's also test_append_with_cont_mem_problem from Alex's comment above.

The cause was type_arithmetic_promoted_type would return int64 as the common type for uint64 and any signed int64. When we then do the static_cast<WideType>(*ptr++) before calling the Is[Not]InOperator, the uint64 is converted to int64 and the special overloads in the Operators are never used. Additional changes: + Ability to only_test_encoding_version_v1 in a test/class/module + Attempt to fix another flaky test `test_read_ts`

to trigger the bug in #496

Fixes #496

to trigger the bug in #496

Fixes #496

to trigger the bug in #496

Fixes #496

to trigger the bug in #496

Fixes #496

to trigger the bug in #496

Fixes #496

poodlewars · 2023-10-16T08:30:11Z

test_append_with_defragmentation eg https://github.com/man-group/ArcticDB/actions/runs/6188104061/job/16801939352

and https://github.com/man-group/ArcticDB/actions/runs/6550597951/job/17791038616?pr=918

G-D-Petrov · 2023-11-21T13:47:49Z

PR #1087 is adding fixes for:

test_column_names_by_timestamp (need to use distinct_timestamps on it)
test_filter_numeric_isnotin_unsigned
linux 3.11 test_hypothesis_sum_agg

It is also adding xfails for:

test_append_with_cont_mem_problem below Flaky tests for defragmentation API #985
test_append_with_defragmentation below Flaky tests for defragmentation API #985

We will continue to monitor the following and will not be xfail them:

Sporadic segmentation fault in test_engine.py::test_partial_write_hashed #316
Windows 3.8 test_find_version
-> Looks similar in nature to test_read_ts above

#### Reference Issues/PRs issue #496 #### What does this implement or fix? As agreed today, I am adding xfail to the tests that need to be investigated as part of issue #496 . The xfails will be removed when the tests are no longer flaky.

#### Reference Issues/PRs Fix for issue #496 #### What does this implement or fix? Change tmpdir to tmp_path because according to the pytest docs, the tmpdir is depreciated and tmp_path is the way to create temporary paths that are safe to use in multiprocessing setups such as pytest-split and pytest-xdist

#### Reference Issues/PRs Part of issue #496 #### What does this implement or fix? Looks like some unneeded xfails have creeped up in master, probably due to a bad merge on my part. This PR removes them.

jjerphan · 2023-12-19T08:44:11Z

test_hypothesis_mean_agg can fail. See those logs.

>   ???
E   AssertionError: DataFrame.iloc[:, 0] (column name="a") are different
E   
E   DataFrame.iloc[:, 0] (column name="a") values are different (100.0 %)
E   [index]: [0]
E   [left]:  [inf]
E   [right]: [nan]
E   At positional index 0, first diff: inf != nan
E   Falsifying example: test_hypothesis_mean_agg(
E       lmdb_version_store=NativeVersionStore: Library: local.test.553_2023-12-18T18_04_05_970933_v2, Primary Storage: lmdb_storage.,
E       df=
E             grouping_column              a
E           0               0  1.586038e+307
E           1               0  1.797693e+308
E       ,
E   )
E   
E   You can reproduce this example by temporarily adding @reproduce_failure('6.72.4', b'AAICAAIAAAC/xpX//////QIAAgAAAL////////6jAAEAAA==') as a decorator on your test case

G-D-Petrov · 2023-12-19T09:08:22Z

@reproduce_failure('6.72.4', b'AAICAAIAAAC/xpX//////QIAAgAAAL////////6jAAEAAA==')

it looks like this build was using a version of the test that doesn't contain the fix for exactly this issue.
The next version has it, so the problem should be fixed with it.

poodlewars · 2023-12-29T09:43:42Z

Closing this epic (which is hard to track) in favour of the label flaky test

mehertz assigned vasil-pashov Jun 19, 2023

mehertz assigned qc00 and unassigned vasil-pashov Jun 30, 2023

qc00 added a commit that referenced this issue Jul 13, 2023

Allow a different clock to be set from Python

bf7fae9

to trigger the bug in #496

qc00 added a commit that referenced this issue Jul 13, 2023

Copy content hash when restoring version

8654c41

Fixes #496

qc00 linked a pull request Jul 17, 2023 that will close this issue

Fix flaky test_filter_numeric_isnotin_unsigned (#496) #604

Merged

qc00 removed a link to a pull request Jul 17, 2023

Fix flaky test_filter_numeric_isnotin_unsigned (#496) #604

Merged

qc00 added a commit that referenced this issue Jul 25, 2023

Allow a different clock to be set from Python

d681873

to trigger the bug in #496

qc00 added a commit that referenced this issue Jul 25, 2023

Copy content hash when restoring version

b849a7b

Fixes #496

qc00 added a commit that referenced this issue Jul 25, 2023

Make timing-dependent tests more reliable (#496)

7050384

qc00 added a commit that referenced this issue Jul 26, 2023

Allow a different clock to be set from Python

909d875

to trigger the bug in #496

qc00 added a commit that referenced this issue Jul 26, 2023

Copy content hash when restoring version

2bd7e76

Fixes #496

qc00 added a commit that referenced this issue Jul 26, 2023

Make timing-dependent tests more reliable (#496)

2db5a66

qc00 added a commit that referenced this issue Jul 27, 2023

Make timing-dependent tests more reliable (#496)

bcbe21d

qc00 added a commit that referenced this issue Jul 27, 2023

Allow a different clock to be set from Python

d55e0f8

to trigger the bug in #496

qc00 added a commit that referenced this issue Jul 27, 2023

Copy content hash when restoring version

096c2a8

Fixes #496

qc00 added a commit that referenced this issue Jul 27, 2023

Make timing-dependent tests more reliable (#496)

b77885d

qc00 added a commit that referenced this issue Aug 2, 2023

Allow a different clock to be set from Python

5247bab

to trigger the bug in #496

qc00 added a commit that referenced this issue Aug 2, 2023

Copy content hash when restoring version

d3e3249

Fixes #496

qc00 added a commit that referenced this issue Aug 2, 2023

Make timing-dependent tests more reliable (#496)

7ce532f

qc00 added a commit that referenced this issue Aug 3, 2023

Make timing-dependent tests more reliable (#496)

55d96c7

qc00 closed this as completed in b13ee7a Aug 7, 2023

qc00 added a commit that referenced this issue Aug 7, 2023

Make timing-dependent tests more reliable (#496)

be394c8

octogenary pushed a commit that referenced this issue Aug 9, 2023

Allow a different clock to be set from Python

1d72362

to trigger the bug in #496

octogenary pushed a commit that referenced this issue Aug 9, 2023

Copy content hash when restoring version

0c38e40

Fixes #496

octogenary pushed a commit that referenced this issue Aug 9, 2023

Make timing-dependent tests more reliable (#496)

e379083

mehertz added the epic Parent issue label Aug 10, 2023

poodlewars reopened this Sep 20, 2023

poodlewars unassigned qc00 Sep 20, 2023

poodlewars assigned phoebusm Oct 20, 2023

poodlewars mentioned this issue Oct 20, 2023

Flaky tests for defragmentation API #985

Closed

poodlewars unassigned phoebusm Oct 20, 2023

G-D-Petrov mentioned this issue Nov 20, 2023

Add xfail to flaky tests #1087

Merged

G-D-Petrov mentioned this issue Nov 22, 2023

Change tmpdir to tmp_path #1093

Merged

G-D-Petrov mentioned this issue Nov 23, 2023

Remove unnecessary xfails #1097

Merged

IvoDD added a commit that referenced this issue Dec 4, 2023

Fixes flake in test_hypothesis_mean_agg #496

03e7eb2

IvoDD added a commit that referenced this issue Dec 4, 2023

Fixes flake in test_hypothesis_mean_agg #496

b80ac6d

IvoDD mentioned this issue Dec 4, 2023

Flaky test hypothesis mean agg (#496) #1125

Merged

5 tasks

IvoDD added a commit that referenced this issue Dec 5, 2023

Fixes flake in test_hypothesis_mean_agg #496

861fce4

poodlewars added the bug Something isn't working label Dec 19, 2023

poodlewars closed this as not planned Won't fix, can't repro, duplicate, stale Dec 29, 2023

poodlewars removed the bug Something isn't working label Dec 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky tests #496

Flaky tests #496

mehertz commented Jun 16, 2023 •

edited by alexowens90

Loading

poodlewars commented Jun 21, 2023

poodlewars commented Jun 27, 2023 •

edited

Loading

mehertz commented Jul 14, 2023

qc00 commented Jul 14, 2023

poodlewars commented Oct 16, 2023 •

edited

Loading

G-D-Petrov commented Nov 21, 2023 •

edited

Loading

jjerphan commented Dec 19, 2023

G-D-Petrov commented Dec 19, 2023

poodlewars commented Dec 29, 2023

Flaky tests #496

Flaky tests #496

Comments

mehertz commented Jun 16, 2023 • edited by alexowens90 Loading

poodlewars commented Jun 21, 2023

poodlewars commented Jun 27, 2023 • edited Loading

mehertz commented Jul 14, 2023

qc00 commented Jul 14, 2023

poodlewars commented Oct 16, 2023 • edited Loading

G-D-Petrov commented Nov 21, 2023 • edited Loading

jjerphan commented Dec 19, 2023

G-D-Petrov commented Dec 19, 2023

poodlewars commented Dec 29, 2023

mehertz commented Jun 16, 2023 •

edited by alexowens90

Loading

poodlewars commented Jun 27, 2023 •

edited

Loading

poodlewars commented Oct 16, 2023 •

edited

Loading

G-D-Petrov commented Nov 21, 2023 •

edited

Loading