Investigate safekeeper eviction errors #8758

arssher · 2024-08-19T13:30:36Z

WARN manager{ttid=88d059a1c2d8aed45da455118a9ac327/b3429961768e77b660f2fce7f093b872}:evict_timeline: failed to evict timeline: validation failed
Caused by:
    non-zero byte found

https://neondb.slack.com/archives/C0756RKTCNR/p1723624642168319
https://neondb.slack.com/archives/C04KGFVUWUQ/p1725512770462649

The text was updated successfully, but these errors were encountered:

jcsp · 2024-09-05T12:14:26Z

Weird observation: all three replicas have the same trailing junk (i.e. probably not a torn write to disk on sk)
Multiple affected timelines have last activity on the same day (end of April, across a period of a few hours).
Perhaps it could have been when we enabled idle tenant detach?

Is the check over-sensitive?

In theory, yes: crashes can generate torn writes.
However, after a crash we expect to see computes restart and finish writing the data. So this should be way rarer than this appears to be.

arssher · 2024-10-04T15:54:56Z

Context: generally non-zero bytes after last record is not something impossible (you could get it by killing compute when it sent half record to safekeepers), but it is expected to disappear eventually, because elected walproposer zeros out WAL after its initial position which is record boundary. So we wanted to understand how this could happen.

So I collected debug dumps and did more thorough analysis. Overall there are 23 timelines like that (querying non evicted timelines which were last modified more than a couple of days ago and don't have lagging remote_consistent_lsn and don't have other issue):

ep-bold-mountain-817710.eu-central-1.aws.neon.tech:5432 arthur@neondb:1810769290=> select distinct tenant_id, timeline_id
from "sk_dumps_2024-09-23"
where env = 'staging' and region='us-east-2' and
eviction_state::text = '"Present"' and
wal_last_modified::date < '2024-09-21'::date and remote_consistent_lsn = commit_lsn and
not (partial_backup->'segments' @? format('strict $[*].name ? (!(@ like_regex ".*sk%s.*"))', sk_id)::jsonpath) order by timeline_id;
            tenant_id             |           timeline_id            
----------------------------------+----------------------------------
 f2e03831819ed1b9b380f633a774fb2c | 240aa11b1e397b030352a8d1f3cba58f
 46f6b7453a87a224efa0b5e0af4ae8ae | 2506a357f731eeb1e04bbe5883102c00
 e9df9482f8468007c83cf6df2bffe480 | 290d7bfb2e47b26309ce403ea0b01989
 d8c92c7fc32be2a8b3ab3305915d78bf | 32b0cb85dfe7b57ba3e8bb2824ebaeff
 62a9ee406c0c1d9de43653a705640df8 | 3941ce032007c0fd7b88a73279870540
 f02478f752bec2e14322adb7170b0596 | 3b4fce480ef1ff021d5d9450a3fa1ee3
 2cb8b2f6edc431da0b6a1e5898b9f1ab | 44a8d4ba3a320eb74b574e63ecc20fbf
 22d2ce914246269175614b0792c84939 | 4a0904233ec5f0a6bc268d596be7dde5
 078a3f4bcfca0fde23dc6a62fc28be95 | 609b49954ced1c892cff669159e4df6f
 44ab0c980c631b105f1ed784754ea9d9 | 678c0676f41672bd0242293647909bb7
 f1ce960f107e05e16a61f07edc838e33 | 6cafcd976e494c7533cf6f239bdcaeb0
 1fd31265f2fc88cfb34fa10219f03a2a | 6ffcd5c6b2bb10d5fe5e0e57c9dd57b1
 357d0e045e32d911be689db4f2b6c5b2 | 9a0c7757f760ca2badafc20434691338
 33f91fc6db9ece331f819739f00a2a38 | 9add3784e94bf4ee4ebc7966ca813e52
 3f4e4b3cef1e7940cb987edaa9577cf8 | 9ed1c8dfd7a29ddade7b5c78f72e6610
 2fd93e29fdbfa2e87f3ad2dbcc97e63e | a120ae8d07d13d8d3a613a4146bc19a0
 b14348d5ac56696fe511d845b4c37a65 | a3b27c0fcbf58aedddc2fc35e99a56dd
 98556a2082360f468205edae855663d4 | ae90c7ed08b6c168b6250979160b484e
 c620c0a78f4122655856c7f3e497e954 | b8f938357ab881ecb6c53adb3a13f5b9
 42a346667e2707ebec4dccbd5bc0a2b4 | bd1662928f3b3852ba11d9a6cac03911
 49c997915d0a9e823ee06c418bb92bbc | d49d76426fe18ed44a80158085c614e5
 bebaa84db929eb64d58366db457b22a3 | d547d7a0188b86df3f8b8881e56f8952
 baa0e7c2270c58d2bc94dec38bcf9db4 | f21b03db99c9194efe9b0fb3ca31c557
(23 rows)

Last modification time of these: such latest is at
2024-03-13 17:42:25.91911+00
and earliest at
2024-03-10 15:17:33.24877+00
(query was like

select cloud, env, region, sk_id, tenant_id, timeline_id, remote_consistent_lsn, commit_lsn, partial_backup, wal_last_modified, (eviction_state::text = '"Present"') present
from "sk_dumps_2024-09-23"
where env = 'staging' and region='us-east-2' and
eviction_state::text = '"Present"' and
wal_last_modified::date < '2024-09-21'::date and remote_consistent_lsn = commit_lsn and
not (partial_backup->'segments' @? format('strict $[*].name ? (!(@ like_regex ".*sk%s.*"))', sk_id)::jsonpath)
order by wal_last_modified desc
;

)

I looked more closely at ~5 of these, and they expose the same pattern:

The non-zero bytes tail (the whole WAL segment actually) is exactly the same on all safekeepers.
These non-zero bytes is beginning of RM_HEAP2 WAL record of ~7719 bytes size which is truncated at page boundary (zeros further).
This record has xid 0.
In xxd I see things like approximate_working_set_size. Moreover, I found another complete record with approximate_working_set_size and concluded that it is pg_proc:

rmgr: Heap        len (rec/tot):     54/  7654, tx:     108546, lsn: 0/0201CD88, prev 0/0201CD68, desc: INSERT off 42 flags 0x00, blkref #0: rel 1663/5/1255 blk 17 FPW
localhost:15013 cloud_admin@postgres:2167370=# select relname from pg_class where oid = '1255';
 relname
---------
 pg_proc
(1 row)

So, 7719 record bytes size hints that partial record is the one with full page image of some pg_proc page.

pg_waldump is doesn't give anything useful for partial/invalid WAL records. Moreover, it is in principle hard to extract info from corrupted records because of WAL record structure (xlogrecord.h): first goes XLogRecord, then XLogRecordBlockHeader(s), then XLogRecordDataHeader[Short|Long], then blocks (images), then the record itself. In our case, when we have ~300 bytes of whole 7719 bytes most likely we don't have main record at all (most of these 7719 byte is likely the page image). I patched pg_waldump to provide at least XLogRecord:

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 4903c7dfec..8caf9f16da 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1180,6 +1180,9 @@ ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
                      XLogRecPtr PrevRecPtr, XLogRecord *record,
                      bool randAccess)
 {
+   printf("validating record header at %X/%X, xlprev %X/%X, xl_tot_len=%u, xl_rmid=%u, rmid==RM_HEAP2_ID %d, xid=%u\n",
+      LSN_FORMAT_ARGS(RecPtr), LSN_FORMAT_ARGS(record->xl_prev), record->xl_tot_len, record->xl_rmid, record->xl_rmid == RM_HEAP2_ID, record->xl_xid);
+
    if (record->xl_tot_len < SizeOfXLogRecord)
    {
        report_invalid_record(state,
@@ -1281,6 +1284,8 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
    int32       offset;
    XLogPageHeader hdr = (XLogPageHeader) phdr;
 
+   printf("validating page header at %X/%X\n", LSN_FORMAT_ARGS(recptr));
+
    Assert((recptr % XLOG_BLCKSZ) == 0);
 
    XLByteToSeg(recptr, segno, state->segcxt.ws_segsize);

Here are its results and xxds on some timelines:

44a8d4ba3a320eb74b574e63ecc20fbf:
last valid record ends at 0/0203B9A8
ars@nonlibrem ~/tmp/tmp/tmp/sk0 $ xxd 000000010000000000000002  | grep -i 0003b9a0: -A 52
0003b9a0: 0200 0000 0000 0000 271e 0000 0000 0000  ........'.......
0003b9b0: 80b9 0302 0000 0000 1009 0000 4341 e400  ............CA..
0003b9c0: 0010 0000 ec1d c400 037f 0600 0005 0000  ................
0003b9d0: 00e7 0400 0011 0000 00ff 0800 0000 0040  ...............@
0003b9e0: 9002 0200 0000 00c4 00d8 0200 2004 2000  ............ . .
0003b9f0: 0000 0038 9f8c 0170 9e86 01a8 9d84 0100  ...8...p........
0003ba00: 8001 00e8 9c7a 0130 9c68 0178 9b6a 01b8  .....z.0.h.x.j..
0003ba10: 9a7a 01f0 9990 0128 9990 0160 9884 0198  .z.....(...`....
0003ba20: 9782 01d0 9682 0110 9680 0150 9580 0190  ...........P....
0003ba30: 947a 01d0 937e 0100 8001 0000 8001 0000  .z...~..........
0003ba40: 8001 0008 9384 0138 9296 0178 917e 01c0  .......8...x.~..
0003ba50: 9068 0108 9068 0148 8f78 0188 8e78 01d0  .h...h.H.x...x..
0003ba60: 8d68 0110 8d76 0150 8c78 0190 8b7c 01c8  .h...v.P.x...|..
0003ba70: 8a84 0108 8a7c 0148 897c 0188 887c 01d0  .....|.H.|...|..
0003ba80: 8768 0118 8768 0150 8690 0188 8586 01c8  .h...h.P........
0003ba90: 8478 0108 847e 012b 0001 00d8 8260 0202  .x...~.+.....`..
0003baa0: ac01 0000 0000 0002 0000 0000 0011 002b  ...............+
0003bab0: 001e 8003 2920 ffff 4f26 0000 0000 0000  ....) ..O&......
0003bac0: 4001 0061 7070 726f 7869 6d61 7465 5f77  @..approximate_w
0003bad0: 6f72 6b69 6e67 5f73 6574 5f73 697a 6500  orking_set_size.
0003bae0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003baf0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003bb00: 0000 0000 c000 000a 0000 000d 0000 0000  ................
0003bb10: 0080 3f00 0000 0000 0000 0000 0000 0066  ..?............f
0003bb20: 0000 0000 7673 0001 0000 0017 0000 0070  ....vs.........p
0003bb30: 0000 0001 0000 0000 0000 001a 0000 0001  ................
0003bb40: 0000 0000 0000 0010 0000 0043 0100 0000  ...........C....
0003bb50: 0000 0000 1900 0000 0100 0000 0100 0000  ................
0003bb60: 2400 0000 7265 7365 7400 0000 3b61 7070  $...reset...;app
0003bb70: 726f 7869 6d61 7465 5f77 6f72 6b69 6e67  roximate_working
0003bb80: 5f73 6574 5f73 697a 651b 246c 6962 6469  _set_size.$libdi
0003bb90: 722f 6e65 6f6e 7301 0000 0000 0000 0009  r/neons.........
0003bba0: 0400 0003 0000 0001 0000 0000 0000 000a  ................
0003bbb0: 0000 0080 0000 000a 0000 000a 0000 0080  ................
0003bbc0: 0000 002d 0d00 000a 0000 0080 0000 0001  ...-............
0003bbd0: 0000 0000 0000 0000 0000 0000 0011 0029  ...............)
0003bbe0: 001e 0003 0920 ffff 0f02 0000 0000 0039  ..... .........9
0003bbf0: 1800 0065 7874 7261 6374 0000 0000 0000  ...extract......
0003bc00: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003bc10: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003bc20: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003bc30: 0000 000b 0000 000a 0000 000c 0000 0000  ................
0003bc40: 0080 3f00 0000 0000 0000 0000 0000 0066  ..?............f
0003bc50: 0000 0100 6973 0002 0000 00a4 0600 0080  ....is..........
0003bc60: 0000 0001 0000 0000 0000 001a 0000 0002  ................
0003bc70: 0000 0000 0000 0019 0000 00f2 0400 001f  ................
0003bc80: 6578 7472 6163 745f 7469 6d65 747a 0001  extract_timetz..
0003bc90: 0000 0000 0000 0000 0000 0000 0011 0028  ...............(
0003bca0: 001e 0003 0920 ffff 0f02 0000 0000 00f9  ..... ..........
0003bcb0: 0400 0064 6174 655f 7061 7274 0000 0000  ...date_part....
0003bcc0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003bcd0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003bce0: 0000 0000 0000 0000 0000 0000 0000 0000  ................

ars@nonlibrem ~/tmp/tmp/tmp/sk2 $ ~/neon/neon/pg_install/v15/bin/pg_waldump 000000010000000000000002 | tail -n 3
pg_waldump: error: error in WAL record at 0/203B980: invalid magic number 0000 in log segment 000000010000000000000002, offset 245760
rmgr: Heap        len (rec/tot):     65/  4885, tx:     113668, lsn: 0/0203A610, prev 0/0203A3C8, desc: LOCK off 109: xid 113668: flags 0x00 LOCK_ONLY EXCL_LOCK , blkref #0: rel 1663/5/24577 blk 0 FPW
rmgr: Heap        len (rec/tot):     83/    83, tx:     113668, lsn: 0/0203B928, prev 0/0203A610, desc: HOT_UPDATE off 109 xmax 113668 flags 0x60 ; new off 110 xmax 113668, blkref #0: rel 1663/5/24577 blk 0
rmgr: Transaction len (rec/tot):     34/    34, tx:     113668, lsn: 0/0203B980, prev 0/0203B928, desc: COMMIT 2024-03-13 18:57:00.063576 MSK

32b0cb85dfe7b57ba3e8bb2824ebaeff:
last complete record at
0/0203a540
its end at
0x0203a562
commit_lsn:
0/0203A568

validating record header at 0/203A540, xlprev 0/203A4E8, xl_tot_len=34, xl_rmid=1, rmid==RM_HEAP2_ID 0, xid=112644
rmgr: Transaction len (rec/tot):     34/    34, tx:     112644, lsn: 0/0203A540, prev 0/0203A4E8, desc: COMMIT 2024-03-13 20:15:25.321127 MSK
validating record header at 0/203A568, xlprev 0/203A540, xl_tot_len=7719, xl_rmid=9, rmid==RM_HEAP2_ID 1, xid=0
validating page header at 0/203C000
pg_waldump: error: error in WAL record at 0/203A540: invalid magic number 0000 in log segment 000000010000000000000002, offset 245760


So incomplete record starts at
0/203A568
xl_tot_len=7719, xl_rmid=9
but since 0/203a6a0 (312 bytes later) segment has only zeros.
xlogreader breaks while validating page header 0/203C000
(magic number is 0), but as you see zeros start earlier.

xxd:
0003a560: 0200 0000 0000 0000 271e 0000 0000 0000  ........'.......
0003a570: 40a5 0302 0000 0000 1009 0000 045e a52a  @............^.*
0003a580: 0010 0000 ec1d c400 037f 0600 0005 0000  ................
0003a590: 00e7 0400 0011 0000 00ff 0800 0000 00c0  ................
0003a5a0: 7c02 0200 0000 00c4 00d8 0200 2004 2000  |........... . .
0003a5b0: 0000 0038 9f8c 0170 9e86 01a8 9d84 0100  ...8...p........
0003a5c0: 8001 00e8 9c7a 0130 9c68 0178 9b6a 01b8  .....z.0.h.x.j..
0003a5d0: 9a7a 01f0 9990 0128 9990 0160 9884 0198  .z.....(...`....
0003a5e0: 9782 01d0 9682 0110 9680 0150 9580 0190  ...........P....
0003a5f0: 947a 01d0 937e 0100 8001 0000 8001 0000  .z...~..........
0003a600: 8001 0008 9384 0138 9296 0178 917e 01c0  .......8...x.~..
0003a610: 9068 0108 9068 0148 8f78 0188 8e78 01d0  .h...h.H.x...x..
0003a620: 8d68 0110 8d76 0150 8c78 0190 8b7c 01c8  .h...v.P.x...|..
0003a630: 8a84 0108 8a7c 0148 897c 0188 887c 01d0  .....|.H.|...|..
0003a640: 8768 0118 8768 0150 8690 0188 8586 01c8  .h...h.P........
0003a650: 8478 0108 847e 012b 0001 00d8 8260 0202  .x...~.+.....`..
0003a660: a801 0000 0000 0002 0000 0000 0011 002b  ...............+
0003a670: 001e 8003 2920 ffff 4f26 0000 0000 0000  ....) ..O&......
0003a680: 4001 0061 7070 726f 7869 6d61 7465 5f77  @..approximate_w
0003a690: 6f72 6b69 6e67 5f73 6574 5f73 697a 6500  orking_set_size.
0003a6a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003a6b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003a6c0: 0000 0000 c000 000a 0000 000d 0000 0000  ................
0003a6d0: 0080 3f00 0000 0000 0000 0000 0000 0066  ..?............f
0003a6e0: 0000 0000 7673 0001 0000 0017 0000 0070  ....vs.........p
0003a6f0: 0000 0001 0000 0000 0000 001a 0000 0001  ................
0003a700: 0000 0000 0000 0010 0000 0043 0100 0000  ...........C....
0003a710: 0000 0000 1900 0000 0100 0000 0100 0000  ................
0003a720: 2400 0000 7265 7365 7400 0000 3b61 7070  $...reset...;app
0003a730: 726f 7869 6d61 7465 5f77 6f72 6b69 6e67  roximate_working
0003a740: 5f73 6574 5f73 697a 651b 246c 6962 6469  _set_size.$libdi
0003a750: 722f 6e65 6f6e 7301 0000 0000 0000 0009  r/neons.........
0003a760: 0400 0003 0000 0001 0000 0000 0000 000a  ................
0003a770: 0000 0080 0000 000a 0000 000a 0000 0080  ................
0003a780: 0000 002d 0d00 000a 0000 0080 0000 0001  ...-............
0003a790: 0000 0000 0000 0000 0000 0000 0011 0029  ...............)
0003a7a0: 001e 0003 0920 ffff 0f02 0000 0000 0039  ..... .........9
0003a7b0: 1800 0065 7874 7261 6374 0000 0000 0000  ...extract......
0003a7c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003a7d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003a7e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003a7f0: 0000 000b 0000 000a 0000 000c 0000 0000  ................
0003a800: 0080 3f00 0000 0000 0000 0000 0000 0066  ..?............f
0003a810: 0000 0100 6973 0002 0000 00a4 0600 0080  ....is..........
0003a820: 0000 0001 0000 0000 0000 001a 0000 0002  ................
0003a830: 0000 0000 0000 0019 0000 00f2 0400 001f  ................
0003a840: 6578 7472 6163 745f 7469 6d65 747a 0001  extract_timetz..
0003a850: 0000 0000 0000 0000 0000 0000 0011 0028  ...............(
0003a860: 001e 0003 0920 ffff 0f02 0000 0000 00f9  ..... ..........
0003a870: 0400 0064 6174 655f 7061 7274 0000 0000  ...date_part....
0003a880: 0000 0000 0000 0000 0000 0000 0000 0000  ................

zeros start here:
0003bf90: 0000 0001 0000 0000 0000 001a 0000 0001  ................
0003bfa0: 0000 0000 0000 001c 0000 0013 6d78 6964  ............mxid
0003bfb0: 5f61 6765 0000 0001 0000 0000 0000 0000  _age............
0003bfc0: 0000 0000 0011 0006 001e 0003 0920 ffff  ............. ..
0003bfd0: 0f02 0000 0000 009d 0400 0061 6765 0000  ...........age..
0003bfe0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003bff0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003c000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003c010: 0000 0000 0000 0000 0000 0000 0000 0000  ...............

240aa11b1e397b030352a8d1f3cba58f:
validating record header at 0/20396F8, xlprev 0/20396A0, xl_tot_len=34, xl_rmid=1, rmid==RM_HEAP2_ID 0, xid=114692
rmgr: Transaction len (rec/tot):     34/    34, tx:     114692, lsn: 0/020396F8, prev 0/020396A0, desc: COMMIT 2024-03-13 00:04:45.055061 MSK
validating record header at 0/2039720, xlprev 0/20396F8, xl_tot_len=7719, xl_rmid=9, rmid==RM_HEAP2_ID 1, xid=0
validating page at 0/203A000
pg_waldump: error: error in WAL record at 0/20396F8: invalid magic number 0000 in log segment 000000010000000000000002, offset 237568
Note that it is *exactly* the same as in 32b0cb85dfe7b57ba3e8bb2824ebaeff: totlen, rmid, xid 0.

Note that here data disappears exactly at page boundary:
00039fb0: 0000 000c 0000 0000 0080 3f00 0000 0000  ..........?.....
00039fc0: 0000 0000 0000 0066 0000 0100 6973 0002  .......f....is..
00039fd0: 0000 0010 0000 0080 0000 0001 0000 0000  ................
00039fe0: 0000 001a 0000 0002 0000 0000 0000 0019  ................
00039ff0: 0000 0019 0000 001d 7465 7874 6963 7265  ........texticre
0003a000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0003a010: 0000 0000 0000 0000 0000 0000 0000 0000  ................

2506a357f731eeb1e04bbe5883102c00:
exactly the same:
rmgr: Heap        len (rec/tot):     83/    83, tx:     111620, lsn: 0/020376D0, prev 0/02036410, desc: HOT_UPDATE off 107 xmax 111620 flags 0x60 ; new off 108 xmax 111620, blkref #0: rel 1663/5/24577 blk 0
validating record header at 0/2037728, xlprev 0/20376D0, xl_tot_len=34, xl_rmid=1, rmid==RM_HEAP2_ID 0, xid=111620
rmgr: Transaction len (rec/tot):     34/    34, tx:     111620, lsn: 0/02037728, prev 0/020376D0, desc: COMMIT 2024-03-10 18:17:33.240838 MSK
validating record header at 0/2037750, xlprev 0/2037728, xl_tot_len=7719, xl_rmid=9, rmid==RM_HEAP2_ID 1, xid=0
validating page at 0/2038000

Also zeros since page boundary:
00037fd0: 0000 0000 0000 0000 0000 000b 0000 000a  ................
00037fe0: 0000 000c 0000 0000 0080 3f00 0000 0000  ..........?.....
00037ff0: 0000 0000 0000 0066 0000 0100 6973 0002  .......f....is..
00038000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00038010: 0000 0000 0000 0000 0000 0000 0000 0000  ................

Initially I was confused by zero xid, thinking it looks like a corruption in the middle (xl_prev follows it and is definitely valid), but looking at heapdesc.c and vacuum code there are xidless heap2 records: xl_heap_vacuum and xl_heap_prune. Vacuum never runs in a transaction. Looking at WAL insertion code, it is quite probable for any XLogFlush to catch some inserted pages of not yet finished vacuum records: for each next WAL page insertion code does GetXLogBuffer, which in turns reports its progress in WALInsertLockUpdateInsertingAt. There WaitXLogInsertionsToFinish which is called by XLogFlush picks it up. So, e.g. if some transaction commits and concurrently there is vacuum is in the middle of inserting its WAL record it is quite probable to make part of this record flushed. And once flush is done walproposer sends data to safekeepers immediately.

Moreover, looking at last operations for affected timelines, it was check_availability. It means there was no usual 5 minutes of idleness: just some check (transaction) and shutdown. So such race of commit (xlogflush) and vacuum is even more likely.

Why we don't have more occurrences of this? At 11.03 we merged #6712 which started flushing all outstanding WAL to safekeepers before the shutdown when pg is stopped in non immediate mode. And we do fast shutdown, see this for vms and this for pods. This makes probability of the issue close to zero.

So I conclude that these bytes is not a corruption and quite normal. However, given that we shouldn't have more of this I'm thinking to leave this too sensitive check in place and force wake up affected tenants to remove the tail.

I automated segment fetching / cmp a bit with

#!/bin/bash

tenant_id=d8c92c7fc32be2a8b3ab3305915d78bf
timeline_id=32b0cb85dfe7b57ba3e8bb2824ebaeff

declare -a sks=("safekeeper-0.us-east-2.aws.neon.build" "safekeeper-2.us-east-2.aws.neon.build" "safekeeper-3.us-east-2.aws.neon.build")

rm -rf "${timeline_id}"
mkdir ${timeline_id}
cd "${timeline_id}"

for sk in "${sks[@]}"; do
    echo $sk
    mkdir "${sk}"
    cd "${sk}"
    rsync  -e 'tsh ssh' -azvP ${sk}:/storage/safekeeper/data/${tenant_id}/${timeline_id}/ .
    cp 000000010000000000000002.partial 000000010000000000000002
    cd ..
done

diff "safekeeper-0.us-east-2.aws.neon.build/000000010000000000000002" "safekeeper-3.us-east-2.aws.neon.build/000000010000000000000002"
diff "safekeeper-0.us-east-2.aws.neon.build/000000010000000000000002" "safekeeper-2.us-east-2.aws.neon.build/000000010000000000000002"
~/neon/neon/pg_install/v15/bin/pg_waldump "safekeeper-0.us-east-2.aws.neon.build/000000010000000000000002" | tail

arssher · 2024-10-08T10:33:43Z

Force started these 23 timelines with

select string_agg(timeline_id, ',') over()
from "sk_dumps_2024-09-23"
where env = 'staging' and region='us-east-2' and
eviction_state::text = '"Present"' and
wal_last_modified::date < '2024-09-21'::date and remote_consistent_lsn = commit_lsn and
not (partial_backup->'segments' @? format('strict $[*].name ? (!(@ like_regex ".*sk%s.*"))', sk_id)::jsonpath) limit 1;
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  string_agg                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 3b4fce480ef1ff021d5d9450a3fa1ee3,f21b03db99c9194efe9b0fb3ca31c557,6cafcd976e494c7533cf6f239bdcaeb0,2506a357f731eeb1e04bbe5883102c00,6cafcd976e494c7533cf6f239bdcaeb0,2506a357f731eeb1e04bbe5883102c00,d547d7a0188b86df3f8b8881e56f8952,a120ae8d07d13d8d3a613a4146bc19a0,290d7bfb2e47b26309ce403ea0b01989,9add3784e94bf4ee4ebc7966ca813e52,b8f938357ab881ecb6c53adb3a13f5b9,d49d76426fe18ed44a80158085c614e5,44a8d4ba3a320eb74b574e63ecc20fbf,a120ae8d07d13d8d3a613a4146bc19a0,678c0676f41672bd0242293647909bb7,d547d7a0188b86df3f8b8881e56f8952,bd1662928f3b3852ba11d9a6cac03911,44a8d4ba3a320eb74b574e63ecc20fbf,32b0cb85dfe7b57ba3e8bb2824ebaeff,3941ce032007c0fd7b88a73279870540,ae90c7ed08b6c168b6250979160b484e,6ffcd5c6b2bb10d5fe5e0e57c9dd57b1,240aa11b1e397b030352a8d1f3cba58f,a3b27c0fcbf58aedddc2fc35e99a56dd,3941ce032007c0fd7b88a73279870540,4a0904233ec5f0a6bc268d596be7dde5,609b49954ced1c892cff669159e4df6f,32b0cb85dfe7b57ba3e8bb2824ebaeff,ae90c7ed08b6c168b6250979160b484e,240aa11b1e397b030352a8d1f3cba58f,678c0676f41672bd0242293647909bb7,bd1662928f3b3852ba11d9a6cac03911,a3b27c0fcbf58aedddc2fc35e99a56dd,6ffcd5c6b2bb10d5fe5e0e57c9dd57b1,9ed1c8dfd7a29ddade7b5c78f72e6610,9a0c7757f760ca2badafc20434691338,f21b03db99c9194efe9b0fb3ca31c557,3b4fce480ef1ff021d5d9450a3fa1ee3,9a0c7757f760ca2badafc20434691338,d49d76426fe18ed44a80158085c614e5,9add3784e94bf4ee4ebc7966ca813e52,4a0904233ec5f0a6bc268d596be7dde5,609b49954ced1c892cff669159e4df6f,9ed1c8dfd7a29ddade7b5c78f72e6610,b8f938357ab881ecb6c53adb3a13f5b9,290d7bfb2e47b26309ce403ea0b01989
(1 row)

script
https://github.com/neondatabase/cloud/pull/18473

./wake_computes.py -e staging -r aws-us-east-2 -m http://localhost:9095 -t 3b4fce480ef1ff021d5d9450a3fa1ee3,f21b03db99c9194efe9b0fb3ca31c557,6cafcd976e494c7533cf6f239bdcaeb0,2506a357f731eeb1e04bbe5883102c00,6cafcd976e494c7533cf6f239bdcaeb0,2506a357f731eeb1e04bbe5883102c00,d547d7a0188b86df3f8b8881e56f8952,a120ae8d07d13d8d3a613a4146bc19a0,290d7bfb2e47b26309ce403ea0b01989,9add3784e94bf4ee4ebc7966ca813e52,b8f938357ab881ecb6c53adb3a13f5b9,d49d76426fe18ed44a80158085c614e5,44a8d4ba3a320eb74b574e63ecc20fbf,a120ae8d07d13d8d3a613a4146bc19a0,678c0676f41672bd0242293647909bb7,d547d7a0188b86df3f8b8881e56f8952,bd1662928f3b3852ba11d9a6cac03911,44a8d4ba3a320eb74b574e63ecc20fbf,32b0cb85dfe7b57ba3e8bb2824ebaeff,3941ce032007c0fd7b88a73279870540,ae90c7ed08b6c168b6250979160b484e,6ffcd5c6b2bb10d5fe5e0e57c9dd57b1,240aa11b1e397b030352a8d1f3cba58f,a3b27c0fcbf58aedddc2fc35e99a56dd,3941ce032007c0fd7b88a73279870540,4a0904233ec5f0a6bc268d596be7dde5,609b49954ced1c892cff669159e4df6f,32b0cb85dfe7b57ba3e8bb2824ebaeff,ae90c7ed08b6c168b6250979160b484e,240aa11b1e397b030352a8d1f3cba58f,678c0676f41672bd0242293647909bb7,bd1662928f3b3852ba11d9a6cac03911,a3b27c0fcbf58aedddc2fc35e99a56dd,6ffcd5c6b2bb10d5fe5e0e57c9dd57b1,9ed1c8dfd7a29ddade7b5c78f72e6610,9a0c7757f760ca2badafc20434691338,f21b03db99c9194efe9b0fb3ca31c557,3b4fce480ef1ff021d5d9450a3fa1ee3,9a0c7757f760ca2badafc20434691338,d49d76426fe18ed44a80158085c614e5,9add3784e94bf4ee4ebc7966ca813e52,4a0904233ec5f0a6bc268d596be7dde5,609b49954ced1c892cff669159e4df6f,9ed1c8dfd7a29ddade7b5c78f72e6610,b8f938357ab881ecb6c53adb3a13f5b9,290d7bfb2e47b26309ce403ea0b01989

arssher · 2024-10-18T11:25:19Z

Repeated the same with staging eu-west-1.

select string_agg(timeline_id, ',') over()
from "sk_dumps_2024-10-18"
where env = 'staging' and region='eu-west-1' and
eviction_state::text = '"Present"' and
wal_last_modified::date < '2024-10-15'::date and remote_consistent_lsn = commit_lsn limit 1;
-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
string_agg | 50f529264a194af289309ca48e907ba4,a7545a78769e09a008ec23e00ef56d78,701fdec591a5eb1cd32a64cce77ef519,c7b0731ac4ed38fef5a1fa354c6f467e,76fce41bb701f6d77b7c50e85d646005,ce0632603f4c8b746f167a88b3ffcd9a,5d3ab256f4da7c404c73382dc6a96679,24c941928b8668a4982b223fb416f78e,e1f1bc802362b9d188ca22e2a69f2f8e,cc7cb3399fde2dd8c00ea276d1665efc,157a39befa6b0bbb7715236f8ab66161,8e9b430f0980c8ac8ea823a971b179c5,9b2a86e0e0087d3e88b21fb04bbd4e4e,e166c0436918b1f9ad012f99cef8f936,eccd78917e7cb3c7da876c257ef7fb72,955e32e93e809d4f863adf9e1dc82f61,f040a2250f0a80a3c849fac9702767a3,3b49ab69aa215f8a579ae05d452e301b,eea4f005f68eda9077519a8727e9576b,c6e7da4e8ce0c70ba83bac5474e6d001,d9a1d2984f7fc7b4dd200e53d0e42574,3d9c265db77385389509bebe1d6b309f,9b22224667f59fe0dbdd1f7c79a39f26,eea4f005f68eda9077519a8727e9576b,ed0f834cc65152630fc5077bca2d925b,36cd8761831a7d06fc5d2ec4ab4d6102,f1a209ae9309aefa2cf2ab2d572851c6,74095be4d643e4e957682c79cd52876a,2397a87e3e602439e888e6ca04263a6b,708a401ab9552ee0845d15bfffa9a1c3,f91d616041ded2a7d0b729f4a2e8ec35,d9a1d2984f7fc7b4dd200e53d0e42574,810aeeb8a456153a4de53fe987f0671f,2dff330495524abaf11d58c562c363be,29bd5a7d9486da714d4df3e57bf8b33c,f91d616041ded2a7d0b729f4a2e8ec35,157a39befa6b0bbb7715236f8ab66161,50f529264a194af289309ca48e907ba4,cbf25f979fdd5b4819fbeb78259dbed9,e166c0436918b1f9ad012f99cef8f936,8e6a2b43430a731a56221a3b2a33ac58,cabd8b306d09e236f48ca849c5fcbb68,24c941928b8668a4982b223fb416f78e,e1f1bc802362b9d188ca22e2a69f2f8e,9b2a86e0e0087d3e88b21fb04bbd4e4e,cbf25f979fdd5b4819fbeb78259dbed9,85d40ba46df71a2dc2107544ce0e2290,828957c8c344065db295fcdedb3d5d4f,8e6a2b43430a731a56221a3b2a33ac58,953ea4dde8c8f981aaf03b2c15f70c35,34737b0adba95c82787f182abca429c7,955e32e93e809d4f863adf9e1dc82f61,c0ac192b42fabe9def3b2a991613c728,1e34088aab18e946416deb3f91bf8aad,22e2f2cae6a7c2f03cfc6d4a5b014e9b,cabd8b306d09e236f48ca849c5fcbb68,3a3e97c0311262eea94fdb8f383d61e6,50f529264a194af289309ca48e907ba4,bbc9a3117fee3091ea18ed4d645f6d60,e1f1bc802362b9d188ca22e2a69f2f8e,e166c0436918b1f9ad012f99cef8f936,f91d616041ded2a7d0b729f4a2e8ec35,c7b0731ac4ed38fef5a1fa354c6f467e,701fdec591a5eb1cd32a64cce77ef519,eccd78917e7cb3c7da876c257ef7fb72,3a3e97c0311262eea94fdb8f383d61e6,f6e1f2da379d6ad1db2feefa1dadba49,a7545a78769e09a008ec23e00ef56d78,3b87bef9f7b12b8dc6ab670e3616f699,74095be4d643e4e957682c79cd52876a,ee3586873a113fbdfd3c853b9f69c1c5,810aeeb8a456153a4de53fe987f0671f,918d54a522bfb4db5513462115e8dbef,8e9b430f0980c8ac8ea823a971b179c5,55bda2e5b484a938297ce10628144ca8,1e34088aab18e946416deb3f91bf8aad,2397a87e3e602439e888e6ca04263a6b,7a4f7164ac09f82dc450f274adb5d659,67d669fa6c5f4696868f34c29571bc46,4537fc7dfdb5e68ccd08a587f4c4262f,29bd5a7d9486da714d4df3e57bf8b33c,d9a1d2984f7fc7b4dd200e53d0e42574,ed0f834cc65152630fc5077bca2d925b,f040a2250f0a80a3c849fac9702767a3,2e97b25b2eb0020a524bd7ff8697e851,09dbda6e8f93c08e88252abb48edd191,13d064b1e9948319b541cae136078a1a,f1a209ae9309aefa2cf2ab2d572851c6,690427065bf86b0b2f5089162e8dadd7,2d7411fa0d8dcf0ab0ff4005196ac142,ea6e43916cbfb12a1682893d88297ad6,5d3ab256f4da7c404c73382dc6a96679,c4a0d41c6032e69dfea640a1b0b643a8,8e6a2b43430a731a56221a3b2a33ac58,76fce41bb701f6d77b7c50e85d646005,b197d87ef4a5796b3fd1703436f24369,3b49ab69aa215f8a579ae05d452e301b,6858055cddc88a015a8dd043278fb71a,e413233f8b2d5684fd4e42a1fc92d9cf,cbf25f979fdd5b4819fbeb78259dbed9,638b6a838cde5ab2352287c3d40dcf1a,cabd8b306d09e236f48ca849c5fcbb68,55bda2e5b484a938297ce10628144ca8,c4a0d41c6032e69dfea640a1b0b643a8,def08ccb89955f7a6dbdb193ca8179e6,638b6a838cde5ab2352287c3d40dcf1a,ea6e43916cbfb12a1682893d88297ad6,79ccb7eb3cfaf2f543a4cd2d61c43dad,a386e2c02dd1e70da7ffc13c2c1950bf,2e97b25b2eb0020a524bd7ff8697e851,740b62410648c0594e6786f5e24a072e,f3507858c6716fa5986753417590bc95,ae633648360eab818949e1f5204a1d13,def08ccb89955f7a6dbdb193ca8179e6,22f08c2465859a01060da75cf3f7811e,67d669fa6c5f4696868f34c29571bc46,708a401ab9552ee0845d15bfffa9a1c3,9b22224667f59fe0dbdd1f7c79a39f26,740b62410648c0594e6786f5e24a072e,ddff630bd8de5c3a27f842ece155f2e0,285081be73b079f48a15decf54c48842,9e5e549a7ee596a3d4943322bc102c26,2397a87e3e602439e888e6ca04263a6b,7a4f7164ac09f82dc450f274adb5d659,36cd8761831a7d06fc5d2ec4ab4d6102,e3ef98130e022a7c8d9929f67f54df80,2845ac6ea9581b07b20f9716e3de3f7b,6858055cddc88a015a8dd043278fb71a,2e97b25b2eb0020a524bd7ff8697e851,1e34088aab18e946416deb3f91bf8aad,ee3586873a113fbdfd3c853b9f69c1c5,bbc9a3117fee3091ea18ed4d645f6d60,f1a209ae9309aefa2cf2ab2d572851c6,828957c8c344065db295fcdedb3d5d4f,ae633648360eab818949e1f5204a1d13,74095be4d643e4e957682c79cd52876a,f3507858c6716fa5986753417590bc95,ea6e43916cbfb12a1682893d88297ad6,ce0632603f4c8b746f167a88b3ffcd9a,810aeeb8a456153a4de53fe987f0671f,a8126fb439c508bc417b48ed2a1ddfaf,f6e1f2da379d6ad1db2feefa1dadba49,c0ac192b42fabe9def3b2a991613c728,09dbda6e8f93c08e88252abb48edd191,cc7cb3399fde2dd8c00ea276d1665efc,953ea4dde8c8f981aaf03b2c15f70c35,701fdec591a5eb1cd32a64cce77ef519,b197d87ef4a5796b3fd1703436f24369,a7545a78769e09a008ec23e00ef56d78,955e32e93e809d4f863adf9e1dc82f61,5dbfe9e06c32b475136b364adb08c73c,5d3ab256f4da7c404c73382dc6a96679,4537fc7dfdb5e68ccd08a587f4c4262f,8e9b430f0980c8ac8ea823a971b179c5,85d40ba46df71a2dc2107544ce0e2290,34737b0adba95c82787f182abca429c7,ed0f834cc65152630fc5077bca2d925b,eccd78917e7cb3c7da876c257ef7fb72,3a3e97c0311262eea94fdb8f383d61e6,c7b0731ac4ed38fef5a1fa354c6f467e,76fce41bb701f6d77b7c50e85d646005,b197d87ef4a5796b3fd1703436f24369,5dbfe9e06c32b475136b364adb08c73c,a8126fb439c508bc417b48ed2a1ddfaf,79ccb7eb3cfaf2f543a4cd2d61c43dad,22e2f2cae6a7c2f03cfc6d4a5b014e9b,85d40ba46df71a2dc2107544ce0e2290,9b2a86e0e0087d3e88b21fb04bbd4e4e,34737b0adba95c82787f182abca429c7,157a39befa6b0bbb7715236f8ab66161,c0ac192b42fabe9def3b2a991613c728,2dff330495524abaf11d58c562c363be,f3507858c6716fa5986753417590bc95,2845ac6ea9581b07b20f9716e3de3f7b,36cd8761831a7d06fc5d2ec4ab4d6102,9b22224667f59fe0dbdd1f7c79a39f26,740b62410648c0594e6786f5e24a072e,285081be73b079f48a15decf54c48842,ddff630bd8de5c3a27f842ece155f2e0,22f08c2465859a01060da75cf3f7811e,eea4f005f68eda9077519a8727e9576b,3d9c265db77385389509bebe1d6b309f,708a401ab9552ee0845d15bfffa9a1c3,a386e2c02dd1e70da7ffc13c2c1950bf,9e5e549a7ee596a3d4943322bc102c26,a05cdd7e6fe29b4c3e2e15f851f87b75,e3ef98130e022a7c8d9929f67f54df80,c6e7da4e8ce0c70ba83bac5474e6d001,def08ccb89955f7a6dbdb193ca8179e6,828957c8c344065db295fcdedb3d5d4f,953ea4dde8c8f981aaf03b2c15f70c35,5dbfe9e06c32b475136b364adb08c73c,24c941928b8668a4982b223fb416f78e,ae633648360eab818949e1f5204a1d13,ce0632603f4c8b746f167a88b3ffcd9a,cc7cb3399fde2dd8c00ea276d1665efc,bbc9a3117fee3091ea18ed4d645f6d60,7a4f7164ac09f82dc450f274adb5d659,2845ac6ea9581b07b20f9716e3de3f7b,67d669fa6c5f4696868f34c29571bc46,ee3586873a113fbdfd3c853b9f69c1c5,6858055cddc88a015a8dd043278fb71a,13d064b1e9948319b541cae136078a1a,3b49ab69aa215f8a579ae05d452e301b,e413233f8b2d5684fd4e42a1fc92d9cf,a05cdd7e6fe29b4c3e2e15f851f87b75,918d54a522bfb4db5513462115e8dbef,2d7411fa0d8dcf0ab0ff4005196ac142,690427065bf86b0b2f5089162e8dadd7,09dbda6e8f93c08e88252abb48edd191,a05cdd7e6fe29b4c3e2e15f851f87b75,a386e2c02dd1e70da7ffc13c2c1950bf,3d9c265db77385389509bebe1d6b309f,4537fc7dfdb5e68ccd08a587f4c4262f,e3ef98130e022a7c8d9929f67f54df80,f6e1f2da379d6ad1db2feefa1dadba49,f040a2250f0a80a3c849fac9702767a3,a8126fb439c508bc417b48ed2a1ddfaf,c6e7da4e8ce0c70ba83bac5474e6d001,ddff630bd8de5c3a27f842ece155f2e0,285081be73b079f48a15decf54c48842,29bd5a7d9486da714d4df3e57bf8b33c,9e5e549a7ee596a3d4943322bc102c26,22f08c2465859a01060da75cf3f7811e,3b87bef9f7b12b8dc6ab670e3616f699,2dff330495524abaf11d58c562c363be,918d54a522bfb4db5513462115e8dbef,c4a0d41c6032e69dfea640a1b0b643a8,3b87bef9f7b12b8dc6ab670e3616f699,e413233f8b2d5684fd4e42a1fc92d9cf,79ccb7eb3cfaf2f543a4cd2d61c43dad,22e2f2cae6a7c2f03cfc6d4a5b014e9b,690427065bf86b0b2f5089162e8dadd7,13d064b1e9948319b541cae136078a1a,2d7411fa0d8dcf0ab0ff4005196ac142,55bda2e5b484a938297ce10628144ca8,638b6a838cde5ab2352287c3d40dcf1a


./wake_computes.py -e staging -r aws-eu-west-1 -m http://localhost:9095 -t ...

AFAIS there is no such problem in staging eu-central-1.

arssher added the c/storage/safekeeper Component: storage: safekeeper label Aug 19, 2024

arssher self-assigned this Aug 19, 2024

arssher mentioned this issue Sep 12, 2024

safekeeper: add wal_last_modified to debug_dump. #8994

Merged

arssher mentioned this issue Sep 26, 2024

Epic: evict inactive timelines on safekeepers to s3 #6220

Open

arssher closed this as completed Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate safekeeper eviction errors #8758

Investigate safekeeper eviction errors #8758

arssher commented Aug 19, 2024 •

edited

Loading

jcsp commented Sep 5, 2024

arssher commented Oct 4, 2024 •

edited

Loading

arssher commented Oct 8, 2024

arssher commented Oct 18, 2024 •

edited

Loading

Investigate safekeeper eviction errors #8758

Investigate safekeeper eviction errors #8758

Comments

arssher commented Aug 19, 2024 • edited Loading

jcsp commented Sep 5, 2024

arssher commented Oct 4, 2024 • edited Loading

arssher commented Oct 8, 2024

arssher commented Oct 18, 2024 • edited Loading

arssher commented Aug 19, 2024 •

edited

Loading

arssher commented Oct 4, 2024 •

edited

Loading

arssher commented Oct 18, 2024 •

edited

Loading