Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate images with different IDs in annotations file + Valid and Train set not mutually exclusive #26

Open
OasisArtisan opened this issue Apr 8, 2023 · 0 comments

Comments

@OasisArtisan
Copy link

Hello, thanks for the great dataset and amazing experiment documentation in the paper.

While working on this dataset I noticed that the training annotations contain 65 duplicate images with different image IDs in livecell_coco_train.json listed below as name and id 1 and id 2.

name: Huh7_Phase_A10_2_00d16h00m_4.tif, k1: 742297, k2: 1012330
name: Huh7_Phase_A11_2_01d20h00m_4.tif, k1: 841377, k2: 1013044
name: Huh7_Phase_A10_2_00d12h00m_4.tif, k1: 749820, k2: 1013129
name: Huh7_Phase_A10_2_00d00h00m_2.tif, k1: 724219, k2: 1013993
name: Huh7_Phase_A10_2_00d04h00m_3.tif, k1: 796609, k2: 1014395
name: Huh7_Phase_A10_2_00d20h00m_4.tif, k1: 742103, k2: 1015006
name: Huh7_Phase_A11_2_01d04h00m_3.tif, k1: 721606, k2: 1016083
name: Huh7_Phase_A10_2_00d12h00m_2.tif, k1: 704129, k2: 1016350
name: Huh7_Phase_A10_2_00d08h00m_1.tif, k1: 749375, k2: 1016420
name: Huh7_Phase_A11_2_01d16h00m_3.tif, k1: 634448, k2: 1017442
name: Huh7_Phase_A11_2_02d00h00m_1.tif, k1: 627369, k2: 1017877
name: Huh7_Phase_A11_2_02d04h00m_1.tif, k1: 652928, k2: 1018687
name: Huh7_Phase_A10_2_00d20h00m_2.tif, k1: 814719, k2: 1019584
name: Huh7_Phase_A10_2_00d08h00m_2.tif, k1: 672481, k2: 1019719
name: Huh7_Phase_A10_2_00d16h00m_2.tif, k1: 850872, k2: 1020466
name: Huh7_Phase_A11_2_01d04h00m_2.tif, k1: 789231, k2: 1020928
name: Huh7_Phase_A10_2_00d12h00m_3.tif, k1: 652541, k2: 1021135
name: Huh7_Phase_A11_2_01d08h00m_1.tif, k1: 760798, k2: 1021356
name: Huh7_Phase_A11_2_02d00h00m_3.tif, k1: 739786, k2: 1021512
name: Huh7_Phase_A10_2_00d16h00m_1.tif, k1: 656984, k2: 1021646
name: Huh7_Phase_A11_2_01d12h00m_1.tif, k1: 752315, k2: 1021783
name: Huh7_Phase_A11_2_01d16h00m_4.tif, k1: 868707, k2: 1021965
name: Huh7_Phase_A11_2_01d04h00m_4.tif, k1: 851926, k2: 1023848
name: Huh7_Phase_A11_2_01d08h00m_2.tif, k1: 695916, k2: 1024548
name: Huh7_Phase_A10_2_00d20h00m_3.tif, k1: 838294, k2: 1025518
name: Huh7_Phase_A10_2_00d00h00m_4.tif, k1: 641150, k2: 1025582
name: Huh7_Phase_A11_2_01d12h00m_2.tif, k1: 729341, k2: 1026409
name: Huh7_Phase_A10_2_00d08h00m_3.tif, k1: 772941, k2: 1026686
name: Huh7_Phase_A11_2_01d04h00m_1.tif, k1: 652633, k2: 1027067
name: Huh7_Phase_A11_2_01d08h00m_3.tif, k1: 724680, k2: 1027318
name: Huh7_Phase_A11_2_01d12h00m_4.tif, k1: 846332, k2: 1027421
name: Huh7_Phase_A10_2_00d04h00m_1.tif, k1: 658739, k2: 1028667
name: Huh7_Phase_A11_2_01d12h00m_3.tif, k1: 736444, k2: 1028962
name: Huh7_Phase_A10_2_01d00h00m_3.tif, k1: 729266, k2: 1029265
name: Huh7_Phase_A10_2_00d00h00m_1.tif, k1: 627203, k2: 1029675
name: Huh7_Phase_A10_2_00d04h00m_4.tif, k1: 712511, k2: 1031290
name: Huh7_Phase_A11_2_01d20h00m_1.tif, k1: 865318, k2: 1033010
name: Huh7_Phase_A10_2_00d08h00m_4.tif, k1: 781168, k2: 1033353
name: SKOV3_Phase_G4_1_01d08h00m_2.tif, k1: 1164153, k2: 1328453
name: SKOV3_Phase_H4_1_00d12h00m_3.tif, k1: 1162274, k2: 1333757
name: SKOV3_Phase_H4_1_00d08h00m_3.tif, k1: 1168650, k2: 1333805
name: SKOV3_Phase_H4_1_00d00h00m_4.tif, k1: 1127550, k2: 1334497
name: SKOV3_Phase_H4_1_00d04h00m_3.tif, k1: 1185349, k2: 1334666
name: SKOV3_Phase_H4_1_00d12h00m_2.tif, k1: 1070449, k2: 1336398
name: SKOV3_Phase_G4_1_01d00h00m_1.tif, k1: 1074361, k2: 1336972
name: SKOV3_Phase_G4_1_01d08h00m_4.tif, k1: 1151307, k2: 1337945
name: SKOV3_Phase_H4_1_00d04h00m_4.tif, k1: 1059759, k2: 1338111
name: SKOV3_Phase_H4_1_00d08h00m_1.tif, k1: 1186839, k2: 1339549
name: SKOV3_Phase_G4_1_01d12h00m_1.tif, k1: 1177546, k2: 1343198
name: SKOV3_Phase_G4_1_00d20h00m_2.tif, k1: 1143762, k2: 1344901
name: SKOV3_Phase_G4_1_00d20h00m_1.tif, k1: 1064934, k2: 1345366
name: SKOV3_Phase_H4_1_00d12h00m_4.tif, k1: 1161177, k2: 1345619
name: SKOV3_Phase_H4_1_00d00h00m_3.tif, k1: 1199735, k2: 1347882
name: SKOV3_Phase_G4_1_01d08h00m_3.tif, k1: 1089743, k2: 1349510
name: SKOV3_Phase_H4_1_00d16h00m_4.tif, k1: 1051423, k2: 1351950
name: SKOV3_Phase_G4_1_01d04h00m_2.tif, k1: 1161030, k2: 1354744
name: SKOV3_Phase_G4_1_01d00h00m_2.tif, k1: 1189326, k2: 1356664
name: SKOV3_Phase_G4_1_01d12h00m_3.tif, k1: 1180678, k2: 1357317
name: SKOV3_Phase_G4_1_01d00h00m_4.tif, k1: 1109637, k2: 1357524
name: SKOV3_Phase_H4_1_00d00h00m_1.tif, k1: 1179907, k2: 1358532
name: SKOV3_Phase_G4_1_01d04h00m_4.tif, k1: 1182744, k2: 1358579
name: SKOV3_Phase_G4_1_01d12h00m_2.tif, k1: 1076656, k2: 1360349
name: SKOV3_Phase_H4_1_00d16h00m_2.tif, k1: 1182700, k2: 1363743
name: SKOV3_Phase_G4_1_01d04h00m_1.tif, k1: 1081917, k2: 1364498
name: SKOV3_Phase_H4_1_00d16h00m_3.tif, k1: 1058930, k2: 1364779

The validation file livecell_coco_val.json has 1 duplicate.

name: Huh7_Phase_A10_2_00d04h00m_2.tif, k1: 876543, k2: 1037056

The test file livecell_coco_test.json has 52 duplicates.

name: Huh7_Phase_A12_1_03d16h00m_2.tif, k1: 918641, k2: 1038567
name: Huh7_Phase_A12_1_03d16h00m_3.tif, k1: 993627, k2: 1038726
name: Huh7_Phase_A12_1_04d00h00m_1.tif, k1: 983296, k2: 1040165
name: Huh7_Phase_A12_1_04d00h00m_3.tif, k1: 991794, k2: 1041802
name: Huh7_Phase_A12_1_03d20h00m_2.tif, k1: 1001871, k2: 1041846
name: Huh7_Phase_A12_1_04d00h00m_2.tif, k1: 973163, k2: 1042788
name: Huh7_Phase_A12_1_03d20h00m_1.tif, k1: 933656, k2: 1043211
name: Huh7_Phase_A12_1_03d12h00m_4.tif, k1: 980864, k2: 1044008
name: Huh7_Phase_A12_1_03d20h00m_4.tif, k1: 921415, k2: 1044060
name: Huh7_Phase_A12_1_03d16h00m_1.tif, k1: 989701, k2: 1044900
name: Huh7_Phase_A12_1_03d12h00m_2.tif, k1: 942839, k2: 1045154
name: Huh7_Phase_A12_1_03d20h00m_3.tif, k1: 921474, k2: 1045838
name: Huh7_Phase_A12_1_04d00h00m_4.tif, k1: 921623, k2: 1047906
name: Huh7_Phase_A12_1_03d16h00m_4.tif, k1: 926145, k2: 1048250
name: Huh7_Phase_A12_1_03d12h00m_3.tif, k1: 983072, k2: 1048800
name: Huh7_Phase_A12_1_03d12h00m_1.tif, k1: 940824, k2: 1049447
name: SKOV3_Phase_F4_2_02d12h00m_1.tif, k1: 1317430, k2: 1372042
name: SKOV3_Phase_F4_2_02d16h00m_1.tif, k1: 1253492, k2: 1374590
name: SKOV3_Phase_F4_2_02d20h00m_1.tif, k1: 1302023, k2: 1375091
name: SKOV3_Phase_E4_2_02d08h00m_1.tif, k1: 1261690, k2: 1376035
name: SKOV3_Phase_F4_2_03d00h00m_2.tif, k1: 1274826, k2: 1376998
name: SKOV3_Phase_E4_2_02d04h00m_2.tif, k1: 1271389, k2: 1377636
name: SKOV3_Phase_F4_2_02d12h00m_4.tif, k1: 1278267, k2: 1377860
name: SKOV3_Phase_E4_2_02d00h00m_1.tif, k1: 1253296, k2: 1381348
name: SKOV3_Phase_E4_2_02d00h00m_3.tif, k1: 1272299, k2: 1383099
name: SKOV3_Phase_E4_2_01d20h00m_4.tif, k1: 1270397, k2: 1383876
name: SKOV3_Phase_E4_2_02d08h00m_4.tif, k1: 1296305, k2: 1384630
name: SKOV3_Phase_E4_2_01d16h00m_2.tif, k1: 1310723, k2: 1385515
name: SKOV3_Phase_F4_2_03d00h00m_3.tif, k1: 1292182, k2: 1385652
name: SKOV3_Phase_F4_2_02d16h00m_3.tif, k1: 1273771, k2: 1386312
name: SKOV3_Phase_E4_2_02d04h00m_4.tif, k1: 1273342, k2: 1389189
name: SKOV3_Phase_F4_2_03d00h00m_1.tif, k1: 1309765, k2: 1391353
name: SKOV3_Phase_F4_2_02d20h00m_2.tif, k1: 1274245, k2: 1396914
name: SKOV3_Phase_E4_2_01d20h00m_3.tif, k1: 1274465, k2: 1397578
name: SKOV3_Phase_E4_2_02d00h00m_4.tif, k1: 1289650, k2: 1399030
name: SKOV3_Phase_E4_2_02d04h00m_1.tif, k1: 1326698, k2: 1399796
name: SKOV3_Phase_F4_2_02d16h00m_2.tif, k1: 1276830, k2: 1400740
name: SKOV3_Phase_F4_2_02d12h00m_2.tif, k1: 1274032, k2: 1401700
name: SKOV3_Phase_F4_2_03d00h00m_4.tif, k1: 1276578, k2: 1402311
name: SKOV3_Phase_E4_2_01d20h00m_1.tif, k1: 1313123, k2: 1402853
name: SKOV3_Phase_E4_2_01d16h00m_1.tif, k1: 1256038, k2: 1405711
name: SKOV3_Phase_F4_2_02d20h00m_4.tif, k1: 1291633, k2: 1406930
name: SKOV3_Phase_F4_2_02d20h00m_3.tif, k1: 1287160, k2: 1407530
name: SKOV3_Phase_F4_2_02d12h00m_3.tif, k1: 1275261, k2: 1408023
name: SKOV3_Phase_E4_2_01d20h00m_2.tif, k1: 1285056, k2: 1409962
name: SKOV3_Phase_E4_2_02d08h00m_3.tif, k1: 1298971, k2: 1412065
name: SKOV3_Phase_E4_2_01d16h00m_4.tif, k1: 1298787, k2: 1412266
name: SKOV3_Phase_E4_2_02d08h00m_2.tif, k1: 1306741, k2: 1412807
name: SKOV3_Phase_E4_2_02d04h00m_3.tif, k1: 1315158, k2: 1413202
name: SKOV3_Phase_E4_2_02d00h00m_2.tif, k1: 1299804, k2: 1414706
name: SKOV3_Phase_F4_2_02d16h00m_4.tif, k1: 1322903, k2: 1415863
name: SKOV3_Phase_E4_2_01d16h00m_3.tif, k1: 1301377, k2: 1418523

I checked a few of the duplicates to see if they have different annotations but it seems that they don't.

Furthermore, I'm not sure if this is intended but the validation set is not mutually exclusive with the training set. There are 30 common images between them:

{'Huh7_Phase_A10_2_00d00h00m_3.tif',
 'Huh7_Phase_A10_2_00d12h00m_1.tif',
 'Huh7_Phase_A10_2_00d16h00m_3.tif',
 'Huh7_Phase_A10_2_00d20h00m_1.tif',
 'Huh7_Phase_A10_2_01d00h00m_1.tif',
 'Huh7_Phase_A10_2_01d00h00m_2.tif',
 'Huh7_Phase_A10_2_01d00h00m_4.tif',
 'Huh7_Phase_A11_2_01d08h00m_4.tif',
 'Huh7_Phase_A11_2_01d16h00m_1.tif',
 'Huh7_Phase_A11_2_01d16h00m_2.tif',
 'Huh7_Phase_A11_2_01d20h00m_2.tif',
 'Huh7_Phase_A11_2_01d20h00m_3.tif',
 'Huh7_Phase_A11_2_02d00h00m_2.tif',
 'Huh7_Phase_A11_2_02d00h00m_4.tif',
 'Huh7_Phase_A11_2_02d04h00m_2.tif',
 'Huh7_Phase_A11_2_02d04h00m_3.tif',
 'Huh7_Phase_A11_2_02d04h00m_4.tif',
 'SKOV3_Phase_G4_1_00d20h00m_3.tif',
 'SKOV3_Phase_G4_1_00d20h00m_4.tif',
 'SKOV3_Phase_G4_1_01d00h00m_3.tif',
 'SKOV3_Phase_G4_1_01d04h00m_3.tif',
 'SKOV3_Phase_G4_1_01d08h00m_1.tif',
 'SKOV3_Phase_G4_1_01d12h00m_4.tif',
 'SKOV3_Phase_H4_1_00d00h00m_2.tif',
 'SKOV3_Phase_H4_1_00d04h00m_1.tif',
 'SKOV3_Phase_H4_1_00d04h00m_2.tif',
 'SKOV3_Phase_H4_1_00d08h00m_2.tif',
 'SKOV3_Phase_H4_1_00d08h00m_4.tif',
 'SKOV3_Phase_H4_1_00d12h00m_1.tif',
 'SKOV3_Phase_H4_1_00d16h00m_1.tif'}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant