multiple negative for MNRL #3139

OnAnd0n · 2024-12-19T07:03:05Z

@tomaarsen
where can i read it in your code, positive + multiple negative to n_columns.
I can't find it...

and I have questions following

should all data samples have same length of negatives for MNRL ?
how to garantee for sampling Hard-negative ?
: {a1, p1, n1-1, n1-2, n1-3}, {a2, p2, n2-1, n2-2, n2-3} => a train_sample in batch : {a1, p1, n1-1, n1-2, n1-3, n2-2, n3-1, ....}

i read your explanation from Understanding how (hard) negatives in MNRL are used #3097

Jakobhenningjensen · 2024-12-20T08:57:27Z

All your hard-negatives are used for all your positives i.e say you have 10 (anchor, positives) then each (anchor, positive) would use all the other "positives" as negatives.

You can then decide to add negatives aswell but they are used for all (anchor, positives) pair i.e if you have 10 (anchor, positives) and 5 negatives then you would (for each (anchor, positive) pair) have 9 + 5 = 14 negatives.

That means that your negatives are not specific to an (anchor, positive) pair (if you want that you should use the TripletLoss).

OnAnd0n · 2024-12-20T14:04:42Z

@Jakobhenningjensen
thank you for your kind explanation.

You can then decide to add negatives aswell but they are used for all (anchor, positives) pair i.e if you have 10 (anchor, positives) and 5 negatives then you would (for each (anchor, positive) pair) have 9 + 5 = 14 negatives.

=> you mean, If I train with 32 data samples, each consisting of {anchor, positive, and 3 negatives},
there are 127(= 31 positives + 96 negatives) negatives for each anchor.
so, {anchor, positive, 127 negatives} is in-batch sample in MNRL. Is it right?


          "data shape"                                                                   "In-batch input"
{a1, p1, n1-1, n1-2, n1-3}                                             {a1, p1,  [127 negatives for anchor1] }
{a2, p2, n2-1, n2-2, n2-3}                                             {a2, p2,  [127 negatives for anchor2] }
                      ...                        =>                                       ...
{a32, p32, n32-1, n32-2, n32-3}                                     {a32, p32,  [127 negatives for anchor32] }

Jakobhenningjensen · 2024-12-22T11:40:17Z

In the MNRL (if I'm correct) you don't have negatives for each anchor/positives but you just provide a list of general hard negatives i.e if your "negative" column has 5 elements then you would have additional 5 negatives (which are all the same) for each sample, since you would just add those 5 elements to your "hard-negatives" for each batch

Jakobhenningjensen · 2024-12-22T12:54:25Z

I thought it worked as if you provided hard negatives for that specific anchor/positive pairs, but you don't. Im currently looking a implementing a loss that does exactly that

OnAnd0n · 2024-12-23T00:03:59Z

@Jakobhenningjensen
thank you for your kind explanation.

But in docs, input shape for MNRL is (anchor, positive, negative_1, …, negative_n)
So, Are [hard negatives for that specific anchor/positive pairs] used in loss respectively?

Loss-overview

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multiple negative for MNRL #3139

multiple negative for MNRL #3139

OnAnd0n commented Dec 19, 2024

Jakobhenningjensen commented Dec 20, 2024

OnAnd0n commented Dec 20, 2024 •

edited

Loading

Jakobhenningjensen commented Dec 22, 2024

Jakobhenningjensen commented Dec 22, 2024

OnAnd0n commented Dec 23, 2024

multiple negative for MNRL #3139

multiple negative for MNRL #3139

Comments

OnAnd0n commented Dec 19, 2024

Jakobhenningjensen commented Dec 20, 2024

OnAnd0n commented Dec 20, 2024 • edited Loading

Jakobhenningjensen commented Dec 22, 2024

Jakobhenningjensen commented Dec 22, 2024

OnAnd0n commented Dec 23, 2024

OnAnd0n commented Dec 20, 2024 •

edited

Loading