Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
wutaiqiang committed Apr 5, 2024
1 parent 57b54ff commit cb6a32f
Showing 1 changed file with 277 additions and 0 deletions.
277 changes: 277 additions & 0 deletions project/FRKL.html
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,285 @@ <h2>FKL vs RKL</h2>
KL divergence is widely used in knowledge distillation (KD).
However, KL divergence is not symmetric as forward KL is not equal to reverse KL.
Here is <a href="https://dibyaghosh.com/blog/probability/kldivergence.html", target="_blank">a blog</a> that well explains it.
<br>
In formula, forward KL is defined as:
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true">
<mtr>
<mtd>
<mi>arg</mi>
<mo>&#x2061;<!-- ⁡ --></mo>
<munder>
<mo movablelimits="true" form="prefix">min</mo>
<mrow class="MJX-TeXAtom-ORD">
<mi>&#x03B8;<!-- θ --></mi>
</mrow>
</munder>
<msub>
<mi>D</mi>
<mrow class="MJX-TeXAtom-ORD">
<mi>K</mi>
<mi>L</mi>
</mrow>
</msub>
<mo stretchy="false">(</mo>
<mi>P</mi>
<mo fence="false" stretchy="false">&#x2016;<!-- ‖ --></mo>
<mi>Q</mi>
<mo stretchy="false">)</mo>
</mtd>
<mtd>
<mi></mi>
<mo>=</mo>
<mi>arg</mi>
<mo>&#x2061;<!-- ⁡ --></mo>
<munder>
<mo movablelimits="true" form="prefix">min</mo>
<mrow class="MJX-TeXAtom-ORD">
<mi>&#x03B8;<!-- θ --></mi>
</mrow>
</munder>
<msub>
<mrow class="MJX-TeXAtom-ORD">
<mi mathvariant="double-struck">E</mi>
</mrow>
<mrow class="MJX-TeXAtom-ORD">
<mi>x</mi>
<mo>&#x223C;<!-- ∼ --></mo>
<mi>P</mi>
</mrow>
</msub>
<mo stretchy="false">[</mo>
<mo>&#x2212;<!-- − --></mo>
<mi>log</mi>
<mo>&#x2061;<!-- ⁡ --></mo>
<msub>
<mi>Q</mi>
<mi>&#x03B8;<!-- θ --></mi>
</msub>
<mo stretchy="false">(</mo>
<mi>X</mi>
<mo stretchy="false">)</mo>
<mo stretchy="false">]</mo>
<mo>&#x2212;<!-- − --></mo>
<mrow class="MJX-TeXAtom-ORD">
<mi class="MJX-tex-caligraphic" mathvariant="script">H</mi>
</mrow>
<mo stretchy="false">(</mo>
<mi>P</mi>
<mo stretchy="false">(</mo>
<mi>X</mi>
<mo stretchy="false">)</mo>
<mo stretchy="false">)</mo>
</mtd>
</mtr>
<mtr>
<mtd />
<mtd>
<mi></mi>
<mo>=</mo>
<mi>arg</mi>
<mo>&#x2061;<!-- ⁡ --></mo>
<munder>
<mo movablelimits="true" form="prefix">min</mo>
<mrow class="MJX-TeXAtom-ORD">
<mi>&#x03B8;<!-- θ --></mi>
</mrow>
</munder>
<msub>
<mrow class="MJX-TeXAtom-ORD">
<mi mathvariant="double-struck">E</mi>
</mrow>
<mrow class="MJX-TeXAtom-ORD">
<mi>x</mi>
<mo>&#x223C;<!-- ∼ --></mo>
<mi>P</mi>
</mrow>
</msub>
<mo stretchy="false">[</mo>
<mo>&#x2212;<!-- − --></mo>
<mi>log</mi>
<mo>&#x2061;<!-- ⁡ --></mo>
<msub>
<mi>Q</mi>
<mi>&#x03B8;<!-- θ --></mi>
</msub>
<mo stretchy="false">(</mo>
<mi>X</mi>
<mo stretchy="false">)</mo>
<mo stretchy="false">]</mo>
</mtd>
</mtr>
<mtr>
<mtd />
<mtd>
<mi></mi>
<mo>=</mo>
<mi>arg</mi>
<mo>&#x2061;<!-- ⁡ --></mo>
<munder>
<mo movablelimits="true" form="prefix">max</mo>
<mrow class="MJX-TeXAtom-ORD">
<mi>&#x03B8;<!-- θ --></mi>
</mrow>
</munder>
<msub>
<mrow class="MJX-TeXAtom-ORD">
<mi mathvariant="double-struck">E</mi>
</mrow>
<mrow class="MJX-TeXAtom-ORD">
<mi>x</mi>
<mo>&#x223C;<!-- ∼ --></mo>
<mi>P</mi>
</mrow>
</msub>
<mo stretchy="false">[</mo>
<mi>log</mi>
<mo>&#x2061;<!-- ⁡ --></mo>
<msub>
<mi>Q</mi>
<mi>&#x03B8;<!-- θ --></mi>
</msub>
<mo stretchy="false">(</mo>
<mi>X</mi>
<mo stretchy="false">)</mo>
<mo stretchy="false">]</mo>
</mtd>
</mtr>
</mtable>
</math>
while reverse KL is defined as:
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mtable columnalign="right left right left right left right left right left right left" rowspacing="3pt" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true">
<mtr>
<mtd>
<mi>arg</mi>
<mo>&#x2061;<!-- ⁡ --></mo>
<munder>
<mo movablelimits="true" form="prefix">min</mo>
<mrow class="MJX-TeXAtom-ORD">
<mi>&#x03B8;<!-- θ --></mi>
</mrow>
</munder>
<msub>
<mi>D</mi>
<mrow class="MJX-TeXAtom-ORD">
<mi>K</mi>
<mi>L</mi>
</mrow>
</msub>
<mo stretchy="false">(</mo>
<mi>Q</mi>
<mo fence="false" stretchy="false">&#x2016;<!-- ‖ --></mo>
<mi>P</mi>
<mo stretchy="false">)</mo>
</mtd>
<mtd>
<mi></mi>
<mo>=</mo>
<mi>arg</mi>
<mo>&#x2061;<!-- ⁡ --></mo>
<munder>
<mo movablelimits="true" form="prefix">min</mo>
<mrow class="MJX-TeXAtom-ORD">
<mi>&#x03B8;<!-- θ --></mi>
</mrow>
</munder>
<msub>
<mrow class="MJX-TeXAtom-ORD">
<mi mathvariant="double-struck">E</mi>
</mrow>
<mrow class="MJX-TeXAtom-ORD">
<mi>x</mi>
<mo>&#x223C;<!-- ∼ --></mo>
<msub>
<mi>Q</mi>
<mi>&#x03B8;<!-- θ --></mi>
</msub>
</mrow>
</msub>
<mo stretchy="false">[</mo>
<mo>&#x2212;<!-- − --></mo>
<mi>log</mi>
<mo>&#x2061;<!-- ⁡ --></mo>
<mi>P</mi>
<mo stretchy="false">(</mo>
<mi>X</mi>
<mo stretchy="false">)</mo>
<mo stretchy="false">]</mo>
<mo>&#x2212;<!-- − --></mo>
<mrow class="MJX-TeXAtom-ORD">
<mi class="MJX-tex-caligraphic" mathvariant="script">H</mi>
</mrow>
<mo stretchy="false">(</mo>
<msub>
<mi>Q</mi>
<mi>&#x03B8;<!-- θ --></mi>
</msub>
<mo stretchy="false">(</mo>
<mi>X</mi>
<mo stretchy="false">)</mo>
<mo stretchy="false">)</mo>
</mtd>
</mtr>
<mtr>
<mtd />
<mtd>
<mi></mi>
<mo>=</mo>
<mi>arg</mi>
<mo>&#x2061;<!-- ⁡ --></mo>
<munder>
<mo movablelimits="true" form="prefix">max</mo>
<mrow class="MJX-TeXAtom-ORD">
<mi>&#x03B8;<!-- θ --></mi>
</mrow>
</munder>
<msub>
<mrow class="MJX-TeXAtom-ORD">
<mi mathvariant="double-struck">E</mi>
</mrow>
<mrow class="MJX-TeXAtom-ORD">
<mi>x</mi>
<mo>&#x223C;<!-- ∼ --></mo>
<msub>
<mi>Q</mi>
<mi>&#x03B8;<!-- θ --></mi>
</msub>
</mrow>
</msub>
<mo stretchy="false">[</mo>
<mi>log</mi>
<mo>&#x2061;<!-- ⁡ --></mo>
<mi>P</mi>
<mo stretchy="false">(</mo>
<mi>X</mi>
<mo stretchy="false">)</mo>
<mo stretchy="false">]</mo>
<mo>+</mo>
<mrow class="MJX-TeXAtom-ORD">
<mi class="MJX-tex-caligraphic" mathvariant="script">H</mi>
</mrow>
<mo stretchy="false">(</mo>
<msub>
<mi>Q</mi>
<mrow class="MJX-TeXAtom-ORD">
<mi>&#x03B8;<!-- θ --></mi>
</mrow>
</msub>
<mo stretchy="false">(</mo>
<mi>X</mi>
<mo stretchy="false">)</mo>
<mo stretchy="false">)</mo>
</mtd>
</mtr>
</mtable>
</math>
<br>

</p>


</div> <!-- col -->
</div> <!--row -->
</body>
Expand Down

0 comments on commit cb6a32f

Please sign in to comment.