Skip to content

Commit

Permalink
fix space
Browse files Browse the repository at this point in the history
  • Loading branch information
Hou9612 committed May 28, 2024
1 parent 53b105f commit 416c7ca
Showing 1 changed file with 13 additions and 55 deletions.
68 changes: 13 additions & 55 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,8 @@ <h3 class="section-heading text-uppercase">What is the multisensory temporal eve
</div>
</div>

<p></p>

<div class="row">
<div class="col-lg-12">
<h3 class="section-heading text-uppercase">What is LFAV dataset?</h3>
Expand Down Expand Up @@ -233,6 +235,9 @@ <h3 class="section-subheading text-muted">Illustrations of our LFAV dataset stat


<div class="row justify-content-md-center text-center">
<div class="col-md-12" style="text-align:left">
<h4 class="section-subheading" style="text-align:left; margin-left:-14px">Dataset information</h4>
</div>
<div class="col-md centered" style="padding:1rem;">
<img src="{{ site.baseurl }}/static/img/stats-figures/stat1.png" style="width: 100%" class="img-responsive"/>
</div>
Expand All @@ -244,17 +249,11 @@ <h3 class="section-subheading text-muted">Illustrations of our LFAV dataset stat
the temporal proportion of events that occur on two modalities at the same time.
<b>(e)</b> Second-order interactions between all labels, the thicker the line, the closer the association.
<b>(f)</b> Distribution of dataset labels black of each category.
</p>
<!-- <b>(a-d)</b> statistical analysis of the videos and QA pairs. <b>(e)</b> Question formulas. <b>(f)</b> Distribution of question templates, where the dark color indicates the number of QA pairs generated from real videos while the light-colored area on the upper part of each bar means that from synthetic videos. <b>(g)</b> Distribution of first n-grams in questions. Our QA-pairs need <b>fine-grained scene understanding</b> and <b>spatio-temporal reasoning</b> over audio and visual modalities to be solved. For example, existential and location questions require spatial reasoning, and temporal questions require temporal reasoning. Best viewed in color.</p> -->
</div>


<!-- <div class="row justify-content-md-center text-center">
<div class="col-md centered" style="padding:1rem;">
<img src="{{ site.baseurl }}/static/img/stats-figures/matrix.png" style="width: 100%" class="img-responsive"/>
</div>
<p class="text-muted" style="text-align:left"><b>Left: </b>Number of combinations of different types of instruments, where the lighter the color, the more the number. And instruments outside the 22 instrument categories are denoted by other. The confusion matrix shows that the combination of different instruments is diversified. <b>Right-top: </b> According to Wikipedia, 22 kinds of instruments are divided into 4 categories: <i>String, Wind, Percussion</i> and <i>Keyboard</i>. <b>Right-bottom: </b>9 question types in different scenarios.</p>
</div> -->


<div class="row justify-content-md-center text-center">
<div class="col-md centered" style="padding:1rem;">
Expand All @@ -274,46 +273,7 @@ <h3 class="section-subheading text-muted">Illustrations of our LFAV dataset stat
</p>
</div>

<br/>


<!-- <div class="row justify-content-md-center text-center">
<div class="col-md-12" style="text-align:left">
<h4 class="section-subheading" id="downloadFiles" style="text-align:left; margin-left:-14px">How was LFAV dataset made?</h4>
</div>
<hr/>
<p class="text-muted" style="text-align:left">
We design an audio-visual question answering labeling system to collect questions, and all QA pairs are collected with this system. The flow chart of the labeling system is shown in below figure.
</p>
<center>
<img src="{{ site.baseurl }}/static/img/stats-figures/annotation_framework.png" alt="" style="width:88%; margin-top:10px; margin-bottom:10px;">
</center>
<p class="text-muted" style="text-align:left">
Labeling system contains <b>questioning</b> and <b>answering</b>. In the questioning section, the annotator is required to select the performance type of the video and the included instruments, and then scene types, question types, and question templates, and finally one question is automatically generated based on the previous selection. In the answering part, the annotator to judge whether the question is reasonable, and if it is unreasonable, the question will be labeled again. Then, the annotator answering the question according to video content, and finally one QA pair is produced.
</p>
</div>
<br/> -->


<!-- <div class="row justify-content-md-center text-center">
<div class="col-md-12" style="text-align:left">
<h4 class="section-subheading" style="text-align:left; margin-left:-14px">QA pairs samples</h4>
</div>
<hr/>
<p class="text-muted" style="text-align:left">
<b>Demo.</b> The large-scale spatial-temporal audio-visual dataset that focuses on question-answering task, as shown in below figure that different audio-visual scene types and their annotated QA pairs in the AVQA dataset.
</p>
<hr/>
<div class="col-md centered" style="padding:0.3rem;">
<img src="{{ site.baseurl }}/static/img/stats-figures/st_avqa_pairs.png" style="width: 100%" class="img-responsive"/>
</div>
<p class="text-muted", style="text-align:left">
In the first row, <b>a)</b>, <b>b)</b>, and <b>c)</b> represent real musical performance videos, namely solo, ensemble of the same instrument, and ensemble of different instruments. In the second row, <b>d)</b>, <b>e)</b>, and <b>f)</b> represent the synthetic video, which are audio and video random matching, audio overlay, and video stitching, respectively.
</p>
</div> -->
<p></p>


<div class="row justify-content-md-center text-center">
Expand Down Expand Up @@ -1097,14 +1057,7 @@ <h4 class="section-subheading" style="text-align:left; margin-left:-14px">Video
</tr>

</table>









</p>

</div>
Expand Down Expand Up @@ -1433,6 +1386,9 @@ <h4 class="section-subheading" id="downloadFiles">Main experiment results </h4>
</p>
</div>
</div>

<br/>

<div class="row">
<div class="col-md-12">
<h4 class="section-subheading" id="downloadFiles">Visualization results </h4>
Expand Down Expand Up @@ -1896,6 +1852,8 @@ <h6 class="text-muted">Renmin University of China</h6>
</div>
</div>

<br/>

<div class="col-lg-12">
<h2 class="section-heading text-uppercase">Acknowledgement</h2>
<div class="text-muted">
Expand Down

0 comments on commit 416c7ca

Please sign in to comment.