Add more tips

NielsRogge · Sep 29, 2022 · 738a608 · 738a608
1 parent 9680740
commit 738a608
Showing 1 changed file with 17 additions and 5 deletions.
diff --git a/docs/source/en/model_doc/markuplm.mdx b/docs/source/en/model_doc/markuplm.mdx
@@ -16,7 +16,14 @@ specific language governing permissions and limitations under the License.
 
 The MarkupLM model was proposed in [MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document
 Understanding](https://arxiv.org/abs/2110.08518) by Junlong Li, Yiheng Xu, Lei Cui, Furu Wei. MarkupLM is BERT, but
-applied to HTML pages instead of raw text documents.
+applied to HTML pages instead of raw text documents. The model incorporates additional embedding layers to improve
+performance, similar to [LayoutLM](layoutlm).
+
+The model can be used for tasks like question answering on web pages or information extraction from web pages. It obtains
+state-of-the-art results on 2 important benchmarks:
+- [WebSRC](https://x-lance.github.io/WebSRC/), a dataset for Web-Based Structual Reading Comprehension (a bit like SQuAD but for web pages)
+- [SWDE](https://www.researchgate.net/publication/221299838_From_one_tree_to_a_forest_a_unified_solution_for_structured_web_data_extraction), a dataset
+for information extraction from web pages (basically named-entity recogntion on web pages)
 
 The abstract from the paper is the following:
 
@@ -30,10 +37,15 @@ pre-trained MarkupLM significantly outperforms the existing strong baseline mode
 tasks. The pre-trained model and code will be publicly available.*
 
 Tips:
-- One can use ['MarkupLMProcessor`] to prepare all data for the model. This processor internally combines a [`MarkupLMFeatureExtractor`] to first
-extract all nodes and xpaths from one or more HTML strings, which are then fed to [`MarkupLMTokenizerFast`], which will turn them into token-level
-`input_ids`, `attention_mask`, `token_type_ids` etc. Optionally, one can also provide `node_labels`, which the tokenizer will turn into token-level
-`labels`.
+- In addition to `input_ids`, [`~MarkupLMModel.forward`] expects 2 additional inputs, namely `xpath_tags_seq` and `xpath_subs_seq`.
+These are the XPATH tags and subscripts respectively for each token in the input sequence.
+- One can use ['MarkupLMProcessor`] to prepare all data for the model. Refer to the [usage guide](#usage-markuplmprocessor) for more info.
+- Demo notebooks can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/MarkupLM).
+
+<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/model_doc/markuplm_architecture.jpg"
+alt="drawing" width="600"/> 
+
+<small> MarkupLM architecture. Taken from the <a href="https://arxiv.org/abs/2110.08518">original paper.</a> </small>
 
 This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/microsoft/unilm/tree/master/markuplm).