From 738a608e06534e0d1edb6695013aa52433b87c8e Mon Sep 17 00:00:00 2001
From: NielsRogge <niels.rogge1@gmail.com>
Date: Thu, 29 Sep 2022 09:17:48 +0000
Subject: [PATCH] Add more tips

---
 docs/source/en/model_doc/markuplm.mdx | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/docs/source/en/model_doc/markuplm.mdx b/docs/source/en/model_doc/markuplm.mdx
index a7b2a95f6027cc..314ef750ce3ff4 100644
--- a/docs/source/en/model_doc/markuplm.mdx
+++ b/docs/source/en/model_doc/markuplm.mdx
@@ -16,7 +16,14 @@ specific language governing permissions and limitations under the License.
 
 The MarkupLM model was proposed in [MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document
 Understanding](https://arxiv.org/abs/2110.08518) by Junlong Li, Yiheng Xu, Lei Cui, Furu Wei. MarkupLM is BERT, but
-applied to HTML pages instead of raw text documents.
+applied to HTML pages instead of raw text documents. The model incorporates additional embedding layers to improve
+performance, similar to [LayoutLM](layoutlm).
+
+The model can be used for tasks like question answering on web pages or information extraction from web pages. It obtains
+state-of-the-art results on 2 important benchmarks:
+- [WebSRC](https://x-lance.github.io/WebSRC/), a dataset for Web-Based Structual Reading Comprehension (a bit like SQuAD but for web pages)
+- [SWDE](https://www.researchgate.net/publication/221299838_From_one_tree_to_a_forest_a_unified_solution_for_structured_web_data_extraction), a dataset
+for information extraction from web pages (basically named-entity recogntion on web pages)
 
 The abstract from the paper is the following:
 
@@ -30,10 +37,15 @@ pre-trained MarkupLM significantly outperforms the existing strong baseline mode
 tasks. The pre-trained model and code will be publicly available.*
 
 Tips:
-- One can use ['MarkupLMProcessor`] to prepare all data for the model. This processor internally combines a [`MarkupLMFeatureExtractor`] to first
-extract all nodes and xpaths from one or more HTML strings, which are then fed to [`MarkupLMTokenizerFast`], which will turn them into token-level
-`input_ids`, `attention_mask`, `token_type_ids` etc. Optionally, one can also provide `node_labels`, which the tokenizer will turn into token-level
-`labels`.
+- In addition to `input_ids`, [`~MarkupLMModel.forward`] expects 2 additional inputs, namely `xpath_tags_seq` and `xpath_subs_seq`.
+These are the XPATH tags and subscripts respectively for each token in the input sequence.
+- One can use ['MarkupLMProcessor`] to prepare all data for the model. Refer to the [usage guide](#usage-markuplmprocessor) for more info.
+- Demo notebooks can be found [here](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/MarkupLM).
+
+<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/model_doc/markuplm_architecture.jpg"
+alt="drawing" width="600"/> 
+
+<small> MarkupLM architecture. Taken from the <a href="https://arxiv.org/abs/2110.08518">original paper.</a> </small>
 
 This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code can be found [here](https://github.com/microsoft/unilm/tree/master/markuplm).