Skip to content

Invalid document when html contains images and existing headers/footers contain images #113

@toneb

Description

@toneb

Word reports "unreadable content" when document contains header/footer with images and inserted html contains images.

The problem is in duplicated DocProperties.Id:

new wp.DocProperties() { Id = drawingObjId, Name = "Picture " + imageObjId, Description = String.Empty },

Calculation of drawingObjId in above method considers only elements in document body, not other parts of the document.

My workaround is to renumber generated prop ids (but considering elements in other document parts):

// get existing max docProp id
var maxDocPropId = new[] {
    doc.MainDocumentPart!.Document.Body!.Descendants<DocProperties>().Select(x => x.Id?.Value ?? 0),
    doc.MainDocumentPart!.HeaderParts.SelectMany(x => x.Header.Descendants<DocProperties>().Select(x => x.Id?.Value ?? 0)),
    doc.MainDocumentPart!.FooterParts.SelectMany(x => x.Footer.Descendants<DocProperties>().Select(x => x.Id?.Value ?? 0)),
}.SelectMany(x => x).Max();

// convert html to openxml
var converter = new HtmlConverter(doc.MainDocumentPart);
var parsed = converter .Parse(html);

// renumber docProp ids
parsed.SelectMany(x => x.Descendants<DocProperties>()).ToList().ForEach(x => x.Id = ++docPropCopy);

// ... use generated elements ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions