-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update EditEngine code to use 32 bit paragraph storage #164
Update EditEngine code to use 32 bit paragraph storage #164
Conversation
which is limited to 16-bit indices and 2^16 entries. For now, use our own std::vector-based compatible class. Patch by: me
Get the ParaPortionList to use 32 bit indices too. Patch by: me
…ices. Patch by: me
Change the public API of the EditEngine class to 32 bit paragraph indices. Patch by: me
MoveParagraphsInfo, PasteOrDropInfos, EENotify, DeletedNodeInfo, EditUndoDelContent, EditUndoSplitPara, and EditUndoMoveParagraphs classes with 32 bit paragraph indices. Patch by: me
to 32 bit integers. Factor out the BaseList class into a separate file and use it as the ContentInfoList. Also increment the binary stream version to 603, as it now stores a 32 bit paragraph count. Patch by: me
to 32 bit that were missed before. Patch by: me
…asses to 32 bit. Patch by: me
Fix one wrong converted varible that should have been left as 16 bit. Patch by: me
Audit all usage of "node" (sometimes used as a name for variables storing paragraph indices) and start converting and fixing the problems. Patch by: me
Patch by: me
to use 32 bit paragraph indices, as per the compiler warnings. Also convert EBulletInfo to 32 bit paragraph indices. Patch by: me
indices, and update enough of the affected code to able to build editeng. Patch by: me
…hs(). Patch by: me
OLUndoExpand to 32 bit indices. Patch by: me
…class and its related code. Patch by: me
in outlin2.cxx. Patch by: me
tools module's 64 bit macros. Patch by: me
to return a sal_uInt32 or sal_uLong, and fix any related problems in all their callers. Patch by: me
…codebase. Patch by: me
take a 32 bit paragraph index. Patch by: me
…ragraph index. Patch by: me
…paragraph index. Patch by: me
… paragraph index. Patch by: me
…RangeEnumeration to use 32 bit paragraph indices. Patch by: me
…agraph indices. Patch by: me
fix the calling code. Patch by: me
in editeng/source/accessibility. Patch by: me
and convert these to 32 bit constants. Add new EE_PARA_MAX and EE_INDEX_MAX constants for this purpose. Patch by: me
take 32 bit paragraph index parameters too. Patch by: me
…archReplaceShape::Search(). Patch by: me
Hi Damjan, |
H Damjan, [1] https://home.apache.org/~cmarcum/test-files/20000-row-html-table.html |
Nice catch. Yes, apparently there is still a problem somewhere in Writer. I only tested copying from a web browser and pasting to Calc. |
I believe the Writer table limit of 64K cells is unrelated to this issue. For example in main/sw/source/filter/html/htmltab.cxx:
That's not EditEngine related, that's a separate issue specific to Writer. Please test only copying from a web browser and pasting into Calc, and opening through the "File type" of "HTML document (OpenOffice Calc) (*html, *.htm)". |
My latest testing: trunk got to row 13106 Copy/Paste from Firefox to Calc Copy/Paste from Firefox to Writer |
There is no way all 20000 pasted successfully over HTML on trunk. Either the transfer format wasn't HTML, or you tested something other than trunk. |
Hi Damjan, I can confirm that copying from Firefox and pasting into AOO within the same VM results in: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my test was successful
Is everyone happy? Should we merge this to trunk? And 420? |
Sorry, I forgot to build this one... My Windows build is now running, looks good so far. Here it is: |
My test on Windows with this table: was successful! |
I am having problems with an xlsx file containing Cyrillic characters. Only the numbers are imported. It may only be my machine. Update: |
Update EditEngine code to use 32 bit paragraph storage (cherry picked from commit d5edfd0)
Our EditEngine, in main/editeng, has a number of containers for paragraphs, paragraph portions, lines, etc. These are based on svl's "PTRARR" type classes, which is just an array of some caller-defined class that grows as needed, like ::std::vector but hardcoded to 16 bit limits. Worse, a lot of code, even outside of main/editeng, is written under these assumptions, and passes 16 bit types and constants to EditEngine classes and retrieves 16 bit results.
One particularly notorious bug emerging from this design, is that when you try to paste an HTML table of more than 65534 cells into Calc, the first 65534 cells are pasted, but all further cells are silently lost, something reported in at least 3 bugs in the last 18 years (57176, 110486 and 117225). This apparently happens because our HTML import is similar to XML DOM parsing, done in 2 phases, first parsing the results into memory, then processing those in-memory results to populate data into the spreadsheet (ScHTMLImport::WriteToDocument() in main/sc/source/filter/html/htmlimp.cxx). When parsing, each HTML cell () becomes a new EditEngine paragraph, and after 65534 paragraphs, all further cells are ignored, so when it's time to populate the spreadsheet, the data isn't there to add.
This enormous patchset changes the container for paragraphs to a new BaseList class, which wraps ::std::vector and uses 32 bit integers to access it. All EditEngine methods are changed to take 32 bit paragraph index parameters, all EditEngine classes store any paragraph indices as 32 bit fields, all 16 bit paragraph index constants like 0xffff are changed to 32 bit 0xffffffff (and to a more readable constant like EE_PARA_NOT_FOUND), and all (known) calling code everywhere in OpenOffice is updated to take these changes into account.
The bug is definitely fixed, all sample documents from Bugzilla paste fully and correctly. What is harder to prove is that nothing else broke. Much had to change, 136 files in 9 modules. OpenGrok helped tremendously in finding obscure places where EditEngine methods were getting called from. I've really tried to avoid introducing any new bugs, but it is hard to be sure with a change of this size. I am making a PR instead of directly pushing to allow you to test a lot ;).