This library is supposed to parse and create OneNote® revision store files according to the [MS-ONESTORE] and [MS-ONE] specifications.
Currently, this is mainly an exercise in learning software development by doing. This means the current state does not provide alot of useful features other than parsing some elements of the OneNote® file. The current structure is messy, and the parsing functionality is not yet extracted from the document model.
However, a goal of writing this library is to add it as filter to the Calligra project, BasKet or something new within the KDE realm. It can be expected, that these projects already require Qt, and use CMake to be build. For this reason, this library will use these frameworks as well, even if they offer more functionality than required.
If you are looking for working OneNote parsers go to:
- one2html, comprehensive library to convert files from OneDrive
- Apache Tika, library to parse documents and meta data
- Interop-TestSuites, an OfficeDev repository by Microsoft to test a number of protocols
- Qt 5.12+ libraries
- CMake
likely will require libmspack
as well in near future (used to unpack .onepkg
files)
from the root directory of this repository run:
mkdir build && cd build
cmake ..
make
Any comments and contributions are welcome! Just open an issue, or drop me an email.
If you run into any issues while trying to parse a file, consider to open an issue in which the respective file is submitted. That way i can figure out which part of the code is the likely culprit.
At some point this library should have a public API which masks the RevisionStoreFile components and lets you interact with the actual content of the notebooks/sections/pages only. If you imagine a specific use case, I would be glad to hear your ideas.
-
Make parsing more robust when encountering errors/ fail safely - study other libraries to extract pattern.
-
Develop an API which can be used as public Facade masking all of the RevisionStoreFile components
-
Create an XmlWriter which builds an xml file according to MS-ONE
-
Object spaces, and maps for GUID/ExtendedGUIDs are not yet created
-
TransactionLogFragment parsing is buggy. Sometimes parsing has to end before the given number of TransactionEntries can be parsed.
-
SOLID principles are not considered enough. Most classes contain functionality which is a mixture of file operations, data structure and string writer.
-
No graph implemented which describes the relationships between components, such as FileNodeList->FileNodeListFragment->FileNode, or Notebook->Page->Outline
-
Deduct serialized format of Shapes
-
Analyze Audio/Video embedded into File
-
Create XMLs according to OneNotes XSD
This project is neither related nor endorsed by Microsoft in any way. The author does not have any affiliation with Microsoft. Information which is not specified in MS-ONESTORE is done by 'clean room reverse engineering' only, mostly with code found in this project. Third party software binaries have NOT been analyzed (or disassembled). Only .one, .onetoc2, and .onepkg files have been used to deduct unspecified information. This also means validity of the finding cannot be garuanteed.
Third party projects from which functionality has been derived are listed in LICENSE.3rdparty.md. This currently includes :
- Qt Open Source Edition - 5.15
- OfficeDev's Interop-TestSuites
- pablospe's cmake-example-library
- KMess's LibISF-Qt