Skip to content

Opening MSG files that contain MS Teams messages throws UnrecognizedMSGTypeError: Could not recognize MSG class type "IPM.SkypeTeams.Message" #401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Witiko opened this issue Jan 10, 2024 · 7 comments

Comments

@Witiko
Copy link
Contributor

Witiko commented Jan 10, 2024

Bug Metadata

  • Version of extract_msg: 0.0.47
  • Your python version: Python 3.10.12
  • How did you launch extract_msg?
    • I used the extract_msg package

Describe the bug

Opening MSG files that contain MS Teams messages causes the following exception to be thrown:

extract_msg.exceptions.UnrecognizedMSGTypeError: Could not recognize MSG class type "IPM.SkypeTeams.Message". As such, there is a high chance that support may be impossible, but you should contact the developers to find out more.

What code did you use or can we use to reproduce this error?

Calling the function extract_msg.openMsg() immediately causes the exception to be thrown.

Traceback

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/extract_msg/open_msg.py", line 170, in openMsg
    raise UnrecognizedMSGTypeError(f'Could not recognize MSG class type "{ct}". As such, there is a high chance that support may be impossible, but you should contact the developers to find out more.')
@TheElementalOfDestruction
Copy link
Collaborator

Apparently the response I had to this never got sent, and I only realized now. Apologies for that.

Unfortunately, that type isn't actually documented, so I'll need examples to implement code that properly understands it. Any that you can provide would be great, but I can try to see if I can generate my own.

@Witiko
Copy link
Contributor Author

Witiko commented Oct 11, 2024

@TheElementalOfDestruction: Thanks for the response! I am sorry that I did not reply within almost 9 months. 🙏

Unfortunately, that type isn't actually documented, so I'll need examples to implement code that properly understands it. Any that you can provide would be great, but I can try to see if I can generate my own.

I cannot share client data, so I looked for ways to create an example MSG file without any sensitive information. Apparently, MSG files of this class are produced by exporting search results from Microsoft Purview eDiscovery. However, the MSG format for MS Teams messages seems to be unavailable for new searches since 2022.

Instead of trying to produce example MSG files, I forked the library and in commit Witiko@4b5ff13, I updated the library to recognize the class "IPM.SkypeTeams.Message" as a message type. I verified that this fixes the problem on 107 different client MSG files with the class "IPM.SkypeTeams.Message". Therefore, I opened PR #440 that should close this issue.

@TheElementalOfDestruction
Copy link
Collaborator

Have you confirmed that there doesn't appear to be any special data entries we should have implemented in a custom type? I'd rather make absolutely sure we get this as implemented as possible. One way I tend to go about this is to try and open the file in outlook and print it if possible. If a print option works to bring up print preview, I take a look at what headers at the top appear and see if there are any that are different from regular outlook messages.

If it does have differences you can quickly identify, implementing a new class for it is easy since Message and MessageBase are functionally identical, with Message just being a subclass of MessageBase which adds no functionality, only used for identifying that an MSG file is specifically of type message. If this format doesn't follow along with how standard messages work, id rather it have its own class.

@Witiko
Copy link
Contributor Author

Witiko commented Oct 11, 2024

One way I tend to go about this is to try and open the file in outlook and print it if possible. If a print option works to bring up print preview, I take a look at what headers at the top appear and see if there are any that are different from regular outlook messages.

I took one of the client files at random and I opened it in Outlook from Office 365. Here is the redacted print preview:

image

The rest of the first printed page lists dozens of other recipients, followed by the text of the message on the second page:

image

Therefore, the only headers seem to be From, Date, and To.

Opening the same file with my patched version of extract_msg from PR #440 produces these same headers and also the standard header Message-Id, which Outlook does not seem to display:

$ python3 -m venv venv
$ source venv/bin/activate
(venv) $ pip install -U pip wheel setuptools
(venv) $ pip install git+https://github.com/Witiko/msg-extractor.git@feat/skypeteams-message
(venv) $ python3
Python 3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import extract_msg
>>> message = extract_msg.openMsg('client-file.msg')
>>> sorted([key for key, value in message.header.items() if value])
['Date', 'From', 'Message-Id', 'To']

There are also empty headers Authentication-Results, Bcc, and Cc:

>>> sorted(message.header.keys())
['Authentication-Results', 'Bcc', 'Cc', 'Date', 'From', 'Message-Id', 'To']

Does this seems as sufficient evidence that there are no special data entries or would you like me to try something else?

@TheElementalOfDestruction
Copy link
Collaborator

Aside from checking that all of that data appears in the body itself, that all looks good enough for me to consider it just a new flavor of Message and call it a day. If you discover a problem with that implementation, you can just submit a new pull request while I accept the original one.

@TheElementalOfDestruction
Copy link
Collaborator

I just finished submitting the current code to a new release, 0.51.0. If everything looks good, you can close this issue

@Witiko
Copy link
Contributor Author

Witiko commented Oct 11, 2024

Looks good to me, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants