Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace ctypes.DllGetClassObject and remove DllCanUnloadNow #127369

Open
encukou opened this issue Nov 28, 2024 · 12 comments
Open

Replace ctypes.DllGetClassObject and remove DllCanUnloadNow #127369

encukou opened this issue Nov 28, 2024 · 12 comments
Labels

Comments

@encukou
Copy link
Member

encukou commented Nov 28, 2024

As far as I can tell, these functions are hooks: third-party code is meant to replace them.

Their implementation in ctypes (i.e. their default behaviour) is to import and call the same-named functions from a third-party library, comtypes.server.inprocserver. This is not good. comtypes should instead register their hook on import.

Here's a possible plan to make the API boundary better without breaking users.

DllCanUnloadNow

While the Python interpreter is running, it is not safe to unload the shared library that contains _ctypes. Therefore:

  • The C function DllCanUnloadNow exported from _ctypes should be changed to always return S_FALSE. We should change that now, without a deprecation period. (Note that the comtypes hook already does this.)
  • We should stop importing and calling comtypes.server.inprocserver. I'm not sure about the necessary deprecation period, but I think that it should be a non-breaking change and can also be done immediately. Or is someone relying on it for side effects? O_o
  • Setting and getting the hook should be deprecated. In about Python 3.18 we should stop calling it, and remove it.

DllGetClassObject

This one, on the other hand, sounds like a useful hook. It also looks like an inprocess COM server need a special build so it's not useful to allow multiple hooks -- replacing a global one is enough. Is that so?
If yes:

  • ctypes.DllGetClassObject (the default implementation) should raise a DeprecationWarning. In about Python 3.18, it should be changed to do nothing, just, return CLASS_E_CLASSNOTAVAILABLE.
  • comtypes should be changed: on import, it should replace ctypes.DllGetClassObject with its own hook.

This should ensure that old versions of comtypes still work as before (until after the deprecation period).

Does that sound reasonable?
cc @junkmd

@zooba
Copy link
Member

zooba commented Nov 28, 2024

I suspect these may be here because comtypes calls back through ctypes, and so to COM it looks like all the calls are coming from _ctypes.pyd? I've never noticed them in our code before (all my COM work with Python has been in my own extension modules).

No doubt it's a useful and important hack at some point, but without a more concrete use case they're probably not so necessary.

Changing default behaviour to not import comtypes is a good move, but it may not be possible for comtypes to hook them itself if it doesn't get imported before these are called. DllGetClassObject in particular is meant as an entry point, but I'm not sure how much gets executed before it may be called. It probably depends on how Python is registered as a local server, which is something that's definitely outside of our upstream support, and so I think it's reasonable for comtypes (or whoever) to figure out a preferred way to set that up.

If comtypes doesn't need these, then I'd prefer to deprecate and remove. I guess that's what @junkmd can tell us.

@junkmd
Copy link
Contributor

junkmd commented Nov 30, 2024

For those interested in this discussion, I would like to introduce some technical references related to DllCanUnloadNow and DllGetClassObject.

A notable reference book on OLE and related COM topics is "Inside OLE, 2nd Edition, Kraig Brockschmidt, Microsoft Press, 1995, ISBN: 1-55615-843-2".
The author, kraigb, has made the book publicly available under the MIT License.

Regarding DllGetClassObject, page 183 of the PDF states the following:

In-Process Server

Every DLL server must implement and export a function named DllGetClassObject with the following form:

STDAPI DllGetClassObject(REFCLSID rclsid, REFIID riid, void **ppv);

When a client asks COM to create an object and COM finds that an in-process server is available, COM will pull the DLL into memory with the COM API function CoLoadLibrary. COM then calls GetProcAddress looking for DllGetClassObject; if successful, it calls DllGetClassObject, passing the same CLSID and IID that the client passed to COM. This function creates a class factory for the CLSID and returns the appropriate interface pointer for the requested IID, usually IClassFactory or IClassFactory2, although the design of this function allows new interfaces to be used in the future. No straitjackets here.

DllGetClassObject is structurally similar to IClassFactory::CreateInstance, and as we'll see later in the sample code for this chapter, the two functions are almost identical: the difference is that CreateInstance creates the component's root object, whereas DllGetClassObject creates the class factory. Both query whatever object they create for the appropriate interface pointer to return, which conveniently calls AddRef as well.

Because DllGetClassObject is passed a CLSID, a single DLL server can provide different class factories for any number of different classes that is, a single module can be the server for any number of component types. The OLE DLL is itself an example of such a server; it provides most of the internally used object classes of OLE from one DLL.

Be Sure to Export DllGetClassObject

When creating an in-process server or handle, be sure to export DllGetClassObject as well as DllCanUnloadNow. (See "Unloading Mechanisms" later in this chapter.) Failure to do so will cause a myriad of really strange and utterly confusing bugs. I guarantee that you'll hate tracking these bugs down. Save yourself the trouble and write yourself a really big, hot-pink fluorescent Post-it note and stick it in the middle of your monitor so you'll remember.

Regarding DllCanUnloadNow, page 186 of the PDF states the following:

In-Process Server

Being rather passive, the DLL unloading mechanism is fairly trivial. Every now and then primarily when a client calls the function CoFreeUnusedLibraries COM attempts to call an exported function named DllCanUnloadNow in the same manner that it calls DllGetClassObject. This function takes no arguments and returns an HRESULT:

STDAPI DllCanUnloadNow(void)

When COM calls this function, it is essentially asking "Can I unload you now?" DllCanUnloadNow returns S_FALSE if any objects or any locks exist, in which case COM doesn't do anything more. If there are no locks or objects, the function returns NOERROR (or S_OK), and COM follows by calling CoFreeLibrary to reverse the CoLoadLibrary function call that COM used to load the DLL in the first place.

Note: Early 16-bit versions of OLE did not implement the CoFreeUnusedLibraries, so DllCanUnloadNow was never called. This led to the belief that it was not necessary to implement this function or IClassFactory::LockServer because the DLL would stay in memory for the duration of the client process. This is no longer true because the COM function is fully functional and will call all DLLs in response. To support proper unloading, you must always implement IClassFactory::LockServer as well as DllCanUnloadNow.

A DLL's response to a DllCanUnloadNow call does not take into consideration the existence or reference counts on any class factory object. Such references are not sufficient to keep a server in memory, which is why LockServer exists. To be perfectly honest, an in-process server could include a count of class factories in its global object count if it wanted to, but this doesn't work with a local server, as the following section illustrates.

@junkmd
Copy link
Contributor

junkmd commented Nov 30, 2024

Below is documentation for using comtypes (as of version 0.5) ¹⁾ as a COM server:
https://pythonhosted.org/comtypes/server.html


¹⁾ The comtypes documentation has not been updated for many years. This is due to a combination of factors: pythonhosted can no longer be updated; I am not yet well-versed in Sphinx; I am unfamiliar with how to migrate documentation to hosting platforms like ReadTheDocs; and I lack case studies on how external collaborators without ownership, like myself, can work with other maintainers to maintain documentation. One reason I started contributing to the cpython documentation was to familiarize myself with modern Sphinx practices.

@junkmd

This comment was marked as outdated.

@junkmd
Copy link
Contributor

junkmd commented Nov 30, 2024

Here are my current thoughts on this:

  1. Change the implementation of these ctypes APIs so they no longer depend on comtypes.
  2. Keep these APIs but modify them to make the hooks in ctypes easier for third-party libraries and applications to use.
  3. In any case, implement a COM server using ctypes to verify its behavior during the process.

I am not against changing ctypes to remove its dependency on comtypes. Considering the proper dependency relationship, the current implementation is suboptimal.

However, just as COM interfaces can be implemented by projects other than comtypes, changes to ctypes should leave room for implementing COM servers without relying on comtypes.

I think the hooks defined in ctypes and the callbacks defined in _ctypes (CanUnloadNow and DllGetClassObject) should remain.
I suspect these are essential when implementing a COM server using ctypes.

Regarding hook handling, I propose creating a relationship similar to sys.excepthook and sys.__excepthook__.
An implementation like below would allow application developers to specify the behavior invoked by hooks easily when combining ctypes and COM packages such as comtypes.
It would also enable calling the pre-overloaded function.

def DllCanUnloadNow():
    return __dll_can_unload_now__()


def __dll_can_unload_now__():
    ...


def DllGetClassObject(rclsid, riid, ppv):
    return __dll_get_class_object__(rclsid, riid, ppv)


def __dll_get_class_object__(rclsid, riid, ppv):
    ...

The main challenge is that we have yet to find modern use cases for implementing COM servers using ctypes or comtypes.

To make production changes, I believe comprehensive testing that covers actual use cases is essential.

The comtypes community likely includes developers who have implemented COM servers using comtypes or even developers familiar with implementing COM servers in languages other than Python.

If agreeable, I plan to reach out to them and invite them to participate in this discussion to provide their insights and cooperation.
I am also reaching out to developers interested in this topic in enthought/comtypes#671, but directly contacting developers who could be stakeholders in this change would be more effective.

cc: @encukou, @zooba

@zooba
Copy link
Member

zooba commented Dec 2, 2024

The main challenge is that we have yet to find modern use cases for implementing COM servers using ctypes or comtypes.

Agreed. There are no shortage of use cases for using COM servers, but very few that require implementing them (especially when they're then going to be activated through COM's global interfaces, as opposed to being directly passed in).

I assume this only applies to in-process activation as well? If you are registering a COM server for out-of-proc activation then I'm pretty sure ctypes isn't going to work anyway. So it's really only able to handle in-process activation, which is also not widely used.

If anyone has any such cases for implementing/registering a COM server in Python, modern or legacy, it would be helpful to list them here.

If we can't find sufficient modern cases to motivate this (and I emphasise modern here because we need to justify users who can update their CPython to a later version but somehow can't modify their own code or update other parts of their system - legacy cases where nobody has touched it in 20 years don't count, because you couldn't update CPython in that case), then I think deprecation and complete removal is on the table.

@encukou
Copy link
Member Author

encukou commented Dec 2, 2024

I suspect these may be here because comtypes calls back through ctypes, and so to COM it looks like all the calls are coming from _ctypes.pyd? I've never noticed them in our code before (all my COM work with Python has been in my own extension modules).

But, it's not safe to unload _ctypes.pyd (or any other extension's DLL) while a Python interpreter is running. So, we should make the DllCanUnloadNow hook a no-op.

Regarding hook handling, I propose creating a relationship similar to sys.excepthook and sys.__excepthook__.

IMO, that would make sense if __dll_get_class_object__ was complicated. But here the base implementation would be two lines:

def __dll_get_class_object__(rclsid, riid, ppv):
    return -2147221231  # CLASS_E_CLASSNOTAVAILABLE

(There's a magic number because ctypes doesn't have a good way to work with HRESULT error constant. But a library dealing with COM should make them easy to work with.)
It could even be no lines -- the C function can do this in case the Python hook does not exist at all.


If we need something more complex than a replaceable hook, i was thinking about a list, named for example ctypes.DllGetClassObject_hooks, to which users could add their hooks. The C function DllGetClassObject would call Python functions in this list in order, until one returns something else than CLASS_E_CLASSNOTAVAILABLE. It would return the result of the last call.

(We could instead make the list internal, and add a “register” function, but then we'd also need “unregister” and “clear” and so on -- a list sounds more convenient.)

Unlike with sys.excepthook, it makes sense to me that there could be multiple extensions that want to export COM objects. But, I have no idea if it's even something to keep in mind as a future possibility.

@bennyrowland
Copy link

I am one of the developers using comtypes to make servers rather than clients that @junkmd asked to weigh in on this discussion. I was going to say that I know absolutely nothing about this subject because I have only used local servers rather than inproc ones, but that felt a bit underwhelming, so I have done a bit of extra digging to try and understand at least a little bit that I might be able to contribute, apologies if none of this is new or interesting.

Local servers made intuitive sense to me because registering them just sets the Python executable and a script to run when the server is created, but it didn't make sense to me how an inproc server could work - how would the interpreter get set up, etc? So I looked at the comtypes registration code for inproc servers here which requires a frozendllhandle attribute to exist to register as inproc.

From somewhere in my distant memories that reminded me that comtypes had been associated with py2exe, so I checked that out and sure enough turned up this page on how to generate a COM DLL server. Some further digging through the code turned up the file source/run_ctypes_dll.c and particularly these lines. Basically what Py2exe is doing is compiling this C file into a DLL and embedding a Python interpreter into it to run the comtypes server code. As the DLL being used to provide the inproc server it has to implement DllGetClassObject() and DllCanUnloadNow, but it does so by loading the implementations out of ctypes (which of course then goes on to get them from comtypes).

I know that @theller was heavily involved in the early days of ctypes, comtypes, and py2exe and could probably add more insight into this situation, but I assume that the issue is that because comtypes is a pure Python package it doesn't have a DLL which could be imported by run_ctypes_dll.c, so it was convenient to make use of the _ctypes.dll to provide exported C functions which call the Python functions in turn.

From my perspective, there doesn't seem to be any reason to keep these functions in ctypes at all. Although they are nominally standard COM functions as explained above by @junkmd, there is no process to invoke them except via Py2exe and run_ctypes_dll.c. It would therefore seem much more logical to move the logic contained in the callbacks directly to Py2exe, but rather than invoking ctypes.DllGetClassObject it could directly invoke comtypes.DllGetClassObject (comtypes is guaranteed to be installed via Py2exe's freezing process anyway). Arguably, the inproc server code in comtypes should even be moved into Py2exe anyway given that it can only be used via Py2exe in the first place.

I will confess that I personally have little appetite for taking on the responsibility of updating Py2exe to no longer depend on these functions from types, I have never used Py2exe and it is sufficiently complex (and insufficiently documented) that I don't fancy setting up the necessary development environment to get it building. But it is actively being developed and I think that the actual fix is probably fairly simple for a maintainer to implement (mainly just copying code from ctypes/callbacks.c). I think that with a fairly long deprecation cycle before removing them from ctypes, combined with a bug report to Py2exe, anybody that is interested in using inproc servers in this way will have plenty of opportunity to implement the fix.

@encukou
Copy link
Member Author

encukou commented Dec 3, 2024

It seems that py2exe has special support for comtypes.server.inprocserver that it runs in a special ctypes_comdll target. It's not clear to me what an “inprocess COM server” is; is this it?

If it is, looks like it will call comtypes.server.register.register before Windows will make any calls to DllGetClassObject, so that's a function where comtypes can ensure that its hook is installed.

@bennyrowland
Copy link

@encukou an inproc COM server is one which is contained in a DLL and so can be instantiated in the calling process, and COM objects so produced can be used directly via function pointers. In the more common case (at least in my experience) the COM server is a separate process and methods are called via RPC with Windows messages.

There are different hooks for different purposes in the DLL. DllRegisterServer() and DllUnregisterServer() are called by regsvr32 in order to create the relevant registry entries to allow the server to then be discoverable by potential clients (or to remove the entries, respectively). These functions would not normally be called when a registered server is being instantiated - that is when DllGetClassObject() is called.

I think the most important point to emphasise is that the DllGetClassObject hook which Windows calls is in the DLL which is registered for the inproc server. This is not the _ctypes.pyd but rather the custom DLL created by Py2exe. The DllGetClassObject in ctypes is therefore not actually a hook. AFAICT there is no reason for any of this code to be in ctypes at all - all of these functions in ctypes will only ever be called from Py2exe DLLs, there is no other mechanism to use them, so they really belong in Py2exe.

You can see here that the Py2exe DLL implementations of DllRegisterServer() and DllUnregisterServer() call into the Python interpreter that is provisioned in DllMain(), which has already imported boot_ctypes_com_server.py and so can directly call the Python versions of the functions here. For reasons that are not clear to me, the Py2exe DLL does not do the same thing for DllGetClassObject(), but instead imports the C functions from _ctypes.pyd that then do use the same paradigm of calling in to the Python code in ctypes. I think it should be very easy to move the Python DllGetClassObject() and DllCanUnloadNow() code from ctypes into boot_ctypes_com_server.py and then just change the way that the Py2exe DLL forwards the calls to match Dll(Un)RegisterServer()

In fact, I think there is definitely a case to be made for moving all of comtypes.server.inprocserver into Py2exe. The implementation in comtypes is custom designed for Py2exe to use (and cannot be used in any other way), and there is plenty of messy stuff like this where Py2exe is overwriting a private comtypes member because really all that logic belongs in Py2exe.

@junkmd
Copy link
Contributor

junkmd commented Dec 4, 2024

Based on previous discussions, I think the ideal approach is to implement the hooks directly in freezing tools like py2exe, with ctypes and comtypes providing supporting utility functions as needed.

Considering cases where implementing COM servers with win32com in pywin32, it is neither reasonable nor feasible for ctypes or comtypes to take responsibility for properly handling inproc COM server hooks for every possible COM implementation.

Moreover, as seen in py2exe/py2exe#24 (and mhammond/pywin32#868), it seems that registering COM servers with py2exe does not work well for either win32com or comtypes.

It might also be worthwhile to invite the maintainers of py2exe to participate in this discussion to help investigate the technical challenges and determine the required deprecation timeline (i.e., the duration of the deprecation period to be set).

@encukou
Copy link
Member Author

encukou commented Dec 4, 2024

I opened py2exe/py2exe#217.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants