-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creds serialization and deserialization #45
Conversation
|
The bulk of the time in CI is compiling all the macOS wheels. In the past there was no arm macOS image to use so compiling was quite time consuming. I believe GHA now has arm images we can use so there's probably some work that can be done to improve those times. Still I'm not seeing 55 minutes, the summary from the latest run is 10 minutes 22 seconds. Are you saying that adding that assertion line causes CI to run for that time? If so you might have to investigate what that is doing and why it might be taking so long locally. |
Jordan, I appreciate your responsiveness and apologize for not being clear enough. The CI duration is indeed shown as 10m 22s (astronomical time between start and end?). Aren't these 55 minutes representing the total computational cost billed by Github? Many thanks |
you'd probably want to cut away the unsupported Python 3.7 right now, and plan for 3.8 EOL after October 2024. |
Ah no worries, thanks for your concern and thinking about this. I haven’t hit the limits, or received notifications that I’m close to them, for the free open source tier so commit away. I believe public repos don’t even have limits but I’m not 100% confident on that. I probably should look at making the build process a bit more efficient but for now it is what it is. |
I added some more test assertions. It appears that Python (3.11 in my case) is interning byte sequences returned by krb5 calls, at least the deserialized ticket byte object appears the same as it was before serialization. However I assume the initial copies still remain in the runtime process memory and will get freed eventually in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing this and adding the tests. I only have two things that really need addressing;
- Keeping the API aligned with the
krb5_*
function names - Looking at a different free/deallocation option for unmarshal_credentials
err = krb5_unmarshal_credentials(context.raw, &data_raw, &creds_raw) | ||
|
||
if creds_raw: | ||
# creds_raw was calloc'ed for krb5_creds structure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat weary about using free
here explicitly. The docs for krb5_unmarshal_credentials state that
Use krb5_free_creds() to free creds_out when it is no longer needed.
I can see the actual implementation essentially calls krb5_free_creds_content
and then calls free
. https://github.com/krb5/krb5/blob/354f176ba6d6cc544e1c15712a13f9c006ca605d/src/lib/krb5/krb/kfree.c#L213-L219
While this replicates that same logic that's more of an implementation detail and not a public API. We should try to avoid freeing data using free
if we don't control how it was allocated.
Instead I'm wondering if we should add another cdef
field on Creds
that can be used to control how the struct is deallocated to work with this new scenario. For example (the field it actually defined in the pxd file, the comment is more for documentation)
cdef class Creds:
# cdef Context ctx
# cdef krb5_creds raw
# cdef int needs_free
# cdef int free_struct
def __dealloc__(Creds self):
if self.needs_free:
if self.free_struct:
krb5_free_creds(self.ctx.raw, &self.raw)
else:
krb5_free_cred_contents(self.ctx.raw, &self.raw)
self.needs_free = 0
The krb5_free_creds
would also need to be defined in the file but seems like Heimdal has it so it shouldn't be a problem.
The other benefit of this is we don't need to copy the data we can just do something like
creds = Creds(context)
...
err = krb5_unmarshal_credentials(context.raw, &data_raw, &creds.raw)
if err:
raise Krb5Error(context, err)
creds.free_struct = 1
creds.needs_free = 1
return creds
Do you see any problem with this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree I stepped onto thin ice here... The point is a pointer-to-pointer used by krb5_unmarshal_credentials()
, it unconditioanaly allocates the structure using its k5alloc()
wrapper and writes the allocated pointer to creds_out
. There is just no way to not allocate some heap memory, so it has to be free'd.
There are might be two ways to avoid shortcuts like mine, right after krb5_unmarshal_credentials()
- Deep copy the new creds to our
self.raw
and deallocate everything just returned withkrb5_free_creds()
as original API prescribes. Our existingraw
will contain a deep copy to be free'd withkrb5_free_cred_contents()
in__deallocate__
. - Shallow copy the new creds to our
self.raw
, save the alloc'ed pointer as_raw_ptr
and indeed callkrb5_free_creds
instead ofkrb5_free_cred_contents
in__deallocate__
. Our existingraw
will remain the shallow copy to keep other things working.
The problem with (1) is that deep copy function krb5_copy_creds()
behaves the same as krb5_unmarshal_credentials()
and allocates the new structure.
Are you ok with having additional _raw_ptr
member in Creds
class?
Or any better ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, sorry I didn't see it was a krb5_creds**
argument. I thought it was just krb5_creds*
, now the API makes more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I might revisit this in the future to try and avoid the memcpy
and just handle raw
based on whether the underlying Creds
class was created with the krb5_creds
or krb5_creds*
value. That's a pretty large change as it'll require changes to everything that calls creds.raw
so best to leave it to another PR.
But that shouldn't hold off this PR as what you have works and passes in CI. Appreciate you working on this and contributing to this library.
Creds
(de)serialization to/from the FILE ccache format (vesion 4) and KCM ccache protocol.Use case: loading/storing of
Creds
objects in implementation-external cache memory, with TTL aligned to ticket validity.MIT only:
krb5_marshal_credentials
krb5_unmarshal_credentials
Example usage: