Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaw in recursive conversion in gap_to_julia #275

Open
fingolfin opened this issue Jul 8, 2019 · 5 comments · May be fixed by #777
Open

Flaw in recursive conversion in gap_to_julia #275

fingolfin opened this issue Jul 8, 2019 · 5 comments · May be fixed by #777
Labels
kind: bug Something isn't working topic: conversion Issue related to conversion

Comments

@fingolfin
Copy link
Member

This is a hypothetical bug, though I am fairly certain one can construct actual problematic examples: gap_to_julia caches conversion results in an IdDict with name recursion_dict. This is meant to ensure that if an object occurs multiple times in the input, it is each time mapped to the same / identical object in the output.

But at the same time, we allow the user some degree of control over outputs, and so in some cases a GAP Obj may not be converted at all (i.e., left as a Julia object of type MPtr), while in other cases it could be converted to something else, e.g. a Julia array. Now it is conceivable that for some inputs, the same object is converted twice, but in each case is supposed to be converted to different output types. E.g. perhaps we ask Julia to produce a Tuple{MPtr, String} and give it as input a GAP list [ x, x ] where x is a GAP strings.

I think (w/o having tried it) that in this case, things will go wrong: because we end up storing in recursion_dict that x is mapped to itself (because for the first tuple element, we asked it to be left as a GAP string), and then when it comes to set the second entry, we are supposed to convert the GAP string to a Julia string -- but because a conversion for x is already cached in recursion_dict, we don't, and so instead return the GAP string once again.

Trying to do this (see below) right now actually does not produce a failure. But that's just because our tuple conversion function is buggy and does not use recursion_dict :-). But note that this problem is not necessarily exclusive to tuples.

julia> l = GAP.EvalString("[\"test\",~[1]]")
GAP: [ "test", "test" ]

julia> l[1]
GAP: "test"

julia> l[1] === l[2]
true

julia> objectid(l[1])
0xa7af7c8f44c756ec

julia> objectid(l[2])
0xa7af7c8f44c756ec

julia> GAP.gap_to_julia(Tuple{GapObj, GapObj}, l)
(GAP: "test", GAP: "test")

julia> GAP.gap_to_julia(Tuple{GapObj, String}, l)
(GAP: "test", "test")

To fix this, I think we need to cache not just the object we converted, but also the type we asked it to be converted to. But that opens a new can of works: If we once ask the object to be converted to String and once AbstractString, we may very well get two objects of type String. But should the conversion result here be two identical Julia strings, or not? If yes: how do we implement this?

In any case, it seems we really really need to document the (intended) semantics of this, and add far more tests to check for other such problems systematically.

@fingolfin fingolfin added the kind: bug Something isn't working label Jul 8, 2019
@fingolfin fingolfin added the topic: conversion Issue related to conversion label Jan 22, 2020
@fingolfin
Copy link
Member Author

Here is an updated version of the example which actually demonstrates the issue (converting to String is a bit misleading for various reasons).

julia> l = @gap """ ["test", ~[1]] """
GAP: [ "test", "test" ]

julia> m = GAP.gap_to_julia(Tuple{Vector{Char}, NTuple{4,Char}}, l)
ERROR: MethodError: Cannot `convert` an object of type Array{Char,1} to an object of type NTuple{4,Char}
Closest candidates are:
  convert(::Type{T}, ::T) where T<:Tuple{Any,Vararg{Any,N} where N} at essentials.jl:309
  convert(::Type{T}, ::Tuple{Any,Vararg{Any,N} where N}) where T<:Tuple{Any,Vararg{Any,N} where N} at essentials.jl:310
  convert(::Type{T}, ::CartesianIndex) where T<:Tuple at multidimensional.jl:134
  ...
Stacktrace:
 [1] _totuple at ./tuple.jl:243 [inlined] (repeats 2 times)
 [2] Tuple{Array{Char,1},NTuple{4,Char}}(::Array{Array{Char,1},1}) at ./tuple.jl:230
 [3] gap_to_julia(::Type{Tuple{Array{Char,1},NTuple{4,Char}}}, ::Main.ForeignGAP.MPtr, ::IdDict{Any,Any}; recursive::Bool) at /Users/mhorn/Projekte/OSCAR/GAP.jl/src/gap_to_julia.jl:255
 [4] gap_to_julia at /Users/mhorn/Projekte/OSCAR/GAP.jl/src/gap_to_julia.jl:245 [inlined] (repeats 2 times)
 [5] top-level scope at REPL[32]:1

So what happens here is that we convert l[1] happily to a Vector{Char}. But then when we want to produce an NTuple{4,Char}, we fetch that Vector{Char} from the conversion dict and try to return it -- then the code one level above wants to put it into the second slot of a 2-tuple, but it won't fit as we returned something of type Vector{Char} when we were asked to return something of type NTuple{4,Char}.

But at least we got an error, that's good! Perhaps we should change all conversion methods to use the ::T syntax to ensure they return something of the requested type, that should help catch similar issues in the future.

That would then actually be one way to safely resolve the issue, in that we provide minimal consistent behavior; in particular we remove buggy behavior, before people start relying on it.

Then if somebody requests it, we can think about whether we can allow some or all of these cases. Some approaches that come to mind:

  1. we add the target type to the lookup key; then all we guarantee is that every input pair type, obj will always produce the same (identical) output; but e.g. String, obj and AbstractString, obj might still return different objects (even though both might be of type String)
  2. as a variant, we could do the same as in 1., but as type we don't use the requested type, but rather the actual type of the object. The problem then becomes how to perform lookup? One way to solve this is by only allowing concrete (non-abstract) output types...
  3. alternatively, if want to allow abstract types: we could modify the recursion_dict to store lists of conversion results; then when a have a hit in recursion_dict, we iterate over all possible conversion results, and take the first which has a type that's a subtype of the given (abstract) type. Since most objects will only converted to one type, that shouldn't hurt performance in the general case too much.

@rfourquet
Copy link
Contributor

I would lean towards option 3., I think this should be almost free performance-wise, and probably doesn't add more complexity than other options.

@ThomasBreuer
Copy link
Member

Yes, I would also prefer option 3. I will provide a pull request.

@fingolfin
Copy link
Member Author

I wonder whether option 3 is really a good idea, though, and that it wouldn't produce weird or surprising results. I don't have an example ready, and they might in fact not exist, but I worry about something like the following:

julia> l = @gap """ [[1,2], ~[1], ~[1]] """
GAP: [ [ 1, 2 ], [ 1, 2 ], [ 1, 2 ] ]

julia> GAP.gap_to_julia(Tuple{UnitRange, StepRange, AbstractRange}, l)
... what should the 3rd output be? 

Of course this actually produces an error right now, as AbstractRange has no constructor. But e.g. AbstractVector has -- and I think there other concrete subtypes of AbstractArray than that. So, there will be multiple matches in recursion_dict. Of course we can just say we always take the first match (as I wrote), but it's now easy to imagine this leading to really surprising behavior, where changing the order of something produces rather different results.

This is kinda a very special situation; my point just is that once we started to accept this and process it in a certain way, people will eventually start to rely on it and then it'll become difficult to change again.

@rfourquet
Copy link
Contributor

That's a good point; it could be made into an error. But I now lean towards option 1. :)
2) has more guarantees but is more restrictive than 1). With 1), I can get the guarantee of 2) by just restricting the target types to concrete types. On the other hand, 2) is more conservative and can be evolved into 1) (without breaking) if there is request for it.

@fingolfin fingolfin linked a pull request Jan 24, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: bug Something isn't working topic: conversion Issue related to conversion
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants