-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interop between generated bindings #746
Comments
Approach 1: type_map a la ffigen pseudo-config
Approach 2: similar to above, but infer library names by package names.
Advantage: simpler config |
Cross linking the ffigen issue: cc @mannprerak2 any suggestions from thinking about this in the ffigen context? |
As detailed in #521 , We don't have transitive dependency resolution at least now. Only specified packages are traversed. So in the scenario, But yeah, the fundamental issue persists even if the dependencies aren't implicitly included. If In jni there's no structs, everything is a fragile, weakly typed pointer so you can cast. But for the fundamental issue, I think there's no solution. In such cases, manually refactoring |
Interaction with output formatWith a mapping from Java type to generated dart binding (both approaches), the API gets tied to whether we chose to generate single file or multi file. ConfigI like approach 2 better. I don't think we care about the imported library name in the generated file. How about we make the keys regexes or prefixes? That way we can match all types in a package: # Map<QualifiedJavaType, DartLibraryUrl>
imports:
'com.apache.pdfbox.*': 'package:pdfbox/pdfbox.dart' |
Agreed. That is the only way that this will work. (Same for ffigen.) |
I suppose regex is not required, if we specify
and jnigen needs to resolve it should first check for Any pitfalls of this approach? |
Yeah in the context of Java a prefix might make more sense than a regex. 👌 |
For ffigen, libclang provides USR (id), which are supposed to be consistent across translation unit (individual parsing). Not sure if this is useful for jnigen. |
@mannprerak2 It's same in Java actually. The fully qualified name is supposed to be same across all libraries loaded in single process. Does ffigen have any mechanism to export USRs automatically, or is it manually specified through type map? |
Not yet, But it should be relatively easy to add to ffigen. |
(note to self: USR = Unified Symbol Resolution) We should design a format. What information should be in there?
If we have multiple USIs for one Uri, we should probably nest the USIs in Uris.
symbols:
'package:pdfbox/pdfbox.dart':
- 'com.apache.pdfbox.*' or symbols:
'package:pdfbox/pdfbox.dart':
- 'com.apache.pdfbox.pdmodel.PDDocument'
# ... For jnigen it might be nicer to just do symbols:
'com.apache.pdfbox' : 'package:pdfbox/pdfbox.dart' But if we spell out all symbols symbols:
'com.apache.pdfbox.pdmodel.PDDocument' : 'package:pdfbox/pdfbox.dart'
'com.apache.pdfbox.pdmodel.Foo' : 'package:pdfbox/pdfbox.dart'
'com.apache.pdfbox.pdmodel.Bar' : 'package:pdfbox/pdfbox.dart' because we're unlikely to have multiple regexes in a single However, for ffigen, it is a bit more likely. One question is: should we output an exact list of symbols, or a regex (or prefix)? A prefix from the config in jnigen is probably fine, it is unlikely to collide with other names. However, in the ffigen context, taking the method/class include filters is likely to not work. So there we should probably output the full list of symbols. If we'd like to keep the ffigen and jnigen symbols:
# Map<DartPackageUri, List<UniqueSymbolIdentifer>>
'package:pdfbox/pdfbox.dart':
- 'com.apache.pdfbox.pdmodel.PDDocument'
- 'com.apache.pdfbox.pdmodel.Foo'
- 'com.apache.pdfbox.pdmodel.Bar'
# ... wdyt? |
In java, I am not familiar with USRs.
I think it should also be possible to semi-manually work around this, by having the user provide a mapping which asserts X.h in config2 is equivalent to X.h in config1. In the rest of this comment I am assuming we have a way to identify each binding symbol in string format, and there's some hierarchical structure. Fortunately this is almost always true in Java. Re: YAML formatI think it's logical to have a mapping from required symbol to required import and not vice versa, because we lookup by symbol, and it's generated file && read by program, even if it looks tidier to have a tree with each import listing what it provides. Of course you can convert to different internal representation with a simple for loop. So matter of preference. Should this file be human readable? Does this file's size matter? Re: lookupAssuming a hierarchical structure among bindings, i.e given an identifier you can tell it's parent (eg java package or header file), we can easily implement lookup in most-specific to most-general order. (As I gave an example above).
Assuming b encompasses all subpackages as well, a.b.c.D will match the above import spec. Maybe if we don't want this behavior, we can use a '/' or other symbol to explicitly specify that 'a.b' is recursively encompassing, and change the lookup code accordingly. That's not a problem. Now, assuming some classes were filtered out using config, we need to explicitly write the filtered out class with null mapping, so an explicitly skipped class a.b.X won't match the above spec. Code should stop at seeing null. There are more implementation details to still flesh out, but I think the basic idea is sufficient. I think regular expressions may be harder to do constant time lookup, unless maybe you sort the list of all regexes and binary search the ones closest to what you are looking for. with prefixes there will be a single digit number of combinations you have to check in a map. Is there any specific advantage of using regex in ffigen context? But yeah above all is leetcode. If it's only a machine read-write file I don't mind writing out every type either, it will still be smaller than generated binding file. And reading some JSON/YAML should be pretty fast compared to parsing C/Java. https://en.wikipedia.org/wiki/List_of_unsolved_problems_in_computer_science |
Yes to all. |
C libraries usually add a prefix to all the symbols in a library so as to prevent name collision. (E.g libclang uses Edit: If you meant just using it in symbol.yaml context, then I don't think regexp is of much use. |
Re: Re: YAML format
It makes it easier for people to understand the whole setup. Also, people will see it in diffs when committing and in PRs etc.
Likely not much, I presume that everything (including git) compresses files. And repeated strings are relatively easy to compress. However, more text also means it is hard for users to read through if trouble shooting is needed. So smaller is usually better.
Exactly. So I prefer optimizing for simple for humans to understand in diffs etc.
That would be slightly tidier than a map indeed. Something like symbols:
'package:pdfbox/pdfbox.dart':
'com.apache.pdfbox.pdmodel' :
'PDDocument' : true # equals 'com.apache.pdfbox.pdmodel.PDDocument' : true
'Foo' : true
'Bar' : true
'nested' :
'Baz' : true
'nested2' : true
# ... But that requires organizing all the symbols that jnigen binds to into a tree of common-prefixes. I'm not sure if that is worth it over a flat list: symbols:
'package:pdfbox/pdfbox.dart':
'com.apache.pdfbox.pdmodel.PDDocument' : true
'com.apache.pdfbox.pdmodel.Foo' : true
'com.apache.pdfbox.pdmodel.Bar' : true
# ... We'd like to have a map (or set) in the Dart runtime for quick lookup, but in yaml it just creates noise. So probably the list that I suggested is nicer. symbols:
'package:pdfbox/pdfbox.dart':
- 'com.apache.pdfbox.pdmodel.PDDocument'
- 'com.apache.pdfbox.pdmodel.Foo'
- 'com.apache.pdfbox.pdmodel.Bar'
# ... |
This is a reason to not use prefixes at all in our yaml file., but only generate mappings for the full path. |
Based on the discussions for ffigen:
format_version: 1.0.0
files:
"../generated/base_gen.dart" :
used_config:
<some-symbol-that-infuences-importing-bindings>: <some-value>
symbols:
'package:pdfbox/pdfbox.dart':
- 'com.apache.pdfbox.pdmodel.PDDocument'
- 'com.apache.pdfbox.pdmodel.Foo'
- 'com.apache.pdfbox.pdmodel.Bar' Note that compared to #539, we don't have to include a |
Yes. symbols can be renamed, both for conflict resolution and probably in future, manually. |
Few notes / questions:
So this means the above classes are in lib/src/ under java-esque folder tree, and imports should be resolved as such. I am still catching upto discussions on ffigen. I may be missing few points from the discussion. This also raises a question of lots of small imports vs few large imports (eg per package |
Okay, so we need name as well. (Oops, I copy pasted too much. We had two file paths in there.) I believe format_version: 1.0.0
files:
'package:pdfbox/pdfbox.dart': # Everything exported from some lib/src/generated/_all.dart
used_config: # Optional.
# Probably empty for now.
symbols:
'com.apache.pdfbox.pdmodel.PDDocument':
name : 'PDDocument'
'com.apache.pdfbox.pdmodel.Foo':
name : 'Foo'
'com.apache.pdfbox.pdmodel.Bar':
name : 'BarRenamed' format_version: 1.0.0
files:
'package:pdfbox/com/apache/pdfbox/pdmodel/PDDocument.dart': # Directly generated into the public API of the package.
symbols:
'com.apache.pdfbox.pdmodel.PDDocument':
name : 'PDDocument'
'package:pdfbox/com/apache/pdfbox/pdmodel/Foo.dart':
symbols:
'com.apache.pdfbox.pdmodel.Foo':
name : 'Foo'
'package:pdfbox/com/apache/pdfbox/pdmodel/Bar.dart':
symbols:
'com.apache.pdfbox.pdmodel.Bar':
name : 'BarRenamed' Both of these would work when being imported from another binding generation.
We don't have to, we can keep it simple, just list the full path of all files in the yaml as I did above. |
I'll try to specify everything here before implementing the feature. Generating the export fileYaml pathThe default path will be a top-level output:
export:
path: a_export.yaml Excluding classes to exportThis can be specified in export:
exclude:
- com.example.A
- com.example.B The export file's structureversion: 1.0.0
files:
'package:x/y.dart':
classes:
'com.example.A':
name: A
type_class: $AType
super_count: 2
type_params:
T:
'java.lang.Object': DECLARED
methods:
abc: 2 # in this case the method got the 2 suffix, so when extending this class, we need to make sure the corresponding method that overrides this, get the 2 suffix
ImportThis can be specified in import:
- 'package:x' # will look for the default top-level jnigen_export.yaml
- 'package:y/y_export.yaml'
- 'path/to/z_export.yaml' If the same class comes in a subsequent import, that one will be overriden with a warning log.
If the same class is generated by the library itself, that takes precedence over the imported ones. Granular control
import:
- 'package:java_collections' # even though this has a class for java.util.List
- 'package:jni': # this takes precedence
show: 'java.util.List' Or import:
- 'package:java_collections': # You can hide individual ones
hide: 'java.util.List' Am I missing something? Different structure or naming suggestions? |
That does mean that if the imported library has methods that take that class as an argument, they will still use the definition of that library. So we end up having to do casts on that boundary. Maybe it's better to have an error, and having to explicitly
We use "symbol-file" as the name for the yaml file in FFIgen. And Also, why have the package name in there? That is superfluous.
Oh I like! (Though we could consider not doing that and just doing the non-magic thing. E.g.
Are there any other things than (Just a large config for my own sake to see where imports fit in.) preamble: |
// ..
output:
c:
path: 'src/'
subdir: 'third_party/'
library_name: 'foo'
dart:
path: 'lib/src/third_party/'
symbols:
path: 'lib/symbols.yaml' # So that it gets a library uri.
classes:
- 'org.example.foo.Foo'
maven_downloads:
source_deps:
- 'org.example....'
import:
- 'package:java_collections/symbols.yaml' # Using a library uri. Again, 'symbols' might not be the right term to describe this. But 'import' and 'export' describes the action with the "thing" not the "thing", not the contents of that yaml file. If we have nothing else than 'classes', maybe it should be |
Good idea.
Agreed, I think that's better. No default and having to explicitly naming it is good.
Tangential: I think it's also useful to have some way of breaking down the generated "symbols" files into multiple ones, maybe adding an option to add sub-exports per generated library. So if a library generates the entire java stdlib, we could import only `package:java/util/util_symbols.yaml'. Also including multiple symbols files into one. Not necessary for closing this issue but could be useful in the future. |
That's a good idea. But indeed let's just build the simple thing first, and bump the version if we want to cover subsequent use cases. |
If this is relative path, you don't have to take package name I have been inclined against package name in config, although it might have made few things easier, because jnigen should work from anywhere in hierarchy, then you will be also adding the relative path to config from package root. (Correct me if I missed something).
Nice! |
It's better to change the format to json instead of yaml. The argument is the same as with |
My glorious infestation of yaml is coming to an end... |
We should support interop between 2 or more jnigen-generated packages.
_Originally posted by @mahesh-hegde in #761
The text was updated successfully, but these errors were encountered: