Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to use the Spoon library #109

Merged
merged 33 commits into from
Mar 5, 2024
Merged

Refactor to use the Spoon library #109

merged 33 commits into from
Mar 5, 2024

Conversation

sanderploegsma
Copy link
Contributor

@sanderploegsma sanderploegsma commented Jan 26, 2024

This refactors the implementation of the Java representer, mainly by replacing the JavaParser library with the Spoon library. Main reason for doing so is that JavaParser makes it pretty hard to accurately track usage of methods, variables etc when renaming, while Spoon gives us this for free.

It also seems to clean up a little bit of complexity which is nice. I also added/updated CONTRIBUTING.md and README.md to contain information similar to their counterparts in the Java analyzer.

Next to that, this PR adds the following fixes/improvements:

Fix recursion in placeholder generation

The previous implementation apparently contained a bug where it would create new placeholders for existing ones, probably because the normalizer makes changes to the representation in-place while walking the AST. This causes an interesting side-effect to say the least; and solving it brings us one step closer to the normalized representation still being valid Java.

The bug was easily spotted in tests/multiple-solution-files/expected_mapping.json, as the mapping contains entries mapping one placeholder to another.

Write output to given output folder instead of solution folder

For some reason this tool was still writing the output to the solution folder instead of to the output folder passed as CLI argument. This is now fixed.

I also removed the validation logic used to check whether the CLI arguments for the input and output folders end with a /, as the implementation no longer relies on that being the case.

Fix error in logging output

Previously the representer output contained a large error on startup:

ERROR StatusConsoleListener Could not determine local host name
 java.net.UnknownHostException: 2bc26557f4b6: 2bc26557f4b6: Temporary failure in name resolution
	at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1936)
	at org.apache.logging.log4j.core.util.NetUtils.getLocalHostname(NetUtils.java:56)
	at org.apache.logging.log4j.core.LoggerContext.lambda$setConfiguration$0(LoggerContext.java:625)
	at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
	at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:625)
	at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:713)
	at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:735)
	at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:260)
	at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:154)
	at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:46)
	at org.apache.logging.log4j.LogManager.getContext(LogManager.java:197)
	at org.apache.logging.log4j.LogManager.getLogger(LogManager.java:611)
	at representer.RepresenterCli.<clinit>(RepresenterCli.java:21)
Caused by: java.net.UnknownHostException: 2bc26557f4b6: Temporary failure in name resolution
	at java.base/java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
	at java.base/java.net.Inet4AddressImpl.lookupAllHostAddr(Inet4AddressImpl.java:43)
	at java.base/java.net.InetAddress$PlatformResolver.lookupByName(InetAddress.java:1211)
	at java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1828)
	at java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:1139)
	at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1818)
	at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1931)
	... 12 more

This was being caused by the log4j library trying to determine the system host name in an environment without a network (Docker). I tried to see if I could suppress it by changing the logger output format, but they insist on resolving the host name even if it's not used in the log output. To that end, I replaced log4j with slf4j + logback and now the error is gone.

Ordered keys in mapping.json

While testing locally, it turned out that sometimes the order of the keys in mapping.json changes, causing the smoke tests to fail. So, now the mapping.json has keys ordered alphabetically. They are now also keyed by placeholder instead of by identifier, following the example in the docs.

Fixes #18
Fixes #19

@sanderploegsma sanderploegsma force-pushed the spoon branch 2 times, most recently from a61393c to 687a763 Compare January 29, 2024 10:20
…tch expressions

Note that support for Java 20 and 21 is not yet added to Spoon.
@sanderploegsma sanderploegsma marked this pull request as ready for review January 31, 2024 20:00
@sanderploegsma sanderploegsma requested a review from a team as a code owner January 31, 2024 20:00
Copy link
Member

@ErikSchierboom ErikSchierboom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanderploegsma Am I correct in seeing that the same code might get a different representation in this version?

@sanderploegsma
Copy link
Contributor Author

Yes, that is correct. The most notable changes are:

  • The amount of whitespace changes, specifically the blank lines before the first method and after the last method
  • The order in which placeholders are generated is probably different
  • Built-in types are printed as FQDNs: java.lang.String instead of `String. This one can be switched off though, to remain more backwards-compatible if we want.

Note however that a lot of changes are for the better. It also seems to fix a few bugs, like the Signal identifier here not being replaced originally: https://github.com/exercism/java-representer/pull/109/files#diff-1d9349fa320ad0502948e97c81e3d9301c7a37f47a44ad08193a9843d34d2167L11-L12

I get that under normal circumstances we would like to prevent the same code generating a different representation, but in this case we may want to accept that consequence. Or are there reasons to prevent this that I may not be aware of?

@ErikSchierboom
Copy link
Member

I get that under normal circumstances we would like to prevent the same code generating a different representation, but in this case we may want to accept that consequence. Or are there reasons to prevent this that I may not be aware of?

The main downside is that we have to re-run the representer on all existing solutions, which is a time-consuming and somewhat costly operation. Are you planning on doing more work on the representer? E.g. if you want to implement more representation changes it is best for us to wait for those to not having to re-run multiple times.

@sanderploegsma
Copy link
Contributor Author

sanderploegsma commented Feb 1, 2024

In that case we may want to hold off until a new version of Spoon is released that includes my fix for the Enum formatting: INRIA/spoon#5649.

After that, I don't expect too many new changes.

The main downside is that we have to re-run the representer on all existing solutions, which is a time-consuming and somewhat costly operation.

Oof, yeah I see. And is that something that happens automatically when updates to a representer are pushed? Or is it a manual trigger invoked by an admin in situations like these?

@ErikSchierboom
Copy link
Member

We manually invoke it, but once we merge, new solutions will start getting the new representation.

@sanderploegsma
Copy link
Contributor Author

@ErikSchierboom the enum formatting has been fixed, so IMO this PR is finished!

- No fully qualified types when printing
- Use AOSP style for formatting
@sanderploegsma
Copy link
Contributor Author

@ErikSchierboom could you also review this one?

@ErikSchierboom
Copy link
Member

@ErikSchierboom the enum formatting has been fixed, so IMO this PR is finished!

Before I approve, this means that this is the definitive version of the new representation, correct?

@sanderploegsma
Copy link
Contributor Author

@ErikSchierboom the enum formatting has been fixed, so IMO this PR is finished!

Before I approve, this means that this is the definitive version of the new representation, correct?

Yes, no more breaking changes 👍

@sanderploegsma sanderploegsma merged commit 6856315 into main Mar 5, 2024
2 checks passed
@sanderploegsma sanderploegsma deleted the spoon branch March 5, 2024 10:14
@ErikSchierboom
Copy link
Member

We'll re-run the representations shortly (it'll take a while).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
x:size/large Large amount of work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Analyze types of method parameters for substitution Analyze MyClass.class for substitution
3 participants