Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which kinds of PR's would you like to get? #648

Closed
nikcio opened this issue Oct 11, 2022 · 36 comments
Closed

Which kinds of PR's would you like to get? #648

nikcio opened this issue Oct 11, 2022 · 36 comments
Labels
is:question Further information is requested

Comments

@nikcio
Copy link
Contributor

nikcio commented Oct 11, 2022

Hi I'm new to this repository.

Therefore to avoid PR spamming I would like to hear what kinds of PR's you'd like to get on this repository. In my own fork I've tried implementing the following tools/fixes/updates that might be useful to have in the main repository:

Other fixes/updates I have in mind

  • Fix to other vulnerabilities (SonarCloud reported)
  • Fix SonarCloud reported Bugs
  • Review and possibility fix security hotspots reported by SonarCloud
  • Fix code smells in the project
  • Into guide to setting up the project in VS 2022 (This was quite a challenge because of the old frameworks used. BTW I also experience VS tools/analyzers crashing when I open the project 😅 So this might also be a reason the repository doesn't get many new contributors)

Please let me know what you'd like PR's on.

@NightOwl888 @rclabo you guys seems to be some of the more active contributors 😄

@eladmarg
Copy link
Contributor

@nikcio , thank you so much for your efforts.

we have many issues because we wanted to stay as close as possible to the java implementation.
(in order to support easy upgrade path in the future)

this is why sonar cloud is finding so many issues.
so yes, packages upgrades are welcome, but if its not something critical (you shouldn't analyze the tests projects) I wouldn't change it.

@rclabo
Copy link
Contributor

rclabo commented Oct 11, 2022

Welcome. We are always looking to grow the LuceneNET developer community.

It's interesting that you had a lot of challenges getting the project up and running in VS 2022. I started with LuceneNET on VS 2019 and later upgraded to VS 2022 and don't recall having much issue, although the solution configuration is certainly much more complex than any solution/project files I've worked with in the past.

But maybe I did have some challenges and asked Shad for help and just forgot. At any rate, your recent experience is the best indicator of what it's like for a developer new to LuceneNET to try to pull it down and build it. I think creating an intro guide to setting up the project in VS 2022 would be a nice addition to the website.

I'll let @NightOwl888 reply to most of your suggestions as his responses will be much more thought out then mine ;-)

@rclabo
Copy link
Contributor

rclabo commented Oct 11, 2022

Wow, I just pulled down LuceneNET master and tried to build it in VS2022 for the first time in a few months. I totally get where you are coming from @nikcio Earlier this year it worked great for me in VS2022, but now it does not. I get this dialog:

image

And if I say that I want to update the framework and click continue, the website solution still does not load. It's as though it doesn't really update the framework used by the project. This problem didn't use to Exist in VS 2022. Apparently it's still not an issue in VS 2019. But of course I want to use VS 2022. @nikcio How did you resolve the issue?

@nikcio
Copy link
Contributor Author

nikcio commented Oct 11, 2022

@rclabo I think the dialog you're seeing is broken I saw it too and it doesn't seem to do anything. I think the issue is that VS 2022 has dropped support for the older versions of .NET v4 - 4.6.1 (I think). This leads me to the next issue. That VS 2022 has dropped support for older versions. Therefore, you can't use the VS installer to install the older versions on your system. Then to get the older versions you have to do a hacky workaround where you download the SDK from Nuget and move it into the correct folder. You can see this StackOverflow answer on how to: https://stackoverflow.com/questions/70022194/open-net-framework-4-5-project-in-vs-2022-is-there-any-workaround

I found I needed to install:

  • .Net 4
  • .Net 451
  • .Net 461

To have the project running.

@jeme
Copy link
Contributor

jeme commented Oct 12, 2022

@nikcio I think the VS2022 installer only installs .NET 4.8 by default when you choose workflows that require .NET, you need to select older SDK's if you require them under "Individual Components" in the installer. - There are SDK's down to 4.6.1.

Considering that 4.6.1 and older (with 3.5 SP 1 as an exception) are all EOL, I guess it makes sense they don't make them available.

@nikcio
Copy link
Contributor Author

nikcio commented Oct 12, 2022

@jeme It was the same conclusion I came to which lead me to the workaround above 😄

@jeme
Copy link
Contributor

jeme commented Oct 12, 2022

@nikcio Ahh, I read too fast and missed that you specified "older versions" as .NET v4 - 4.6.1 - Sorry.

@nikcio
Copy link
Contributor Author

nikcio commented Oct 14, 2022

Under "Other ways to help" in: https://github.com/apache/lucenenet/blob/master/CONTRIBUTING.md

There's written:

But I'm wondering does any form of document exist of files that has already been looked through? Or does files normally have a tell to show that they've been looked through?

@NightOwl888
Copy link
Contributor

Under "Other ways to help" in: https://github.com/apache/lucenenet/blob/master/CONTRIBUTING.md

There's written:

But I'm wondering does any form of document exist of files that has already been looked through? Or does files normally have a tell to show that they've been looked through?

A little history. For quite a while we had a few failing tests that were difficult to track down. This is primarily because of the randomized testing that is used in the tests and the fact that we don't have a 1:1 port of the test framework. So, to expedite the process of finding these bugs, we requested help reviewing the code line-by-line, since at the time it was easier to find them that way.

However, since then we have fixed the test framework so randomized tests can be repeated (thus debugged) in #547. This allowed us to finally track down several issues that were primarily hidden and caused intermittent test failures across several tests. As a result, at present there are no remaining test failures in the Lucene.Net (core) assembly, which was primarily what we wanted to address with this review.

We have also made reviewing tests a priority for certain older projects in #259, since during the port there were often missing pieces when working on a given code file and sometimes the solution was to comment out parts of the file to get it to compile. Some of these commented lines may still exist, meaning we may be missing test conditions. If you are looking for a way to narrow this down, I would suggest reviewing the remaining test projects in #259.

Do note that most of the projects under test that are not part of #259 were ported from 4.8.1 unless otherwise noted with a comment like // Lucene version compatibility level 8.2.0 at the top of the code file. We shouldn't be bringing in code from any other version elsewhere. When we upgrade the application, it will be all at once rather than one piece at a time.

@NightOwl888
Copy link
Contributor

NightOwl888 commented Oct 15, 2022

I have added some issues from the scans above, although, there is a lot in there and I could use some help reviewing and creating issues. Of course, if it is a change that would cause us to diverge greatly from the Java source, we don't want to do it. Some things that I noticed that we definitely do not want to change:

  1. goto statements / Refactor the containing loop to do more than one iteration - These are mostly workarounds for the lack of support for the labeled continue statement in .NET. This is the best way to keep the execution order the same in Java and .NET when stepping through the code.
  2. Cognitive Complexity - we want to remain in sync with Java Lucene here.
  3. Associated TODO - it is detecting the TODO statements from Lucene. All TODO statements that apply to Lucene.NET are marked with either LUCENENET TODO: or LUCENE TO-DO. If we can change the scan to look for those instead, it would be great. NOTE: A PR that would be accepted would be to change them all (except for the plain TODOs from Lucene) to be LUCENENET TODO.
  4. Remove this commented out code. - If the commented code existed in Lucene, we want to keep it. If not, please point it out so we can decide case-by-case.
  5. Rename class 'SOMEClass' to match pascal case naming rules, consider using 'SomeClass'. - We are generally better off using the original class names.
  6. Remove the unnecessary Boolean literal(s) - If these exist in Lucene, we should keep them.

Some things we can change, but are not that critical:

  1. redundant jump statements (if they exist in Java) - The ones I noticed were just return statements at the end of a method. We can fix these, but we should comment them out in the source code and leave a // LUCENENET: comment after them why they were removed. A link to a related Code Analysis rule (preferred) or SonarCloud issue is fine.
  2. unused private fields - These are in code that is auto-generated in Java (we don't have auto-generation setup yet in .NET). Again, we can fix them, but should comment out the unused field and leave a // LUCENENET: comment after them why they were removed.

If you could adjust the scan to exclude the issues we don't want to change and the issues I have already opened so we can see what remains, that would be great.

@nikcio
Copy link
Contributor Author

nikcio commented Oct 15, 2022

I'll have a look on excluding the cases above 😄

@nikcio
Copy link
Contributor Author

nikcio commented Oct 15, 2022

@NightOwl888 I've now excluded the rules causing detection on the things we don't want to change.

All the issues you've created I've added linked some more filtered links and I've added a link to the GitHub issues on SonarCloud. I've also marked all issues "confirmed" that you have created issues for on GitHub. You can filter the remaining "open" issues like this:
Note: It's not possible to mark issues reported by Roslyn "confirmed" so they only have a comment with a link to a given GitHub issue

image

How to find filtered links to issues (Non-Roslyn reported)

image

Press "Why is this an issue" this will show you a description. Then use the title here "Utility classes should not have public constructors" (This is the name of the rule).

image

The rule can be used to filer by rule in the section shown above.

How to find filtered links to issues (Roslyn reported)

Same as before we need to find the rule name but here the rule name isn't the same as before.

image

Therefore, we need to be on the "Issues" tab and press the filter button (The triangle) and then the rule name will be "Roslyn:CA1822" in this case. Here we can just click the text and the filter will be set.

@NightOwl888
Copy link
Contributor

NightOwl888 commented Oct 15, 2022

More issues we don't want to change:

  1. Consider reworking this 'switch' to reduce the number of 'case's from xxx to at most 30
  2. Merge this if statement with the enclosing one.
  3. Take the required action to fix the issue indicated by this 'FIXME' comment. - These were the original TagSoup comments and we can ignore them.
  4. Change return type to 'void'; not a single caller uses the returned value.
  5. All 'Xxx' method overloads should be adjacent.
  6. Loops should be simplified with "LINQ" expressions. - This would need to be benchmarked to get the okay. In general, LINQ is slow and almost certainly performs inconsistently across target frameworks.
  7. Loop should be simplified by calling Select(r => r.value)). - This would need to be benchmarked to get the okay. In general, LINQ is slow and almost certainly performs inconsistently across target frameworks.
  8. Replace this 'for' loop with a 'while' loop.
  9. Extract the assignment of 'lastType' from this expression.
  10. Use 'string.Contains' instead of 'string.IndexOf' to improve readability
  11. Make this an auto-implemented property and remove its backing field.
  12. Use the opposite operator ('>') instead.
  13. Use the opposite operator ('!=') instead.
  14. Extract this nested code block into a separate method.
  15. Constructor has x parameters, which is greater than the 7 authorized.
  16. Remove this unnecessary null check; 'is' returns false for nulls. - This is a performance enhancement. It is far faster to check for null first before calling is.
  17. Remove this silly bit operation.
  18. Remove this unnecessary cast to 'float'. - Where noted, this is a floating point precision workaround for 32 bit .NET Framework.

More issues we can change:

  1. Replace this type-check-and-cast sequence with an 'as' and a null check. - No comment needed. This is more readable and is understood to be a better translation from Java to .NET.
  2. Remove the ToString call in order to use a strongly-typed StringBuilder overload - Add a comment to indicate code analysis or other issue description
  3. Remove the unused internal class 'FSTEntry'. - This can be removed and replaced with a comment // LUCENENET specific - removed FSTEntry because it is not in use.
  4. Remove this call from a constructor to the overridable 'xxx' method/property. - This is a difference in initialization order between Java and .NET. In Java it is safe to make a call to a virtual member from a constructor, but in .NET it is not. Same issue in Fix initialization issue with DirectoryTaxonomyWriter #408. I still haven't found a good solution to this, but it would be nice to have a list of all of the places where it is a problem so we can consider what solution(s) will work.
  5. Remove this empty statement. - When these are extra ; characters, they are most likely just typos. For labeled continue statements, we inconsistently use ; and { }, both which trigger this warning. For those, we should probably use the latter approach and add an inline comment /* LUCENENET: intentionally blank */ between the curly brackets.
  6. Either remove or fill this block of code. - The comment associated with the line can be moved inside of the curly brackets to make it non-empty. If there is no comment, just add // LUCENNET: intentionally empty. Exception: empty finally blocks - we can remove the entire try/finally out of the methods in JavascriptLexer. Leave a comment // LUCENENET specific - removed unnecessary try/finally block in each method where they are removed.
  7. Correct this '&' to '&&'. - Check the Java source on this to see whether this is a translation bug or a bug carried over from Lucene, and report appropriately.
  8. Use 'StringBuilder.Append(char)' instead of 'StringBuilder.Append(string)' when the input is a constant unit string - No comment needed. This is more efficient and is understood to be a better translation from Java to .NET.
  9. Prefer 'AsSpan' over 'Substring' when span-based overloads are available - We can conditionally reference the package System.Memory in net462 and netstandard2.0 to fix this. Other targets don't need to reference the package. No comment needed. This is more efficient and is understood to be a better translation from Java to .NET. Do note the overload of Substring(int, int) differs between Java (second parameter is end) and .NET (second parameter is length). To correct use length = end - start. We should have done this already.
  10. Fix this implementation of 'IDisposable' to conform to the dispose pattern. - Note this is part of API: Review to ensure IDisposable is being used correctly and disposable pattern implemented correctly #265. Most of the issues I have spotted were failure to call base.Dispose(disposing) after exiting the if (disposing) block. Others are because they are on a private class that should be sealed, but it is not (Enumerators).
  11. Use nameof in place of string literal 'xxx'
  12. Change the visibility of this constructor to 'protected'. - These should all be abstract classes. Add a comment // LUCENENET specific - changed from public to protected after each changed line. Exception: some classes are noted that the constructor must be public for Reflection calls to work.
  13. Rename 'xxx' which hides the field with the same name. - We can fix this for the static readonly fields. In regular instance methods, we should leave it alone unless it diverges from Lucene.
  14. Change the visibility of 'EOS_FLAG_BIT' or make it 'const' or 'readonly'. - Make it const.
  15. Replace the control character at position 2 by its escape sequence '\u3000'. - This should be marked with a // LUCENENET specific - changed from invisible whitespace character to '\u3000' for readability.
  16. Add a nested comment explaining why this method is empty, throw a 'NotSupportedException' or complete the implementation. - Looks like the TagSoup members weren't all reviewed to ensure that members were marked virtual in .NET if they aren't explicitly marked final in Java: https://android.googlesource.com/platform/external/tagsoup/+/donut-release/src/org/ccil/cowan/tagsoup/PYXWriter.java.
  17. The parameter name 'WeightedPhraseInfo' is not declared in the argument list. - issue link. This is a bug - we need to use the correct parameter name.
  18. When implementing IComparable, you should also override ==, !=, <, <=, >, and >=. - If the class is public, we should definitely implement these. If not public, it probably doesn't matter.

@nikcio
Copy link
Contributor Author

nikcio commented Oct 15, 2022

I've now excluded the rules causing detection on the new things we don't want to change.

Only Consider using 'string.Contains' instead of 'string.IndexOf' couldn't be ignored. This is because it's the ROSLYN analyzer which picks it up. So if we're going to ignore it we need to make that analyzer ignore it. (In the project - I think)

The other can change points now have a linked issue with a link to SonarCloud of where to find the issues.

@NightOwl888
Copy link
Contributor

I have added a rule to .editorconfig to completely disable CA2249. That should disable the rule for the entire solution.

@nikcio
Copy link
Contributor Author

nikcio commented Oct 15, 2022

I've experienced some issues opening the Lucene project after updating VS 2022 to 17.3.6 but it seems from what I can gather that it's a bug in VS 2022 so I've reported it here: https://developercommunity.visualstudio.com/t/VS-2022-1736-Process-is-terminated-due/10173885?entry=problem

Just if anybody else encounters the same problems 😄

@NightOwl888
Copy link
Contributor

@nikcio - Please leave a comment on any issues you are interested in tackling (or currently working on) so we don't duplicate efforts. We will assign them to you specifically (if I am not mistaken, assigning GitHub issues is only available for Apache Lucene.NET committers).

@rclabo
Copy link
Contributor

rclabo commented Oct 17, 2022

I've experienced some issues opening the Lucene project after updating VS 2022 to 17.3.6 but it seems from what I can gather that it's a bug in VS 2022 so I've reported it here: https://developercommunity.visualstudio.com/t/VS-2022-1736-Process-is-terminated-due/10173885?entry=problem

Just if anybody else encounters the same problems 😄

Thank you! That was on my list to do this morning and to let you know that even though I had been using VS2022 for many months with LuceneNET without issue, when I opened the solution in VS2022 after your message a few days ago I realized it was blowing up with a StackOverflowException. I spent two days digging into that and I too believe it's a new VS2022 issue that didn't previously exist. I verified this by pulling down the Beta16 tag and trying to compile that and it now throws the StackOverflowException but I had previously used older versions of VS2022 to compile that version of the code base with no issue.

@NightOwl888
Copy link
Contributor

@nikcio - Could we run the SonarCloud scan again? Several of the issues that were already addressed are still there.

@nikcio
Copy link
Contributor Author

nikcio commented Oct 20, 2022

@NightOwl888 I've started a new scan now it takes about ~30 min to run. But I think if you see the value of SonarCloud it should maybe be setup on the main lucene.NET repository instead of only my fork? I can draft a PR which adds it but it of course requires that you have enough permission to set it up in SonarCloud. It would also help future development by annotating PR's and warning about possible problems with PR's.

(SonarCloud is free for open source projects)

@nikcio nikcio mentioned this issue Oct 20, 2022
2 tasks
@nikcio
Copy link
Contributor Author

nikcio commented Oct 20, 2022

@NightOwl888 A how to guide has been created at: #709 😄

@NightOwl888
Copy link
Contributor

@nikcio

Here is a puzzle to work out. The scan is telling us to use an immutable collection. The problem is, this is an immutable collection, just not one from System.Collection.Immutable. Those collections have very different behaviors than what is expected in Lucene, so we also made additional read-only collections in J2N to behave like the ones in Java. Here are a list of all of our immutable collections.

  1. CharArraySet (after calling CharArraySet.UnmodifiableSet()). This just makes a regular CharArraySet instance that is backed by a CharArrayMap.UnmodifiableCharArrayMap<TValue>.
  2. CharArrayMap.UnmodifiableCharArrayMap<TValue>
  3. J2N.Collections.ObjectModel.ReadOnlyCollection<T>
  4. J2N.Collections.ObjectModel.ReadOnlyDictionary<TKey, TValue>
  5. J2N.Collections.ObjectModel.ReadOnlyList<T>
  6. J2N.Collections.ObjectModel.ReadOnlySet<T>

So we need to adjust the scan to ignore these, while picking up any others that we might have missed.

Since there are only a few dozen, we could suppress them with an attribute, but since this isn't a Roslyn error, I am not sure how to suppress it conditionally.

Thoughts?

One other issue is the fact that J2N currently doesn't support ImmutableArray<T> for structural equality comparisons (we assumed that Microsoft would never create a struct collection, and then they made this one), so ideally we wouldn't return it from any of our public APIs just yet. We want to fix this eventually, but mark these issues with a LUCENENET TODO: Change to ImmutableArray<T> when structural equality support for it is added in J2N and suppress it from the scan.

@nikcio
Copy link
Contributor Author

nikcio commented Oct 20, 2022

@NightOwl888

I've looked a bit around and unfortunately it seems that you cannot customize the rules in SonarCloud. If you need to have this ability you need to host SonarQube yourself then it can be done. But I did find that it's possible to extend a rule description. So We could add the classes listed above to that description. I'm unsure if this affects everyone using the rule or is specific to a Quality profile. Therefore I think we should mark the current false positives as "Resolved as Won't fix" with a link to the description above. Although if you'd like to add SonarCloud to the main repository I would wait until then so we don't have to do it twice.

There's also the possibility that Sonar themselves could fix the issue with the rule but I'm not sure how to make that request.

@nikcio
Copy link
Contributor Author

nikcio commented Oct 20, 2022

As for the other issue I would properly also mark issues related to that with a "Resolved won't fix" because as long as you keep track of the issues with the TODO marker they won't be lost. If there's a specific rule in SonarCloud that triggers the issue it could be even easier because we can just deactivate the rule and reactivate it when it's time to make the change then there's no manual labor.

@nikcio
Copy link
Contributor Author

nikcio commented Oct 20, 2022

Random thought

Maybe by using SonarCloud on the J2N library it could tell us more about why it marks the collections as mutable. It's very possible that it's just a specific pattern and/or hardcoded list that Sonar uses in this rule and therefore marks them wrong so it's a long shot.

As long as there's some source code or test code set up in a similar faction as where it registers the issue in the Lucenenet project we could see if the issue is also registered there.

@NightOwl888
Copy link
Contributor

I guess it depends. What happens if you create a sample app that has an API that returns an instance (declared in a static readonly variable) of a readonly collection from System.Collections.ObjectModel and scan it? If it doesn't even consider those to be "immutable", I doubt there is much chance we can make it identify the J2N collections. But I agree that getting it to recognize those collections as being immutable for the general public would be ideal, since J2N is a general library that others might use outside of Lucene.NET.

That being said, the code analyzer infrastructure is flexible in that it allows you to define your own IDs to suppress with the same old [SuppressMessage] attribute, so if we can work out what the ID is for that issue, we can deal with this conditionally (assuming it is built on top of the .NET code analysis infrastructure).

@nikcio
Copy link
Contributor Author

nikcio commented Oct 20, 2022

@NightOwl888

I looked in the logs from the GitHub actions workflow and found the ID here (It's the Sxxxx identifier in the message):

2022-10-20T08:04:15.4461319Z ##[warning]/github/workspace/src/Lucene.Net.Analysis.Common/Analysis/Core/StopAnalyzer.cs(44,32): warning S3887: Use an immutable collection or reduce the accessibility of the non-private readonly field 'ENGLISH_STOP_WORDS_SET'. [/github/workspace/src/Lucene.Net.Analysis.Common/Lucene.Net.Analysis.Common.csproj]
2022-10-20T08:04:15.4593352Z ##[warning]/github/workspace/src/Lucene.Net.Analysis.Common/Analysis/Core/StopAnalyzer.cs(44,32): warning S2386: Use an immutable collection or reduce the accessibility of the public static field 'ENGLISH_STOP_WORDS_SET'. [/github/workspace/src/Lucene.Net.Analysis.Common/Lucene.Net.Analysis.Common.csproj]

And also found this: https://stackoverflow.com/questions/63191169/is-there-a-possibility-to-suppress-warnings-of-sonarcloud-in-source-code

Some issues can also be found directly in Visual Studio with SonarLint but as my VS 2022 installation is unusable at the moment I didn't try that 😄

But when VS doesn't crash SonarLint will show most of the current Sonar warning in the Errors tab for the files you have open. It's a little unclear what issues it doesn't show but I think the ones that won't should would take to much compute so they left them out to ensure a smooth experience.

@NightOwl888
Copy link
Contributor

Just FYI, these warning numbers are also used in the URL, so can easily be used to filter the list and/or get the ID.

https://sonarcloud.io/project/issues?resolved=false&rules=csharpsquid%3AS3887&id=nikcio_lucenenet

Just ignore the charpsquid%3A and grab/edit everything after that.

@jeme
Copy link
Contributor

jeme commented Oct 22, 2022

To be fair, the design behind how CharArraySet becomes readonly is not all that obvious as far as I can tell (The class it self is not immutable, its how it's initialized that makes it so), so it's very understandable that it's flagged in this scenario IMO and it should just be marked as "resolved: won't fix" in SonarQube with a comment why. AFAIK Sonar SHOULD remember this for subsequent scans unless the code changes (not sure how significant a change is required before sonar will re-raise the issue)

If there is any rules that the project does not wish to use or if we wish to flag rules differently, a custom profile can be created and the project can be changed to that, to exclude rules from a profile you need to copy their profile and then modify it from there, unfortunately inheriting one will not allow one to exclude rules only include additional ones, (I did not keep up with the entire thread here so don't know if you have already been working on that) - (Rules from Roslyn have to be ignored there instead)

I can see that allot of Apache projects already use SonarCloud, if the LuceneNET project wishes to use it as well maybe it's worth figuring out who and how to get it registeret there so a workflow can be added in the main project instead. However being a Java port, I think it should be considered very carefully, as mentioned again an again, certain decisions is made to stick close to Java which is likely to lead to more flags by tools such as Sonar.

It is also possible to make sonar only report from a Delta, either by days, version, data etc. This can be really useful as you otherwise has this huge burden of thousands of red flags in old code which would take days to fix, instead with this feature it's possible to focus on only fixing these issues in new code or as old code changes.

@eladmarg
Copy link
Contributor

Guys you're doing a great job but I think we better put our efforts in a new version and align with the latest Java Lucene.

Lucene is an awesome library, one of the best, however if today I would design it completely different with modern architecture, less allocations and much better performance in mind.

We choose this path with the known issues because we wanted to stay as close as possible to the Java implementation. Without this constraints like I said, I would completely refactor many parts.

I tried to write a tool that compares Java 4.8 to the latest, mark the .net files to be changed and what's the difference.

I can estimate that in 2-3 months of hard work I can get to compiling latest aligned version. (Of course this is only the first part then fix the failing tests)

If someone know a big company that will benefit from it and might have the funds, I'm willing to take this small project

@nikcio
Copy link
Contributor Author

nikcio commented Oct 22, 2022

@eladmarg I believe this conversation is very similar to what's written here: #437 (comment)

I found this issue when first looking into the project and I've come to agree that a great first step would be to make the 4.8.0 version stable (out of beta) before taking on too much of the new additions that have happened in the Lucene project and I think many of the issues we're solving currently are moving us closer to that target. I also would love to have Lucenenet be feature equivalent to the Lucene project but having a solid foundation will make the process more smooth.

Also when I first came to the project a couple of weeks ago I didn't know very much about how the project functions and still don't now 99% of the project so I found it very difficult starting to look into the issues already opened. Therefore, I find the newly opened issues great because they give me a chance to help out and slowly get more familiar with the project.

And about the funding element as I read in the different conversations in the issues it's a common problem that this project doesn't get the love it deserves. And therefore moves more slowly than it otherwise would be. Even though I've tried to help out in the past weeks @NightOwl888 have been doing like 90% of the real work that has been going into the project. So if we can come up with some kind of strategy to pull more attention to the project it would be great. And I'm sure very appreciated by the maintainers that have used countless hours on this project.

@NightOwl888
Copy link
Contributor

@nikcio - I found what appears to be a bug in the scan: https://sonarcloud.io/project/issues?issues=AYPAuOCLhbfJOGLOoaG2&open=AYPAuOCLhbfJOGLOoaG2&id=nikcio_lucenenet

The scan is recommending to convert from protected internal to private protected. The former allows internal (public-like) access within the assembly and protected access outside of the assembly. The latter allows protected access within the assembly and no access outside of the assembly.

I double-checked, and indeed it is possible to inherit FieldCacheRangeFilter<T> outside of the assembly.

    public class Foo : FieldCacheRangeFilter<double>
    {
        public Foo()
            : base("foo", FieldCache.NUMERIC_UTILS_DOUBLE_PARSER, lowerVal: 444.444, upperVal: 555.555, includeLower: true, includeUpper: true)
        {
        }

        public override DocIdSet GetDocIdSet(AtomicReaderContext context, IBits acceptDocs)
        {
            throw new NotImplementedException();
        }
    }

Clearly the advice given by the scan is incorrect if we want this to remain a protected constructor outside of the assembly.

That being said, checking with the Lucene 4.8.0 source the constructor is actually declared private, so this is also a Java to C# translation bug and it should ultimately be made private protected as the scan indicated.

NOTE: protected internal was initially put on all protected APIs because in Java it is possible to call a protected member within the same package like it is public. However, protected internal is a big mess when it is combined with toggling on and off InternalsVisibleToAttribute as required. Once InternalsVisibleToAttribute is enabled, all of the subclasses need to be changed from protected to protected internal to match accessibility. Ideally, we would avoid protected internal and use protected when we can, but since some internal places use these APIs like they are public, we need it in those specific cases.

@NightOwl888
Copy link
Contributor

@nikcio - I found what appears to be a bug in the scan: https://sonarcloud.io/project/issues?issues=AYPAuOCLhbfJOGLOoaG2&open=AYPAuOCLhbfJOGLOoaG2&id=nikcio_lucenenet

The scan is recommending to convert from protected internal to private protected. The former allows internal (public-like) access within the assembly and protected access outside of the assembly. The latter allows protected access within the assembly and no access outside of the assembly.

I double-checked, and indeed it is possible to inherit FieldCacheRangeFilter<T> outside of the assembly.

    public class Foo : FieldCacheRangeFilter<double>
    {
        public Foo()
            : base("foo", FieldCache.NUMERIC_UTILS_DOUBLE_PARSER, lowerVal: 444.444, upperVal: 555.555, includeLower: true, includeUpper: true)
        {
        }

        public override DocIdSet GetDocIdSet(AtomicReaderContext context, IBits acceptDocs)
        {
            throw new NotImplementedException();
        }
    }

Clearly the advice given by the scan is incorrect if we want this to remain a protected constructor outside of the assembly.

That being said, checking with the Lucene 4.8.0 source the constructor is actually declared private, so this is also a Java to C# translation bug and it should ultimately be made private protected as the scan indicated.

NOTE: protected internal was initially put on all protected APIs because in Java it is possible to call a protected member within the same package like it is public. However, protected internal is a big mess when it is combined with toggling on and off InternalsVisibleToAttribute as required. Once InternalsVisibleToAttribute is enabled, all of the subclasses need to be changed from protected to protected internal to match accessibility. Ideally, we would avoid protected internal and use protected when we can, but since some internal places use these APIs like they are public, we need it in those specific cases.

I guess I am also going to have to state that this bug is probably going to be helpful to find all of the protected APIs that were translated from Java to C# wrong. So, we should wait until #677 is closed before we report it to SonarCloud.

@nikcio
Copy link
Contributor Author

nikcio commented Nov 2, 2022

@NightOwl888 I've just looked a bit on it and I don't think the rule is wrong. If you read the title it says it's because the class is marked abstract and therefore can't be instantiated: https://sonarcloud.io/organizations/nikcio/rules?open=csharpsquid%3AS3442&rule_key=csharpsquid%3AS3442

Therefore, in the example it would make most sense to change the modifier to protected and not private. If you would maintain the same level of access and is not taking the Java source into account (as you stated the modifier is wrong and should be private)

Do note that the rule specifics that the modifier shouldn't be public/internal not that it should always be private.😉

@nikcio
Copy link
Contributor Author

nikcio commented Nov 5, 2022

As #709 has been merged and seems to be working I've updated the links in the open issues to link to the new sonar report: https://sonarcloud.io/project/overview?id=apache_lucenenet if you find links that have yet to be updated please help out updating them. 😄

Mostly the link is the same with a change in the id= query parameter to id=apache_lucenenet instead of id=nikcio_lucenenet if the link isn't specific to a single issue.

@nikcio
Copy link
Contributor Author

nikcio commented Nov 5, 2022

@NightOwl888 As I have no power to change the states of the issues on the official sonarcloud issues it would be great if you could mark these two issues as won't fix as was done in prior conversations.

The issues viewed from my fork: https://sonarcloud.io/project/issues?resolutions=WONTFIX&id=nikcio_lucenenet

Issue 1: https://sonarcloud.io/project/issues?issues=AYRH0UVF_qq9ReJdi5Au&open=AYRH0UVF_qq9ReJdi5Au&id=apache_lucenenet

Issue 2: https://sonarcloud.io/project/issues?issues=AYRH0UO0_qq9ReJdi4_S&open=AYRH0UO0_qq9ReJdi4_S&id=apache_lucenenet

@nikcio nikcio closed this as completed Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is:question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants