Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude all transitive dependencies of ssj besides commons-math3 and replace linear regression implementation #638

Merged
merged 1 commit into from
Aug 19, 2024

Conversation

dwnusbaum
Copy link
Member

#625 added a dependency on ssj, which in turn picked up a few additional dependencies. I think we can avoid some of these dependencies to reduce the size of the HPI, avoid unusual licenses, and reduce supply chain risk by excluding all dependencies of ssj except for commons-math3.

[INFO] +- ca.umontreal.iro.simul:ssj:jar:3.3.2:compile
[INFO] |  +- colt:colt:jar:1.2.0:runtime
[INFO] |  |  \- concurrent:concurrent:jar:1.3.4:runtime
[INFO] |  +- com.github.rwl:optimization:jar:1.3:runtime
[INFO] |  \- org.apache.commons:commons-math3:jar:3.6.1:runtime

While reviewing the update, some of these dependencies stood out to me as being a bit unusual.

  • colt:colt
    • The main homepage seems to be https://dst.lbl.gov/ACSSoftware/colt/, which offers jar downloads but does not mention anything about Maven, and from some quick searches it is not clear to me who owns colt:colt on Maven central.
    • The library has a split license, where the hep.aida.* packages (which are unused by ssj, but due to the nature of Jenkins plugins are exposed to plugins that depend on junit) are LGPL "with the exception that any usage related to military applications is expressly forbidden", which means that it is definitely not an OSI-approved license (which is a requirement for plugins hosted by jenkinsci), so that seems bad. There are different versions of the library under other groupIds where the problematic package has been removed, but none that have clear ownership on Maven central traceable to a clear and current owner.
  • concurrent:concurrent

All we need from ssj is linear least squares regression (LeastSquares) and some kind of smoothed spline interpolation (SmoothingCubicSpline).

  • LeastSquares makes use of classes from the colt library, which I would prefer to avoid, but commons-math3 has a direct replacement in its SimpleRegression class.
  • I did not see any direct equivalent to SmoothingCubicSpline in commons-math3. There is SplineInterpolator, but it does not have smoothing support. Probably smoothing support good enough for our purposes (just a visual aid in some charts) could be implemented on top of SplineInterpolator with some preprocessing on our side if someone is interested and wants to spend some time on it.

Note:

Really I would prefer to not depend on ssj at all, and instead just copy the source code of SmoothingCubicSpline into this repo with proper attribution. I doubt we care at all about tracking updates for the code we are using, and there is a good chance the code will never change anyway. I am also a bit concerned that the 3.3.2 version that we are using and that is in Maven central is only mentioned in the README in https://github.com/umontreal-simul/ssj, and there is no corresponding release or tag. However, I did not find any good way to report licensing information correctly via https://github.com/jenkinsci/license-maven-plugin and https://www.mojohaus.org/license-maven-plugin/ without having the library that you want to include the license for actually be a dependency.

Testing done

I ran the plugin, created a job that created random test data and recorded it with this plugin, built the job around 50 times, took a screenshot of the charts on the test history page with and without my changes, and verified that the relevant trend lines in the screenshots were the same.

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

@dwnusbaum dwnusbaum requested a review from a team as a code owner August 19, 2024 19:15
@@ -104,6 +96,11 @@
<groupId>io.jenkins.plugins</groupId>
<artifactId>plugin-util-api</artifactId>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was already a dependency via ssj, I think it was just unused previously, but now we use it for SimpleRegression.

import org.kohsuke.accmod.Restricted;
import org.kohsuke.accmod.restrictions.NoExternalUse;
import org.kohsuke.stapler.bind.JavaScriptMethod;
import umontreal.ssj.functionfit.LeastSquares;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uses libraries from colt, which we can avoid by using SimpleRegression instead.


SimpleRegression sr = new SimpleRegression(true);
for (int i = 0; i < lrX.length; i++) {
sr.addData(lrX[i], lrY[i]);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit awkward, but the SimpleRegression methods that accept arrays of data expect double[n][2] instead of double[2][n], so I did not see a simpler approach.

Copy link
Member

@basil basil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

@basil basil requested a review from timja August 19, 2024 19:48
Copy link
Member

@timja timja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@dwnusbaum
Copy link
Member Author

dwnusbaum commented Aug 19, 2024

Probably smoothing support good enough for our purposes (just a visual aid in some charts) could be implemented on top of SplineInterpolator with some preprocessing on our side if someone is interested and wants to spend some time on it.

Also though, looking at the uses of SmoothingCubicSpline a bit more, I think the build duration distribution chart should be a histogram, not a trend line, and we could implement that relatively easily.

The uses of SmoothingCubicSpline in the test results chart to produce a nonlinear trend line when you have at least 200 builds could perhaps be replaced with a moving average to be much simpler, but I also wonder whether it's even worth adding a special trend line for this case.

(with changes like these we could drop the ssj dependency completely)

@timja timja merged commit 6077688 into jenkinsci:master Aug 19, 2024
16 checks passed
@dwnusbaum dwnusbaum deleted the reduce-dependencies branch August 19, 2024 20:46
@basil
Copy link
Member

basil commented Aug 20, 2024

CC @mdealer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants