Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for apparent build errors #154

Merged
merged 5 commits into from
Jun 8, 2016

Conversation

anjackson
Copy link
Collaborator

Under Java 1.7 (u51), on a Mac (see full details below), I find the current build to be broken.

One set of errors arise because the unit test assumes it knows how the JVM will serialised HashMaps, and how this relates to the ordering of the HashMap. However, this ordering is not guaranteed and since some point on the 1.7 line these tests will now fail.

The other discrepancy is strange - one of the selftest crawls run during the build checks the number of novel bytes is as expected. For some reason, this value is 1 lower than it was before. In particular, the novel bytes associated with the localhost dns request appears to be 48 rather than the expected 49. Perhaps this is some kind of odd platform dependence?

$ mvn --v
Apache Maven 3.3.3 (7994120775791599e205a5524ec3e0dfe41d4a06; 2015-04-22T12:57:37+01:00)
Maven home: /usr/local/Cellar/maven/3.3.3/libexec
Java version: 1.7.0_51, vendor: Oracle Corporation
Java home: /Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.11.3", arch: "x86_64", family: "mac"

@ruebot
Copy link
Collaborator

ruebot commented Apr 2, 2016

@anjackson I'm getting a failed build as well on master, full output here.

Ubuntu 15.10, 64-bit

[nruest@simian:heritrix3] (git)-[master]-$ java -version                                    
java version "1.8.0_77"
Java(TM) SE Runtime Environment (build 1.8.0_77-b03)
Java HotSpot(TM) 64-Bit Server VM (build 25.77-b03, mixed mode)
[nruest@simian:heritrix3] (git)-[master]-$ mvn -v
Apache Maven 3.3.3
Maven home: /usr/share/maven
Java version: 1.7.0_80, vendor: Oracle Corporation
Java home: /usr/lib/jvm/java-7-oracle/jre
Default locale: en_CA, platform encoding: UTF-8
OS name: "linux", version: "4.2.0-34-generic", arch: "amd64", family: "unix"

And, your branch isn't building for me either. Output here.

I also setup TravisCI for this repo here. More than happy to put in a pull request as well. Maybe it'll help out for issues like this in the future.

@anjackson
Copy link
Collaborator Author

Thanks @ruebot - your builds hit heap space problems, but this bit looks like there are unresolved issues

Failed tests: testSomething(org.archive.crawler.selftest.StatisticsSelfTest): expected:<9775> but was:<9771>

Note that IA have their own build server which seems perfectly happy, so perhaps this a subtle platform/locale issue?

I've attached the arc.gz from the last selftest run (from /tmp/heritrix-junit-tests/selftest/StatisticsSelfTest/jobs/selftest-job/arcs/) - can you attach yours so we can compare the contents?

WEB-20160402195540108-00000-551192.168.99.18443.arc.gz

And here's the WARC version from /tmp/heritrix-junit-tests/selftest/StatisticsSelfTest/jobs/selftest-job/20160402195538/warcs/

WEB-20160402195539599-00000-551192.168.99.18443.warc.gz

@ruebot
Copy link
Collaborator

ruebot commented Apr 2, 2016

@anjackson ah, cool. Didn't know they had a Jenkins setup.

I just realized my mvn output said it was using Java 7 instead of Java 8, so that might be worth noting.

Here is the arc from my last selftest run, which would have been your branch:
WEB-20160402223629621-00000-23206simian8443.arc.gz

@anjackson
Copy link
Collaborator Author

Oh that's weird. Yours doesn't have a DNS response record, and indeed no results from 'localhost' at all. Quite confused now. Maybe running out of heap space killed it midway through!?

@ruebot
Copy link
Collaborator

ruebot commented Apr 2, 2016

I'll do a export MAVEN_OPTS="-Xmx3000m" and rebuild. See if that helps.

@ruebot
Copy link
Collaborator

ruebot commented Apr 2, 2016

Hrm. Both still fail, and I don't see a /tmp/heritrix-junit-tests/selftest directory this time around. But, I have some arcs in /tmp/heritrix-junit-tests.

@ruebot
Copy link
Collaborator

ruebot commented Apr 3, 2016

Here is a second build with Java 7, and more memory. It looks like it got further along and failed at Heritrix 3: 'engine' subproject as opposed to Heritrix 3: 'modules' subproject (reusable components).

...and the sample arc: https://www.dropbox.com/s/gcnvq15s0wsdxl9/WEB-20160403002631073-00000-25360~simian~8443.warc.gz?dl=0

@anjackson
Copy link
Collaborator Author

Okay, I'm totally confused now. That build should not have failed like that. Maybe I should also set up a Travis build with lots of JVM combinations?

@ruebot
Copy link
Collaborator

ruebot commented Apr 3, 2016

set up a Travis build with lots of JVM combinations

👍

@ruebot
Copy link
Collaborator

ruebot commented Apr 3, 2016

TravisCI - master - https://travis-ci.org/ruebot/heritrix3/builds/120423334
TravisCI - ukwa:fix-test-errors - https://travis-ci.org/ruebot/heritrix3/builds/120423589

Looks like openjdk8 isn't setup on the TravisCI machines.

@anjackson
Copy link
Collaborator Author

I eventually got Travis builds running (e.g.). Java 7 is fine although I kept hitting a race condition in one test and had to introduce a delay.

Java 8 hits more problems due to relying on comparing the serialised binary forms of objects. This is probably something we should try to work out how to avoid, as the serialisation format is not a fixed part of the spec, and can be expected to change from time to time and across Java implementations. However, I'm not sure how best to do this.

@nlevitt
Copy link
Contributor

nlevitt commented Apr 13, 2016

I'm kinda letting you guys sort out these issues :)

No .travis.yml in the pull request?

@anjackson
Copy link
Collaborator Author

I thought you'd rather not have one, as you already run Jenkins? Very happy to patch it in if you don't mind, as it would make it much easier for me to run Travis CI on our fork.

@nlevitt
Copy link
Contributor

nlevitt commented Apr 14, 2016

I think it's fine to have both jenkins and travis builds. Jenkins makes it easier to download build artifacts and it populates our maven repo. We can use travis for testing with different versions of java and stuff like that.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Setup TravisCI
@anjackson
Copy link
Collaborator Author

Thanks to @ruebot for finishing off the Travis config.

@ruebot
Copy link
Collaborator

ruebot commented Apr 14, 2016

np!

@nlevitt do you know how to setup the TravisCI hook in GitHub?

@kris-sigur
Copy link
Collaborator

Been using the travis build against one of my forks. The OpenJDK build seems to occasionally fail (different unit tests) for no (seemingly) good reason? The OracleJDK, however, has only failed when it should have.

@nlevitt
Copy link
Contributor

nlevitt commented May 3, 2016

Sorry this slipped off my radar. @ruebot I cherry-picked the .travis.yml commit so we can have a baseline before merging these fixes. https://travis-ci.org/internetarchive/heritrix3

@ruebot
Copy link
Collaborator

ruebot commented May 3, 2016

Cool. You know how to enable it on your end in GitHub settings? Also, do you want a README badge for it? Happy to toss in another pull request for that if need be.

@nlevitt
Copy link
Contributor

nlevitt commented May 3, 2016

I didn't have to do anything in github settings. (Maybe there was some account level setting that was already enabled though.) I just enabled it in travis.

Sure feel free to send a PR for a the readme badge.

@ruebot
Copy link
Collaborator

ruebot commented May 3, 2016

Yeah, you can get at it two different ways, so sounds like we're all good.

@nlevitt
Copy link
Contributor

nlevitt commented Jun 6, 2016

After running into more issues with those byte count checks in StatisticsSelfTest.java I finally took the time to examine the issue. Turns out the size of the dns record depends on the local environment (dns server, I guess). Specifically the TTL varies. For example

20160606204018
localhost.              0       IN      A       127.0.0.1

vs.

20160606200119  
localhost.              604800  IN      A       127.0.0.1

On #164 I extended @anjackson's branch to exclude dns records from the size checks.

@nlevitt nlevitt merged commit 8437a27 into internetarchive:master Jun 8, 2016
nlevitt added a commit that referenced this pull request Jun 8, 2016
Fixes for apparent build errors (extends #154)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants