-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
t/proj_transform2.t test failures #498
Comments
That 145 vs 141 is likely due to some rounding/quantisation difference between Windows and Linux and/or compiler versions. Sorry about this, and thanks for the report. The tests that have been used up till now, using |
@shawnlaffan Could you try the latest I'm still going to switch to doing the coordinate transformation directly, which will be a smaller, quicker, and more accurate test, but I'd like to see if the pthread thing is the cause of this issue. |
Just built and tested with Windows will not be the issue as this is an Ubuntu instance (it just happens to be running via WSL2). gcc details are below.
|
Could you try with the latest |
I found that the test errors disappear if one adds Tested in openSUSE Tumbleweed, Perl 5.40.0 |
Good heavens! That does not sound like it should work. Are you sure it's not just intermittent? Also, @dhdeangelis could you also try the latest |
I am sure it is not intermittent. But I do not completely understand the reason. Short summary: PDL-2.092: built and test OK PDL-2.093: built OK, failed test for proj_transform2:
Upon investigation I saw baffling things like if printing $map to stdout before $ortho the test passes:
Apart from simplification, perhaps the main change from 2.092 to 2.093 was the use of PDL-2.093: with a modified proj_transform2, adding use PDL::IO::FITS;
built OK, test OK:
Now, master does not have a proj_transform2.t test anymore. It builds and tests correctly:
All the steps above are reproducible in my system. Perl (built by me, not the OS's):
System info:
Hope this adds light. Perhaps it is not necessary anymore, although it would be interesting to know more about the root cause. |
Yes, I'd like to know why also. I'm currently tracking two further bugs in IO::FITS (the RICE compression doesn't seem to round-trip in at least one circumstance), and the A thing I suspect is that the test used Could you do a |
@mohawk2 here comes the results of the tests you asked for. I separate in two comments for clarity. Spoiler alert: this is confusing.
Suggested test:
Output: failure
But, just if we do only dumping or printing the test passes:
Resulting in:
|
Suggested test:
Result: failure (as in the previous case)
But, as in the previous case, dumping or printing causes the test to pass
I hope you find some clarity here. |
Thank you for your efforts so far on this! I don't have clarity yet, sadly. One observation is that inherently by the nature of a byte's range, and that it's a It would be very valuable to see when So, could I ask you to modify (I also note that between the IO::FITS being included and not, there is at least a |
I did (not calling PDL::IO::FITS here):
got this, posting only tests 6-10 since it is very long otherwise:
Lots of fishy behavior if for example, I do not print $g it fails, if I print $g it passes. I am not sure how exactly should I use indexND or what to expect. |
Unfortunately all the above does is a sub pdl_cmp {
my ($g, $e, $l) = @_;
print "- - - - - - - - - - - - - - - - - - - - - - - - - - - - \n";
$_ = PDL->topdl($_) for $g, $e;
local $Test::Builder::Level = $Test::Builder::Level + 1;
my $which_close = approx($g, $e, 1.1);
print "result:\n$which_close\n";
fail("$l: result was BAD value"), return if any $which_close->isbad;
(my $bool = ok all($which_close), $l) or diag "got:\n$g\nexpected:\n$e";
print "whichND of unequal: ",whichND($which_close) if !$bool;
print "\n";
} |
Enabling only tests 6-10 passes with 4x4 piddles of ones. Tests 6-10 fail if other tests enabled.
|
So it looks like the effect only happens in This is likely to be a failure of the handling of FITS headers when A:F:H isn't loaded, so it will be valuable to track it down. |
Exactly
with:
In tests 6-10:
gives:
All test pass then. If 1-5 are disabled, 6-10 pass, but 11-15 fail. As before: if one chooses to print $ortho, in 6-10, then all tests pass. |
@dhdeangelis Thank you! Could you also show the |
Here it comes. I do not see any difference sadly: with:
gives:
My general observation is: if we do not use IO::FITS tests will fail, but will pass if we print $map or $ortho. However ... now, for the first time, I am seeing intermittent behavior in the test results. |
That intermittent behaviour might actually be because the PROJ-binding code until extremely recently was using a non-thread-safe API, but for large data inputs (>1e6 elements, such as the 2e6-pixel Earth image) would nevertheless use POSIX threads, which is a recipe for intermittent problems. To see if I'm right on this, could you first of all try running the test lots of times as it is now, to get a feel for how often it fails? Then, could you set the environment variable |
@shawnlaffan Do you feel this problem is now solved? If so, are you happy to close the issue? |
I cannot test on the original system due to an Alien::proj/Alien::Build issue with mixed system/share installs. Despite Alien::proj being a share install, PDL is being linked to the system installation of Proj due to how the libs are stored and reported by Alien::Build/Base, and now won't build.
This is because the linked Proj library on the system is version 6.3.1, and Fixing the dynamic libs is an Alien::proj/Alien::Build problem, but PDL will still break for a system Alien::proj where Proj is earlier than 7.1. Either PDL needs to check for Also FWIW, the tests pass on an Ubuntu 24.04 instance without a system install of Alien::proj, but also passed before all the related PDL changes. |
I could also bump the minimum version for Alien::proj to be Proj 7 or higher. That won't fix the dynamic libs issue but will ensure |
I feel it's pretty important to get the units correct, and the EDIT Or, I could make the |
I've set it to 7.1 in shawnlaffan/perl-alien-proj@7d1296e Once I get CI sorted I'll release a new Alien::proj |
Alien::Proj 1.28 is now on CPAN. |
Looping back to the build issues (edit: in #498 (comment)), I now think that's caused by PDL or EUMM. The rpath of the Proj4.so file links directly to
This seems to be due to a line in the generated Makefile that sets the LD_RUNPATH:
That variable should point to I am not sure how or where to set that, though, and there is only one place in the PDL code base that uses LD_RUN_PATH: Line 204 in 899a615
Another approach is to add One could also set |
I'm trying to figure out a 6.3-compatible solution, having asked for help on OSGeo/PROJ#4217 - if that works, then it might solve the immediate problem. |
The build issue should be fixed by shawnlaffan/perl-alien-proj@ed072d4 That stops system directories being added to the lib search path, allowing the Alien share builds to be found before any system versions. |
Now released as part of Alien::proj 1.29: https://metacpan.org/release/SLAFFAN/Alien-proj-1.29 |
Does that mean you consider this issue solved? |
@mohawk2 this did not make a difference. Still printing interim results is the way to avoid failures. Even after updating Alien::proj to 1.29. |
Thank you. Is there a way I could login to that machine to investigate more directly? |
No |
Then my ability to debug this further will be limited. Can you confirm whether the latest |
Using the latest master the test proj_transform2.t in 2.093 passes without failure. This does not seem to be intermittent success. I guess the issue can be closed. Thank you @mohawk2 and @shawnlaffan . |
The tests all pass for me now so on that basis it can be closed. The one remaining component is your idea to support Proj 6.3.1. Is that still on the cards? |
Let's give the OSGeo folks a bit longer. If it turns out it is possible, we could remove the increased need for share installs. |
@shawnlaffan I think they've had long enough. Unfortunately we'll have to do a bit more share installing on older Ubuntu! Could you close this? That way you can reopen if further problems arise. |
t/proj_transform2.t is failing tests on my Ubuntu system (Ubuntu 20.04.6 LTS via WSL2) for PDL 2.093 and HEAD.
Alien::proj is a share install, so it is not using the system version of Proj. Proj is at version 9.4.1.
First failing commit is c3bf22f
FWIW, the system version of Proj is 6.3.1. I have not tested with it, but there have been many changes since that was released.
The text was updated successfully, but these errors were encountered: