Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for #292 Date range queries #293

Merged
merged 1 commit into from
Nov 9, 2015

Conversation

johnerikhalse
Copy link
Contributor

This PR prevents the nullifying of start/end dates and also enables queries on exact dates.

The existing unit test required that both start and end date should be set, otherwise it treated it as none of them where set. I loosened that by allowing open-ended dates.

This PR renders #225 obsolete, and solves #292 and partially #190.

@johnerikhalse johnerikhalse added this to the 2.3.0 Minor Release milestone Oct 7, 2015
@ldko
Copy link
Member

ldko commented Oct 9, 2015

This works for me except in the case when querying with an exact date that is a full 14 digit timestamp. In this case, with STAR appended to exactDateTimestamp, all date instances are returned--not just the one specified. Thanks @johnerikhalse

@johnerikhalse
Copy link
Contributor Author

@ldko Thanks for testing
Yes, I am aware of the case with 14 digit timestamp. Without STAR you will be redirected to the replay page and not to the search result page. I didn't change the behavior to accept 14 digits with STAR because there is a unit test failing when sending 14 digits with STAR. I am not sure if such a change will break anything somewhere else.

I could change the code, if you would like to test if that change has any side effects.

}
return toString(datespec,url);

String exactDateTimestamp = getEmptyStringIfNull(wbRequest.get(WaybackRequest.REQUEST_EXACT_DATE));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

60-68 should be indented

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but as a lot of other places in the code, it is a mixture of tabs and spaces.

@ldko
Copy link
Member

ldko commented Oct 14, 2015

If the change you would make is something like removing the else if (dateStr.length() == 14) block from PathDatePrefixQueryRequestParser.java then the test failing is in org.archive.wayback.archivalurl.ArchivalUrlRequestParserTest where without that block we get:

Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.078 sec <<< FAILURE!
testDatePrefix(org.archive.wayback.archivalurl.ArchivalUrlRequestParserTest)  Time elapsed: 0.007 sec  <<< FAILURE!
junit.framework.ComparisonFailure: expected:<1996...> but was:<2010...>
        at junit.framework.Assert.assertEquals(Assert.java:81)
        at junit.framework.Assert.assertEquals(Assert.java:87)
        at org.archive.wayback.archivalurl.ArchivalUrlRequestParserTest.testDatePrefix(ArchivalUrlRequestParserTest.java:204)

With the test failing on assertEquals(EXPECTED_START_TIMESTAMP, wbr3.getStartTimestamp());

If we are not handling the 14 digit timestamp differently (by removing the elseif), then EXPECTED_START_TIMESTAMP becomes our exact date and not the earliest timestamp possible, so the test fails (in an expected way) with the change.

I noticed in the test though, the comment:

        // full 14-digit timestamp with "*": entire time range, highlight the
        // closest to the specified date.

So I guess the expected behavior of an exact 14 digit timestamp search has been not to return a single date result (or none) as I was thinking, but to return all date range results with the closest one highlighted--though I can't tell how it is supposed to be highlighted looking at all the results.

I guess the question is, do we want to only return an exact timestamp result if it actually matches the exact timestamp or do we want the functionality of what the comment indicates--that we return all matches and highlight the closest one--in which case we may need to better indicate the closest one.

On another note, what do you think of adding some help text to the advanced search page? Something along the lines of (for English string) Enter dates as partial or full timestamps, yyyymmddHHMMSS. Examples: 2010, 201201, 20081105042020

@johnerikhalse
Copy link
Contributor Author

No, the change I suggest is in PathPrefixDatePrefixQueryRequestParser:

From:

    private final static Pattern TIMESTAMP_REGEX = Pattern.compile("(\\d{0,13})\\*");

To:

    private final static Pattern TIMESTAMP_REGEX = Pattern.compile("(\\d{0,14})\\*");

And then the test testPathPrefixDatePrefix in ArchivalUrlRequestParserTest must be changed to not expecting null.

But I am not sure if this has any side effects. The rule will only be applied if the timestamp ends with *

@ldko
Copy link
Member

ldko commented Nov 3, 2015

Okay, that makes sense. You are changing it only for URL prefix queries and not exact URL queries. Your change works for the "beginning with" searches with 14 digit timestamps while the "exactly matching" searches still return all dates (which historically seems to be what is expected?).

I didn't notice any side effects with allowing 14 digit timestamps with PathPrefixDatePrefixQueryRequestParser when using Archival replay. PathPrefixDatePrefixQueryRequestParser is also referenced in ProxyArchivalRequestParser, but honestly I'm not clear on how that should be used, so I can't tell if it is affected negatively.

kris-sigur pushed a commit that referenced this pull request Nov 9, 2015
@kris-sigur kris-sigur merged commit 54eba3d into iipc:master Nov 9, 2015
johnerikhalse added a commit that referenced this pull request Dec 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants