Skip to content

Conversation

@ajbozarth
Copy link
Member

What changes were proposed in this pull request?

Adding long values for each Date in the ApplicationAttemptInfo API for easier use in code

How was the this patch tested?

Tested with dev/run-tests

@ajbozarth
Copy link
Member Author

Two questions for discussion:

  1. Do we want/need this addition to the API?
  2. What are better names for the new params?

@SparkQA
Copy link

SparkQA commented Feb 23, 2016

Test build #51783 has finished for PR 11326 at commit 54f8ea4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ajbozarth ajbozarth changed the title Added long values for dates in ApplicationAttemptInfo API [SPARK-13459] [Web UI] Added long values for dates in ApplicationAttemptInfo API Feb 23, 2016
@ajbozarth ajbozarth changed the title [SPARK-13459] [Web UI] Added long values for dates in ApplicationAttemptInfo API [SPARK-13241] [Web UI] Added long values for dates in ApplicationAttemptInfo API Feb 23, 2016
@ajbozarth
Copy link
Member Author

Sorry forgot the Jira name in the title, then mixed it up with the other Jira I'm working on

@srowen
Copy link
Member

srowen commented Feb 24, 2016

I can imagine the use for this, but what's your use case? is this read by code often?
I'm not sure about the name ending with "L". Ideally I'd call them "startTime" and "formattedStartTime" but we can't change the existing field (or shouldn't). Maybe "startTimeTimestamp" or something? I don't have a great idea.

@ajbozarth
Copy link
Member Author

@steveloughran originally said he wanted it for testing purposes in the jira. As for the naming I agree the L isn't ideal, I just put it down as a placeholder until I had other ideas. I also though of *Timestamp, *Long, *Num, or *Raw. I agree the better naming would change the current Date fields, but since we can't... What do you think Steve?

@steveloughran
Copy link
Contributor

@srowen I'd like this as the REST API publishes the times as strings, not long values which can be consistently deserialized across all platforms.

It's not critical to me, but will assist testing, especially that incomplete work is being updated and picked up.

@srowen
Copy link
Member

srowen commented Feb 26, 2016

Yeah it seems useful. I am just stuck on the name. Anything reasonable that >1 person likes seems fine.

@steveloughran
Copy link
Contributor

maybe use Epoch as the suffice; it is unix epoch values, after all.

@ajbozarth
Copy link
Member Author

I like Epoch, I'll make the change and push it when I get in to the office

@SparkQA
Copy link

SparkQA commented Feb 26, 2016

Test build #52060 has finished for PR 11326 at commit d0e87c5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@steveloughran
Copy link
Contributor

LGTM. @srowen ?

"attemptId" : "1",
"startTimeEpoch" : 1426533910242,
"endTimeEpoch" : 1426533945177,
"lastUpdatedEpoch" : "",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, is this right? it seems like this becomes a string when it has no value. Is that just how JSON works or do we need to have it omit this value when not defined?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not seen that. It should be a 0, shoudn' it. Or send it over as a number in a string, where "" means not set? Client gets to parse

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty string is actually hard-coded in HistorySeverSuite, I just copied how it was done with lastUpdated for lastUpdatedEpoch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue is that lastUpdated is a formatting string, so an empty string makes sense for no value, but not for a numeric value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding correctly, your issue is not with my changes but the way to test is currently working. If that's the case I can change the hard-coded test values from "" to 0 with this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed this was some kind of golden file test, where the output is intended to match exactly this file. Is this not what the output would be with this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll detail it better just to make sure we're on the same page: HistoryServerSuite looks at a test json and before comparing it to the exception json file (this file) checks if lastUpdated (and now also lastUpdatedEpoch) exists, if so it over writes it's value with an empty string. I'm not sure why they decide to do it the sway but it is detailed in a comment on line 157 in HistoryServerSuite.scala

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen Did that explanation make sense? Do we want to make more changes to update how the tests work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I get it, it blanks the lastUpdated field so this is effectively not compared. (Would it just be easier to strip these?) Although I appreciate it doesn't matter in the sense that it's just for purposes of a diff, it seems like it's worth making the value it substitutes valid, like a 0. Or just removing it if possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I changed "" to 0 and am testing now, will push once it passes.

@SparkQA
Copy link

SparkQA commented Mar 21, 2016

Test build #53697 has finished for PR 11326 at commit 6c1998b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 21, 2016

Test build #53709 has finished for PR 11326 at commit ed1012a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ajbozarth
Copy link
Member Author

@srowen This good to go now?

if (subStrings(i).indexOf("lastUpdatedEpoch") >= 0) {
subStrings(i) = "\"lastUpdatedEpoch\":0"
} else if (subStrings(i).indexOf("lastUpdated") >= 0) {
subStrings(i) = "\"lastUpdated\":0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but now you've made a string-valued property take on a numeric value here. I don't think that needs to change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a numeric value being changed to a string before. I thought that was what you were referencing above when you said it was "becoming a string" but I may have misunderstood.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lastUpdated is a string (representing a date). lastUpdatedEpoch is an integer. They should be treated differently. The dummy value of a string here is "" and for an integer is 0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, shoot, messed that up. I'll go fix it.

@SparkQA
Copy link

SparkQA commented Mar 25, 2016

Test build #54202 has finished for PR 11326 at commit c20035c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

attempts = app.attempts.map { internalAttemptInfo =>
new ApplicationAttemptInfo(
attemptId = internalAttemptInfo.attemptId,
startTimeEpoch = internalAttemptInfo.startTime,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajbozarth thank you for your patience, this is pretty close now. I apologize, I haven't really been paying attention to the rest of this change. It is conceptually fine but actually, this part of the change is not necessary. You've already got Date objects encoding the the time in this object, which is really a fancy wrapper around ms since the epoch. getTime gets you the same value you're adding here already. So I don't think it's necessary to actually store all these values a second time.

Of course you still need to expose these values as an integer Long as properties of the class, so that JSON will contain the second ms representation of them. I think that only requires adding getter methods with the right name to the class objects below, not actually adding redundant fields. How about that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats ok, and I originally thought the same thing about the redundancy and getter methods. I even spent a couple hours this morning combing through the related api code to double check my understanding.

From my understanding of the api (at least ApplicationInfo and ApplicationAttemptInfo), the api classes are created when the api is called using these functions I edited. The returned classes are then automatically converted to json, so there is no getters for the classes or place to write them. The only data converted to json is the data given in params.

I don't fully understand the original choice for the api classes to work this way (no methods) but, unless I am missing something, adding the extra Epoch values as params is the only way to pass the ms info to the api json output. Also as I mentioned before, it seems the values are only stored in the class for the time it take to process the api call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you would certainly have to expose getters for the new values. You just add them to the class declaration. This should work?

class ApplicationAttemptInfo private[spark](
  ...) {
  ...
  def getStartTimeEpoch: Long = startTime.getTime
  ...
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given there's no example of that in the api I (apparently wrongly) assumed that I couldn't add methods like getters to the api classes. I'll add them, test and push when I have time later today.

@SparkQA
Copy link

SparkQA commented Mar 28, 2016

Test build #54353 has finished for PR 11326 at commit 95af3a6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 30, 2016

Test build #54478 has finished for PR 11326 at commit bfbe7e2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ajbozarth
Copy link
Member Author

@srowen I updated to getters and tests have passed, does this look good to go?

@srowen
Copy link
Member

srowen commented Apr 1, 2016

Nice and clean now. Merged to master

@asfgit asfgit closed this in abc6c42 Apr 1, 2016
@ajbozarth ajbozarth deleted the spark13241 branch April 1, 2016 23:22
@steveloughran
Copy link
Contributor

thx for this; helps with cross platform dev & test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants