Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send notifications for feed expirations #126

Merged
merged 10 commits into from
Dec 11, 2018
Merged

Conversation

evansiroky
Copy link

Implementation to resolve #121.

Will send a 1 week warning email and an email on the day of an expiration.

Also has some refactoring of commonly shared code and fixes #123.

Copy link
Member

@landonreed landonreed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Comment summary: some methods need javadoc and we might should consolidate a bunch of this scheduling logic in a single class.

import java.time.ZoneId;

public class Utils {
public static ZoneId getTimezone(String tzid) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new method should have javadoc.

@@ -90,6 +94,8 @@
public static Map<String, ScheduledFuture> autoFetchMap = new HashMap<>();
// Scheduled executor that handles running scheduled jobs.
public final static ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);
// multimap for keeping track of notification expirations
public final static ListMultimap<FeedSource, Future> scheduledNotificationExpirations = ArrayListMultimap.create();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just store the feed source's ID here instead of the feed source? Actually, I was thinking we might should just keep all scheduled tasks in a single spot (i.e., adapt the autoFetchMap into a general purpose data structure for these.

@@ -126,6 +132,8 @@ public static void main(String[] args) throws IOException {
}
}

scheduleFeedNotificationExpirations();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see this method generalized to include the project auto fetch scheduled tasks above.

@@ -60,6 +60,7 @@
<filtering>true</filtering>
<includes>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this filtering? I don't love the idea of having to explicitly include each item here. Are we excluding something? If so, could we just define an excludes filter?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I copy-pasted it all in when I added the server info api endpoint. Agree that filtering should be off here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tried a whole bunch of different things and none of them worked, so I think this is actually necessary.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be comments here showing why we are filtering resources though. Filtering means replacing {keys} like this with variables defined by Maven (often in the POM). In resources/public/index.html I see <link href="https://${S3BUCKET}.s3.amazonaws.com/dist/index.css" rel="stylesheet"> so filtering makes sense here. But I don't think we want to be performing variable substitution on the Logback configuration or YAML config files.

@@ -159,10 +167,26 @@ static void initializeApplication(String[] args) throws IOException {
feedBucket = getConfigPropertyAsText("application.data.gtfs_s3_bucket");
bucketFolder = FeedStore.s3Prefix;

// create application gtfs folder if it doesn't already exist
new File(getConfigPropertyAsText("application.data.gtfs")).mkdirs();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other approach would be to have the application exit if no folder exists here. I'm kind of inclined to do that unless there's a reason why this is better. Curious to hear your thoughts.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I'm using the dev configuration it lists the gtfs directory in tmp which I guess is getting deleted every so often. Therefore, I added it in here so it will always be created if the folder doesn't exist. The current code results in the application noting an error on startup, but not exiting, but then once a webservice is called all kinds of errors happen. I think it'd be good default behavior to create this folder if it doesn't exist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if someone is using a shared config file and the path is set to /Users/landon/nested/dir/that/is/very/deeply/buried. Do we really want to create those directories on the server? If the application isn't properly failing fast, let's just add that behavior. We can also just change the default gtfs directory if that behavior is annoying.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding creating supposedly random folders on the server, if it's what the user specified, why wouldn't datatools-server do that? I don't even know what goes in that folder in the first place. I actually like that the files are saved to a tmp directory so that they get wiped out locally so I don't have to periodically clean things up. I'm also not a fan of the server failing because I haven't done even more manual setup of things. It should just work.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is arbitrary how a server reacts when it's misconfigured. It can fail fast or attempt to recover by creating missing directories. The important thing is that we choose a behavior and stick to it. I'm increasingly a fan of failing fast and showing clear error messages. Every error recovery code path is another thing that needs to be maintained and can have unexpected consequences, e.g. if someone misspells the intended path and we silently create it, or attempt to create it and it fails with a seemingly irrelevant permissions error.

scheduledNotificationExpirations.removeAll(this);
}

public void scheduleExpirationNotifications () {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs javadoc explaining what this is.

if (parentProject == null) {
// parent project has been deleted, but feed source/version have not
// abort the setting up of the notification and figure out why the database has been
// allowed to devolve to this state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not log a warning or error here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging an warning does seem like a good idea here.

long expirationEpochSeconds = latest
.validationResult
.lastCalendarDate
.atTime(0, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this time set to midnight? From a user experience perspective, this might not be ideal. Why don't we schedule them for like 9am instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9am is way too late in the day because some people could start working before that. Maybe 4am is best since it wouldn't usually conflict with other items running on the server. I think this would be a good thing to be one of the first things someone sees in their inbox perhaps along with a nightly status report / summary.

@@ -538,4 +544,88 @@ public FeedSource clone () throws CloneNotSupportedException {
return (FeedSource) super.clone();
}

@Override
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we just stash the feed ID in the multimap, we wouldn't need this equality method.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True

}catch(Exception e){
timezone = ZoneId.of("America/New_York");
}
ZoneId timezone = getTimezone(project.defaultTimeZone);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this scheduleAutoFeedFetch and the FeedSource#scheduleExpirationNotifications (and the respective cancel methods) should all go in the same class. It makes me uncomfortable that this logic is getting spread out and I suspect there is some duplicated code that could be factored out. Maybe this could all go in the datatools.common.util package as ScheduleUtils? Some of the logic in DataManager.java might belong here also.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that make sense. I'll try to do that.

A few other things I saw: it looks like the logic to schedule auto-fetch of specific feed sources is missing in datatools-server. Also, it looks like we need to reconfigure project creation to automatically schedule fetches for a project upon creation in light of the changes in catalogueglobal/datatools-ui#41

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait nvm, I see that project auto-fetch must be enabled for a feed to automatically be fetched.

@landonreed landonreed assigned evansiroky and unassigned landonreed Nov 28, 2018
@landonreed
Copy link
Member

landonreed commented Nov 29, 2018

@evansiroky, one other comment I have: can you run the cancel notifications method when a feed source is deleted (i.e., within FeedSource#delete)? While you're at it, please do the same for cancelling auto fetch when a project is deleted. Thanks!

@evansiroky
Copy link
Author

@landonreed The method for deleting a project will call the feedsource delete method which will call the feedversion delete method which now recalculates the feed expiration notifications, so your request for adding to the delete methods should be needed.

@evansiroky
Copy link
Author

Requested changes implemented. Should be ready for review and/or approval.

@evansiroky evansiroky assigned landonreed and unassigned evansiroky Dec 4, 2018
@landonreed
Copy link
Member

OK, #130 has been merged into this PR. I think we can call this complete now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pom blurb is filtering out logback.xml Send feed expiration notification to users subscribed to feed source
3 participants