Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potential rosters and employee assignment spec #81

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

skyqrose
Copy link
Contributor

@skyqrose skyqrose commented Sep 20, 2024

Rendered

Here's an attempt at unifying #28 and #45. It includes both the roster and employee assignments, and both the typical schedule and exceptions. It does not include vehicles.

It includes 4 files:

  • rosters.txt: A typical week for a roster. Pretty directly corresponds to add rosters.txt #45.
  • roster_dates.txt: Exceptions to rosters.txt on specific dates. This makes it possible to describe holidays or vacations that are built into the work, before they're assigned to an employee. It also allows minor disruptions like track work, to say that someone is still doing mostly the same run/roster, but the schedule has changed slightly.
  • employee_rosters.txt: Which roster did an employee get scheduled to do? For North American schedule, this would refer to most of the rating at once. For UK scheduling, where employees rotate through rosters, there would be one row per employee per week.
  • employee_run_dates.txt: Exceptions to employee_rosters.txt, or allows describing employee assignments without representing rosters, as in Allocating drivers and vehicles in ODS #28. If rosters are used, this allows vacations that are scheduled but not considered part of the roster, or leaves of absence.

None of the files require a row for every date + person pair, and it's intended to cover every crew scheduling use case that I remember coming up in working group discussions.

that's a lot of files

Unfortunately I think it necessarily takes 4 files. To describe what a roster is doing on a date and what an employee is doing on a date requires a (calendar.txt, calendar_dates.txt) pair for both rosters and employees. (You could do it in 1 file each by listing every date separately as in #28, but that's duplication that people wanted to avoid, and means you can't describe a typical week.)

I also considered whether we could cut down on the number of files by using service_ids so the exceptions could be defined in calendar_dates.txt (after all, run_events didn't need its own _dates file because it could use service IDs), but I don't think that's possible because rosters need to associate a different run with each day.

Vehicles were originally part of the plan in #28. I cut them from the scope of this PR to focus on crew. One concern people had was that if crew required too many files, maybe vehicles will, too. I don't think that's the case, because vehicles don't have human-specific concerns like rosters and vacations. So having 4 crew files doesn't mean it'll get doubled to 8 crew+vehicle files. I think we should make crew rosters first, and then have a separate discussion for vehicles.

this is a draft

This is just a rough draft to gives something to discuss during working group meetings. It's got some missing pieces and TODOs.

I have a few examples that I've thought about, but don't have them written up yet. If there's any you'd like to see in order to discuss during the working group, let me know, otherwise, I'll put them off til later.

Copy link

@skyqrose
Copy link
Contributor Author

Some things from the working group meeting to think about / work on:

  1. Writing examples would help us be sure that this can represent everything we need to. @BTollison mentioned wanting to do some of this? If you need inspiration, there's a list of examples I think would be useful in examples.md.
  2. Could this lean more on the existing calendar.txt and service_ids instead of defining its own calendar+date system for these files? I tried and couldn't figure it out, but someone else could give it a shot.
  3. When a calendar exception happens (like minor time changes due to track work), you have to copy that exception into roster_dates.txt because it refers to runs using a service_id. Is there a way to prevent that, and say it still refers to the same run on a different service, without opening the can of worms of having non-unique run ids?
  4. Vocabulary: There's a ton of new terms in rostering, which we should define so we're all on the same page.
  5. Specifically, this proposal uses roster_id to refer to work that one person might do over the course of a week. Usually "Roster" means a bunch of people's schedules, and a single person's schedule might be a "job" or "roster position". What words does your agency use for this? What should we name the files + fields?
  6. @BTollison raised a case where agencies have roster positions that last multiple weeks (e.g. a two-week rotation). add rosters.txt #45 handled this with a week_sequence field. Should we try to add that feature into this proposal, and if so, how?

@BTollison
Copy link
Collaborator

BTollison commented Oct 15, 2024

Alright, I finally (sorry) got to this... here are my thoughts:

In the proposed rosters.txt we have service_id fields per day and a start_date and end_date. I think that maybe we can get rid of these. We had discussions earlier about not needing service_id because run_id is tied to a service_id, and I think this idea makes sense now. But to counter a situation where you have multiple sets of rosters or rosters that do not align with the service_id, I propose we create a roster_group.txt, I'll describe it in more detail below.

I'd like to propose adding the following to rosters.txt

  • roster_week_sequence in the case that a roster has multiple weeks within it, for example a 2 week roster. We can number the sequence of the weeks to define which week is week 1 and which is week 2.
  • roster_group_id ties back to roster_group.txt
  • roster_name this is necessary because roster_id is a primary key, but rosters often have specific numbering sequences like blocks or runs do. So, for example it is possible to have a roster 101 that is for the whole of January but then replace it without a new roster set that has a roster 101 starting in February in the same data set.

So in practice you end up with:

calendar.txt

service_id monday tuesday wednesday thursday friday saturday sunday start_date end_date
Mon-Wed 1 0 1 0 0 0 0 20240101 20240131
Tue-Thur 0 1 0 1 0 0 0 20240101 20240131
Friday 0 0 0 0 1 0 0 20240101 20240131
Saturday 0 0 0 0 0 1 0 20240101 20240131
Sunday 0 0 0 0 0 0 1 20240101 20240131

calendar_dates.txt -- We replace Monday service with a Holiday.

service_id date exception_type
holiday 20240101 1
Mon-Wed 20240101 2

roster_group.txt -- In this example, you can see that we have defined two types of rosters for the same period. Where one is for a 40 hour work week and another is for a 36 hour work week.

roster_group_id start_date end_date roster_set_name
40_hour_prod 20240101 20240131 Production
36_hour_prod 20240101 20240131 Production
  • roster_group_id defines the primary key per roster set
  • start_date start date of the roster set
  • end_date end date of the roster set
  • roster_set_name human readable name of the roster set

rosters.txt -- In this roster we have 3 roster entries, with a 2 week rotation for each, from 2 different sets of rosters that attach to the same run_id's. So, they stay independent of the service_id's but are still linked to them.

roster_group_id roster_id roster_name roster_week_sequence monday_run_id tuesday_run_id wednesday_run_id thursday_run_id friday_run_id saturday_run_id sunday_run_id
40_hour_prod 1 101 1 10 10 20 10 10    
40_hour_prod 1 101 2 20 20 10 20     210
40_hour_prod 2 102 1 11 11 21     111 211
40_hour_prod 2 102 2 21 21     21 111 211
36_hour_prod 3 201 1 20     20 20 110  
36_hour_prod 3 201 2     20 10 10 110  

Just to make things easy to follow, this is a summary of the changes to rosters.txt again.

  • roster_week_sequence Add to deal with the case that a roster has multiple weeks within it, for example a 2 week roster. We can number the sequence of the weeks to define which week is week 1 and which is week 2.
  • roster_group_id add to tie back to roster_group.txt
  • roster_name add because roster_id is a primary key, but rosters often have specific numbering / naming conventions like blocks or runs do. So, for example it is possible to have a roster 101 that is for the whole of January but then replace it without a new roster set that has a roster 101 starting in February in the same data set.
  • service_id remove because run_id ties back to service_id
  • start_date remove, shifted this to roster_group.txt
  • end_date remove, shifted this to roster_group.txt

roster_dates.txt -- We complete the cycle of service replacement here by replacing the run_id. When coupled with calendar_dates.txt that has replaced the service_id, then you can now fully replace both the public facing trips and the internal work. For clarity, run_id 555 is added, and run_id 10 is removed. I suppose technically replacing roster_name is also possible, but I'm not so sure on this one yet.

roster_group_id date roster_id roster_name run_id exception_type
40_hour_prod 20240101 1 101 555 1
40_hour_prod 20240101 1 101 10 2
  • We delete service_id in the original proposal because run_id is tied to it to remove complexity.
  • We add roster_group_id to tie us back to a roster set.
  • We can either have roster_name here or not, I need to think this through more.

It took me a few days to think about this overall, what do you guys think?

@skyqrose
Copy link
Contributor Author

Oooh thanks for the proposal, there are a ton of good new ideas in here.

We had discussions earlier about not needing service_id because run_id is tied to a service_id

We had discussions about trips being tied to a service_id. But at least at my agency, run_ids are re-used between services, so run_id does need a service_id next to it. It is possible to work around this, but it'd be hard. (More details collapsed cuz it's a bit of a tangent.)

How to get unique run IDs
  • Add to the spec that run_id must be unique across the whole dataset, like trip_id is.
    • (this is backwards incompatible for anyone already using run_events.txt, not sure if that applies to anyone yet)
  • Translate all our runs from their existing names "500" to new unique names "2024-Fall-weekday-500"
  • Have a way to translate "2024-Fall-weekday-500" into the human-readable name "500"
    • Possibly a de-normalized run_name column in run_events.txt
    • Possibly a new file runs.txt with two columns run_id and run_name.

That's all possible, but in my opinion referring to runs everywhere as a (service_id, run_id) pair is easier than renaming the run IDs.

  1. I like the roster group abstraction. This is the first time GTFS or TODS have been able to represent the concept of "The whole spring schedule", and it means we don't have to denormalize the start/end date across every roster in the group. It does mean an extra file, though, not sure if that's worth it.

  2. Is roster name meant to be used to match related rosters together by checking which rosters have the same name, or is it just for displaying to humans? Are January 101 and February 101 necessarily related?

  3. How do you know which week of a multi-week roster a date belongs to. Is Wednesday 2024-01-31 part of week 1 or week 2?

  4. What is roster_set_name used for?

  5. The example roster_dates.txt edits roster 1, but not 2 or 3. This implies that 1 is doing the special holiday run 555, but 2 and 3 are still doing the normal weekday runs 11 and 20. Does this proposal solve the problem of needing to add every roster to roster_dates even if there's only a minor schedule change (note 3 in my last list)? I was hoping the new roster_groups concept could do that somehow but I don't see how. Does run 11 work on trips on both services Mon-Wed and holiday?

@BTollison
Copy link
Collaborator

We had discussions about trips being tied to a service_id. But at least at my agency, run_ids are re-used between services, so run_id does need a service_id next to it. It is possible to work around this, but it'd be hard. (More details collapsed cuz it's a bit of a tangent.)

Is this because there is run_id for you is the literal run number? I realized this is a problem in the current situation and that we probably need to propose a run_name field to allow run_id to be a proper primary key.

  1. Is roster name meant to be used to match related rosters together by checking which rosters have the same name, or is it just for displaying to humans? Are January 101 and February 101 necessarily related?

Yes they can be related in that it's for garage 101 or something like this. I find that in this industry we are often renumbering things in a particular way for a particular business reason. So you are extremely likely to end up with overlapping names, that are meant to be human readable. So it can be useful for looking up as well, especially if you are troubleshooting with someone in the field.

  1. How do you know which week of a multi-week roster a date belongs to. Is Wednesday 2024-01-31 part of week 1 or week 2?

Yeah, this is something that we can either have a field for or assume that the operations software can control it. I had thought about having a field that basically indicates that week X is the start week of this dataset. So, if you sent an update but didn't change the start_date or end_date, it's possible to compensate with this field as well. Perhaps we can make this an optional field?

  1. What is roster_set_name used for?

I have seen that often there are multiple roster sets for a garage, so roster_set_name can use used to group them together in some way or allow for whatever business related naming convention is necessary in the field.

  1. The example roster_dates.txt edits roster 1, but not 2 or 3. This implies that 1 is doing the special holiday run 555, but 2 and 3 are still doing the normal weekday runs 11 and 20. Does this proposal solve the problem of needing to add every roster to roster_dates even if there's only a minor schedule change (note 3 in my last list)? I was hoping the new roster_groups concept could do that somehow but I don't see how. Does run 11 work on trips on both services Mon-Wed and holiday?

I lean towards we probably can't solve this because it's hard to see how we aren't stuck trying to tell the programs what needs to happen when we no longer do the normal work that was scheduled. We need to know who does what in place of it. By the way, this is a good time to say I totally forgot about the employee_rosters.txt. But Now that I see it again, I think we can just incorporate the employee_id into rosters.txt and roster_dates.txt

@skyqrose
Copy link
Contributor Author

skyqrose commented Nov 7, 2024

To break some (not all) of the changes down into smaller topics that could be discussed separately:

  • Unique run IDs: Should we refer to runs as (service_id, non_unique_run_id), or should we say that run_id must be unique, and add a separate run_name field?
  • Multi-week rotations: Do we need to support them? If so, then we need to figure out the last details. If not, then agencies who use multi-week rotations would have to reassign people to a different roster position each week.
  • Roster groups: Should we add this level of abstraction and add another file? Is normalizing the dates and having a roster set name useful enough to be worth the extra file?

My opinions on them:

  • Unique run IDs: Non-unique Run IDs keep causing problems, so I guess it's worth the effort to fix, but it'd probably be backwards incompatible. I'm on the fence.
  • Multi-week rotations: It would be nice to represent the data in a way that's more semantic, but I don't see a way to do it without exploding the complexity of the calendar, so I'm leaning opposed to it.
  • Roster groups: The benefits to normalizing this data seem small enough now that it doesn't seem worth an extra file, but I could be convinced that it's important.

@safrazier17
Copy link
Contributor

@skyqrose I have been having trouble consolidating the proposal because of these questions. Thanks for putting it so neatly. My thoughts:

I don't think we should do another backwards incompatible change at this point. Can you clarify for me: you're saying MBTA would use the same run id twice on a monday and have it assigned to two separate services and containing different trips? How do you differentiate between them in dispatch? Are they for different modes?

multi-week and roster groups: I am inclined to leave out at this point. We are adding a lot of complexity to the spec as is and I think it makes sense to defer these to a future update unless it is actively blocking an in-progress implementation.

@jeffkessler-keolis
Copy link
Contributor

Okay, apologies for the delay in weighing in on this.

Unique run IDs: Should we refer to runs as (service_id, non_unique_run_id), or should we say that run_id must be unique, and add a separate run_name field?

I'm in favor of run_id not being forced to a dataset-unique value given the use case of having a consistent run_id spanning multiple service_ids.

  • Those who want to have crew members assigned to a single run regardless of the service_ids in effect could do so by leaving the service_id definition blank; this is something of which we'd make tremendous use. Others could specify runs based on the specific servce_id definition.

  • Those of us who have a crew-facing job number can use that figure without needing to use or include an internal identifier, but an internal/unique identifier could always be populated therein.

Multi-week rotations: Do we need to support them? If so, then we need to figure out the last details. If not, then agencies who use multi-week rotations would have to reassign people to a different roster position each week.

  • There are some railroads in the US (e.g. Amtrak) that have positions operate on an alternating cycle (e.g. everyone works one weekend day every-other weekend).

  • Many, if not most, foreign operations rely on a months-long roster cycle, whereby the positions are rotated through on a regular basis (and probably common for @BTollison's operations, too).

  • I see a few ways this approach could be implemented:

    • Every multi-week roster is itemized out with weekly roster dates, each of which have to be assigned to their specific dates.
    • The roster is listed with each week in succession, with the start listed in a date-based assignment. (This is still the same amount of data if over a long period of tine, but the date-interpretation becomes implicit vs explicit, and could be repeated for multi-week cycles.)
    • The roster is listed with each week in succession, and individuals could be assigned to both a roster position AND a starting week. (This reduces the amount of data required to convey the applicable information, and although implicit with dates, is explicit in operation and intentionality.)

Roster groups: Should we add this level of abstraction and add another file? Is normalizing the dates and having a roster set name useful enough to be worth the extra file?

I'm admittedly not sure how much is needed here / not sure I get the use case. Happy to have this abstraction as an optional grouping, but I don't know if I am understanding a reason to make it a mandatory structural element at this point.

@safrazier17
Copy link
Contributor

Summary of decisions from today's meeting:

  • We will add a roster_groups file that will not include start and end date, which will instead remain in the rosters.txt file (@skyqrose, can you incorporate this into the PR?)
  • We will defer consideration of multi-week rosters; this is likely a high priority for Euro operators, but adds a level of complexity that we are not currently prepared to consider in v2.1
  • No final decision has been made on run_id uniqueness, but we are leaning toward continue to key based on the unique run_id/service_id pair rather than introducing a new run_name concept
  • @antrim will build a vehicle assignments proposal based on @BTollison's add Vehicles.txt  #30; will report on whether this is a blocker for current stage implementation

@antrim antrim mentioned this pull request Dec 13, 2024
antrim added a commit to antrim/operational-data-standard that referenced this pull request Dec 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants