Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monthly buckets of data #215

Open
sabahfromlondon opened this issue May 10, 2022 · 8 comments
Open

Monthly buckets of data #215

sabahfromlondon opened this issue May 10, 2022 · 8 comments
Labels
component: detail enhancement New feature or request
Milestone

Comments

@sabahfromlondon
Copy link

sabahfromlondon commented May 10, 2022

The registry currently is only able to provide users with data in year buckets.

This is not ideal for a two main reasons:

  • In the interface users are asked to provide start and end dates for data they are search for i.e. start month and year and end month and year, so there is a mismatch between what they request and what they receive
  • We end up providing users with too much data, which will lead to a poor user experience and users having to go through the dataset to delete large swathes of it potentially e.g. if a user selected a start date of Dec 2018 and end date of Jan 2019 (so data spanning two months only), then currently they will receive data for all of 2018 and all of 2019!

As a result, we would like there to be monthly buckets of data, so that a user can receive only the data from the start month and year and up to only the end month and year. For example:

  • IF a user selects a start date of Dec 2018 and end date of Jan 2019, THEN they will only receive data for those two months, BUT all in one sheet for CSV or one JSON file.

The UI options in the screenshot below will need updated as a result of this change to monthly buckets.

image

  • IF the user selected a date range in the faceted search, THEN the first option, automatically selected for the users should now present that date range. IF the users stick with this option and clicks "Download data", THEN they should be provided with the appropriate monthly buckets of data. We can still keep the current download options as second and third options for the user.
  • IF the user did not select any date range in the faceted search (the default would then be "All"), THEN the current UI options should remain.

IMPORTANT: We should have the same set of options for JSON and CSV while the Flatten tool is not ready to be launched.

@jpmckinney jpmckinney added the enhancement New feature or request label Sep 16, 2022
@jpmckinney jpmckinney added this to the V2 milestone Sep 16, 2022
@yolile yolile mentioned this issue Sep 21, 2022
@jpmckinney
Copy link
Member

Noting that there are a couple places in the code that test for the presence of "_" in a filename (separator between year and month): in the flattener callback and files_available.

When we're ready to convert monthly files, we can update the flattener callback to not convert files if the output already exists. That way, we can just publish messages to the queue to fill in the month files.


As for the frontend, if I understand @sabahfromlondon, users can:

  • Download data for all time
  • Download data for a specific year
  • Specify a start date and end date, and download all intervening months as one file

We can't simply display all months like we do years, because, for example, job 713 has 210 files (2005-12 to 2022-06). #235 (comment)

I figure users would just want all the data from start date to end date. If they need access to individual months, then we would need a new design.

For reference, here is the current design at https://data.open-contracting.org/en/publication/22:

Screen Shot 2022-09-23 at 4 16 23 PM

Users can get all-time or a year in one click. For start/end, I figure we can have a small form with start date, end date and "download" button. The date fields can be pre-populated according to the search filters.

In this way, if a user searches without any date filters, they still have the opportunity to set a specific range.

The form can enforce a minimum/maximum according to the known date range, and it can perhaps repeat the date range, so that users don't need to scroll to the top to remember.

@sabahfromlondon
Copy link
Author

@jpmckinney I'm a bit confused about what has happened because the requirement was for the user to select the date range as part of thier search.

In the requirements document as part of the faceted search I had the options: Past month, Past 6 months, Past year, Past 5 years, All time - as part of a drop down. Currently the UI is only showing: All, Past year, Past six months using radio buttons - which I can live with.

There was also a requirement for a custom date range option in the design using a calendar selector. I'm pretty sure this was developed but is not longer available. I'm not sure what happened to it!

The date range for the download should be for what the user selected back on the search page. It's why we added the feedback labels on the datasets in case there was a partial coverage issue.

I think it's odd for the user to have to re-select the date range again here. Is it becuase of a technical difficulty?

@jpmckinney
Copy link
Member

jpmckinney commented Sep 28, 2022

The UI for a date range needs to be added back to the search page. It is temporarily missing. I had to rip out Vue in order to fix a variety of bugs, and didn't have time to add that functionality back yet. Now tracked in #249

We can add Past month and Past 5 years once #234 is closed. Now tracked in #250.

I'm suggesting that if the user did not indicate a date range on the search page, then they can have the opportunity to set one on the detail page. Also, if they change their mind, they can do so without going back to the search page. If they did set a date range on the search page, it would be the same on the detail page. The change I'm suggesting is to make it editable.

@sabahfromlondon
Copy link
Author

I do see where you are coming from and editing the date range on the details page is not unprecendeted, but we did do extra work to support the user on the search page with the feedback labels for the coverage. The goal was for users to feel confident that the could access data for the date range they wanted. I hoped that there would be no need to the user to have to select again, expect in the rarest of circumstances.

If the change as you say is to make the date range editable in the Access data section, then the options for the user should mirror what is on the search page.

@sabahfromlondon
Copy link
Author

Thanks for adding the additional tickets :)

@jpmckinney
Copy link
Member

Yup, the partial overlap logic will be restored along with the filter, and the options will be synced to avoid data re-entry for the common case.

@sabahfromlondon
Copy link
Author

Sounds perfect!

@jpmckinney jpmckinney modified the milestones: V2, Priority Feb 8, 2024
@jpmckinney
Copy link
Member

Relevant for large annual Excel files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: detail enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants