[FEATURE]: Create Train and Test Datasets from User-Uploaded Dataset in S3 for /training #913

dwu359 · 2023-08-20T22:24:20Z

Feature Name

Create Train and Test Datasets from S3 for /training

Your Name

Daniel Wu

Description

As of right now, the training backend can only handle default datasets for /tabular. Allow user-uploaded datasets to be used for tabular training by implementing a dataset creator in training/dataset.py to allow the /tabular endpoint route to read a file from s3 given the filename and split it into train and test datasets.

Right now, datasets are stored in s3 in the dlp-upload-bucket in the location {uid}/{trainspace_type}/{filename}.

You can upload files to the bucket with https://em9iri9g4j.execute-api.us-west-2.amazonaws.com/ SST prod endpoint and /datasets/user/{type}/{filename}/presigned_upload_url route.
EDIT: The above statement is not true, see below

You will need a bearer token also, which can be obtained using the backend cli. For more info, cd training && poetry run python cli.py --help.

The text was updated successfully, but these errors were encountered:

github-actions · 2023-08-20T22:24:31Z

Hello @dwu359! Thank you for submitting the Feature Request Form. We appreciate your contribution. 👋

We will look into it and provide a response as soon as possible.

To work on this feature request, you can follow these branch setup instructions:

Checkout the main branch:

```
 git checkout nextjs
```

Pull the latest changes from the remote main branch:

```
 git pull origin nextjs
```

Create a new branch specific to this feature request using the issue number:

```
 git checkout -b feature-913
```

Feel free to make the necessary changes in this branch and submit a pull request when you're ready.

Best regards,
Deep Learning Playground (DLP) Team

karkir0003 · 2023-09-06T22:02:06Z

@NMBridges youre doing this task

dwu359 · 2023-09-16T15:48:43Z

@NMBridges My bad, this task should deal with reading the dataset files from s3 into training, not writing files to s3.

karkir0003 · 2023-09-16T15:56:47Z

https://github.com/DSGT-DLP/Deep-Learning-Playground/blob/nextjs/training/training/core/dataset.py

should be the file to implement this endpoint in @NMBridges

karkir0003 · 2023-09-16T15:59:57Z

@NMBridges also, assume the scope of this use case to be for tabular (so reading CSV from S3 and then building train/test dataset). See example dataset creator class in the linked file

dwu359 added the enhancement New feature or request label Aug 20, 2023

karkir0003 added this to Backend Improvements Aug 21, 2023

karkir0003 moved this to Todo in Backend Improvements Aug 21, 2023

dwu359 added the backend backend tasks label Aug 21, 2023

dwu359 assigned NMBridges Sep 6, 2023

karkir0003 added this to DLP Project Board Sep 10, 2023

github-project-automation bot moved this to Todo in DLP Project Board Sep 10, 2023

karkir0003 moved this from Backlog to Todo in DLP Project Board Sep 10, 2023

NMBridges mentioned this issue Sep 16, 2023

Implemented custom dataset creator class #962

Open

karkir0003 moved this from Todo to Review in DLP Project Board Sep 17, 2023

noah-iversen moved this from Review to Todo in DLP Project Board Feb 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE]: Create Train and Test Datasets from User-Uploaded Dataset in S3 for /training #913

[FEATURE]: Create Train and Test Datasets from User-Uploaded Dataset in S3 for /training #913

dwu359 commented Aug 20, 2023 •

edited

Loading

github-actions bot commented Aug 20, 2023

karkir0003 commented Sep 6, 2023

dwu359 commented Sep 16, 2023

karkir0003 commented Sep 16, 2023

karkir0003 commented Sep 16, 2023

[FEATURE]: Create Train and Test Datasets from User-Uploaded Dataset in S3 for /training #913

[FEATURE]: Create Train and Test Datasets from User-Uploaded Dataset in S3 for /training #913

Comments

dwu359 commented Aug 20, 2023 • edited Loading

Feature Name

Your Name

Description

github-actions bot commented Aug 20, 2023

karkir0003 commented Sep 6, 2023

dwu359 commented Sep 16, 2023

karkir0003 commented Sep 16, 2023

karkir0003 commented Sep 16, 2023

dwu359 commented Aug 20, 2023 •

edited

Loading