Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Data Structure Update #124

Closed
1 task done
wincowgerDEV opened this issue Mar 6, 2023 · 7 comments · Fixed by #173
Closed
1 task done

[Feature]: Data Structure Update #124

wincowgerDEV opened this issue Mar 6, 2023 · 7 comments · Fixed by #173
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request reviewer comment feedback from manuscript reviewers, should be prioritized

Comments

@wincowgerDEV
Copy link
Collaborator

wincowgerDEV commented Mar 6, 2023

Guidelines

  • I agree to follow this project's Contributing Guidelines.

Description

Reviewer Said:
I agree with @luxaritas that the written documentation could be expanded. What I miss the most is a detailed explanation of the json data structure for post-processing software. Which fields are expected and what information do they contain? Are there fields which are not available for all data. As I am not a familiar with trash research I was also wondering whether there is some standardized format for data exchange on trash location data and labelling which could be used here (I think the Hapich et al 2022 paper is elaborating on this, so it would be nice to read some more details of the connection of this trash taxonomy to the data format of trashai). Also I understand that your targeted audience is less technical so dealing with json data may be a barrier. I think it would also be good to offer a pdf based overview (like an analysis page which could just be printed into a pdf - so users could directly have a map overview of their trash expedition).

Problem

JSON Schema isn't well defined. JSON format can be challenging to deal with and summary stats may be preferred.

Proposed Solution

Create a json schema file and some simple to read documentation on the json format. Also create a download with summary information in csv of similar flat format.

Alternatives Considered

Expect that folks are savy in json.

@wincowgerDEV wincowgerDEV added the enhancement New feature or request label Mar 6, 2023
@wincowgerDEV
Copy link
Collaborator Author

wincowgerDEV commented Mar 6, 2023

Adding in reviewer's recommendation: I was missing information on which software people typically use for researching trash images. Since I'm not familiar with the field I may not be aware that such software is not really available, but I think it would be useful to give a glimpse into the working process around your tool (like: what do I actually do with the downloaded json data?). It would also be interesting to know whether there are some widely known databases where you can put your trash data, as I think the useful part of such information is putting a lot of data from different researchers together to give a broad overview of littering places. I think especially it would be nice to have some guidance what to do with the json data downloaded by your tool. Is there some analysis software for that?

@epierotti3 epierotti3 added documentation Improvements or additions to documentation reviewer comment feedback from manuscript reviewers, should be prioritized labels Apr 5, 2023
@epierotti3 epierotti3 moved this to To-Do in TrashAI Apr 5, 2023
@epierotti3 epierotti3 added this to the Complete Manuscript milestone Apr 5, 2023
@shollingsworth
Copy link
Collaborator

I'm not sure what this is asking, might need to get with Win on this one.

@wincowgerDEV
Copy link
Collaborator Author

I'm not sure what this is asking, might need to get with Win on this one.

@shollingsworth, yeah this one is in my court. I can do all of this.

@wincowgerDEV
Copy link
Collaborator Author

Here is the json schema for the summary file. I think it should be added in the download.

{ "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "detected_objects": { "type": "array", "items": { "type": "object", "properties": { "name": { "type": "string" }, "count": { "type": "integer", "minimum": 0 }, "hashes": { "type": "array", "items": { "type": "string" } } }, "required": ["name", "count", "hashes"] } }, "no_detection_hashes": { "type": "array", "items": { "type": "string" } }, "unique_detections": { "type": "integer", "minimum": 0 }, "total_detections": { "type": "integer", "minimum": 0 }, "gps": { "type": "object", "properties": { "list": { "type": "array", "items": { "type": "object", "properties": { "coordinate": { "type": "object", "properties": { "lat": { "type": "number" }, "lng": { "type": "number" } }, "required": ["lat", "lng"] }, "hash": { "type": "string" } }, "required": ["coordinate", "hash"] } } }, "required": ["list"] } }, "required": ["detected_objects", "no_detection_hashes", "unique_detections", "total_detections", "gps"] }

Here is a description file:

  1. detected_objects (Array): An array of objects where each object represents a detected item with the following properties:

    • name (String): The name of the detected item (e.g., "Plastic film").
    • count (Integer): The count of the detected item. This number should be a non-negative integer.
    • hashes (Array): An array of strings where each string represents a hash associated with the detected item.
  2. no_detection_hashes (Array): An array of strings where each string represents a hash that is associated with items not detected.

  3. unique_detections (Integer): The count of unique detections. This number should be a non-negative integer.

  4. total_detections (Integer): The total count of detections. This number should be a non-negative integer.

  5. gps (Object): An object that contains GPS information with the following properties:

    • list (Array): An array of objects where each object represents a GPS location with the following properties:
      • coordinate (Object): An object that contains the GPS coordinates with the following properties:
        • lat (Number): The latitude of the location.
        • lng (Number): The longitude of the location.
      • hash (String): A hash associated with the location.

Please note that all of the properties described above are required. If any property is missing or if a property's value is of an incorrect type, the JSON will not validate against the schema.

@wincowgerDEV
Copy link
Collaborator Author

For single image json we will also want this schema and definition file:
{ "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "hash": { "type": "string" }, "filename": { "type": "string" }, "exifdata": { "type": "object", "properties": { "Make": { "type": "string" }, "Model": { "type": "string" }, "DateTimeOriginal": { "type": "integer" }, "ModifyDate": { "type": "integer" }, "CreateDate": { "type": "integer" }, "GPSLatitudeRef": { "type": "string" }, "GPSLatitude": { "type": "number" }, "GPSLongitudeRef": { "type": "string" }, "GPSLongitude": { "type": "number" }, "GPSAltitudeRef": { "type": "integer" }, "GPSAltitude": { "type": "number" }, "GPSTimeStamp": { "type": "array", "items": { "type": "integer" } }, "GPSDateStamp": { "type": "string" }, "ExifImageWidth": { "type": "integer" }, "ExifImageHeight": { "type": "integer" } }, "required": ["Make", "Model", "DateTimeOriginal", "ModifyDate", "CreateDate", "GPSLatitudeRef", "GPSLatitude", "GPSLongitudeRef", "GPSLongitude", "GPSAltitudeRef", "GPSAltitude", "GPSTimeStamp", "GPSDateStamp", "ExifImageWidth", "ExifImageHeight"] }, "metadata": { "type": "array", "items": { "type": "object", "properties": { "score": { "type": "string" }, "correction": { "type": "string" }, "remove": { "type": "boolean" }, "is_tf": { "type": "boolean" }, "id": { "type": "string" }, "label": { "type": "string" }, "area": { "type": "object", "properties": { "x1": { "type": "number" }, "y1": { "type": "number" }, "x2": { "type": "number" }, "y2": { "type": "number" } }, "required": ["x1", "y1", "x2", "y2"] } }, "required": ["score", "correction", "remove", "is_tf", "id", "label", "area"] } } }, "required": ["hash", "filename", "exifdata", "metadata"] }

Description:
hash (String): A hash associated with the image file.

filename (String): The filename of the image.

exifdata (Object): An object that contains EXIF data with the following properties:

Make (String): The manufacturer of the camera that took the image.
Model (String): The camera model that took the image.
DateTimeOriginal (Integer): The original date and time the image was taken, represented as a Unix timestamp.
ModifyDate (Integer): The date and time the image was last modified, represented as a Unix timestamp.
CreateDate (Integer): The date and time the image was created, represented as a Unix timestamp.
GPSLatitudeRef (String): The reference for latitude (usually "N" or "S").
GPSLatitude (Number): The latitude coordinate of where the image was taken.
GPSLongitudeRef (String): The reference for longitude (usually "E" or "W").
GPSLongitude (Number): The longitude coordinate of where the image was taken.
GPSAltitudeRef (Integer): The reference for altitude (usually 0).
GPSAltitude (Number): The altitude where the image was taken.
GPSTimeStamp (Array): The GPS timestamp of when the image was taken, represented as an array of integers [hours, minutes, seconds].
GPSDateStamp (String): The GPS date when the image was taken.
ExifImageWidth (Integer): The width of the original image.
ExifImageHeight (Integer): The height of the original image.
metadata (Array): An array of objects where each object represents a detected item with the following properties:

score (String): The confidence score of the AI model for the detection.
correction (String): A correction applied to the AI model's prediction. This is usually empty if no correction is applied.
remove (Boolean): A flag indicating whether this detection should be removed. False means the detection is valid.
is_tf (Boolean): A flag indicating if the detection is a true positive (true) or false positive (false).
id (String): A unique identifier for the detection.
label (String): The label of the detected item (e.g., "Plastic film").
area (Object): An object that contains the coordinates of the detected area with the following properties:
x1 (Number): The x-coordinate of the top left corner of the detected area.
y1 (Number): The y-coordinate of the top left corner of the detected area.
x2 (Number): The x-coordinate of the bottom right corner of the detected area.
y2 (Number): The y-coordinate of the bottom right corner of the detected area.
Please note that all of the properties described above are required. If any property is missing or if a property's value is of an incorrect type, the JSON will not validate against the schema.

@wincowgerDEV
Copy link
Collaborator Author

Adding analysis page:
already exists https://www.trashai.org/summary

@wincowgerDEV
Copy link
Collaborator Author

wincowgerDEV commented May 16, 2023

Todo:

  • Add JSON schema and definition files to the download
  • Add an option to download data as csv instead of json.
  • Add video discussion about things a user can do with the data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request reviewer comment feedback from manuscript reviewers, should be prioritized
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants