Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom media fields #572

Closed
Natkeeran opened this issue Mar 2, 2023 · 19 comments
Closed

Custom media fields #572

Natkeeran opened this issue Mar 2, 2023 · 19 comments
Assignees
Labels
enhancement New feature or request

Comments

@Natkeeran
Copy link

Natkeeran commented Mar 2, 2023

After reviewing the code, it seems that islandora workbench currently does not support ingesting custom/additional media fields. Various use cases exist where we need to add custom/additional media fields.

Further, it is much easier to use one main ingest sheet to ingest/update repository content. Thus, it would be easier if the custom media fields can be specified within the one main ingest sheet as well. For example, by specifying the media fields in the config or with a prefix.

We have the following specific use case in the starter site development. We need to specify the field_display_mode_for_viewer at the media level.

@Natkeeran Natkeeran changed the title Custom media fields (within a single ingest csv file) Custom media fields Mar 2, 2023
@mjordan
Copy link
Owner

mjordan commented Mar 2, 2023

Thanks @Natkeeran I agree, we need to be able to add custom field data to media. I will make this my next priority.

@mjordan mjordan self-assigned this Mar 2, 2023
@mjordan mjordan added the enhancement New feature or request label Mar 2, 2023
@mjordan
Copy link
Owner

mjordan commented Mar 3, 2023

Following the pattern established for creating media track files, I'd like to propose that the CSV column headers for media fields in node create and update CSV look like this:

media:video:field_display_mode_for_viewer
foo

Within add_media tasks, we can probably drop the media:video namespace since the media type is already inferred from the file's extension:

node_id,file,field_display_mode_for_viewer
100,test.mp4,foo

@Natkeeran
Copy link
Author

Sounds good.

@mjordan
Copy link
Owner

mjordan commented Mar 3, 2023

One thing we need to account for is that the media type can vary within the CSV on a row by row basis, so therefore we'll need to validate that any required fields for each media type represented in the 'file' column are in the CSV. Also, I'm not sure how we should approach 'additional_files' media... or whether they are even I'm scope for this feature.

mjordan added a commit that referenced this issue Mar 7, 2023
@mjordan
Copy link
Owner

mjordan commented Mar 8, 2023

Looks like we'll need to be careful not to stomp on work done in #373 - Media Track Files.

mjordan added a commit that referenced this issue Mar 10, 2023
@mjordan
Copy link
Owner

mjordan commented Mar 15, 2023

The more I look into how we can handle with adding field data to files identified in the additional_files config setting, the more I am leaning toward using a secondary task. In that case, instead of values in the secondary task CSV's parent_id column pointing to the id column in the primary CSV, we'd add a new column to the secondary task CSV additional_file_path_[columnname] that would point to the value in the column in the primary task CSV registered in the additional_files config setting.

In fact, it might be worth considering replacing the way the current additional_files works with an add_media secondary task, where there is a secondary task for each additional media type. In this case, using media:[type]:[field] in the primary CSV would apply only to files named in the file column. Populating fields in additional media would be done via columns in the secondary media-specific CSVs.

(Edit: if we use secondary tasks to create additional files, we can use cleaner columns headers in each media type's secondary CSV than I suggest above - they could probably be the same as ordinary node field headers currently are, just the human or machine field name.)

@mjordan
Copy link
Owner

mjordan commented Mar 15, 2023

Using a secondary task to create media track files might also make sense.

@Natkeeran
Copy link
Author

"additional_files works with an add_media secondary task" can possibly streamline the logic!, and may help organize media info in another sheet. For example, if the user wants to provide a different title for the media etc.

@mjordan
Copy link
Owner

mjordan commented Mar 15, 2023

Yes, I like this separation of tasks better than overloading the primary node CSV. It's already deviating from what I consider to be an important Workbench design principle - you don't need to be a developer to use it.

Also, since secondary tasks are just ordinary tasks, we could focus on adding custom fields to media in add_media tasks, which would be useful in its own right. This approach would also reduce Workbench's overall amount of code/maintenance/complexity compared to adding media fields in the primary node CSV.

@mjordan
Copy link
Owner

mjordan commented Mar 16, 2023

Here is an simple example of CSVs to illustrate how this would work. The node CSV is for a create task and the media CSV is for an add_media task. Two changes from how add_media works now are:

  1. you can include custom fields on the media in the add_media CSV
  2. node_id currently needs to contain the node ID of the node to attach the media to,. Since this is a secondary task, we will will need to allow using the target node's id value. Not a huge change and we can validate that when used in oridinary add_media tasks, the value of node_id needs be an existing node ID.

Node CSV (primary task)

id,file,title,field_description
test1,smile.jpg,Smile - you're on camera,I took a selfie!
test2,dog.jpg,My dog,He's a good boi.
test3,cat.jpg,My cat,She likes to smack her brother when he hastles her.

Media CSV (secondary task), for custom thumbnails with a custom (completely made up for this example) field on the media of "TN Source"

node_id,file,field_media_use,field_tn_source
test1,smile_tn.png,Thumbnail Image,https://cuteicons.com
test2,dognose.png,Thumbnail Image,https://funnydogs.com

The secondary task that adds a custom thumbnail does the same thing as the current additional_files config setting does. You can have as many secondary tasks as you want, so if you wanted to add another media to each node, you would just add a second secondary task.

@mjordan
Copy link
Owner

mjordan commented Mar 17, 2023

I think the order of development operations to accomplish this is:

  1. resolve Convert secondary tasks data file to an SQLite database #574
  2. add the ability to add_media tasks to support fields in media CSV (currently only node_id and file are supported) and allow node "id" values in the node_id CSV field
  3. add --check validation of custom media fields for add_media tasks
  4. allow add_media config files to be registered in the secondary_tasks config setting

We would retain the current additional_files functionality for a time to allow users who are using it to shift over to the new secondary_tasks approach, and at the end of this deprecation period, remove the additional_files functionality.

mjordan added a commit that referenced this issue Mar 21, 2023
mjordan added a commit that referenced this issue Mar 22, 2023
@mjordan
Copy link
Owner

mjordan commented Mar 24, 2023

Related issue - #144.

@dmer
Copy link

dmer commented Mar 31, 2023

Hi Mark - thanks for working on this! Looking forward to testing it. I have a question - will I be able to attach a file to a media using this method? (obv. presuming I have the file field on the media already)
We have an upcoming need to ingest TIF images with their HOCR and we'll want the HOCR (text file) to be attached to the TIF media object.

@mjordan
Copy link
Owner

mjordan commented Mar 31, 2023

Yes, there will be no loss of current functionality. Now that the "additional_files" feature seems to be working fairly well, I don't see a need to deprecate it in the near/medium term. The new way of ingesting media will allow you to add custom fields to every media you ingest, so in that sense, it will expand existing functionality.

@dmer
Copy link

dmer commented Mar 31, 2023

Thanks Mark!

@dmer
Copy link

dmer commented Mar 31, 2023

Does "additional_files" allow a file to be attached to another media? Sorry if I'm asking obvious question - I went and re-read the section on that config in the docs and it seems like it works to add additional media to be connected to a node, but I didn't see anything about attaching a file to a media. Maybe it works the same?

@mjordan
Copy link
Owner

mjordan commented Mar 31, 2023

No, currently addtiional_files create media (one per additional file named in those columns), which are attached to the node described in the CSV. The only type of file that can be attached to another media is a track file. On rereading your last question it's clear that you want to attach a new file to an existing media. Is that correct?

If so, we'll need to make workbench do that. New use case for me, other than attaching a track file to a media. Can you write up a structured use case so I can take this into account?

@dmer
Copy link

dmer commented Mar 31, 2023

Yes - I'll do that in a new issue. Thanks for the clarifications!

@mjordan
Copy link
Owner

mjordan commented Jan 15, 2024

Closing, I believe all of these issues have been addressed. If not, let me know.

@mjordan mjordan closed this as completed Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants