Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to update Media Use values? #461

Closed
Natkeeran opened this issue Aug 22, 2022 · 38 comments
Closed

How to update Media Use values? #461

Natkeeran opened this issue Aug 22, 2022 · 38 comments
Labels
enhancement New feature or request

Comments

@Natkeeran
Copy link

Is it possible to update media use values after creating the media?

@mjordan
Copy link
Owner

mjordan commented Aug 22, 2022

Not at the moment: #76. But, adding an "update_media" task will be relatively easy now that field handlers are abstracted out. It would look like https://github.com/mjordan/islandora_workbench/blob/main/workbench#L339, and there are already two utility functions in workbench_utils.py that are relevant,

def patch_media_fields(config, media_id, media_type, node_csv_row):
and
def patch_media_use_terms(config, media_id, media_type, media_use_tids):
. The latter is only used to add media use terms in addition to the first one.

@mjordan mjordan added the enhancement New feature or request label Aug 31, 2022
@Natkeeran
Copy link
Author

Natkeeran commented May 1, 2023

@mjordan

@hassanelsheikha is working on update_media function. We need some clarifications about how best to do this.

As part of updating media, if we need to update the files, can we delete the files, then create the new files, and update the respective file field.

Currently, patch_media_fields only handles two field types. We would need to expand that to include other fields, including possible custom fields, right?

patch_media_use_terms is useful if we are only updating the media use terms, otherwise, we should include the changes as part of the media_json in the patch_media_use_terms, right?

Should we infer the media _type from config or should we make a GET request to get that info?

Also, what factors we need to consider with respect to media versions?

Thank you.

@mjordan
Copy link
Owner

mjordan commented May 1, 2023

@hassanelsheikha is working on update_media function. We need some clarifications about how best to do this.

Excellent! Thanks for opening an issue first.

As part of updating media, if we need to update the files, can we delete the files, then create the new files, and update the respective file field.

Yes, that's what I'd do, but the deleted file won't be available as a previous version (but see below)

Currently, patch_media_fields only handles two field types. We would need to expand that to include other fields, including possible custom fields, right?

Correct.

patch_media_use_terms is useful if we are only updating the media use terms, otherwise, we should include the changes as part of the media_json in the patch_media_use_terms, right?

I think we can deprecate patch_media_use_terms if we expand patch_media_fields, since field_media_use is just another field on the media.

Should we infer the media _type from config or should we make a GET request to get that info?

Use set_media_type to set a file's media type.

Also, what factors we need to consider with respect to media versions?

Workbench doesn't currently account for media revisions. Islandora/documentation#1035 might be relevant. However, there is a larger issue with versions/revisions during REST and Migrations: they aren't created. See Islandora/documentation#1485 for more info. I'm not sure what we can do about this.

Also, most/half of the work in adding a new task to Workbench is the --check part. Any likely factor that could interfere with a successful add/update/delete should be checked during --check. In this case, he should check for the presence of the file, and that its extension is registered with the target file field. If supporting remote files (which he should), check its MIME type from the remote server, convert that into an extension, and check that the extension is registered with the media type's target file field. Usually you can reuse existing validation functions, etc. but it still adds a lot of work.

@Natkeeran
Copy link
Author

Thank you for the feedback, we should be able to incorporate the necessary checks.

@Natkeeran
Copy link
Author

if we put most of the logic into a method that returns the media_json, we would be able to more easily test that function with mockup. We can put the actual PATCH request into a wrapper method.

@mjordan
Copy link
Owner

mjordan commented May 1, 2023

@mjordan
Copy link
Owner

mjordan commented May 1, 2023

Re. testing, so far all HTTP requests that create or update entities are tested using a live Islandora. There are a lot of tests that do use what is essentially a mockup, but not for entities.

@Natkeeran
Copy link
Author

https://github.com/mjordan/islandora_workbench/blob/main/workbench_utils.py#L3193 actually includes the request to a site. Thats ok, this can be refactored later on.

In the case where a file is not provided for media update, can we use a GET request to get the media_type?

@mjordan
Copy link
Owner

mjordan commented May 1, 2023

In the case where a file is not provided for media update, can we use a GET request to get the media_type?

Absolutely. Workbench does a lot of this type of request. Be sure to use `https://github.com/mjordan/islandora_workbench/blob/main/workbench_utils.py#L168 since it adds all the config stuff to the request.

@hassanelsheikha
Copy link
Contributor

@mjordan Of course! We plan on using something like
media_type = issue_request(config, 'GET', media_json_url).links['describes']['type'].split('/')[0]
or similar. Would that work?

@mjordan
Copy link
Owner

mjordan commented May 1, 2023

Yes, but that function returns the response object. You'll need to get the body and parse out the media type yourself.

@hassanelsheikha
Copy link
Contributor

@mjordan How would the user indicate in the CSV file that they want to leave the field as is? How would they indicate that they want to clear the current value in that field?

@mjordan
Copy link
Owner

mjordan commented May 1, 2023

We'll want to make updating media work the same way as updating nodes currently does. https://mjordan.github.io/islandora_workbench_docs/updating_nodes/ provides details. Currently there is no way to indicate "leave the value as it is"; Workbench assumes that all rows in the CSV are to be updated. The reason there is no "leave it as is" is that whether the update job appends, replaces, or deletes is defined in the config file and not at the row level in the CSV. Can you provide a use case for "leaving the field as is"?

@hassanelsheikha
Copy link
Contributor

hassanelsheikha commented May 1, 2023

I think this serves as a use-case:

media_id,field1,field2
1,AS-IS,57
2,57,AS-IS 

So that we don't have to call the update routine multiple times.

@mjordan
Copy link
Owner

mjordan commented May 1, 2023

Looking at https://github.com/mjordan/islandora_workbench/blob/main/workbench_fields.py#L112, currently an empty CSV value during append and replace update tasks leaves the field value as is. During delete update tasks, it deletes the field value. Maybe we don't need to do anything to "leave as is"?

@hassanelsheikha
Copy link
Contributor

I see. That does make more sense indeed. Thanks for clarifying!

@mjordan
Copy link
Owner

mjordan commented May 1, 2023

Sorry I didn't remember that earlier. Also, I will update the docs to indicate that an empty CSV value in append and replace update tasks leaves does nothing.

@mjordan
Copy link
Owner

mjordan commented May 2, 2023

@hassanelsheikha confirming that blank/empty CSV values tell Workbench to leave the field value as is. https://mjordan.github.io/islandora_workbench_docs/updating_nodes/ updated.

@hassanelsheikha
Copy link
Contributor

Awesome! Thank you for your confirmation.

@mjordan
Copy link
Owner

mjordan commented May 2, 2023

Work on #367 will require that I modify the following parts of the Workbench codebase:

I don't expect you will need to touch those parts of the code in your work but I would like to avoid Git merge conflicts at any cost. I will likely merge my work into main before you do, so you will be the one who hits the merge conflicts. Changes should be minor, only a couple of lines in each place except for the new method in the Config class. Is it OK for me to proceed?

@hassanelsheikha
Copy link
Contributor

The interface of https://github.com/mjordan/islandora_workbench/blob/main/workbench_utils.py#L689-L917 will remain the same, correct? If so, all good.

Also, is there any utility function which allows attaching a file to an existing media without a node_id?

@mjordan
Copy link
Owner

mjordan commented May 2, 2023

The interface of https://github.com/mjordan/islandora_workbench/blob/main/workbench_utils.py#L689-L917 will remain the same, correct? If so, all good.

Yes, but the return value will have an additional key that is unrelated to the work you are doing.

Also, is there any utility function which allows attaching a file to an existing media without a node_id?

No, but you may be able to use create_file(). See https://github.com/mjordan/islandora_workbench/blob/main/workbench_utils.py#L3130-L3158 for an example of it being used with a POST to add a file while creating a media.

@hassanelsheikha
Copy link
Contributor

@mjordan It seems like updating the file for a media will be very tedious and make the code very complex. Also, it may require that the media be attached to a node and the user will have to specify the ID of that node in the CSV file. I think it is simpler for the user to just delete the entire media then create a new one in the case where they want to replace the media file. This will keep the code minimal and simple. What do you think?

@mjordan
Copy link
Owner

mjordan commented May 2, 2023

I'm not in favor of that approach, since we can't just delete all the entity field data that is on the media. If we delete the entire media in order to replace a file, how do we ensure that this data is not lost? Also, links to the media would break since the replacement media would get a new media ID.

@hassanelsheikha
Copy link
Contributor

Ok, do you have any suggestions for tackling this issue then? In particular, the fact that a node ID is required for almost everything related to adding a file is difficult to work around.

@mjordan
Copy link
Owner

mjordan commented May 2, 2023

I don't have time to look at it right now but I will this evening. Can you summarize what you are trying to accomplish so I am sure I am focusing on what you need?

@mjordan
Copy link
Owner

mjordan commented May 2, 2023

Don't forget, you might need to do one or more intermediate GET requests to get data that is otherwise missing. For example, if you know the media ID, you can get the node it is attached to by requesting https://example.com/media/50740/edit?_format=json, then inspecting the value of the returned JSON's field_media_of.

Yes, it's tedious.

@hassanelsheikha
Copy link
Contributor

hassanelsheikha commented May 2, 2023

No worries at all, sorry for the inconvenience. In summary: I have made good progress on update_media but am halted by the use case where the user wants to update a file belonging to a certain media (does this use case exist?). Deleting the existing file belonging to the media is not a problem, but adding the new file to Drupal and attaching it to the media causes an issue, as useful functions like create_file depend on a node ID, which requires either the media to be attached to a node (which is not always the case; consider a newly created site with media but no nodes yet) or the development of a new set of functions that don't depend on a node ID (this will complicate the code heavily). This is where I'm being blocked right now.

@mjordan
Copy link
Owner

mjordan commented May 2, 2023

No problem. Workbench might seem very "organic" and without a clear structure, because it is. I, and others, have added functionality as it has been requested by the Islandora community and it shows. If we need to create new methods, that's fine. I would like to make the code a lot cleaner and easier for developers to understand and use, so if we need to refactor existing functions, that's fine too.

Updating just a file attached to a media is most likely a use case someone will want eventually.

Refactoring would be easier if writing tests was easier. Tests have been a challenge. For functionality that interacts with Drupal, running tests requires a live Islandora at http://localhost:8000. The reason Workbench tests don't use mock HTTP responses is that early on, the structure of a response body was a moving target. Maybe it's time to start using mock response objects so we can write more integration tests. But that's off topic from what you're trying to do....

@hassanelsheikha
Copy link
Contributor

Understood. So what would you say is appropriate for the testing of update_media?

@mjordan
Copy link
Owner

mjordan commented May 4, 2023

The closest integration test that exists is https://github.com/mjordan/islandora_workbench/blob/main/tests/islandora_tests.py#L628. For now, don't worry about it. The existing tests will at least let us know if your work has introduced any regression errors. We can add a test for your work later.

@mjordan
Copy link
Owner

mjordan commented May 5, 2023

FYI I'd like to merge https://github.com/mjordan/islandora_workbench/tree/issue_367 into main. That OK with you?

@hassanelsheikha
Copy link
Contributor

Please go ahead. I don't think there should be conflicts there.

@mjordan
Copy link
Owner

mjordan commented May 5, 2023

OK, merged with e087e18.

@hassanelsheikha
Copy link
Contributor

Thanks! Merged with mine now.

@hassanelsheikha
Copy link
Contributor

Hi @mjordan. Can you please give me a rundown of the logging/error system for Workbench? In particular, when should we skip a row in the CSV file and when should we terminate all together (in the case of a "bad" input)? Also, when are errors printed to the console, and when are they exclusively printed to the .log file?

@mjordan
Copy link
Owner

mjordan commented May 9, 2023

I have been leaning towards not printing every error to the console, since people tend to miss them. Better to just say an entity has/has not been created/updated/deleted but to log all detail to the log.

@mjordan
Copy link
Owner

mjordan commented May 22, 2023

@hassanelsheikha thanks again! This is excellent!

@mjordan mjordan closed this as completed May 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants