Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error parsing recipe: Cannot parse recipe: Unknown object found during flattening of instructions. #1297

Closed
MarcelRobitaille opened this issue Oct 28, 2022 · 7 comments · Fixed by #1300
Labels
Backend Issue or PR related to the backend code bug Something isn't working website support

Comments

@MarcelRobitaille
Copy link
Collaborator

Description
I'm getting an error importing a recipe. The page seems to have invalid HTML but shouldn't it still be imported?

Reproduction

  1. Try to import https://ilovevegan.com/spinach-avocado-and-marinated-tofu-salad/

Expected behavior
Recipe imported successfully.

Actual behavior
There is a popup saying "Cannot parse recipe: Unknown object found during flattening of instructions."

The logs report:

  Warning  cookbook           libxml: Error 513 occurred 8 times while parsing https://ilovevegan.com/spinach-avocado-and-marinated-tofu-salad/. First time it   2022-10-28T10:48:44+00:00
                              occurred in line 523 and column 829: ID mmmlogo already defined


  Warning  cookbook           libxml: Error 800 occurred while parsing https://ilovevegan.com/spinach-avocado-and-marinated-tofu-salad/. First time it occurred  2022-10-28T10:48:44+00:00
                              in line 499 and column 1: Misplaced DOCTYPE declaration


  Warning  cookbook           libxml: Error 801 occurred 156 times while parsing https://ilovevegan.com/spinach-avocado-and-marinated-tofu-salad/. First time it 2022-10-28T10:48:44+00:00
                              occurred in line 495 and column 595: Tag header invalid


  Warning  cookbook           libxml: Error 23 occurred 5 times while parsing https://ilovevegan.com/spinach-avocado-and-marinated-tofu-salad/. First time it    2022-10-28T10:48:44+00:00
                              occurred in line 59 and column 185: htmlParseEntityRef: expecting ';'


  Warning  cookbook           libxml: Error 68 occurred 54 times while parsing https://ilovevegan.com/spinach-avocado-and-marinated-tofu-salad/. First time it   2022-10-28T10:48:44+00:00
                              occurred in line 14 and column 108: htmlParseEntityRef: no name

Browser
Which browser are you using? Firefox 106.0.1

Versions
Nextcloud server version: ?? 24.0.6
Cookbook version: ?? 7e17b69
Database system: MariaDB

@MarcelRobitaille MarcelRobitaille added bug Something isn't working Backend Issue or PR related to the backend code labels Oct 28, 2022
@MarcelRobitaille
Copy link
Collaborator Author

Should we add a tag for these import/parse errors?

@christianlupus
Copy link
Collaborator

There is already one 😉.

The errors/warnings are mostly ignored and not of much importance. We might want to remove them from the NC logs altogether, once #1067 is implemented/merged.

I will have a look at the page you mentioned. The message means, however, that the parsing algorithm of XML already succeeded and the JSON is going to be built. Something was offending, I do not yet know what it was.

@MarcelRobitaille
Copy link
Collaborator Author

There is already one 😉

I don't see it. What is the name? Also, I saw a ruby tag in there. We don't have any ruby, do we?

I will have a look at the page you mentioned.

Thanks. I had a poke around the code. I thought those logs meant something was failing already upon parsing the HTML. I'm still not very familiar with the backend.

@christianlupus
Copy link
Collaborator

I don't see it. What is the name? Also, I saw a ruby tag in there. We don't have any ruby, do we?

Ok, just to be sure we are talking is the same thing. There is a tag website support that i added already to this issue. It is intended for problems with inputting from various sites. Is this what you meant?

Thanks. I had a poke around the code. I thought those logs meant something was failing already upon parsing the HTML. I'm still not very familiar with the backend.

I am still on my way to make the backend code more documented. But this involves often a refactor cycle.

The messages mainly mean that the pages are not strictly stick to the XML language standards. Nope that HTML is not 100% XML. It should be compatible but HTML is more forgiving. Thus, the XML parser complains about various issues in the imported sites. Many are not critical. Most of the sizes do not serve valid HTML (in the sense of the standard). This causes the errors and warnings.

If you want, we can together have a look at the complete backend, of this is something you want to get a grasp on. Just an offer.

@MarcelRobitaille
Copy link
Collaborator Author

Ok, just to be sure we are talking is the same thing. There is a tag website support that i added already to this issue. It is intended for problems with inputting from various sites. Is this what you meant?

Yes, that's what I meant. Sorry, I didn't expect it to be called "website support" and I didn't notice you added a tag.

The messages mainly mean that the pages are not strictly stick to the XML language standards. Nope that HTML is not 100% XML. It should be compatible but HTML is more forgiving. Thus, the XML parser complains about various issues in the imported sites. Many are not critical. Most of the sizes do not serve valid HTML (in the sense of the standard). This causes the errors and warnings.

That makes sense. I just didn't see anything else in the logs and assumed the XML parsing errors were the cause.

If you want, we can together have a look at the complete backend, of this is something you want to get a grasp on. Just an offer.

Sure!

@christianlupus
Copy link
Collaborator

I saw a ruby tag in there. We don't have any ruby, do we?

I just saw, I forgot to answer. Yes, we do have Ruby in use: Building the GitHub pages uses the static website generator Jekyll that ist using ruby.

@christianlupus
Copy link
Collaborator

I just checked the reason why the recipe does not parse successfully. We are not capable of parsing HowToSections correctly at the moment. The offending recipe instructions are
grafik

The exception is thrown in this line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backend Issue or PR related to the backend code bug Something isn't working website support
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants