-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add keukenliefde parser #877
Conversation
bed9a85
to
ce778b4
Compare
61d93cc
to
0beff5b
Compare
@jayaddison i think it is better now, I merged the 2 loops into one large one. |
Thanks! Looking pretty good - do you have an example recipe where the |
I was just testing a bit more, but found some recipe that didn't work because the html is different, can I add a second test somehow to cover the other situations? |
Yep, you certainly can - for that I'd recommend copying the approach taking by an existing multi-test scraper. For example, two tests for The Clever Carrot (note the different |
Thanks, will have a look, I found out that some old recipes are different in html format, which will make it a challenge to make this work. But that's mostly my issue as this site seems to be a mess :-P |
The site handles different formats depending on the age of the recipe, so add a second test case that covers this behavior.
The oldest recipes do not have the nice classes to find elements, so we do need text matching on the headers.
I added 2 more test cases which brings the total number at 3:
I tried to extract methods when possible to reduce the number of duplicated lines in the code. And it should now not crash any more when something is a bit different from what we expected. Please let me know what you think of this :-), should I extract the legacy format to another class or is it ok like this? |
Hmm, good question. Roughly speaking: I think the current approach with multiple fallbacks in a single class is fine here. Trying to find a reason why that is / general guidance: firstly it's often down to what makes the code easiest to manage, with a large amount of personal preference. And by personal preference, I mean the scraper author (you in this case :)). So if you want, experiment with the alternative, and if you find that you have a clear preference, we can go with that. The other consideration would be how much of the structure of the HTML page is shared. Generally I'd say that if most of the page is the same, then re-using a single class probably makes more sense. If there are three completely different page structures, then three different classes might be more likely to make sense. |
Think this is done for now, please let me know if other changes are required |
Yep, looks good to me @jaapio - thank you for this contribution. If I had one minor nitpick it would be to add the same |
71b53f5
to
21d4510
Compare
Here you go sir, thanks for this project and your assistance! As I'm working through my bookmarked recipes you can expect more contributions from my side. |
Excellent, thanks! And you're welcome. Please do, and any feedback on improving the development process here would be gratefully received too. |
This patch adds support for keukenliefde.nl