Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: docs(academy-advanced-crawling): comit my unfinished first articles #490

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

metalwarrior665
Copy link
Member

No description provided.

@honzajavorek honzajavorek added the t-academy Issues related to Web Scraping and Apify academies. label May 22, 2024
@@ -0,0 +1,221 @@
---
title: How to parse compressed sitemaps
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be solved in Crawlee already

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i believe we already handle compression in the sitemap helper, cc @janbuchar


### [](#facets-vs-filters) Facets vs filters
There probably isn't a single global definition of what a facet is and what a filter is. In this framework, we will use the following definitions:
**Facet** - A lit of options (or range) that you can apply to get a specific filter. You get facets from the API response.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Facet** - A lit of options (or range) that you can apply to get a specific filter. You get facets from the API response.
**Facet** - A list of options (or range) that you can apply to get a specific filter. You get facets from the API response.

```

## [](#the-framework) The framework
> We are working on a TypeScript library that will provide this framework and also auto-implement it for popular search APIs like Algolia.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Crawlee?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t-academy Issues related to Web Scraping and Apify academies.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants