Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content being ignored by Crawlee #109

Open
crptopool opened this issue Oct 17, 2023 · 0 comments
Open

Content being ignored by Crawlee #109

crptopool opened this issue Oct 17, 2023 · 0 comments

Comments

@crptopool
Copy link

crptopool commented Oct 17, 2023

While crawling content it is ignoring text between certain tags like for example the content below between <aside></aside> is completely ignored.

<aside class="content tip astro-duqfclob" aria-label="Tip">
	<p class="title astro-duqfclob" aria-hidden="true">
		<span class="icon astro-duqfclob">
			<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 18 18" width="16" height="16" class="astro-duqfclob">
				<path fill-rule="evenodd" d="M14 0a8.8 8.8 0 0 0-6 2.6l-.5.4-.9 1H3.3a1.8 1.8 0 0 0-1.5.8L.1 7.6a.8.8 0 0 0 .4 1.1l3.1 1 .2.1 2.4 2.4.1.2 1 3a.8.8 0 0 0 1 .5l2.9-1.7a1.8 1.8 0 0 0 .8-1.5V9.5l1-1 .4-.4A8.8 8.8 0 0 0 16 2v-.1A1.8 1.8 0 0 0 14.2 0h-.1zm-3.5 10.6-.3.2L8 12.3l.5 1.8 2-1.2a.3.3 0 0 0 .1-.2v-2zM3.7 8.1l1.5-2.3.2-.3h-2a.3.3 0 0 0-.3.1l-1.2 2 1.8.5zm5.2-4.5a7.3 7.3 0 0 1 5.2-2.1h.1a.3.3 0 0 1 .3.3v.1a7.3 7.3 0 0 1-2.1 5.2l-.5.4a15.2 15.2 0 0 1-2.5 2L7.1 11 5 9l1.5-2.3a15.3 15.3 0 0 1 2-2.5l.4-.5zM12 5a1 1 0 1 1-2 0 1 1 0 0 1 2 0zm-8.4 9.6a1.5 1.5 0 1 0-2.2-2.2 7 7 0 0 0-1.1 3 .2.2 0 0 0 .3.3c.6 0 2.2-.4 3-1.1z" class="astro-duqfclob"></path>
			</svg>
		</span>
		Tip
	</p>
	<section class="astro-duqfclob">
		<p>A common pattern in Astro is to import global CSS inside a <a href="/en/core-concepts/layouts/">Layout component</a>. Be sure to import the Layout component before other imports so that it has the lowest precedence.</p>
	</section>
</aside>

The above code produces output as per screenshot below and also can be seen in action on this link :

image

All text inside <aside></aside> is ignored. Please advise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant