Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

links/urls are not apprearing using extract #636

Closed
alroythalus opened this issue Jul 1, 2024 · 1 comment
Closed

links/urls are not apprearing using extract #636

alroythalus opened this issue Jul 1, 2024 · 1 comment
Labels
feedback Feedback from users requested

Comments

@alroythalus
Copy link

    extract(
        web_content,
        include_formatting=False,
        include_tables=True,
        include_comments=False,
        include_links=True,
        output_format="xml",
        favor_recall=True,
        config=config,
    )
)  # type: ignore


with this config urls are not showing up. What is the issu. How can it be fixed?

sites tested on
https://openai.com/policies/privacy-policy/
https://docs.github.com/en/site-policy/privacy-policies/github-general-privacy-statement

@adbar

@adbar adbar added the question Further information is requested label Jul 16, 2024
@adbar adbar added feedback Feedback from users requested and removed question Further information is requested labels Jul 25, 2024
@adbar
Copy link
Owner

adbar commented Jul 25, 2024

@alroythalus I just tested the Github example and the links are in the XML output, here is a small example:

To remove content or information you have publicly posted, please submit a <ref target="https://support.github.com/contact/private-information">Private Information Removal request</ref>.

I cannot reproduce the bug, can you see if it works for you or if you can provide more information?

@adbar adbar closed this as not planned Won't fix, can't repro, duplicate, stale Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feedback Feedback from users requested
Projects
None yet
Development

No branches or pull requests

2 participants