-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to get currency name(USD,IND) with the DocQuery? #51
Comments
Can you share the document and value you are trying to extract? |
document links: in the below you can see the symbols $,euro symbol. based on that can we get currency name like USD,EUR something like this. |
9240725072.pdf |
DocQuery is extractive -- meaning it will only return text that is present in the document. Neither of the first two documents contains the string "USD" or "EUR", so there's no way to get DocQuery to return that value. I may suggest using something simpler, like a regular expression that searches for terms like "USD" or "$" to suggest "USD" and "EUR" or € to suggest "EUR", unless there is a consistent pattern for where this information lies in the document itself. |
In the above pdf file(loaded file) we have USD string right. |
@ankrgyl |
Yes, but not the other two
Unfortunately not, but if you'd be willing to contribute one, we'd be happy to include it! |
@ankrgyl |
Not at this point. |
@ankrgyl
i have tried multiple ways to extract the currency name.but no luck.
can you suggest me a matching query(question) to extract the currency from the doc?
The text was updated successfully, but these errors were encountered: