(Enhancement): Analytics - Updated regex for entitiy-detection logic of phone_number v2 #557
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
JIRA Ticket Number AN-4218
JIRA TICKET: https://hello-haptik.atlassian.net/browse/AN-4218
Description of change
Yesterday's regex r'[-(),.+\s{}]{9,12}' - didn't worked for ticket number, so had to update it.
updated it to support numbers, r'[0-9-().+\s]{9,12}'
Later while testing found out Refactor few variables, add defaults to config and add documentation for datastore environment variables #2 (new one) is not working for numbers of 13 digits long.
so updated it to support to range 9 to 12 only so 13 digit number will not be considered as a phone number,
but to handle 13 digits, it became complex - like r'^(?:+?\d{1,3})?[-.\s]?(?\d{1,4})?[-.\s]?\d{1,4}[-.\s]?\d{1,4}$'
but complexity to handle just case is not helpful so need to ignore 13 digit case, but before that did load testing on 1 million entries,
found out time in Refactor few variables, add defaults to config and add documentation for datastore environment variables #2 and java install #5 is double, so we need to go with Refactor few variables, add defaults to config and add documentation for datastore environment variables #2 (simpler one but it won't handle 13 digits if no brackets given), and I think it should be fine.
Slack thread for discussion on AGCT Group - https://haptik.slack.com/archives/C04NKEXCAHH/p1727158979642939?thread_ts=1727070015.417459&cid=C04NKEXCAHH
Slack thread for adding testcases in Chat-bot NER - https://haptik.slack.com/archives/G01NG4CSL8P/p1727159738581209?thread_ts=1727080081.909739&cid=G01NG4CSL8P