-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional Argument Validation Fails for non-latin characters #99
Comments
I wonder whether this library should be validating or filtering input to the API at all. |
In this commit I changed the way we're filtering and validating characters so that we allow non-Latin characters as we have in Germany, Poland, Greece and other countries.
Hello @robmeek, I tested the hypothesis you provided here where I have the following data set for testing 'Greek Trader Name' => [
[
'countryCode' => 'EL',
'vatNumber' => '999645865',
'requesterCountryCode' => 'EL',
'requesterVatNumber' => '999645865',
'traderName' => 'ΤΡΑΙΝΟΣΕ',
'traderCompanyType' => 'AE',
'traderStreet' => 'ΚΑΡΟΛΟΥ 1-3',
'traderPostcode' => '10437',
'traderCity' => 'ΑΘΗΝΑ',
],
], I've modified the private function filterArgument(string $argumentValue): string
{
$argumentValue = str_replace(['"', '\''], '', $argumentValue);
return filter_var($argumentValue, FILTER_SANITIZE_SPECIAL_CHARS, FILTER_FLAG_STRIP_LOW);
} And modified the validation regex for private function validateArgument(string $argumentValue): bool
{
if (false === filter_var($argumentValue, FILTER_VALIDATE_REGEXP, [
'options' => ['regexp' => '/^[a-zA-Z0-9\s\.\-,\pL]+$/u']
])) {
return false;
}
return true;
} Now the tests are running OK and we didn't break anything in the process, so that's always good. Please have a look at my PR #100 and see if it solves your issue. |
@DragonBe Yes, that looks good to me. |
Awaiting a code review from @krzaczek to make sure… once approved, it will be merged in. |
@DragonBe can you please add "+" to the white list too?
Example of a valid company name in the VIES database: ACME GmbH + Co. KG Thanks a lot! |
I just discovered, that parentheses need to be allowed too. Maybe it is better to remove this filter as @robmeek already suggested. I can see no need for it. |
I'm not really a fan of allowing all sorts of entries to go in to prevent malicious code to be executed. Given the feedback up to this point, what I might do instead is allowing all characters except those I consider harmful by using |
Thanks for your feedback. |
I was just reviewing the code bit again and there are two steps in the validation process:
I could add the |
Hmm, up to this point I have most of the issues cleared, except for the ampesant issue. Test case: |
Double checked the SoapCall result:
Looks to me like the validation is ok, but the filtering should not filter the special characters |
After some discussion on issue #99 I improved the way we filter arguments, especially for trader company names and added a couple of additional test cases to make sure we allow special characters inside the trader names.
OK, in 7ef8962 I was able to improve the filtering with additonal test cases and was even able to remove a potential security risk. @fidelo-software can you have a look at it to see if your use cases are now covered? If not, please add some examples here in the thread or immediately to the test data provider. |
Thank you for responding so quickly! Speak & Fun España SL |
Correct, abbreviations in Spanish are often marked with a I know it's a bit annoying, but until I find a permanent solution I think I need to take it case by case. I've adjusted the validation rule in b971690. |
The addOptionalArguments method throws an InvalidArgumentException if an argument contains exclusively non-ascii characters.
e.g. if the traderName is
ΕΛΛΑΣ
– note that the E and the A are Greek, and not Latin, unicode characters here. For test purposes something like
Äß
also throws the exception.The filterArgument strips away all greek letters (and accented characters such as ä and é). In the case of ΕΛΛΑΣ – this means an empty string is sent to validateArgument.
The regex in validateArgument is probably also problematic. Changing to this:
/^[a-zA-Z0-9α-ωΑ-Ω\s\.\-,]+$/
– Will allow the greek alphabet but I don’t know whether that’s sufficient.
The text was updated successfully, but these errors were encountered: