Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language locale values do not match web standards #2909

Closed
billynoah opened this issue Apr 14, 2015 · 5 comments
Closed

Language locale values do not match web standards #2909

billynoah opened this issue Apr 14, 2015 · 5 comments

Comments

@billynoah
Copy link
Contributor

The default language codes for English which are inserted on DB creation do not appear to follow web standards, and therefore will not find a match in many cases.

Case:

As of OpenCart 2.0.2.0 the database sets the following string for the English language locale:

en_US.UTF-8,en_US,en-gb,english

This is used to match against an array of languages produced from the browser's transmitted 'HTTP_ACCEPT_LANGUAGE'. In my case using Firefox v37 on OS X that string happens to be "en-US,en;q=0.5". Exploded this produces the array:

array(
  [0] => 'en-US'
  [1] => 'en;q=0.5'
)

You can see the first element of the array follows standards described here: http://www.metamodpro.com/browser-language-codes setting the default language to "en-US'. The alternate would be 'en'.

In the first case it will not match our array since OpenCart uses '_' but browser transmits '-' as separator. IUn the second it could match but for the relative quality factor string following en. So in either case we will not find a match.

Additionally, in my testing, some browsers add space in the string so that exploded array elements could end up as ' en-US' which will cause additional difficulties matching.

My suggestion would be add or amend the locale field to include 'en-US' and possibly revise index.php to remove the relative quality factor string when creating the "browser languages" array like this:

$browser_languages = preg_replace(array('/;.*/','/\s/'),'',$browser_languages);

Another solution would involve rewriting the language detection portion to use php's http_negotiate_language() function.

@atnaples
Copy link

bad news are: the same was in 1.6.5... and it's taken from Google's documentation/recommendation.
BUT all browsers (windows, mac, linux) report as you sad: en-*, worse: it could be upper case and low...

@billynoah
Copy link
Contributor Author

Can you provide a link to the google doc you mentioned? Case can be easily rectified by converting all to lowercase.

@atnaples
Copy link

unfortunately, could not find google's paper, but i found http://www.w3.org/International/questions/qa-lang-priorities.en.php and they are speaking only about '-', not '_'. so, for example:

  • en-US.UTF-8,en-US,en-gb - English
  • es-ES.UTF-8,es-ES,UTF-8 - Spanish
  • ru-RU.UTF-8,ru-RU,russian - Russian

... and in index.php something like that (for autatic detection):

$detect = '';

if (isset($request->server['HTTP_ACCEPT_LANGUAGE']) && $request->server['HTTP_ACCEPT_LANGUAGE']) {
// change everyting to lowcase, because of apple :-(
$accept_languages = strtolower($_SERVER['HTTP_ACCEPT_LANGUAGE']);
$accept_languages_arr = explode(",",$accept_languages);
foreach($accept_languages_arr as $accept_language) {
if( preg_match ("/^(([a-zA-Z]+)(-([a-zA-Z]+)){0,1})(;q=([0-9.]+)){0,1}/" , $accept_language, $matches ) ) {
if(!isset($matches[6]) || !$matches[6]) $matches[6] = 1;
$result[$matches[1]] = array(
'lng_base' => $matches[2],
'lng_ext' => $matches[4],
'lng' => $matches[1],
'priority' => $matches[6],
'_str' => $accept_language,
);
}
}

foreach ($result as $browser_language) {

// it could be more than one language supported by browser/system. tested on linux (ru/en)
if( $detect != '' ) break;
foreach ($languages as $key => $value) {
if ($value['status']) {
$locale = explode(',', strtolower($value['locale']));
if( isset($browser_language['lng']) && in_array($browser_language['lng'], $locale)) {
// it would correct to check 'priority' for the language... we'are talink the first one
$detect = $key;
break;
}
}
}
}
}

@danielkerr
Copy link
Member

there is no issues here. you can put what ever values you want to detect different languages.

@billynoah
Copy link
Contributor Author

@danielkerr are you familiar with the relative quality factor? How can a user possibly account for the presence of the this, given that it's added by the browser and impossible to predict? It needs to be removed from the string before you will be able to detect a language.

If a given browser uses "en" and you simply add "en" to your query strings, you will not be able to successfully parse the response based on the fact that it could be:

en;q=0.5
en;q=0.8
en;q=0.1

Etc, etc... Are users expected to add a seemingly infinite number of possible numeric values after each language string in their system settings? I think you'll agree, this is unreasonable.

While I agree it is not the responsibility of OpenCart to predict what languages will be used by a given site, at the very least, the relative quality factor and any additional whitespace should be stripped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants