Skip to content

Charset related changes #132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 24, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 20 additions & 22 deletions src/EDI/Parser.php
Original file line number Diff line number Diff line change
Expand Up @@ -81,9 +81,9 @@ class Parser
private $stringSafe = '§SS§';

/**
* @var string|null Syntax identifier
* @var string UNB Syntax identifier
*/
private $syntaxID = 'UNOB';
private string $syntaxID = '';

/**
* @var string|null Message format from UNH
Expand Down Expand Up @@ -132,9 +132,9 @@ class Parser
private $unbChecked = false;

/**
* Optionally disable workarounds
* Optionally disable workarounds.
*/
private $strict = false;
private bool $strict = false;

/**
* Parse EDI array.
Expand Down Expand Up @@ -273,21 +273,25 @@ public function analyseUNH(array $line): void
}

/**
* Check if the encoding of the text actually matches the one declared by the UNB syntax identifier.
* Check if the file's character encoding actually matches the one declared in the UNB header.
*
* @throws \LogicException
* @throws \RuntimeException
*/
public function checkEncoding(): bool
{
if (empty($this->parsedfile)) {
throw new \RuntimeException('No text has been parsed yet');
throw new \LogicException('No text has been parsed yet');
}

if (! isset(self::$charsets[$this->syntaxID])) {
throw new \RuntimeException('Unsupported syntax identifier: ' . $this->syntaxID);
}

return mb_check_encoding($this->parsedfile, self::$charsets[$this->syntaxID]);
$check = mb_check_encoding($this->parsedfile, self::$charsets[$this->syntaxID]);
if(!$check)
$this->errors[] = 'Character encoding does not match declaration in UNB interchange header';

return $check;
}

/**
Expand All @@ -299,9 +303,9 @@ public function errors(): array
}

/**
* Set Strict
* (Un)Set strict parsing.
*/
public function isStrict($strict)
public function setStrict(bool $strict):void
{
$this->strict = $strict;
}
Expand Down Expand Up @@ -329,21 +333,15 @@ public function getRawSegments(): array
}

/**
* Get character encoding extracted from UNB header
* Get syntax identifier from the UNB header.
* Does not necessarily mean that the text is actually encoded as such.
*
* @return string
* @throws \RuntimeException
*/
public function getCharset(): string
public function getSyntaxIdentifier(): string
{
if (empty($this->parsedfile)) {
throw new \RuntimeException('No text has been parsed yet');
}

if (! isset(self::$charsets[$this->syntaxID])) {
throw new \RuntimeException('Unsupported syntax identifier: ' . $this->syntaxID);
}

return self::$charsets[$this->syntaxID];
return $this->syntaxID;
}

/**
Expand Down Expand Up @@ -457,7 +455,7 @@ private function resetUNA(): void
*/
private function resetUNB(): void
{
$this->syntaxID = 'UNOB';
$this->syntaxID = '';
$this->unbChecked = false;
}

Expand Down