Skip to content

Revcheck library and QA tools for translation synchronization  #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 59 commits into from
Jun 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
9120926
First part of revcheck algorithm.
Jul 11, 2023
0bf8620
Hash and timestamp details for source files.
Jul 13, 2023
2eb3e3c
Revtag parser.
Jul 13, 2023
7b74f59
Rename for future use.
Jul 14, 2023
9fc530c
Implementation of revcheck algorithm.
Jul 14, 2023
7c5cc0c
Export some calculated status to easily use on other tools.
Jul 16, 2023
a5c212f
qaxml.a.check.php, tool to check tag-attribute-values.
Jul 17, 2023
ecb3f1e
Print diff in conventional order.
Jul 27, 2023
117660a
qaxml.a -- Documentation and small fixes.
Aug 2, 2023
6cf1423
Initial support for [skip-revcheck].
Aug 3, 2023
03b9760
[skip-revcheck] compatibility.
Aug 3, 2023
ea00f5e
Compact output.
Aug 3, 2023
7058ea8
Script to check entities usage in translatoins.
Aug 4, 2023
1f3c5f1
qaxml.p and qaxml.t tools.
Aug 4, 2023
e3a7609
Rename compare hashes to head and diff.
Aug 5, 2023
f3e45aa
Tools documentation.
Aug 5, 2023
45d1126
Merge branch 'php:master' into master
alfsb Aug 5, 2023
997af1f
Ignore order in comparation.
Aug 5, 2023
e07946b
Simplify output.
Aug 5, 2023
6d4a3fb
Small fixes.
Aug 7, 2023
66d9630
Simple and complex output.
Aug 7, 2023
40edf10
`bookinfo.xml` is expectd to tag differ.
Aug 7, 2023
7cc43f5
Case insentivity on `<type>`.
Aug 8, 2023
84818f6
Consistent output volunteer myself for revisions.
Aug 11, 2023
817fa07
Consistent output volunteer myself for revisions.
Aug 11, 2023
a933c9e
Merge branch 'php:master' into master
alfsb Aug 12, 2023
0cd98f5
Missing ;
Aug 12, 2023
afebb28
Merge branch 'php:master' into master
alfsb Aug 13, 2023
918bfc6
Fix revtag capture and error reporting.
Aug 13, 2023
60c9cae
Organize require for all files in `lib/`
Aug 14, 2023
cb1eb76
New detail output for `qaxml.t.php`.
Aug 14, 2023
bd4846d
New detail output for `qaxml.t.php`.
Aug 14, 2023
efbefa7
Detect and ignore expected outputs.
Aug 15, 2023
258bc50
New `qarvt.php` to validate revtag expected format.
Aug 17, 2023
34f2e3d
Offer ignore comment only if checking tag's contents.
Aug 18, 2023
aa9b422
Save ignore locally, outside of global XML.
Aug 18, 2023
2ab0f2f
Enhance ignore marking to consider arguments and intenal state.
Aug 18, 2023
b1d1205
Ignore usage in all cases.
Aug 18, 2023
2b3d81e
Merge branch 'php:master' into master
alfsb Aug 18, 2023
1292b1f
Generic algorithm to detect count and contents mismatch.
Aug 22, 2023
cda8129
Count filtered.
Aug 22, 2023
e8dcd15
Generalize case insensitivity on basic types
Aug 22, 2023
ae34a2e
Configurable ignore output.
Aug 22, 2023
99d9e30
Fix wrong hash on unused message.
Aug 22, 2023
1c1c59b
Simplify output.
Aug 23, 2023
43624a2
Add ignore capacity on qaxml.e.php.
Aug 24, 2023
2e3585a
Convert qaxml.t.php to use OutputIgnoreBuffer.
Aug 24, 2023
beec19b
Fix instable output on qaxml.t.php.
Aug 25, 2023
467d573
Merge branch 'php:master' into master
alfsb Oct 9, 2023
da1a6d9
Merge branch 'php:master' into master
alfsb Oct 17, 2023
92964f2
Merge branch 'php:master' into master
alfsb Nov 2, 2023
2614f14
Merge branch 'php:master' into master
alfsb Nov 21, 2023
ed29a72
License code to PHP License
Dec 14, 2023
3a9dd37
Merge branch 'php:master' into master
alfsb Dec 14, 2023
0861659
Fix dynamic property creation into temporary
Dec 15, 2023
ed0a911
Usage and help info
Dec 15, 2023
07eee25
Documentation fixes
Dec 15, 2023
821023f
Pass by reference is not necessary with classes
Dec 15, 2023
bfa2f52
Encapsulate private property
Dec 15, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# The following volunteers have self-identified as subject matter experts
# or interested parties over a particular area of this repository.
# While requesting a review from someone does not obligate that person to
# review a pull request, these reviewers might have valuable knowledge of
# the problem area and could aid in deciding whether a pull request is ready
# for merging.
#
# For more information, see the GitHub CODEOWNERS documentation:
# https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners

/scripts/translation/ @alfsb
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,3 +143,8 @@ and find issues with it, they are located in the `scripts/qa/` directory.
There might be some more just in `scripts/` but they need to be checked if they
are still relevant and/or given some love.

# Translation Tools

There are also various scripts to ensure the quality and synchrony of
documentation translations, located in the `scripts/translation/` directory.

2 changes: 2 additions & 0 deletions scripts/translation/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Persistent data shared between scripts
.cache/
111 changes: 111 additions & 0 deletions scripts/translation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Some useful scripts for maintaining translation consistency of manual

Some of these scripts only test some file contents or XML structure
of translated files against their equivalents on `en/` directory.
Others will try modify the translations in place, changing the
translated files. Use with care.

Not all translations are identical, or use the same conventions.
So not all scripts will be of use for all translations. The
assumptions of each script are described in each file.

The `lib/` directory contains common code and functionality
across these scripts.

Before using the scripts, it need be configured:
```
php doc-base/scripts/translation/configure.php $LANG_DIR
```

## qarvt.php

`qarvt.a.php` checks if all translated files have revtags in the
expected format.

## qaxml.a.php

`qaxml.a.php` checks if all updated translated files have
the same tag-attribute-value triples. Tag's attributes are extensively
utilized in manual for linking and XIncluding. Translated files with
missing os mistyped attributes may cause build failing or missing
parts not copied by XIncludes.

## qaxml.e.php

`qaxml.e.php` checks if all updated translated files have
the same external entities as the original files. Unbalanced entities
may indicate mistyped or wrongly traduced parts.

## qaxml.p.php

`qaxml.p.php` checks if all updated translated files have
the same processing instructions as the original files. Unbalanced entities
may cause compilation errors, as they are utilized on manual in the build
process.

## qaxml.t.php

`qaxml.t.php` checks if all updated translated files have
the same tags as the original files. Different number of tags between
source texts and target translations may cause compilation errors.

Usage: `php qaxml.t.php [--detail] [tag[,tag]]`

`[tag[,tag]]` is a comma separated tag list to check their
contents, as some tag's contents are expected *not* be translated.

`--detail` will also print line defintions of each mismatched tag,
to facilitate bitsecting.

## Suggested execution

Structural checks:

```
php doc-base/scripts/translation/configure.php $LANG_DIR

php doc-base/scripts/translation/qarvt.php

php doc-base/scripts/translation/qaxml.a.php
php doc-base/scripts/translation/qaxml.e.php
php doc-base/scripts/translation/qaxml.p.php
php doc-base/scripts/translation/qaxml.t.php
```
Tags where is expected no translations:
```
php doc-base/scripts/translation/qaxml.t.php acronym
php doc-base/scripts/translation/qaxml.t.php classname
php doc-base/scripts/translation/qaxml.t.php constant
php doc-base/scripts/translation/qaxml.t.php envar
php doc-base/scripts/translation/qaxml.t.php function
php doc-base/scripts/translation/qaxml.t.php interfacename
php doc-base/scripts/translation/qaxml.t.php parameter
php doc-base/scripts/translation/qaxml.t.php type
php doc-base/scripts/translation/qaxml.t.php classsynopsis
php doc-base/scripts/translation/qaxml.t.php constructorsynopsis
php doc-base/scripts/translation/qaxml.t.php destructorsynopsis
php doc-base/scripts/translation/qaxml.t.php fieldsynopsis
php doc-base/scripts/translation/qaxml.t.php funcsynopsis
php doc-base/scripts/translation/qaxml.t.php methodsynopsis
```
Tags where is expected few translations:
```
php doc-base/scripts/translation/qaxml.t.php code
php doc-base/scripts/translation/qaxml.t.php computeroutput
php doc-base/scripts/translation/qaxml.t.php filename
php doc-base/scripts/translation/qaxml.t.php literal
php doc-base/scripts/translation/qaxml.t.php varname
```

# Migration

## Maintainers with spaces

The regex on `RevtagParser` was narrowed to not accept maintainer's names
with spaces. This need to be confirmed on all active translations, or
the regex modified to accept spaces again.

## en/chmonly

`en/chmonly` is ignored on revcheck, but it appears translatable. If it's a
`en/` only directory, this should be uncommented on RevcheckIgnore.
29 changes: 29 additions & 0 deletions scripts/translation/configure.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
<?php
/**
* +----------------------------------------------------------------------+
* | Copyright (c) 1997-2023 The PHP Group |
* +----------------------------------------------------------------------+
* | This source file is subject to version 3.01 of the PHP license, |
* | that is bundled with this package in the file LICENSE, and is |
* | available through the world-wide-web at the following url: |
* | https://www.php.net/license/3_01.txt. |
* | If you did not receive a copy of the PHP license and are unable to |
* | obtain it through the world-wide-web, please send a note to |
* | license@php.net, so we can mail you a copy immediately. |
* +----------------------------------------------------------------------+
* | Authors: André L F S Bacci <ae php.net> |
* +----------------------------------------------------------------------+
* | Description: Generate cached data for revcheck and QA tools. |
* +----------------------------------------------------------------------+
*/

require_once __DIR__ . '/lib/all.php';

if ( count( $argv ) < 2 || in_array( '--help' , $argv ) || in_array( '-h' , $argv ) )
{
fwrite( STDERR , "Usage: {$argv[0]} [lang_dir]\n\n" );
fwrite( STDERR , "See https://github.com/php/doc-base/tree/master/scripts/translation#readme for more info.\n" );
return;
}

new RevcheckRun( 'en' , $argv[1] , true );
57 changes: 57 additions & 0 deletions scripts/translation/lib/CacheFile.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
<?php
/**
* +----------------------------------------------------------------------+
* | Copyright (c) 1997-2023 The PHP Group |
* +----------------------------------------------------------------------+
* | This source file is subject to version 3.01 of the PHP license, |
* | that is bundled with this package in the file LICENSE, and is |
* | available through the world-wide-web at the following url: |
* | https://www.php.net/license/3_01.txt. |
* | If you did not receive a copy of the PHP license and are unable to |
* | obtain it through the world-wide-web, please send a note to |
* | license@php.net, so we can mail you a copy immediately. |
* +----------------------------------------------------------------------+
* | Authors: André L F S Bacci <ae php.net> |
* +----------------------------------------------------------------------+
* | Description: Class to handle data persistence. |
* +----------------------------------------------------------------------+
*/

require_once __DIR__ . '/all.php';

class CacheFile
{
const CACHE_DIR = __DIR__ . '/../.cache';

private string $filename;

function __construct( string $file )
{
$this->filename = CacheFile::prepareFilename( $file , true );
}

public function load( mixed $init = null )
{
if ( file_exists( $this->filename ) == false )
return $init;
$data = file_get_contents( $this->filename );
return unserialize( gzdecode( $data ) );
}

public function save( $data )
{
$contents = gzencode( serialize( $data ) );
file_put_contents( $this->filename , $contents );
}

public static function prepareFilename( string $file , bool $createCacheDirs = false )
{
if ( str_starts_with( $file , '/' ) )
return $file;
$outPath = CacheUtil::CACHE_DIR . '/' . dirname( $file );
$outFile = rtrim( $outPath , '/' ) . '/' . $file;
if ( $createCacheDirs && file_exists( $outPath ) == false )
mkdir( $outPath , 0777 , true );
return $outFile;
}
}
51 changes: 51 additions & 0 deletions scripts/translation/lib/CacheUtil.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
<?php
/**
* +----------------------------------------------------------------------+
* | Copyright (c) 1997-2023 The PHP Group |
* +----------------------------------------------------------------------+
* | This source file is subject to version 3.01 of the PHP license, |
* | that is bundled with this package in the file LICENSE, and is |
* | available through the world-wide-web at the following url: |
* | https://www.php.net/license/3_01.txt. |
* | If you did not receive a copy of the PHP license and are unable to |
* | obtain it through the world-wide-web, please send a note to |
* | license@php.net, so we can mail you a copy immediately. |
* +----------------------------------------------------------------------+
* | Authors: André L F S Bacci <ae php.net> |
* +----------------------------------------------------------------------+
* | Description: Common functions do load and save to cache files. |
* +----------------------------------------------------------------------+
*/

require_once __DIR__ . '/all.php';

class CacheUtil
{
const CACHE_DIR = __DIR__ . '/../.cache';

public static function load( string $path , string $file )
{
$filename = CacheUtil::prepareFilename( $path , $file , true );
if ( file_exists( $filename ) == false )
return null;
$data = file_get_contents( $filename );
return unserialize( $data );
}

public static function save( string $path , string $file , $data )
{
$outFile = CacheUtil::prepareFilename( $path , $file , true );
$contents = serialize( $data );
file_put_contents( $outFile , $contents );
}

public static function prepareFilename( string $path , string $file , bool $createDirs = false )
{
$baseDir = CacheUtil::CACHE_DIR;
$outPath = rtrim( $baseDir , '/' ) . '/' . $path;
$outFile = rtrim( $outPath , '/' ) . '/' . $file;
if ( $createDirs && file_exists( $outPath ) == false )
mkdir( $outPath , 0777 , true );
return $outFile;
}
}
26 changes: 26 additions & 0 deletions scripts/translation/lib/GitDiffParser.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
<?php
/**
* +----------------------------------------------------------------------+
* | Copyright (c) 1997-2023 The PHP Group |
* +----------------------------------------------------------------------+
* | This source file is subject to version 3.01 of the PHP license, |
* | that is bundled with this package in the file LICENSE, and is |
* | available through the world-wide-web at the following url: |
* | https://www.php.net/license/3_01.txt. |
* | If you did not receive a copy of the PHP license and are unable to |
* | obtain it through the world-wide-web, please send a note to |
* | license@php.net, so we can mail you a copy immediately. |
* +----------------------------------------------------------------------+
* | Authors: André L F S Bacci <ae php.net> |
* +----------------------------------------------------------------------+
* | Description: Parse `git diff` to complement file state. |
* +----------------------------------------------------------------------+
*/

require_once __DIR__ . '/all.php';

class GitDiffParser
{
public static function parseNumstatInto( string $dir , RevcheckFileInfo $file )
{}
}
93 changes: 93 additions & 0 deletions scripts/translation/lib/GitLogParser.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
<?php
/**
* +----------------------------------------------------------------------+
* | Copyright (c) 1997-2023 The PHP Group |
* +----------------------------------------------------------------------+
* | This source file is subject to version 3.01 of the PHP license, |
* | that is bundled with this package in the file LICENSE, and is |
* | available through the world-wide-web at the following url: |
* | https://www.php.net/license/3_01.txt. |
* | If you did not receive a copy of the PHP license and are unable to |
* | obtain it through the world-wide-web, please send a note to |
* | license@php.net, so we can mail you a copy immediately. |
* +----------------------------------------------------------------------+
* | Authors: André L F S Bacci <ae php.net> |
* +----------------------------------------------------------------------+
* | Description: Parse `git log` to complement file state. |
* +----------------------------------------------------------------------+
*/

require_once __DIR__ . '/all.php';

class GitLogParser
{
static function parseInto( string $lang , RevcheckFileList & $list )
{
$cwd = getcwd();
chdir( $lang );
$fp = popen( "git log --name-only" , "r" );
$hash = "";
$date = "";
$skip = false;
while ( ( $line = fgets( $fp ) ) !== false )
{
// new commit block
if ( substr( $line , 0 , 7 ) == "commit " )
{
$hash = trim( substr( $line , 7 ) );
$date = "";
$skip = false;
continue;
}
// datetime of commit
if ( strpos( $line , 'Date:' ) === 0 )
{
$line = trim( substr( $line , 5 ) );
$date = strtotime( $line );
continue;
}
// other headers
if ( strpos( $line , ': ' ) > 0 )
continue;
// empty lines
if ( trim( $line ) == "" )
continue;
// commit message
if ( str_starts_with( $line , ' ' ) )
{
// commits with this mark are ignored
if ( stristr( $line, '[skip-revcheck]' ) !== false )
$skip = true;
continue;
}
// otherwise, a filename
$filename = trim( $line );
$info = $list->get( $filename );

// untracked file (deleted, renamed)
if ( $info == null )
continue;

// the head commit
if ( $info->head == "" )
{
$info->head = $hash;
$info->date = $date;
}

// after, only tracks non skipped commits
if ( $skip )
continue;

// the diff commit
if ( $info->diff == "" )
{
$info->diff = $hash;
$info->date = $date;
}
}

pclose( $fp );
chdir( $cwd );
}
}
Loading