Skip to content

Commit

Permalink
[#148] Generate CONTRIBUTORS file using Git history (#150)
Browse files Browse the repository at this point in the history
* feat(#148): generate CONTRIBUTORS file using Git history

Signed-off-by: Pierre-Yves Lapersonne <pierreyves.lapersonne@orange.com>

* refactor(#148): metrics

Signed-off-by: Pierre-Yves Lapersonne <pierreyves.lapersonne@orange.com>

* refactor(#148): review

Signed-off-by: Pierre-Yves Lapersonne <pierreyves.lapersonne@orange.com>

---------

Signed-off-by: Pierre-Yves Lapersonne <pierreyves.lapersonne@orange.com>
  • Loading branch information
pylapp authored Apr 3, 2024
1 parent ff582f0 commit f8c363d
Show file tree
Hide file tree
Showing 5 changed files with 187 additions and 2 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- [Diver] Generate CONTRIBUTORS file using Git history ([#148](https://github.com/Orange-OpenSource/floss-toolbox/issues/148))
- [Utils] Apply SPDX headers to sources with REUSE tool ([#146](https://github.com/Orange-OpenSource/floss-toolbox/issues/146))
- [Diver] Check headers of sources files ([#101](https://github.com/Orange-OpenSource/floss-toolbox/issues/101))

Expand Down
63 changes: 62 additions & 1 deletion toolbox/diver/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -303,4 +303,65 @@ check_for_sources($arguments[:folder], $arguments[:template], $arguments[:exclud
# Software description: A SwiftUI components library with code examples for Orange Design System
```

This case is more interesting if you use the same symbol for first, last and intermediate lines (or if you use only monoline comment symbol).
This case is more interesting if you use the same symbol for first, last and intermediate lines (or if you use only monoline comment symbol).

### Generate a CONTRIBUTORS file

We may want to have a CONTRIBUTORS.txt or AUTHORS.txt files containing all the people names who worked or still work on the project.
To do so, we can use the VCS history as a source of truh ; e.g. Git as SCM.
The *generate-contributors-file.py* Python script will use Git commands to get logs, and build a CONTRIBUTORS file in the project with some notice
and the list of all entities (first name, uppercased lastname, email).
However, the user will have to deal namesakes and only might want to remove some bots accounts.

To run it:

```shell
python3.8 generate-contributors-file.py --target /path/to/myt/git/based/project
```

For example it will output a file with such content:

```text
# This is the official list of people have contributed code to
# this repository.
#
# Names should be added to this file like so:
# Individual's name <submission email address>
# Individual's name <submission email address> <email2> <emailN>
#
# An entry with multiple email addresses specifies that the
# first address should be used in the submit logs and
# that the other addresses should be recognized as the
# same person.
# Please keep the list sorted.
renovate[bot] <29139666+renovate[bot]@users.noreply.github.com>
BarryAllen <barry.allen@star.labs>
Lex LUTHOR <100863844+lluthor@users.noreply.github.com>
Bruce WAYNE <batman@gmail.com>
Bruce WAYNE <bruce.waybe@wayneenterprise.com>
```

In the example above, we can see that the Renovate bot commit has been processed (maybe a line to remove), *BarryAllen* failed to configure his Git environment (because he types to fast on his keyboard we can suppose), the commit from GitHub Web UI of *Lex Luthor* has been picked and *Bruce WAYNE* used two addresses.

Maybe a better file after fixes could be (after manual cleaning):

```text
# This is the official list of people have contributed code to
# this repository.
#
# Names should be added to this file like so:
# Individual's name <submission email address>
# Individual's name <submission email address> <email2> <emailN>
#
# An entry with multiple email addresses specifies that the
# first address should be used in the submit logs and
# that the other addresses should be recognized as the
# same person.
# Please keep the list sorted.
Barry ALLEN <barry.allen@star.labs>
Lex LUTHOR <lex.luthor@lex.corp>
Bruce WAYNE <bruce.wayne@wayneenterprise.com> <batman@gmail.com>
```
6 changes: 5 additions & 1 deletion toolbox/diver/dry-run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

# Since...............: 10/03/2023
# Description.........: Make a dry-run of the diver module to check if everything is ready to use
# Version.............: 1.2.0
# Version.............: 1.3.0

set -eu

Expand Down Expand Up @@ -116,6 +116,7 @@ CheckIfFileExists "./find-credits-in-files.sh"
CheckIfFileExists "./find-missing-developers-in-git-commits.sh"
CheckIfFileExists "./list-contributors-in-history.sh"
CheckIfFileExists "./lines-count.sh"
CheckIfFileExists "./generate-contributors-file.py"

echo -e "\nCheck utilitary scripts..."
CheckIfFileExists "./utils/extract-contributors-lists.rb"
Expand Down Expand Up @@ -148,6 +149,9 @@ CheckIfRuntimeExists "git" "git --version" "2.32.0"
echo -e "\nCheck for cloc..."
CheckIfRuntimeExists "cloc" "cloc --version" "1.88"

echo -e "\nCheck for Python3..."
CheckIfRuntimeExists "Python3" "python3 --version" "3.8.5"

# Conclusion
# ----------

Expand Down
118 changes: 118 additions & 0 deletions toolbox/diver/generate-contributors-file.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
#!/usr/bin/python3
# Software Name: floss-toolbox
# SPDX-FileCopyrightText: Copyright (c) Orange SA
# SPDX-License-Identifier: Apache-2.0
#
# This software is distributed under the Apache 2.0 license,
# the text of which is available at https://opensource.org/license/apache-2-0
# or see the "LICENSE.txt" file for more details.
#
# Authors: See CONTRIBUTORS.txt
# Software description: A toolbox of scripts to help work of forges admins and open source referents

# Version.............: 1.0.0
# Since...............: 03/04/2023
# Description.........: Using the Git history, generates a CONTRIBUTORS.md file

import argparse
import os
import shutil
import subprocess
import sys
import time

# Configuration
# -------------

NORMAL_EXIT_CODE = 0
BAD_ARGUMENTS_EXIT_CODE = 1
BAD_PRECONDITION_EXIT_CODE = 2

# Check arguments
# ---------------

parser = argparse.ArgumentParser(description='Using the Git history, generates a CONTRIBUTORS.md file')
required_args = parser.add_argument_group('Required arguments')
required_args.add_argument('-t', '--target', help='The path to the folder (Git based project) where scan must be done')
args = parser.parse_args()

target = args.target
if not target or not os.path.isdir(target):
print("❌ Error: the target to scan is not defined or not a directory")
sys.exit(BAD_ARGUMENTS_EXIT_CODE)
else:
print(f'🆗 Target to scan and update is: "{target}"')

# Service
# -------

start_computation_time = time.time()

# Temporary log file for "git log" command
TEMP_FOLDER = ".floss-toolbox-temp"
TEMP_FOLDER_FULL_PATH = target + "/" + TEMP_FOLDER
GIT_LOG_TEMP_FILE = "git-logs.txt"
GIT_LOG_TEMP_FILE_PATH = TEMP_FOLDER_FULL_PATH + "/" + GIT_LOG_TEMP_FILE
print(f"✏️ Creating folder '{TEMP_FOLDER}' with internal stuff in target")
os.makedirs(TEMP_FOLDER_FULL_PATH, exist_ok=True)

# Check if Git repository is empty (check if there are at least 1 commit in the logs)
command_result_output = subprocess.check_output("git log --oneline -1 > /dev/null 2>&1 | wc -l", shell=True)
command_result = int(command_result_output.decode().strip())
if command_result == "0":
printf("💥 Error: Target is a git repository without any commit, that's weird")
sys.exit(BAD_PRECONDITION_EXIT_CODE)
else:
print("🆗 It seems there are commits in this repository, cool!")

# Dump Git logs
print("✏️ Dumping Git logs")
# Create the log file, go to targetn and run the git command
# Format the output to have first name, last name (upercased) and email, sorted alphabetically ascending
# Deal also the case where we only have one value between first and last name
git_log_command = """
touch {log_file} && cd {target} && git log --all --format="%aN <%aE>" | sort | uniq | awk '{{if ($2 !~ /@/) {{print $1, toupper($2), $3}} else {{print $1, $2, $3}}}}' | sort -k2 > {log_file}
""".format(target=target, log_file=GIT_LOG_TEMP_FILE_PATH)
os.system(git_log_command)

contributors_count_output = subprocess.check_output("cat {log_file} | wc -l".format(log_file=GIT_LOG_TEMP_FILE_PATH), shell=True)
contributors_count = int(contributors_count_output.decode().strip())
print(f"👉 Found maybe {contributors_count} contributors")

# Add notice in file
final_file_name = "CONTRIBUTORS.new.txt"
final_file_path = target + "/" + final_file_name # .new just to prevent to override previous existing file
print(f"✏️ Preparing final file at '{final_file_path}'")
with open(GIT_LOG_TEMP_FILE_PATH, 'r') as logs_file:
contributors = logs_file.read()

notice = """# This is the official list of people have contributed code to
# this repository.
#
# Names should be added to this file like so:
# Individual's name <submission email address>
# Individual's name <submission email address> <email2> <emailN>
#
# An entry with multiple email addresses specifies that the
# first address should be used in the submit logs and
# that the other addresses should be recognized as the
# same person.
# Please keep the list sorted.
"""
final_content = notice + contributors

with open(final_file_path, 'w') as contributors_file:
contributors_file.write(final_content)

print("🧹 Cleaning")
shutil.rmtree(TEMP_FOLDER_FULL_PATH)

end_computation_time = time.time()

print(f"🎉 The contributors file '{final_file_name}' has been generated (in {end_computation_time - start_computation_time} seconds)!")
print(f'!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!')
print(f'✋ You MUST have a look on it, deal namesakes and ensure the file is well filled')
print(f'!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!')
sys.exit(NORMAL_EXIT_CODE)
1 change: 1 addition & 0 deletions toolbox/diver/list-contributors-in-history.sh
Original file line number Diff line number Diff line change
Expand Up @@ -247,4 +247,5 @@ echo "Reports available in $REPORT_METRIC_FILE:"
cat $REPORT_METRIC_FILE

echo -e "\nEnd of $SCRIPT_NAME\n"
echo "NOTE: Maybe the script called generate-contributors-file.py is more interesting for you"
NormalExit

0 comments on commit f8c363d

Please sign in to comment.