Merge pull request #32 from MikeMeliz/coveragereport

AddCoverageReports: Tox & SonarQube Cloud
MikeMeliz · Nov 3, 2024 · 5f9ed77 · 5f9ed77
2 parents 568a859 + ca78258
commit 5f9ed77
Show file tree

Hide file tree

Showing 5 changed files with 79 additions and 24 deletions.
diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml
@@ -0,0 +1,27 @@
+name: Build
+on:
+  push:
+    branches:
+      - master
+  pull_request:
+    types: [opened, synchronize, reopened]
+jobs:
+  sonarcloud:
+    name: SonarCloud
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v2
+        with:
+          fetch-depth: 0
+      - name: Setup Python
+        uses: actions/setup-python@v2
+        with:
+          python-version: ${{ matrix.python }}
+      - name: Install tox and any other packages
+        run: pip install tox
+      - name: Run tox
+        run: tox -e py
+      - name: SonarCloud Scan
+        uses: SonarSource/sonarcloud-github-action@master
+        env:
+          SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
diff --git a/README.md b/README.md
@@ -14,8 +14,9 @@
   [![Release][release-version-shield]][releases-link]
   [![Last Commit][last-commit-shield]][commit-link]
   ![Python][python-version-shield]
+  [![Quality Gate Status][quality-gate-shield]][quality-gate-link]
   [![license][license-shield]][license-link]
-    
+
 </div>
 
 ### What makes it simple and easy to use?
@@ -43,7 +44,7 @@ $ torcrawl -v -u http://www.github.com/ -c -d 2 -p 2
 ```
 
 > [!TIP]  
-> Crawling is not illegal, but violating copyright *is*. It’s always best to double check a website’s T&C before start crawling them. Some websites set up what’s called `robots.txt` to tell crawlers not to visit those pages.
+> Crawling is not illegal, but violating copyright *is*. It’s always best to double-check a website’s T&C before start crawling them. Some websites set up what’s called `robots.txt` to tell crawlers not to visit those pages.
 > <br>This crawler *will* allow you to go around this, but we always *recommend* respecting robots.txt.
 
 <hr>
@@ -62,34 +63,34 @@ $ torcrawl -v -u http://www.github.com/ -c -d 2 -p 2
     1. **Debian/Ubuntu**: <br>
         `apt-get install tor`<br>
         `service tor start`
-    3. **Windows**: Download [`tor.exe`][tor-download], and:<br>
+    2. **Windows**: Download [`tor.exe`][tor-download], and:<br>
         `tor.exe --service install`<br>
         `tor.exe --service start`
-    5. **MacOS**: <br>
+    3. **MacOS**: <br>
         `brew install tor`<br>
         `brew services start tor`
-    6. For different distros, visit:<br>
+    4. For different distros, visit:<br>
        [TOR Setup Documentation][tor-docs]
 
 ## Arguments
-**arg** | **Long** | **Description**
-----|------|------------
-**General**: | |
--h  |--help| Help message
--v  |--verbose| Show more information about the progress 
--u  |--url *.onion| URL of Webpage to crawl or extract
--w  |--without| Without using TOR Network
--f  |--folder| The directory which will contain the generated files
-**Extract**: | |
--e  |--extract| Extract page's code to terminal or file (Default: Terminal)
--i  |--input filename| Input file with URL(s) (separated by line)
--o  |--output [filename]| Output page(s) to file(s) (for one page)
--y  |--yara | Perform yara keyword search:<br>h = search entire html object,<br>t = search only text
-**Crawl**: | |
--c  |--crawl| Crawl website (Default output on website/links.txt)
--d  |--depth| Set depth of crawler's travel (Default: 1)
--p  |--pause| Seconds of pause between requests (Default: 0)
--l  |--log| Log file with visited URLs and their response code
+| **arg**      | **Long**            | **Description**                                                                        |
+|--------------|---------------------|----------------------------------------------------------------------------------------|
+| **General**: |                     |                                                                                        |
+| -h           | --help              | Help message                                                                           |
+| -v           | --verbose           | Show more information about the progress                                               |
+| -u           | --url *.onion       | URL of Webpage to crawl or extract                                                     |
+| -w           | --without           | Without using TOR Network                                                              |
+| -f           | --folder            | The directory which will contain the generated files                                   |
+| **Extract**: |                     |                                                                                        |
+| -e           | --extract           | Extract page's code to terminal or file (Default: Terminal)                            |
+| -i           | --input filename    | Input file with URL(s) (separated by line)                                             |
+| -o           | --output [filename] | Output page(s) to file(s) (for one page)                                               |
+| -y           | --yara              | Perform yara keyword search:<br>h = search entire html object,<br>t = search only text |
+| **Crawl**:   |                     |                                                                                        |
+| -c           | --crawl             | Crawl website (Default output on website/links.txt)                                    |
+| -d           | --depth             | Set depth of crawler's travel (Default: 1)                                             |
+| -p           | --pause             | Seconds of pause between requests (Default: 0)                                         |
+| -l           | --log               | Log file with visited URLs and their response code                                     |
 
 ## Usage & Examples
 
@@ -240,6 +241,8 @@ v1.2:
 [last-commit-shield]: https://img.shields.io/github/last-commit/MikeMeliz/TorCrawl.py?logo=github&label=Last%20Commit&style=plastic
 [release-version-shield]: https://img.shields.io/github/v/release/MikeMeliz/TorCrawl.py?logo=github&label=Release&style=plastic
 [python-version-shield]: https://img.shields.io/badge/Python-v3-green.svg?style=plastic&logo=python&label=Python
+[quality-gate-shield]: https://sonarcloud.io/api/project_badges/measure?project=MikeMeliz_TorCrawl.py&metric=alert_status
+[quality-gate-link]: https://sonarcloud.io/summary/new_code?id=MikeMeliz_TorCrawl.py
 [license-shield]: https://img.shields.io/github/license/MikeMeliz/TorCrawl.py.svg?style=plastic&logo=gnu&label=License
 [commit-link]: https://github.com/MikeMeliz/TorCrawl.py/commits/main
 [releases-link]: https://github.com/MikeMeliz/TorCrawl.py/releases

diff --git a/modules/tests/test_checker.py b/modules/tests/test_checker.py
@@ -15,7 +15,7 @@ def setUp(cls) -> None:
     def tearDownClass(cls):
         """ Test Suite Teardown. """
         # Remove test folder.
-        os.rmdir('torcrawl')
+        os.rmdir('output/torcrawl')
 
     def test_url_canon_001(self):
         """ url_canon unit test.

diff --git a/sonar-project.properties b/sonar-project.properties
@@ -0,0 +1,15 @@
+sonar.projectKey=MikeMeliz_TorCrawl.py
+sonar.organization=mikemeliz
+
+# This is the name and version displayed in the SonarCloud UI.
+sonar.projectName=TorCrawl.py
+sonar.projectVersion=1.0
+
+# Path is relative to the sonar-project.properties file. Replace "\" by "/" on Windows.
+sonar.sources=.
+
+# Encoding of the source code. Default is default system encoding
+sonar.sourceEncoding=UTF-8
+
+# Adding the coverage analysis path
+sonar.python.coverage.reportPaths=coverage.xml
diff --git a/tox.ini b/tox.ini
@@ -0,0 +1,10 @@
+[tox]
+envlist = py39
+skipsdist = True
+
+[testenv]
+deps =
+    -r{toxinidir}/requirements.txt
+    pytest
+    pytest-cov
+commands = pytest --cov=. --cov-report=xml --cov-config=tox.ini --cov-branch