A Python library for Comments and Source Code Extraction
✨ 🍰 ✨
A source code file usually contains various vital information such as license text, function/class documentation, code/logic explanation, etc in the form of comments (block & line).
Nirjas is a fully dedicated python library to extract the comments and source code out of your file(s). The extracted comments can be processed in various ways to detect licenses, generate documentation, process info, etc.
Apart from that the library serves you with all the required metadata about your Code, Comments and File(s)
For more details, read our paper
- Python 3
Installing Python on Linux machines:
$ sudo apt-get install python3 python3-pip
For macOS and Windows, packages are available at Python.org
We Support almost all the major programming languages. If you want any other language to be added, feel free to raise an issue.
The Languages we support till now:
- C
- C#
- C++
- CSS
- Dart
- Go
- Haskell
- HTML
- Java
- JavaScript
- Julia
- JSX
- Kotlin
- MATLAB
- Perl
- PHP
- Python
- R
- Ruby
- Rust
- Scala
- Scss
- Shell
- Swift
- Sql
- TypeScript
- TSX
You’ll need to make sure you have pip available. You can check this by running:
pip --version
If you installed Python from source, with an installer from python.org, you should already have pip. If you’re on Linux and installed using your OS package manager, you may have to install pip separately.
Haven’t installed pip? Visit: https://pip.pypa.io/en/stable/installing/
Install the latest official release via pip. This is the best approach for most users. It will provide a stable version and are available for most platforms.
- Update pip to the latest stable version
pip3 install --upgrade pip
- Install Nirjas
pip3 install nirjas
- Upgrading Nirjas
Upgrade already installed Nirjas library to the latest version from PyPI.
pip3 install --upgrade Nirjas
If you are interested in contributing to Nirjas development, running the latest source code, or just like to build everything yourself, it is not difficult to install & build Nirjas from the source.
-
Fork the repo
-
Clone on your local system
git clone https://github.com/fossology/Nirjas.git
- Change directory
cd Nirjas/
- Install the package
pip3 install .
This will install Nirjas on your system.
-
Check if Nirjas is installed correctly or get help, Run:
nirjas -h
ornirjas --help
Nirjas also hosts Docker images on Docker hub. They can be pulled using
docker pull fossology/nirjas:latest
To scan with Docker image, just mount the directory you want to analyze and pass the path as argument.
docker run --rm -v $(pwd):/opt/ fossology/nirjas:latest /opt/<file_to_analyze>
- For help
nirjas -h
- To extract comments from a single file
nirjas <path to file>
- To extract strings which assigned to variables from a source code file (Not yet implemented)
nirjas <path to source code file>
- To extract comments from all the files in directory/sub-directory
nirjas <path to directory>
- To extract only source code (excludes commented part) out of a file
nirjas -i <target file> <new file name including extension>
or for default file generation (default file: source.txt)
nirjas -i <target file>
To run a test for Nirjas, execute the following script:
python3 testScript.py
This will download all the test files into nirjas/languages/tests/TestFiles
folder and will run the tests as well.
We maintain our entire documentation at GitHub wiki.
Feel free to switch from code
to wiki
or just click here - Nirjas Documentation
All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.
A detailed overview on how to contribute can be found in the contributing guide.
Feel free to ask questions or discuss suggestions on Slack
This repository is licensed under the terms of LGPL-2.1. Check the LICENSE file for more details.
If you find this project useful, please consider giving a star ⭐ and please cite as:
@INPROCEEDINGS{9734222,
author={Bhardwaj, Ayush and Sahil and Pratap, Kaushlendra and Mishra, Gaurav},
booktitle={2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence)},
title={Nirjas: An open source framework for extracting metadata from the source code},
year={2022},
pages={47-52},
doi={10.1109/Confluence52989.2022.9734222}}