Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements and Refactoring’s in run.py Script: String formatting Updates, and AI Feature Integration #35

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

RahulVadisetty91
Copy link

Summary:
With this update, there is a large number of enhancements to the `run. Developing such script in python, makes it easier to maintain and improve, faster and optimise and incorporate enhanced AI capabilities and features. Hardcoding has been minimized in the script because it has been replaced by constants for file paths to increase code maintainability. String formatting has been changed to use only readable and compatible string formatting instead of the f-string format.

2. Related Issues:
This update is aimed at the direct violation of code duplication, string literal hard-coding, and data preprocessing and model training function performance enhancement. Using it also resolves issues where f-string formatting affects code readability in certain parts of the code.

3. Discussions:
Discussions centred on enhancing the quality of the script by replacing string literals through better string formatting. Further, there were deliberations on how AI features are to be incorporated in the script; All the while, focusing on making the script adaptable to future AI uses.

4. QA Instructions:
QA should ensure that path constants defined during work on the application have remained functional and that changes in string formatting make it easier to understand the code without affecting its performance. Some of the new features, which include data preprocessing and the training routines, should be validated for the levels of accuracy of performance.

5. Merge Plan:
Once it is sure that the constants, strings formatting and AI feature integration implemented correctly, the branch can be merged with other codes. Before the merge, it is advised that testing be carried out in several environments.

6. Motivation and Context:
The reasons for such changes were the appeals to remove duplicated code as well as to make codebase more manageable through refactoring of constants and string formatting. It was crucial to include the powerful AI features for better script performance and possibility to use it for other tasks with considering today’s coding standards.

7. Types of Changes:

  • **Code refactoring:In order to improve code readability, changed string literals to constants and formatted string representation optimally.
  • **New features:Seamlessly include advanced AI features for data preprocessing as well as for model training.
  • Performance improvements: Developed and improved the training and model evaluation routines, to accommodate scalability improvements.

This update introduces significant enhancements to the AI processing pipeline, focusing on improving data handling, scaling, and model training. The key changes include:

1. Refactoring File Paths with Constants:
   - Introduced constants for frequently used file paths to improve code maintainability and readability. This reduces redundancy and makes future modifications easier.

2. Enhanced Data Processing:
   - Updated the `BaseBars` class usage to handle different types of price bars, including tick, dollar, and volume bars. This improves the flexibility of data processing by allowing the script to create multiple bar types from raw price and volume data.
   - Added functionality to handle data from new CSV paths and ensured compatibility with different data formats.

3. Data Scaling Improvements:
   - Implemented the `MinMaxScaler` for feature scaling, ensuring that input data is normalized to the range [-1, 1]. This scaling enhances the performance of the AutoEncoder model by improving convergence and accuracy.

4. AutoEncoder Model Enhancements:
   - Updated the `AutoEncoder` model to include advanced architecture configurations with customizable layer sizes. This includes building and training the model with specified layer dimensions and epochs to better capture complex data patterns.
   - Added functionality to encode and process data efficiently, saving the encoded features for further analysis.

5. Random Forest Model Updates:
   - Integrated a new `RFModel` class for Random Forest implementation, allowing for advanced model training and testing. The updated script includes model parameter adjustments and training with both scaled and original datasets.
   - Enhanced model evaluation to ensure comprehensive testing of the Random Forest model’s performance on various datasets.

6. Removed Unnecessary Code:
   - Cleaned up commented-out sections related to `NNModel`, focusing the script on the implemented models. This helps streamline the code and reduces clutter.

7. Improved Code Structure:
   - Refactored the script to improve overall organization, including clear separation of data processing, model training, and evaluation sections. This enhances readability and maintainability.

These updates aim to streamline the data processing pipeline, enhance model performance, and ensure more robust handling of various data types and scaling requirements.
Enhanced Data Processing and Model Training with Improved Scaling and Refactored File Paths
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant