diff --git a/README.MD b/README.MD index b76dfc6..d4c7691 100644 --- a/README.MD +++ b/README.MD @@ -1,5 +1,6 @@ [![Build](https://github.com/Samyssmile/edux/actions/workflows/gradle.yml/badge.svg?branch=main)](https://github.com/Samyssmile/edux/actions/workflows/gradle.yml) [![CodeQL](https://github.com/Samyssmile/edux/actions/workflows/codeql-analysis.yml/badge.svg?branch=main)](https://github.com/Samyssmile/edux/actions/workflows/codeql-analysis.yml) + # EDUX - Java Machine Learning Library EDUX is a user-friendly library for solving problems with a machine learning approach. @@ -8,18 +9,22 @@ EDUX is a user-friendly library for solving problems with a machine learning app EDUX supports a variety of machine learning algorithms including: -- **Multilayer Perceptron (Neural Network):** Suitable for regression and classification problems, MLPs can approximate non-linear functions. +- **Multilayer Perceptron (Neural Network):** Suitable for regression and classification problems, MLPs can approximate + non-linear functions. - **K Nearest Neighbors:** A simple, instance-based learning algorithm used for classification and regression. - **Decision Tree:** Offers visual and explicitly laid out decision making based on input features. - **Support Vector Machine:** Effective for binary classification, and can be adapted for multi-class problems. - **RandomForest:** An ensemble method providing high accuracy through building multiple decision trees. ### Battle Royale - Which algorithm is the best? + We run all algorithms on the same dataset and compare the results. [Benchmark](https://github.com/Samyssmile/edux/discussions/42) ## Goal -The main goal of this project is to create a user-friendly library for solving problems using a machine learning approach. The library is designed to be easy to use, enabling the solution of problems with just a few lines of code. + +The main goal of this project is to create a user-friendly library for solving problems using a machine learning +approach. The library is designed to be easy to use, enabling the solution of problems with just a few lines of code. ## Features @@ -31,43 +36,122 @@ The library currently supports: - Support Vector Machine - RandomForest -## Installation +## Get started Include the library as a dependency in your Java project file. ### Gradle ``` - implementation 'io.github.samyssmile:edux:1.0.5' + implementation 'io.github.samyssmile:edux:1.0.6' ``` ### Maven + ``` io.github.samyssmile edux - 1.0.5 + 1.0.6 ``` -## How to use this library +## Getting started tutorial + +This section guides you through using EDUX to process your dataset, configure a multilayer perceptron (Multilayer Neural +Network), perform training and evaluation. + +A multi-layer perceptron (MLP) is a feedforward artificial neural network that generates a set of outputs from a set of +input features. An MLP is characterized by several layers of input nodes connected as a directed graph between the input +and output layers. + +![Neural Network](https://hc-linux.eu/github/iris-nn.png) + +### Step 1: Data Processing + +Firstly, we will load and prepare the IRIS dataset: + +| sepal.length | sepal.width | petal.length | petal.width | variety | +|--------------|-------------|--------------|-------------|---------| +| 5.1 | 3.5 | 1.4 | 0.2 | Setosa | + +```java +var featureColumnIndices=new int[]{0,1,2,3}; // Specify your feature columns + var targetColumnIndex=4; // Specify your target column + + var dataProcessor=new DataProcessor(new CSVIDataReader()); + var dataset=dataProcessor.loadDataSetFromCSV( + "path/to/your/data.csv", // Replace with your CSV file path + ',', // CSV delimiter + true, // Whether to skip the header + featureColumnIndices, + targetColumnIndex + ); + dataset.shuffle(); + dataset.normalize(); + dataProcessor.split(0.8); // Replace with your train-test split ratio +``` + +### Step 2: Preparing Training and Test Sets: - NetworkConfiguration networkConfiguration = new NetworkConfiguration(...ActivationFunction.LEAKY_RELU, ActivationFunction.SOFTMAX, LossFunction.CATEGORICAL_CROSS_ENTROPY, Initialization.XAVIER, Initialization.XAVIER); - MultilayerPerceptron multilayerPerceptron = new MultilayerPerceptron(features, labels, testFeatures, testLabels, networkConfiguration); - multilayerPerceptron.train(); +Extract the features and labels for both training and test sets: - multilayerPerceptron.predict(...); +```java + var trainFeatures=dataProcessor.getTrainFeatures(featureColumnIndices); + var trainLabels=dataProcessor.getTrainLabels(targetColumnIndex); + var testFeatures=dataProcessor.getTestFeatures(featureColumnIndices); + var testLabels=dataProcessor.getTestLabels(targetColumnIndex); +``` + +### Step 3: Network Configuration + +```java +var networkConfiguration=new NetworkConfiguration( + trainFeatures[0].length, // Number of input neurons + List.of(128,256,512), // Number of neurons in each hidden layer + 3, // Number of output neurons + 0.01, // Learning rate + 300, // Number of epochs + ActivationFunction.LEAKY_RELU, // Activation function for hidden layers + ActivationFunction.SOFTMAX, // Activation function for output layer + LossFunction.CATEGORICAL_CROSS_ENTROPY, // Loss function + Initialization.XAVIER, // Weight initialization for hidden layers + Initialization.XAVIER // Weight initialization for output layer + ); +``` + +### Step 4: Training and Evaluation + +```java +MultilayerPerceptron multilayerPerceptron=new MultilayerPerceptron( + networkConfiguration, + testFeatures, + testLabels + ); + multilayerPerceptron.train(trainFeatures,trainLabels); + multilayerPerceptron.evaluate(testFeatures,testLabels); +``` + +### Results + +```output +... +MultilayerPerceptron - Best accuracy after restoring best MLP model: 98,56% +``` ### Working examples -You can find working examples for all algorithms in the [examples](https://github.com/Samyssmile/edux/tree/main/example/src/main/java/de/example) folder. -In all examples the IRIS or Seaborn Pinguins datasets are used. +You can find more fully working examples for all algorithms in +the [examples](https://github.com/Samyssmile/edux/tree/main/example/src/main/java/de/example) folder. -#### Iris Dataset -The IRIS dataset is a multivariate dataset introduced by the British statistician and biologist Ronald Fisher in his 1936 paper *The use of multiple measurements in taxonomic problems* +For examples we use the -![Neural Network](https://hc-linux.eu/github/iris-nn.png) -## Contributions +* [IRIS dataset](https://archive.ics.uci.edu/ml/datasets/iris). +* [SEABORNE PENGUINS dataset](https://seaborn.pydata.org/archive/0.11/tutorial/function_overview.html). +## Contributions -Contributions are warmly welcomed! If you find a bug, please create an issue with a detailed description of the problem. If you wish to suggest an improvement or fix a bug, please make a pull request. Also checkout the [Rules and Guidelines](https://github.com/Samyssmile/edux/wiki/Rules-&-Guidelines-for-New-Developers) page for more information. +Contributions are warmly welcomed! If you find a bug, please create an issue with a detailed description of the problem. +If you wish to suggest an improvement or fix a bug, please make a pull request. Also checkout +the [Rules and Guidelines](https://github.com/Samyssmile/edux/wiki/Rules-&-Guidelines-for-New-Developers) page for more +information. diff --git a/docs/javadocs/allclasses-index.html b/docs/javadocs/allclasses-index.html index a5ccedf..fb38206 100644 --- a/docs/javadocs/allclasses-index.html +++ b/docs/javadocs/allclasses-index.html @@ -57,53 +57,72 @@

Alle Klassen und Schni
Beschreibung
ActivationFunction
-
Enumerates common activation functions used in neural networks and similar machine learning architectures.
+
Enumerates common activation functions used in neural networks and similar machine learning + architectures.
-
Classifier
-
+ +
+
Implements the IImputationStrategy interface to provide an average value imputation.
+
+ +
Provides a common interface for machine learning classifiers within the Edux API.
- -
 
- -
 
- -
 
- -
 
- -
 
- -
 
- -
 
- -
 
- -
-
A Decision Tree classifier for predictive modeling.
+ +
 
+ +
 
+ +
+
The Dataloader interface defines a method for loading datasets from CSV files.
- + +
 
+ +
+
The DataPostProcessor interface defines a set of methods for post-processing data.
+
+ +
 
+
 
- + +
 
+ +
+
A Decision Tree classifier for predictive modeling.
+
+
 
- +
 
+ +
 
+ +
+
Defines a strategy interface for imputing missing values within a column of data.
+
-
 
+
+
Enumerates the available imputation strategies for handling missing values in datasets.
+
 
 
-
 
+
+
Enumerates strategies for initializing weights in neural network layers, providing methods to + apply these strategies to given weight arrays.
+
 
 
-
The KnnClassifier class provides an implementation of the k-Nearest Neighbors algorithm for classification tasks.
+
The KnnClassifier class provides an implementation of the k-Nearest Neighbors algorithm + for classification tasks.
 
@@ -113,44 +132,48 @@

Alle Klassen und Schni
 
 
- +
 
- +
 
 
- +
-
The MultilayerPerceptron class represents a simple feedforward neural network, - which consists of input, hidden, and output layers.
+
Implements the IImputationStrategy interface to provide a mode value imputation.
- -
 
- -
 
- +
-
RandomForest Classifier - RandomForest is an ensemble learning method, which constructs a multitude of decision trees - at training time and outputs the class that is the mode of the classes output by - individual trees, or a mean prediction of the individual trees (regression).
+
The MultilayerPerceptron class represents a simple feedforward neural network, which + consists of input, hidden, and output layers.
- +
 
- -
-
The SupportVectorMachine class is an implementation of a Support Vector Machine (SVM) classifier, utilizing the one-vs-one strategy for multi-class classification.
+ +
 
+ +
+
RandomForest Classifier RandomForest is an ensemble learning method, which constructs a multitude + of decision trees at training time and outputs the class that is the mode of the classes output + by individual trees, or a mean prediction of the individual trees (regression).
- -
 
- -
 
- + +
 
+ +
+
The SupportVectorMachine class is an implementation of a Support Vector Machine (SVM) + classifier, utilizing the one-vs-one strategy for multi-class classification.
+
+ +
 
+
 
- +
 
- +
 
+ +
 

diff --git a/docs/javadocs/allpackages-index.html b/docs/javadocs/allpackages-index.html index 82545ac..5bb42c9 100644 --- a/docs/javadocs/allpackages-index.html +++ b/docs/javadocs/allpackages-index.html @@ -70,33 +70,36 @@

Alle Packages

de.edux.functions.loss
 
de.edux.math
-
 
-
de.edux.math.entity
-
 
-
de.edux.ml.decisiontree
+
Provides the classes necessary for mathematical operations and data manipulation within the Edux + framework.
+
+
de.edux.ml.decisiontree
+
Decision tree implementation.
-
de.edux.ml.knn
-
 
-
de.edux.ml.nn.config
-
 
-
de.edux.ml.nn.network
-
 
-
de.edux.ml.nn.network.api
+
de.edux.ml.knn
 
-
de.edux.ml.randomforest
+
de.edux.ml.nn.config
-
Random Forest implementation.
+
Classes for the configuration of the neural network.
-
de.edux.ml.svm
+
de.edux.ml.nn.network
+
 
+
de.edux.ml.nn.network.api
+
 
+
de.edux.ml.randomforest
+
Random Forest implementation.
+
+
de.edux.ml.svm
+
Support Vector Machine (SVM) implementation.
-
de.edux.util
-
 
-
de.edux.util.math
+
de.edux.util
 
+
de.edux.util.math
+
 
diff --git a/docs/javadocs/de/edux/api/Classifier.html b/docs/javadocs/de/edux/api/Classifier.html index abaae8f..cf8f8cc 100644 --- a/docs/javadocs/de/edux/api/Classifier.html +++ b/docs/javadocs/de/edux/api/Classifier.html @@ -93,25 +93,29 @@

Schnittstelle Classifier

public interface Classifier
Provides a common interface for machine learning classifiers within the Edux API. -

The Classifier interface is designed to encapsulate a variety of machine learning models, offering - a consistent approach to training, evaluating, and utilizing classifiers. Implementations are expected to handle - specifics related to different types of classification algorithms, such as neural networks, decision trees, - support vector machines, etc.

+

The Classifier interface is designed to encapsulate a variety of machine learning + models, offering a consistent approach to training, evaluating, and utilizing classifiers. + Implementations are expected to handle specifics related to different types of classification + algorithms, such as neural networks, decision trees, support vector machines, etc. -

Each classifier must implement methods for training the model on a dataset, evaluating its performance, - and making predictions on new, unseen data. This design allows for interchangeability of models and promotes - a clean separation of concerns between the data processing and model training phases.

+

Each classifier must implement methods for training the model on a dataset, evaluating its + performance, and making predictions on new, unseen data. This design allows for + interchangeability of models and promotes a clean separation of concerns between the data + processing and model training phases. + +

Typical usage involves: -

Typical usage involves:

Implementing classes should ensure that proper validation is performed on the input data and - any necessary pre-processing or feature scaling is applied consistent with the model's requirements.

+ any necessary pre-processing or feature scaling is applied consistent with the model's + requirements.
@@ -212,8 +213,7 @@

predict

Parameter:
feature - a single set of input values to be evaluated by the model.
Gibt zurück:
-
a double array representing the predicted output values for the - provided input values.
+
a double array representing the predicted output values for the provided input values.
Löst aus:
IllegalArgumentException - if feature is empty.
diff --git a/docs/javadocs/de/edux/data/provider/DataPostProcessor.html b/docs/javadocs/de/edux/data/provider/DataPostProcessor.html index 1982f57..2798279 100644 --- a/docs/javadocs/de/edux/data/provider/DataPostProcessor.html +++ b/docs/javadocs/de/edux/data/provider/DataPostProcessor.html @@ -91,6 +91,10 @@

Schnittstelle DataPost
public interface DataPostProcessor
+
The DataPostProcessor interface defines a set of methods for post-processing data. This + typically includes normalizing data, shuffling, handling missing values, and splitting datasets. + Implementations should ensure that the data is properly processed to be ready for subsequent + analysis or machine learning tasks.
  • @@ -163,18 +208,53 @@

    imputation

    imputation

    DataPostProcessor imputation(int columnIndex, ImputationStrategy imputationStrategy)
    +
    Performs imputation on missing values in a specified column index using the provided imputation + strategy.
    +
    +
    Parameter:
    +
    columnIndex - the index of the column to apply imputation
    +
    imputationStrategy - the strategy to use for imputing missing values
    +
    Gibt zurück:
    +
    the DataPostProcessor instance with imputed data for method chaining
    +
    + +
  • +
  • +
    +

    performListWiseDeletion

    +
    void performListWiseDeletion()
    +
    Performs list-wise deletion on the dataset. This involves removing any rows with missing values + to ensure the dataset is complete. This method modifies the dataset in place and does not + return a value.
  • getDataset

    List<String[]> getDataset()
    +
    Retrieves the processed dataset as a list of string arrays. Each string array represents a row + in the dataset.
    +
    +
    Gibt zurück:
    +
    a list of string arrays representing the dataset
    +
  • split

    DataProcessor split(double splitRatio)
    +
    Splits the dataset into two separate datasets according to the specified split ratio. The split + ratio determines the proportion of data to be used for the first dataset (e.g., training set).
    +
    +
    Parameter:
    +
    splitRatio - the ratio for splitting the dataset, where 0 Ungültige Eingabe: "<" splitRatio Ungültige Eingabe: "<" 1
    +
    Gibt zurück:
    +
    a DataProcessor instance containing the first portion of the dataset according + to the split ratio
    +
    Löst aus:
    +
    IllegalArgumentException - if the split ratio is not between 0 and 1
    +
  • diff --git a/docs/javadocs/de/edux/data/provider/DataProcessor.html b/docs/javadocs/de/edux/data/provider/DataProcessor.html index 9cd12d2..30d88ea 100644 --- a/docs/javadocs/de/edux/data/provider/DataProcessor.html +++ b/docs/javadocs/de/edux/data/provider/DataProcessor.html @@ -90,12 +90,12 @@

    Klasse DataProcessor

    Alle implementierten Schnittstellen:
    -
    DataloaderV2, DataPostProcessor, Dataset
    +
    Dataloader, DataPostProcessor, Dataset

    public class DataProcessor extends Object -implements DataPostProcessor, Dataset, DataloaderV2
    +implements DataPostProcessor, Dataset, Dataloader
    @@ -232,9 +283,15 @@

    loadDataSetFromCSV

    normalize

    public DataPostProcessor normalize()
    +
    Beschreibung aus Schnittstelle kopiert: DataPostProcessor
    +
    Normalizes the dataset. This typically involves scaling the values of numeric attributes so + that they share a common scale, often between 0 and 1, without distorting differences in the + ranges of values.
    Angegeben von:
    normalize in Schnittstelle DataPostProcessor
    +
    Gibt zurück:
    +
    the DataPostProcessor instance with normalized data for method chaining
    @@ -242,9 +299,14 @@

    normalize

    shuffle

    public DataPostProcessor shuffle()
    +
    Beschreibung aus Schnittstelle kopiert: DataPostProcessor
    +
    Shuffles the dataset randomly. This is usually done to ensure that the data does not carry any + inherent bias in the order it was collected or presented.
    Angegeben von:
    shuffle in Schnittstelle DataPostProcessor
    +
    Gibt zurück:
    +
    the DataPostProcessor instance with shuffled data for method chaining
    @@ -252,9 +314,14 @@

    shuffle

    getDataset

    public List<String[]> getDataset()
    +
    Beschreibung aus Schnittstelle kopiert: DataPostProcessor
    +
    Retrieves the processed dataset as a list of string arrays. Each string array represents a row + in the dataset.
    Angegeben von:
    getDataset in Schnittstelle DataPostProcessor
    +
    Gibt zurück:
    +
    a list of string arrays representing the dataset
    @@ -291,13 +358,51 @@

    getClassMap

  • +
    +

    getIndexOfColumn

    +
    public Optional<Integer> getIndexOfColumn(String columnName)
    +
    +
    Angegeben von:
    +
    getIndexOfColumn in Schnittstelle Dataset
    +
    +
    +
  • +
  • +
    +

    getColumnDataOf

    +
    public String[] getColumnDataOf(String columnName)
    +
    +
    Angegeben von:
    +
    getColumnDataOf in Schnittstelle Dataset
    +
    +
    +
  • +
  • +
    +

    getColumnNames

    +
    public String[] getColumnNames()
    +
    +
    Angegeben von:
    +
    getColumnNames in Schnittstelle Dataset
    +
    +
    +
  • +
  • imputation

    public DataPostProcessor imputation(String columnName, ImputationStrategy imputationStrategy)
    +
    Beschreibung aus Schnittstelle kopiert: DataPostProcessor
    +
    Performs imputation on missing values in a specified column using the provided imputation + strategy. Imputation is the process of replacing missing data with substituted values.
    Angegeben von:
    imputation in Schnittstelle DataPostProcessor
    +
    Parameter:
    +
    columnName - the name of the column to apply imputation
    +
    imputationStrategy - the strategy to use for imputing missing values
    +
    Gibt zurück:
    +
    the DataPostProcessor instance with imputed data for method chaining
  • @@ -306,9 +411,31 @@

    imputation

    imputation

    public DataPostProcessor imputation(int columnIndex, ImputationStrategy imputationStrategy)
    +
    Beschreibung aus Schnittstelle kopiert: DataPostProcessor
    +
    Performs imputation on missing values in a specified column index using the provided imputation + strategy.
    Angegeben von:
    imputation in Schnittstelle DataPostProcessor
    +
    Parameter:
    +
    columnIndex - the index of the column to apply imputation
    +
    imputationStrategy - the strategy to use for imputing missing values
    +
    Gibt zurück:
    +
    the DataPostProcessor instance with imputed data for method chaining
    +
    + + +
  • +
    +

    performListWiseDeletion

    +
    public void performListWiseDeletion()
    +
    Beschreibung aus Schnittstelle kopiert: DataPostProcessor
    +
    Performs list-wise deletion on the dataset. This involves removing any rows with missing values + to ensure the dataset is complete. This method modifies the dataset in place and does not + return a value.
    +
    +
    Angegeben von:
    +
    performListWiseDeletion in Schnittstelle DataPostProcessor
  • diff --git a/docs/javadocs/de/edux/data/provider/DataloaderV2.html b/docs/javadocs/de/edux/data/provider/Dataloader.html similarity index 78% rename from docs/javadocs/de/edux/data/provider/DataloaderV2.html rename to docs/javadocs/de/edux/data/provider/Dataloader.html index 4229e32..bc5f4bf 100644 --- a/docs/javadocs/de/edux/data/provider/DataloaderV2.html +++ b/docs/javadocs/de/edux/data/provider/Dataloader.html @@ -2,10 +2,10 @@ -DataloaderV2 (lib 1.0.5 API) +Dataloader (lib 1.0.5 API) - + @@ -82,7 +82,7 @@
    -

    Schnittstelle DataloaderV2

    +

    Schnittstelle Dataloader

    @@ -90,7 +90,10 @@

    Schnittstelle DataloaderV2<
    DataProcessor


    -
    public interface DataloaderV2
    +
    public interface Dataloader
    +
    The Dataloader interface defines a method for loading datasets from CSV files. + Implementations of this interface should handle the parsing of CSV files and configuration of + data processing according to the provided parameters.
    diff --git a/docs/javadocs/de/edux/data/provider/Dataset.html b/docs/javadocs/de/edux/data/provider/Dataset.html index 4e0ca28..28a0977 100644 --- a/docs/javadocs/de/edux/data/provider/Dataset.html +++ b/docs/javadocs/de/edux/data/provider/Dataset.html @@ -108,26 +108,35 @@

    Methodenübersicht

    Map<String,Integer>
    getClassMap()
     
    -
    double[][]
    -
    getInputs(List<String[]> dataset, - int[] inputColumns)
    +
    String[]
    +
    getColumnDataOf(String columnName)
    +
     
    +
    String[]
    +
    getColumnNames()
    +
     
    +
    Optional<Integer>
    +
    getIndexOfColumn(String columnName)
     
    double[][]
    -
    getTargets(List<String[]> dataset, - int targetColumn)
    +
    getInputs(List<String[]> dataset, + int[] inputColumns)
     
    double[][]
    -
    getTestFeatures(int[] inputColumns)
    +
    getTargets(List<String[]> dataset, + int targetColumn)
     
    double[][]
    -
    getTestLabels(int targetColumn)
    +
    getTestFeatures(int[] inputColumns)
     
    double[][]
    -
    getTrainFeatures(int[] inputColumns)
    +
    getTestLabels(int targetColumn)
     
    double[][]
    -
    getTrainLabels(int targetColumn)
    +
    getTrainFeatures(int[] inputColumns)
     
    +
    double[][]
    +
    getTrainLabels(int targetColumn)
    +
     
    @@ -186,6 +195,24 @@

    getTestFeatures

    double[][] getTestFeatures(int[] inputColumns)
    +
  • +
    +

    getIndexOfColumn

    +
    Optional<Integer> getIndexOfColumn(String columnName)
    +
    +
  • +
  • +
    +

    getColumnDataOf

    +
    String[] getColumnDataOf(String columnName)
    +
    +
  • +
  • +
    +

    getColumnNames

    +
    String[] getColumnNames()
    +
    +
  • diff --git a/docs/javadocs/de/edux/data/provider/package-summary.html b/docs/javadocs/de/edux/data/provider/package-summary.html index 31791de..bd77c03 100644 --- a/docs/javadocs/de/edux/data/provider/package-summary.html +++ b/docs/javadocs/de/edux/data/provider/package-summary.html @@ -77,12 +77,16 @@

    Package de.edux.data.pro
    Klasse
    Beschreibung
    - -
     
    + +
    +
    The Dataloader interface defines a method for loading datasets from CSV files.
    +
     
    -
     
    +
    +
    The DataPostProcessor interface defines a set of methods for post-processing data.
    +
     
    diff --git a/docs/javadocs/de/edux/data/provider/package-tree.html b/docs/javadocs/de/edux/data/provider/package-tree.html index 54125e3..c4b1fba 100644 --- a/docs/javadocs/de/edux/data/provider/package-tree.html +++ b/docs/javadocs/de/edux/data/provider/package-tree.html @@ -59,7 +59,7 @@

    Klassenhierarchie

  • java.lang.Object
  • @@ -67,7 +67,7 @@

    Klassenhierarchie

    Schnittstellenhierarchie

      -
    • de.edux.data.provider.DataloaderV2
    • +
    • de.edux.data.provider.Dataloader
    • de.edux.data.provider.DataPostProcessor
    • de.edux.data.provider.Dataset
    • de.edux.data.provider.Normalizer
    • diff --git a/docs/javadocs/de/edux/functions/activation/ActivationFunction.html b/docs/javadocs/de/edux/functions/activation/ActivationFunction.html index 285c04f..09fdd45 100644 --- a/docs/javadocs/de/edux/functions/activation/ActivationFunction.html +++ b/docs/javadocs/de/edux/functions/activation/ActivationFunction.html @@ -97,33 +97,40 @@

      Enum-Klasse ActivationF
      public enum ActivationFunction extends Enum<ActivationFunction>
      -
      Enumerates common activation functions used in neural networks and similar machine learning architectures. +
      Enumerates common activation functions used in neural networks and similar machine learning + architectures. -

      Each member of this enum represents a distinct type of activation function, a critical component in - neural networks. Activation functions determine the output of a neural network layer for a given set of - input, and they help normalize the output of each neuron to a specific range, usually between 1 and -1 or - between 1 and 0.

      +

      Each member of this enum represents a distinct type of activation function, a critical + component in neural networks. Activation functions determine the output of a neural network layer + for a given set of input, and they help normalize the output of each neuron to a specific range, + usually between 1 and -1 or between 1 and 0. -

      This enum simplifies the process of selecting and utilizing an activation function. It provides an - abstraction where the user can easily switch between different functions, making it easier to experiment - with neural network design. Additionally, each function includes a method for calculating its derivative, - which is essential for backpropagation in neural network training.

      +

      This enum simplifies the process of selecting and utilizing an activation function. It + provides an abstraction where the user can easily switch between different functions, making it + easier to experiment with neural network design. Additionally, each function includes a method + for calculating its derivative, which is essential for backpropagation in neural network + training. + +

      Available functions include: -

      Available functions include:

        -
      • SIGMOID: Normalizes inputs between 0 and 1, crucial for binary classification.
      • -
      • RELU: Addresses the vanishing gradient problem, allowing for faster and more effective training.
      • -
      • LEAKY_RELU: Variation of RELU, prevents "dying neurons" by allowing a small gradient when the unit is not active.
      • -
      • TANH: Normalizes inputs between -1 and 1, a scaled version of the sigmoid function.
      • -
      • SOFTMAX: Converts a vector of raw scores to a probability distribution, typically used in multi-class classification.
      • +
      • SIGMOID: Normalizes inputs between 0 and 1, crucial for binary classification. +
      • RELU: Addresses the vanishing gradient problem, allowing for faster and more + effective training. +
      • LEAKY_RELU: Variation of RELU, prevents "dying neurons" by allowing a small gradient + when the unit is not active. +
      • TANH: Normalizes inputs between -1 and 1, a scaled version of the sigmoid function. +
      • SOFTMAX: Converts a vector of raw scores to a probability distribution, typically + used in multi-class classification.
      -

      Each function overrides the calculateActivation and calculateDerivative methods, providing the - specific implementation for the activation and its derivative based on input. These are essential for the forward - and backward passes through the network, respectively.

      +

      Each function overrides the calculateActivation and calculateDerivative + methods, providing the specific implementation for the activation and its derivative based on + input. These are essential for the forward and backward passes through the network, respectively. -

      Note: The SOFTMAX function additionally overrides calculateActivation for an array input, - facilitating its common use in output layers of neural networks for classification tasks.

      +

      Note: The SOFTMAX function additionally overrides calculateActivation + for an array input, facilitating its common use in output layers of neural networks for + classification tasks.

      diff --git a/docs/javadocs/de/edux/functions/activation/package-summary.html b/docs/javadocs/de/edux/functions/activation/package-summary.html index fca1ac5..d932adc 100644 --- a/docs/javadocs/de/edux/functions/activation/package-summary.html +++ b/docs/javadocs/de/edux/functions/activation/package-summary.html @@ -71,21 +71,23 @@

      Package de.edux.f
      Provides the classes necessary to define various activation functions used in neural networks. -

      This package is part of the larger Edux framework for educational purposes in the realm of machine learning. - Within this package, you will find enumerations and possibly classes that represent a variety of standard - activation functions, such as Sigmoid, TanH, ReLU, and others. These functions are fundamental components - in the construction of neural networks, as they dictate how signals are processed as they pass from one - neuron (or node) to the next, essentially determining the output of each neuron.

      +

      This package is part of the larger Edux framework for educational purposes in the realm of + machine learning. Within this package, you will find enumerations and possibly classes that + represent a variety of standard activation functions, such as Sigmoid, TanH, ReLU, and others. + These functions are fundamental components in the construction of neural networks, as they + dictate how signals are processed as they pass from one neuron (or node) to the next, essentially + determining the output of each neuron. -

      Each activation function contained within this package has distinct characteristics and is useful in - different scenarios, depending on the nature of the input data, the specific architecture of the network, - and the learning task at hand. For instance, some functions are better suited for dealing with issues like - the vanishing gradient problem, while others might normalize input values into a certain range to aid with - the convergence of the learning algorithm.

      +

      Each activation function contained within this package has distinct characteristics and is + useful in different scenarios, depending on the nature of the input data, the specific + architecture of the network, and the learning task at hand. For instance, some functions are + better suited for dealing with issues like the vanishing gradient problem, while others might + normalize input values into a certain range to aid with the convergence of the learning + algorithm. -

      This package is designed to offer flexibility and ease of use for those constructing machine learning - models, as it allows for easy switching between different activation strategies, facilitating experimentation - and learning.

      +

      This package is designed to offer flexibility and ease of use for those constructing machine + learning models, as it allows for easy switching between different activation strategies, + facilitating experimentation and learning.

      @@ -97,7 +99,8 @@

      Package de.edux.f
      Beschreibung
      -
      Enumerates common activation functions used in neural networks and similar machine learning architectures.
      +
      Enumerates common activation functions used in neural networks and similar machine learning + architectures.
      diff --git a/docs/javadocs/de/edux/functions/imputation/AverageImputation.html b/docs/javadocs/de/edux/functions/imputation/AverageImputation.html new file mode 100644 index 0000000..e65b0ed --- /dev/null +++ b/docs/javadocs/de/edux/functions/imputation/AverageImputation.html @@ -0,0 +1,200 @@ + + + + +AverageImputation (lib 1.0.5 API) + + + + + + + + + + + + + +
      + +
      +
      + +
      + +

      Klasse AverageImputation

      +
      +
      java.lang.Object +
      de.edux.functions.imputation.AverageImputation
      +
      +
      +
      +
      Alle implementierten Schnittstellen:
      +
      IImputationStrategy
      +
      +
      +
      public class AverageImputation +extends Object +implements IImputationStrategy
      +
      Implements the IImputationStrategy interface to provide an average value imputation. This + strategy calculates the average of the non-missing numeric values in a column and substitutes the + missing values with this average. + +

      It is important to note that this strategy is only applicable to columns with numeric data. + Attempting to use this strategy on categorical data will result in a RuntimeException.

      +
      +
      + +
      +
      +
        + +
      • +
        +

        Konstruktordetails

        +
          +
        • +
          +

          AverageImputation

          +
          public AverageImputation()
          +
          +
        • +
        +
        +
      • + +
      • +
        +

        Methodendetails

        +
          +
        • +
          +

          performImputation

          +
          public String[] performImputation(String[] datasetColumn)
          +
          Performs average value imputation on the provided dataset column. Missing values are identified + as blank strings and are replaced by the average of the non-missing values. If the column + contains categorical data, a runtime exception is thrown.
          +
          +
          Angegeben von:
          +
          performImputation in Schnittstelle IImputationStrategy
          +
          Parameter:
          +
          datasetColumn - an array of String representing the column data with potential + missing values.
          +
          Gibt zurück:
          +
          an array of String where missing values have been imputed with the average of + non-missing values.
          +
          Löst aus:
          +
          RuntimeException - if the column data contains categorical values which cannot be + averaged.
          +
          +
          +
        • +
        +
        +
      • +
      +
      + +
      +
      +
      + + diff --git a/docs/javadocs/de/edux/functions/imputation/IImputationStrategy.html b/docs/javadocs/de/edux/functions/imputation/IImputationStrategy.html new file mode 100644 index 0000000..f6a6327 --- /dev/null +++ b/docs/javadocs/de/edux/functions/imputation/IImputationStrategy.html @@ -0,0 +1,157 @@ + + + + +IImputationStrategy (lib 1.0.5 API) + + + + + + + + + + + + + +
      + +
      +
      + +
      + +

      Schnittstelle IImputationStrategy

      +
      +
      +
      +
      Alle bekannten Implementierungsklassen:
      +
      AverageImputation, ModeImputation
      +
      +
      +
      public interface IImputationStrategy
      +
      Defines a strategy interface for imputing missing values within a column of data. Implementations + of this interface should provide a concrete imputation method that can handle various types of + missing data according to specific rules or algorithms.
      +
      +
      +
        + +
      • +
        +

        Methodenübersicht

        +
        +
        +
        +
        +
        Modifizierer und Typ
        +
        Methode
        +
        Beschreibung
        + +
        performImputation(String[] columnData)
        +
        +
        Performs imputation on the provided column data array.
        +
        +
        +
        +
        +
        +
      • +
      +
      +
      +
        + +
      • +
        +

        Methodendetails

        +
          +
        • +
          +

          performImputation

          +
          String[] performImputation(String[] columnData)
          +
          Performs imputation on the provided column data array. Missing values within the array are + expected to be filled with substituted values determined by the specific imputation strategy + implemented.
          +
          +
          Parameter:
          +
          columnData - an array of String representing the data of a single column, where + missing values are to be imputed.
          +
          Gibt zurück:
          +
          an array of String representing the column data after imputation has been + performed.
          +
          +
          +
        • +
        +
        +
      • +
      +
      + +
      +
      +
      + + diff --git a/docs/javadocs/de/edux/functions/imputation/ImputationStrategy.html b/docs/javadocs/de/edux/functions/imputation/ImputationStrategy.html index 73b6c02..6ce5ebc 100644 --- a/docs/javadocs/de/edux/functions/imputation/ImputationStrategy.html +++ b/docs/javadocs/de/edux/functions/imputation/ImputationStrategy.html @@ -97,6 +97,9 @@

      Enum-Klasse ImputationS
      public enum ImputationStrategy extends Enum<ImputationStrategy>
      +
      Enumerates the available imputation strategies for handling missing values in datasets. Each + strategy is associated with a concrete implementation of IImputationStrategy that defines + the specific imputation behavior.

      @@ -118,13 +121,15 @@

      Enum-Konstanten - Übersicht

      Enum-Konstante
      Beschreibung
      -
       
      - -
       
      - -
       
      +
      +
      Imputation strategy that replaces missing values with the average of the non-missing values in + the dataset column.
      +
      -
       
      +
      +
      Imputation strategy that replaces missing values with the most frequently occurring value + (mode) in the dataset column.
      +
    @@ -133,20 +138,26 @@

    Enum-Konstanten - Übersicht

    Methodenübersicht

    -
    +
    Modifizierer und Typ
    Methode
    Beschreibung
    - - -
    -
    Gibt die Enum-Konstante dieser Klasse mit dem angegebenen Namen zurück.
    + + +
    +
    Retrieves the IImputationStrategy implementation associated with the imputation + strategy.
    - - + +
    +
    Gibt die Enum-Konstante dieser Klasse mit dem angegebenen Namen zurück.
    +
    + + +
    Gibt ein Array mit den Konstanten dieser Enum-Klasse in der Reihenfolge ihrer Deklaration zurück.
    @@ -171,27 +182,20 @@

    Von Klasse geerbte Method

    Enum-Konstanten - Details

    • -
      -

      DUMMY

      -
      public static final ImputationStrategy DUMMY
      -
      -
    • -
    • -
      -

      MEAN

      -
      public static final ImputationStrategy MEAN
      -
      -
    • -
    • AVERAGE

      public static final ImputationStrategy AVERAGE
      +
      Imputation strategy that replaces missing values with the average of the non-missing values in + the dataset column. This strategy is suitable for numerical data only.
    • MODE

      public static final ImputationStrategy MODE
      +
      Imputation strategy that replaces missing values with the most frequently occurring value + (mode) in the dataset column. This strategy can be used for both numerical and categorical + data.
    @@ -233,6 +237,18 @@

    valueOf

    +
  • +
    +

    getImputation

    +
    public IImputationStrategy getImputation()
    +
    Retrieves the IImputationStrategy implementation associated with the imputation + strategy.
    +
    +
    Gibt zurück:
    +
    the imputation strategy implementation
    +
    +
    +
  • diff --git a/docs/javadocs/de/edux/functions/imputation/ModeImputation.html b/docs/javadocs/de/edux/functions/imputation/ModeImputation.html new file mode 100644 index 0000000..0c2d701 --- /dev/null +++ b/docs/javadocs/de/edux/functions/imputation/ModeImputation.html @@ -0,0 +1,194 @@ + + + + +ModeImputation (lib 1.0.5 API) + + + + + + + + + + + + + +
    + +
    +
    + +
    + +

    Klasse ModeImputation

    +
    +
    java.lang.Object +
    de.edux.functions.imputation.ModeImputation
    +
    +
    +
    +
    Alle implementierten Schnittstellen:
    +
    IImputationStrategy
    +
    +
    +
    public class ModeImputation +extends Object +implements IImputationStrategy
    +
    Implements the IImputationStrategy interface to provide a mode value imputation. This + strategy finds the most frequently occurring value, or mode, in a dataset column and substitutes + missing values with this mode.
    +
    +
    + +
    +
    +
      + +
    • +
      +

      Konstruktordetails

      +
        +
      • +
        +

        ModeImputation

        +
        public ModeImputation()
        +
        +
      • +
      +
      +
    • + +
    • +
      +

      Methodendetails

      +
        +
      • +
        +

        performImputation

        +
        public String[] performImputation(String[] datasetColumn)
        +
        Performs mode value imputation on the provided dataset column. Missing values are identified as + blank strings and are replaced by the mode of the non-missing values. If multiple modes are + found, the first encountered in the dataset is used.
        +
        +
        Angegeben von:
        +
        performImputation in Schnittstelle IImputationStrategy
        +
        Parameter:
        +
        datasetColumn - an array of String representing the column data with potential + missing values.
        +
        Gibt zurück:
        +
        an array of String where missing values have been imputed with the mode of + non-missing values.
        +
        +
        +
      • +
      +
      +
    • +
    +
    + +
    +
    +
    + + diff --git a/docs/javadocs/de/edux/functions/imputation/package-summary.html b/docs/javadocs/de/edux/functions/imputation/package-summary.html index ae84acb..adbc141 100644 --- a/docs/javadocs/de/edux/functions/imputation/package-summary.html +++ b/docs/javadocs/de/edux/functions/imputation/package-summary.html @@ -72,12 +72,28 @@

    Package de.edux.f
    +

    Klassenhierarchie

    + +
    +
    +

    Schnittstellenhierarchie

    + +
    +

    Enum-Klassenhierarchie

    • java.lang.Object diff --git a/docs/javadocs/de/edux/functions/initialization/Initialization.html b/docs/javadocs/de/edux/functions/initialization/Initialization.html index 37c04c0..acf6f12 100644 --- a/docs/javadocs/de/edux/functions/initialization/Initialization.html +++ b/docs/javadocs/de/edux/functions/initialization/Initialization.html @@ -97,6 +97,8 @@

      Enum-Klasse Initialization<
      public enum Initialization extends Enum<Initialization>
      +
      Enumerates strategies for initializing weights in neural network layers, providing methods to + apply these strategies to given weight arrays.

      @@ -118,9 +120,14 @@

      Enum-Konstanten - Übersicht

      Enum-Konstante
      Beschreibung
      -
       
      +
      +
      He initialization strategy for weights.
      +
      -
       
      +
      +
      Enumerates strategies for initializing weights in neural network layers, providing methods to + apply these strategies to given weight arrays.
      +
    @@ -174,12 +181,17 @@

    Enum-Konstanten - Details

    XAVIER

    public static final Initialization XAVIER
    +
    Enumerates strategies for initializing weights in neural network layers, providing methods to + apply these strategies to given weight arrays.
  • HE

    public static final Initialization HE
    +
    He initialization strategy for weights. This strategy is designed for layers with ReLU + activation, initializing the weights with variance scaled by the size of the previous layer, + aiming to reduce the vanishing gradient problem.
  • diff --git a/docs/javadocs/de/edux/functions/initialization/package-summary.html b/docs/javadocs/de/edux/functions/initialization/package-summary.html index b62c6c9..f0b92c4 100644 --- a/docs/javadocs/de/edux/functions/initialization/package-summary.html +++ b/docs/javadocs/de/edux/functions/initialization/package-summary.html @@ -77,7 +77,10 @@

    Package de.ed
    Klasse
    Beschreibung
    -
     
    +
    +
    Enumerates strategies for initializing weights in neural network layers, providing methods to + apply these strategies to given weight arrays.
    +
    diff --git a/docs/javadocs/de/edux/math/Entity.html b/docs/javadocs/de/edux/math/Entity.html index 2713cf7..5feaa0e 100644 --- a/docs/javadocs/de/edux/math/Entity.html +++ b/docs/javadocs/de/edux/math/Entity.html @@ -87,7 +87,7 @@

    Schnittstelle Entity<T>

    Alle bekannten Implementierungsklassen:
    -
    Matrix, Vector
    +
    Matrix, Vector

    public interface Entity<T>
    diff --git a/docs/javadocs/de/edux/math/entity/Matrix.MatrixIterator.html b/docs/javadocs/de/edux/math/Matrix.MatrixIterator.html similarity index 92% rename from docs/javadocs/de/edux/math/entity/Matrix.MatrixIterator.html rename to docs/javadocs/de/edux/math/Matrix.MatrixIterator.html index d5e5e93..bfab4d8 100644 --- a/docs/javadocs/de/edux/math/entity/Matrix.MatrixIterator.html +++ b/docs/javadocs/de/edux/math/Matrix.MatrixIterator.html @@ -5,16 +5,16 @@ Matrix.MatrixIterator (lib 1.0.5 API) - + - - - - - + + + + + -