Add Bayesian network and LLM integration for medical diagnosis #58

leonvanbokhorst · 2024-11-23T13:13:01Z

++

Summary by Sourcery

Introduce multiple new machine learning models and algorithms, including a Bayesian network-based medical diagnosis system, a MiniGPT transformer model, a Word2Vec embedding model, an RNN mood predictor, an LSTM for character prediction, and a simple attention mechanism. Enhance the project with a deep dreamer class for generating DeepDream images and update the documentation to include detailed explanations of the historical significance and key concepts of these models.

New Features:

Introduce a Bayesian network-based medical diagnosis system that integrates large language models (LLMs) for natural language understanding and probabilistic reasoning.
Implement a MiniGPT model, a simplified version of the GPT architecture, demonstrating core transformer concepts like multi-head self-attention and position embeddings.
Add a Word2Vec implementation using the skip-gram model with negative sampling to learn word embeddings from context.
Create a Recurrent Neural Network (RNN) mood predictor that processes sequences of events to predict mood based on past events.
Develop a Long Short-Term Memory (LSTM) network for next character prediction, showcasing its ability to learn long-term dependencies in text.
Implement a simple attention mechanism to encode sentences and calculate attention scores between words, demonstrating the concept of attention in neural networks.
Introduce a softmax function demonstration to convert raw scores into probabilities, explaining its significance in neural networks.

Enhancements:

Add pgmpy to the requirements for Bayesian network functionalities.
Implement a deep dreamer class using a pre-trained Inception V3 model to generate DeepDream images.

Documentation:

Add comprehensive docstrings and comments explaining the historical significance and key concepts of various neural network architectures and algorithms, including Bayesian networks, transformers, RNNs, LSTMs, and attention mechanisms.

…sing This commit refactors the DeepDreamer class in deepdream.py to improve the model initialization and image preprocessing steps. The model initialization now uses the weights parameter, and the image preprocessing includes resizing while maintaining the aspect ratio and converting the image to a tensor and normalizing it.

Add visual demonstration of the XOR problem showing: - Why XOR is not linearly separable - Neural network's learned decision boundaries - Interactive visualization of training process This builds on 01_xor_network.py to provide visual insights into: - Data point distribution - Failed linear separation attempts - Complex decision boundaries learned by the network Technical additions: - Contour plot of network decisions - Grid-based boundary visualization - Improved training monitoring

- Refactor the XORNetwork class to improve code readability and maintainability. - Update the class initialization and activation functions for better learning. - Add a new ModernXORNetwork class with batch normalization and leaky ReLU activation. - Ensure the output is between 0 and 1 using sigmoid activation.

…sing

…evolution This commit enhances the documentation of our neural network evolution demo by: - Adding detailed historical context for each architecture (1980s-2010s) - Explaining key innovations and their significance: * Sigmoid to Tanh activation transition * Introduction of ReLU and Batch Normalization * Early memory mechanisms leading to LSTM - Documenting architectural decisions and their rationale - Clarifying the progression of neural network development The comments help developers understand: - Why each architecture was significant - What problems each innovation solved - How different components work together - The historical context of deep learning evolution

- Add test_lstm_model function for model evaluation - Calculate key metrics (MAE, MSE, RMSE) on test data - Update demo to include testing phase - Add proper model evaluation mode handling This enables better assessment of model performance and generalization capabilities through standardized metrics.

Adds detailed explanations and technical documentation to the LSTM-based text generation model, including: - Architecture overview and component relationships - Detailed explanations of LSTM memory and sequence processing - Character embedding and vocabulary handling - Temperature-controlled text generation sampling - Tensor shape transformations and data flow The comments follow the project's documentation standards and mirror the detailed explanations in the XOR network implementation, making the code more accessible for learning purposes. Technical notes: - Follows Google-style docstrings - Includes implementation details and design decisions - Explains the role of each hyperparameter - Documents tensor shapes and transformations

- Add detailed docstrings explaining RNN concepts and historical significance - Include architectural explanations for network components - Document relationship between mathematical operations and intuitive concepts - Add inline comments explaining the purpose of each layer and transformation - Follow educational style similar to XOR network example Part of the "Journey to Transformer" tutorial series.

- Add detailed module docstring explaining Word2Vec theory and implementation - Document model architecture and training process - Add explanatory comments for key algorithms and data structures - Include examples of semantic relationships in embeddings - Explain negative sampling and context window concepts - Add inline comments for code clarity and maintainability This documentation helps developers understand both the theoretical foundations and practical implementation details of the Word2Vec model.

- Add comprehensive module docstring explaining softmax history and importance - Include detailed function docstrings with mathematical explanations - Add scenario-based examples with explanatory comments - Improve print statements with educational context - Structure code sections with clear learning objectives This commit improves the educational value of the softmax implementation by providing deeper context and clearer explanations of the concepts.

Add educational implementation of attention mechanisms showing: - Word embeddings and vocabulary mapping - Simple dot product attention scoring - Sentence-level search functionality This commit provides a foundational example for understanding attention mechanisms, a key component of transformer architectures. Key features: - CoolAttention class with word embeddings - Sentence encoding and attention search - Visualization of attention scores - Detailed documentation of concepts and history Technical details: - 5-dimensional word embeddings - Case-sensitive word matching - Dot product attention scoring

Added a new file, tokenizer_vocab.json, which contains the vocabulary mapping for the tokenizer used in the project. This file includes character-to-index mappings and special tokens such as PAD, BOS, EOS, and UNK.

…tracting relationships

sourcery-ai · 2024-11-23T13:13:05Z

Reviewer's Guide by Sourcery

This PR introduces several new Python files implementing various neural network architectures and concepts, from basic neural networks to transformers. The implementation follows a journey through the evolution of neural networks, with each file building upon concepts from previous ones. The PR also adds a Bayesian network implementation for medical diagnosis and a DeepDream implementation.

Class diagram for BayesianLLM and DiagnosticReasoning

classDiagram
    class BayesianLLM {
        - str model_name
        - ChatOllama llm
        - Dict~str, List~str~~ nodes
        - Optional~BayesianNetwork~ network
        - str patient_story
        - Path log_file
        + __init__(model_name: str)
        + _initialize_log_file()
        + log_diagnostic_process(evidence: Dict~str, str~, diagnosis: DiagnosticReasoning) -> None
        + create_node(description: str) -> Tuple~str, List~str~~
        + extract_relationships(text: str) -> List~Tuple~str, str~~
        + build_network()
        + extract_medical_concepts(story: str) -> List~str~
        + extract_evidence(story: str) -> Dict~str, str~
        + setup_medical_network(story: str)
        + generate_explanation(evidence: Dict~str, str~) -> str
        + generate_diagnostic_reasoning(evidence: Dict~str, str~) -> DiagnosticReasoning
        + explain_decision_path(diagnosis: DiagnosticReasoning) -> str
        + verify_log_file() -> bool
    }
    class DiagnosticReasoning {
        - str conclusion
        - float confidence
        - List~str~ evidence_path
        - List~Tuple~str, float~~ alternative_explanations
    }

Class diagram for MiniGPT and related classes

classDiagram
    class MiniGPT {
        - GPTConfig config
        - nn.Embedding token_embedding
        - nn.Embedding position_embedding
        - ModuleList blocks
        - nn.LayerNorm ln_f
        - nn.Linear lm_head
        + __init__(config: GPTConfig)
        + _init_weights(module)
        + forward(idx, targets=None)
        + generate(idx, max_new_tokens, temperature=1.0, sample_fn=None)
    }
    class TransformerBlock {
        - MultiHeadAttention attention
        - FeedForward feed_forward
        - nn.LayerNorm ln1
        - nn.LayerNorm ln2
        + __init__(config)
        + forward(x)
    }
    class MultiHeadAttention {
        - int num_heads
        - int head_size
        - float dropout
        - nn.Linear query
        - nn.Linear key
        - nn.Linear value
        - nn.Linear proj
        - mask
        + __init__(config)
        + forward(x)
    }
    class FeedForward {
        - nn.Sequential net
        + __init__(config)
        + forward(x)
    }
    class GPTConfig {
        - int vocab_size
        - int block_size
        - int n_layer
        - int n_embd
        - int num_heads
        - int head_size
        - float dropout
        + __init__(vocab_size, block_size, n_layer=6, n_embd=384, num_heads=6, dropout=0.1)
    }
    MiniGPT --> TransformerBlock
    TransformerBlock --> MultiHeadAttention
    TransformerBlock --> FeedForward
    MiniGPT --> GPTConfig

Class diagram for Word2Vec and Word2VecTrainer

classDiagram
    class Word2Vec {
        - nn.Embedding target_embeddings
        - nn.Embedding context_embeddings
        + __init__(vocab_size, embedding_dim)
        + forward(target_word, context_word)
        + get_embedding(word_idx)
    }
    class Word2VecTrainer {
        - int window_size
        - Dict~str, int~ vocab
        - Dict~int, str~ idx_to_word
        - int vocab_size
        - List~Tuple~int, int, float~ training_pairs
        - Word2Vec model
        - optim.Adam optimizer
        - nn.BCELoss criterion
        + __init__(text, embedding_dim=64, window_size=2, min_count=5)
        + _create_training_pairs(words)
        + train(epochs=100, batch_size=24)
        + get_similar_words(word, n=5)
    }
    Word2VecTrainer --> Word2Vec

Class diagram for MoodPredictor

classDiagram
    class MoodPredictor {
        - int hidden_size
        - nn.Linear input_layer
        - nn.RNNCell rnn_cell
        - nn.Linear output_layer
        - nn.Tanh tanh
        - nn.Sigmoid sigmoid
        + __init__(input_size, hidden_size, output_size)
        + forward(x, hidden=None)
    }

Class diagram for TextPredictor and TextProcessor

classDiagram
    class TextPredictor {
        - int hidden_size
        - nn.Embedding embedding
        - nn.LSTM lstm
        - nn.Linear fc
        + __init__(vocab_size, embedding_dim=32, hidden_size=128)
        + forward(x, hidden=None)
    }
    class TextProcessor {
        - str chars
        - Dict~str, int~ char_to_idx
        - Dict~int, str~ idx_to_char
        - int vocab_size
        + __init__()
        + encode(text)
        + decode(indices)
    }

Class diagram for BasicNetwork, ImprovedNetwork, ModernNetwork, and SimpleMemoryNetwork

classDiagram
    class BasicNetwork {
        - nn.Linear layer1
        - nn.Sigmoid sigmoid
        - nn.Linear layer2
        + __init__(input_size, hidden_size, output_size)
        + forward(x)
    }
    class ImprovedNetwork {
        - nn.Linear layer1
        - nn.Tanh tanh
        - nn.Linear layer2
        + __init__(input_size, hidden_size, output_size)
        + forward(x)
    }
    class ModernNetwork {
        - nn.Linear layer1
        - nn.BatchNorm1d bn1
        - nn.ReLU relu
        - nn.Linear layer2
        - nn.BatchNorm1d bn2
        - nn.Linear layer3
        + __init__(input_size, hidden_size, output_size)
        + forward(x)
    }
    class SimpleMemoryNetwork {
        - int hidden_size
        - nn.Linear input_gate
        - nn.Linear memory_transform
        - nn.Linear output_gate
        - nn.Linear output
        - nn.Tanh tanh
        - nn.Sigmoid sigmoid
        + __init__(input_size, hidden_size, output_size)
        + forward(x, hidden_state=None)
    }

Class diagram for XORNetwork

classDiagram
    class XORNetwork {
        - nn.Sequential layers
        + __init__()
        + forward(x)
    }

Class diagram for DeepDreamer

classDiagram
    class DeepDreamer {
        - Inception3 model
        - Dict~str, Tensor~ activations
        - str layer_name
        + __init__(model_name="inception_v3", layer_name="Mixed_5b")
        + _get_activation(name)
        + preprocess_image(image_path, size=512)
        + deprocess_image(tensor)
        + dream(image_path, num_iterations=20, lr=0.01, octave_scale=1.4, num_octaves=4)
    }

Class diagram for SimpleRNN and SimpleLSTM

classDiagram
    class SimpleRNN {
        - int hidden_size
        - nn.RNNCell rnn_cell
        - nn.Linear output
        + __init__(input_size, hidden_size, output_size)
        + forward(x, hidden=None)
    }
    class SimpleLSTM {
        - nn.LSTM lstm
        - nn.Linear output
        + __init__(input_size, hidden_size, output_size)
        + forward(x)
    }

Class diagram for CoolAttention

classDiagram
    class CoolAttention {
        - List~str~ story
        - Dict~str, int~ word2idx
        - nn.Embedding embeddings
        + __init__()
        + encode_sentence(sentence)
        + attention_search(person)
    }

Class diagram for SimpleClassifier

classDiagram
    class SimpleClassifier {
        - nn.Linear layer
        + __init__()
        + forward(x)
    }

File-Level Changes

Change	Details	Files
Added a series of neural network implementations showing the evolution from basic networks to transformers	Implemented XOR problem solution with neural networks Created neural network evolution demonstration with different architectures Added RNN mood predictor implementation Implemented RNN vs LSTM memory comparison Created Word2Vec implementation for word embeddings Added LSTM character prediction model Implemented attention mechanism demonstration Created mini-GPT transformer implementation	`src/journey_to_transformer/01_xor_network.py` `src/journey_to_transformer/02_neural_net_evolution.py` `src/journey_to_transformer/03_rnn_mood_pred.py` `src/journey_to_transformer/04_rnn_vs_lstm_mem.py` `src/journey_to_transformer/05_lstm_next_char_pred.py` `src/journey_to_transformer/06_word2vec.py` `src/journey_to_transformer/07_softmax.py` `src/journey_to_transformer/08_attention.py` `src/journey_to_transformer/09_mini_gpt.py`
Added a Bayesian network implementation for medical diagnosis with LLM integration	Implemented BayesianLLM class for medical diagnosis Created diagnostic reasoning system with probabilistic inference Added logging system for diagnostic processes Implemented natural language explanation generation	`src/20_bayes_medical_explanability.py`
Added DeepDream implementation for neural network visualization	Implemented DeepDreamer class using InceptionV3 Created image processing utilities for dream generation Added multi-octave processing for enhanced visualization	`src/poc/deepdream.py`
Updated project dependencies	Added pgmpy for Bayesian Networks	`requirements.txt`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time. You can also use
this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @leonvanbokhorst - I've reviewed your changes - here's some feedback:

Overall Comments:

Consider splitting the larger files (especially 20_bayes_medical_explanability.py) into smaller, more focused modules for better maintainability.
Add type hints to function parameters and return values throughout the codebase to improve code clarity and maintainability.

Here's what I looked at during the review

🟡 General issues: 1 issue found
🟢 Security: all looks good
🟢 Testing: all looks good
🟡 Complexity: 1 issue found
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2024-11-23T13:14:22Z

src/journey_to_transformer/06_word2vec.py

+        self.optimizer = optim.Adam(self.model.parameters())
+        self.criterion = nn.BCELoss()
+
+    def _create_training_pairs(self, words):


suggestion (performance): Implement size limit for training pairs to prevent memory issues

Consider implementing a maximum size limit for the pairs list to prevent potential memory issues with large input texts.

def _create_training_pairs(self, words, max_pairs=1000000):

sourcery-ai · 2024-11-23T13:14:22Z