Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Bayesian network and LLM integration for medical diagnosis #58

Merged
merged 18 commits into from
Nov 23, 2024

Conversation

leonvanbokhorst
Copy link
Owner

@leonvanbokhorst leonvanbokhorst commented Nov 23, 2024

++

Summary by Sourcery

Introduce multiple new machine learning models and algorithms, including a Bayesian network-based medical diagnosis system, a MiniGPT transformer model, a Word2Vec embedding model, an RNN mood predictor, an LSTM for character prediction, and a simple attention mechanism. Enhance the project with a deep dreamer class for generating DeepDream images and update the documentation to include detailed explanations of the historical significance and key concepts of these models.

New Features:

  • Introduce a Bayesian network-based medical diagnosis system that integrates large language models (LLMs) for natural language understanding and probabilistic reasoning.
  • Implement a MiniGPT model, a simplified version of the GPT architecture, demonstrating core transformer concepts like multi-head self-attention and position embeddings.
  • Add a Word2Vec implementation using the skip-gram model with negative sampling to learn word embeddings from context.
  • Create a Recurrent Neural Network (RNN) mood predictor that processes sequences of events to predict mood based on past events.
  • Develop a Long Short-Term Memory (LSTM) network for next character prediction, showcasing its ability to learn long-term dependencies in text.
  • Implement a simple attention mechanism to encode sentences and calculate attention scores between words, demonstrating the concept of attention in neural networks.
  • Introduce a softmax function demonstration to convert raw scores into probabilities, explaining its significance in neural networks.

Enhancements:

  • Add pgmpy to the requirements for Bayesian network functionalities.
  • Implement a deep dreamer class using a pre-trained Inception V3 model to generate DeepDream images.

Documentation:

  • Add comprehensive docstrings and comments explaining the historical significance and key concepts of various neural network architectures and algorithms, including Bayesian networks, transformers, RNNs, LSTMs, and attention mechanisms.

leonvanbokhorst and others added 18 commits November 18, 2024 17:18
…sing

This commit refactors the DeepDreamer class in deepdream.py to improve the model initialization and image preprocessing steps. The model initialization now uses the weights parameter, and the image preprocessing includes resizing while maintaining the aspect ratio and converting the image to a tensor and normalizing it.
Add visual demonstration of the XOR problem showing:
- Why XOR is not linearly separable
- Neural network's learned decision boundaries
- Interactive visualization of training process

This builds on 01_xor_network.py to provide visual insights into:
- Data point distribution
- Failed linear separation attempts
- Complex decision boundaries learned by the network

Technical additions:
- Contour plot of network decisions
- Grid-based boundary visualization
- Improved training monitoring
- Refactor the XORNetwork class to improve code readability and maintainability.
- Update the class initialization and activation functions for better learning.
- Add a new ModernXORNetwork class with batch normalization and leaky ReLU activation.
- Ensure the output is between 0 and 1 using sigmoid activation.
…evolution

This commit enhances the documentation of our neural network evolution demo by:

- Adding detailed historical context for each architecture (1980s-2010s)
- Explaining key innovations and their significance:
  * Sigmoid to Tanh activation transition
  * Introduction of ReLU and Batch Normalization
  * Early memory mechanisms leading to LSTM
- Documenting architectural decisions and their rationale
- Clarifying the progression of neural network development

The comments help developers understand:
- Why each architecture was significant
- What problems each innovation solved
- How different components work together
- The historical context of deep learning evolution
- Add test_lstm_model function for model evaluation
- Calculate key metrics (MAE, MSE, RMSE) on test data
- Update demo to include testing phase
- Add proper model evaluation mode handling

This enables better assessment of model performance and generalization
capabilities through standardized metrics.
Adds detailed explanations and technical documentation to the LSTM-based text
generation model, including:

- Architecture overview and component relationships
- Detailed explanations of LSTM memory and sequence processing
- Character embedding and vocabulary handling
- Temperature-controlled text generation sampling
- Tensor shape transformations and data flow

The comments follow the project's documentation standards and mirror the
detailed explanations in the XOR network implementation, making the code more
accessible for learning purposes.

Technical notes:
- Follows Google-style docstrings
- Includes implementation details and design decisions
- Explains the role of each hyperparameter
- Documents tensor shapes and transformations
- Add detailed docstrings explaining RNN concepts and historical significance
- Include architectural explanations for network components
- Document relationship between mathematical operations and intuitive concepts
- Add inline comments explaining the purpose of each layer and transformation
- Follow educational style similar to XOR network example

Part of the "Journey to Transformer" tutorial series.
- Add detailed module docstring explaining Word2Vec theory and implementation
- Document model architecture and training process
- Add explanatory comments for key algorithms and data structures
- Include examples of semantic relationships in embeddings
- Explain negative sampling and context window concepts
- Add inline comments for code clarity and maintainability

This documentation helps developers understand both the theoretical
foundations and practical implementation details of the Word2Vec model.
- Add comprehensive module docstring explaining softmax history and importance
- Include detailed function docstrings with mathematical explanations
- Add scenario-based examples with explanatory comments
- Improve print statements with educational context
- Structure code sections with clear learning objectives

This commit improves the educational value of the softmax implementation
by providing deeper context and clearer explanations of the concepts.
Add educational implementation of attention mechanisms showing:
- Word embeddings and vocabulary mapping
- Simple dot product attention scoring
- Sentence-level search functionality

This commit provides a foundational example for understanding
attention mechanisms, a key component of transformer architectures.

Key features:
- CoolAttention class with word embeddings
- Sentence encoding and attention search
- Visualization of attention scores
- Detailed documentation of concepts and history

Technical details:
- 5-dimensional word embeddings
- Case-sensitive word matching
- Dot product attention scoring
Added a new file, tokenizer_vocab.json, which contains the vocabulary mapping for the tokenizer used in the project. This file includes character-to-index mappings and special tokens such as PAD, BOS, EOS, and UNK.
Copy link
Contributor

sourcery-ai bot commented Nov 23, 2024

Reviewer's Guide by Sourcery

This PR introduces several new Python files implementing various neural network architectures and concepts, from basic neural networks to transformers. The implementation follows a journey through the evolution of neural networks, with each file building upon concepts from previous ones. The PR also adds a Bayesian network implementation for medical diagnosis and a DeepDream implementation.

Class diagram for BayesianLLM and DiagnosticReasoning

classDiagram
    class BayesianLLM {
        - str model_name
        - ChatOllama llm
        - Dict~str, List~str~~ nodes
        - Optional~BayesianNetwork~ network
        - str patient_story
        - Path log_file
        + __init__(model_name: str)
        + _initialize_log_file()
        + log_diagnostic_process(evidence: Dict~str, str~, diagnosis: DiagnosticReasoning) -> None
        + create_node(description: str) -> Tuple~str, List~str~~
        + extract_relationships(text: str) -> List~Tuple~str, str~~
        + build_network()
        + extract_medical_concepts(story: str) -> List~str~
        + extract_evidence(story: str) -> Dict~str, str~
        + setup_medical_network(story: str)
        + generate_explanation(evidence: Dict~str, str~) -> str
        + generate_diagnostic_reasoning(evidence: Dict~str, str~) -> DiagnosticReasoning
        + explain_decision_path(diagnosis: DiagnosticReasoning) -> str
        + verify_log_file() -> bool
    }
    class DiagnosticReasoning {
        - str conclusion
        - float confidence
        - List~str~ evidence_path
        - List~Tuple~str, float~~ alternative_explanations
    }
Loading

Class diagram for MiniGPT and related classes

classDiagram
    class MiniGPT {
        - GPTConfig config
        - nn.Embedding token_embedding
        - nn.Embedding position_embedding
        - ModuleList blocks
        - nn.LayerNorm ln_f
        - nn.Linear lm_head
        + __init__(config: GPTConfig)
        + _init_weights(module)
        + forward(idx, targets=None)
        + generate(idx, max_new_tokens, temperature=1.0, sample_fn=None)
    }
    class TransformerBlock {
        - MultiHeadAttention attention
        - FeedForward feed_forward
        - nn.LayerNorm ln1
        - nn.LayerNorm ln2
        + __init__(config)
        + forward(x)
    }
    class MultiHeadAttention {
        - int num_heads
        - int head_size
        - float dropout
        - nn.Linear query
        - nn.Linear key
        - nn.Linear value
        - nn.Linear proj
        - mask
        + __init__(config)
        + forward(x)
    }
    class FeedForward {
        - nn.Sequential net
        + __init__(config)
        + forward(x)
    }
    class GPTConfig {
        - int vocab_size
        - int block_size
        - int n_layer
        - int n_embd
        - int num_heads
        - int head_size
        - float dropout
        + __init__(vocab_size, block_size, n_layer=6, n_embd=384, num_heads=6, dropout=0.1)
    }
    MiniGPT --> TransformerBlock
    TransformerBlock --> MultiHeadAttention
    TransformerBlock --> FeedForward
    MiniGPT --> GPTConfig
Loading

Class diagram for Word2Vec and Word2VecTrainer

classDiagram
    class Word2Vec {
        - nn.Embedding target_embeddings
        - nn.Embedding context_embeddings
        + __init__(vocab_size, embedding_dim)
        + forward(target_word, context_word)
        + get_embedding(word_idx)
    }
    class Word2VecTrainer {
        - int window_size
        - Dict~str, int~ vocab
        - Dict~int, str~ idx_to_word
        - int vocab_size
        - List~Tuple~int, int, float~ training_pairs
        - Word2Vec model
        - optim.Adam optimizer
        - nn.BCELoss criterion
        + __init__(text, embedding_dim=64, window_size=2, min_count=5)
        + _create_training_pairs(words)
        + train(epochs=100, batch_size=24)
        + get_similar_words(word, n=5)
    }
    Word2VecTrainer --> Word2Vec
Loading

Class diagram for MoodPredictor

classDiagram
    class MoodPredictor {
        - int hidden_size
        - nn.Linear input_layer
        - nn.RNNCell rnn_cell
        - nn.Linear output_layer
        - nn.Tanh tanh
        - nn.Sigmoid sigmoid
        + __init__(input_size, hidden_size, output_size)
        + forward(x, hidden=None)
    }
Loading

Class diagram for TextPredictor and TextProcessor

classDiagram
    class TextPredictor {
        - int hidden_size
        - nn.Embedding embedding
        - nn.LSTM lstm
        - nn.Linear fc
        + __init__(vocab_size, embedding_dim=32, hidden_size=128)
        + forward(x, hidden=None)
    }
    class TextProcessor {
        - str chars
        - Dict~str, int~ char_to_idx
        - Dict~int, str~ idx_to_char
        - int vocab_size
        + __init__()
        + encode(text)
        + decode(indices)
    }
Loading

Class diagram for BasicNetwork, ImprovedNetwork, ModernNetwork, and SimpleMemoryNetwork

classDiagram
    class BasicNetwork {
        - nn.Linear layer1
        - nn.Sigmoid sigmoid
        - nn.Linear layer2
        + __init__(input_size, hidden_size, output_size)
        + forward(x)
    }
    class ImprovedNetwork {
        - nn.Linear layer1
        - nn.Tanh tanh
        - nn.Linear layer2
        + __init__(input_size, hidden_size, output_size)
        + forward(x)
    }
    class ModernNetwork {
        - nn.Linear layer1
        - nn.BatchNorm1d bn1
        - nn.ReLU relu
        - nn.Linear layer2
        - nn.BatchNorm1d bn2
        - nn.Linear layer3
        + __init__(input_size, hidden_size, output_size)
        + forward(x)
    }
    class SimpleMemoryNetwork {
        - int hidden_size
        - nn.Linear input_gate
        - nn.Linear memory_transform
        - nn.Linear output_gate
        - nn.Linear output
        - nn.Tanh tanh
        - nn.Sigmoid sigmoid
        + __init__(input_size, hidden_size, output_size)
        + forward(x, hidden_state=None)
    }
Loading

Class diagram for XORNetwork

classDiagram
    class XORNetwork {
        - nn.Sequential layers
        + __init__()
        + forward(x)
    }
Loading

Class diagram for DeepDreamer

classDiagram
    class DeepDreamer {
        - Inception3 model
        - Dict~str, Tensor~ activations
        - str layer_name
        + __init__(model_name="inception_v3", layer_name="Mixed_5b")
        + _get_activation(name)
        + preprocess_image(image_path, size=512)
        + deprocess_image(tensor)
        + dream(image_path, num_iterations=20, lr=0.01, octave_scale=1.4, num_octaves=4)
    }
Loading

Class diagram for SimpleRNN and SimpleLSTM

classDiagram
    class SimpleRNN {
        - int hidden_size
        - nn.RNNCell rnn_cell
        - nn.Linear output
        + __init__(input_size, hidden_size, output_size)
        + forward(x, hidden=None)
    }
    class SimpleLSTM {
        - nn.LSTM lstm
        - nn.Linear output
        + __init__(input_size, hidden_size, output_size)
        + forward(x)
    }
Loading

Class diagram for CoolAttention

classDiagram
    class CoolAttention {
        - List~str~ story
        - Dict~str, int~ word2idx
        - nn.Embedding embeddings
        + __init__()
        + encode_sentence(sentence)
        + attention_search(person)
    }
Loading

Class diagram for SimpleClassifier

classDiagram
    class SimpleClassifier {
        - nn.Linear layer
        + __init__()
        + forward(x)
    }
Loading

File-Level Changes

Change Details Files
Added a series of neural network implementations showing the evolution from basic networks to transformers
  • Implemented XOR problem solution with neural networks
  • Created neural network evolution demonstration with different architectures
  • Added RNN mood predictor implementation
  • Implemented RNN vs LSTM memory comparison
  • Created Word2Vec implementation for word embeddings
  • Added LSTM character prediction model
  • Implemented attention mechanism demonstration
  • Created mini-GPT transformer implementation
src/journey_to_transformer/01_xor_network.py
src/journey_to_transformer/02_neural_net_evolution.py
src/journey_to_transformer/03_rnn_mood_pred.py
src/journey_to_transformer/04_rnn_vs_lstm_mem.py
src/journey_to_transformer/05_lstm_next_char_pred.py
src/journey_to_transformer/06_word2vec.py
src/journey_to_transformer/07_softmax.py
src/journey_to_transformer/08_attention.py
src/journey_to_transformer/09_mini_gpt.py
Added a Bayesian network implementation for medical diagnosis with LLM integration
  • Implemented BayesianLLM class for medical diagnosis
  • Created diagnostic reasoning system with probabilistic inference
  • Added logging system for diagnostic processes
  • Implemented natural language explanation generation
src/20_bayes_medical_explanability.py
Added DeepDream implementation for neural network visualization
  • Implemented DeepDreamer class using InceptionV3
  • Created image processing utilities for dream generation
  • Added multi-octave processing for enhanced visualization
src/poc/deepdream.py
Updated project dependencies
  • Added pgmpy for Bayesian Networks
requirements.txt

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@leonvanbokhorst leonvanbokhorst self-assigned this Nov 23, 2024
@leonvanbokhorst leonvanbokhorst added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 23, 2024
@sourcery-ai sourcery-ai bot changed the title @sourcery-ai Add Bayesian network and LLM integration for medical diagnosis Nov 23, 2024
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @leonvanbokhorst - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider splitting the larger files (especially 20_bayes_medical_explanability.py) into smaller, more focused modules for better maintainability.
  • Add type hints to function parameters and return values throughout the codebase to improve code clarity and maintainability.
Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

self.optimizer = optim.Adam(self.model.parameters())
self.criterion = nn.BCELoss()

def _create_training_pairs(self, words):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (performance): Implement size limit for training pairs to prevent memory issues

Consider implementing a maximum size limit for the pairs list to prevent potential memory issues with large input texts.

    def _create_training_pairs(self, words, max_pairs=1000000):

]
)

def log_diagnostic_process(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider extracting log entry preparation logic into a separate method to improve code organization.

The logging implementation can be simplified while maintaining all functionality and error handling. Here's a suggested refactor:

def _prepare_log_entry(self, evidence: Dict[str, str], diagnosis: DiagnosticReasoning) -> Dict[str, str]:
    """Prepare log entry data with error handling"""
    network_structure = (
        [f"{cause}{effect}" for cause, effect in self.network.edges()]
        if self.network else []
    )

    return {
        "timestamp": datetime.now().isoformat(),
        "patient_story": self.patient_story.strip(),
        "extracted_evidence": json.dumps(evidence),
        "primary_conclusion": diagnosis.conclusion,
        "confidence": str(diagnosis.confidence),
        "evidence_path": json.dumps(diagnosis.evidence_path),
        "alternative_explanations": json.dumps(diagnosis.alternative_explanations),
        "network_structure": json.dumps(network_structure)
    }

def log_diagnostic_process(self, evidence: Dict[str, str], diagnosis: DiagnosticReasoning) -> None:
    """Log diagnostic process with simplified error handling"""
    try:
        log_entry = self._prepare_log_entry(evidence, diagnosis)

        with open(self.log_file, "a", newline="", encoding="utf-8") as f:
            writer = csv.DictWriter(f, fieldnames=list(log_entry.keys()))
            writer.writerow(log_entry)

        logger.info(f"Successfully logged diagnostic process to {self.log_file}")

    except Exception as e:
        logger.error(f"Failed to log diagnostic process: {e}", exc_info=True)
        raise

This refactor:

  1. Extracts log entry preparation to a separate method
  2. Flattens the error handling structure
  3. Maintains all functionality and error tracking
  4. Makes the code flow more linear and easier to follow

)

try:
chain = prompt | self.llm | StrOutputParser()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Extract code out into method (extract-method)

logger.debug(f"Cleaned medical concepts response: {content}")

try:
concepts = json.loads(content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Extract code out into method (extract-method)

Comment on lines +547 to +550
explanation = chain.invoke({"nodes": nodes_str, "evidence": evidence_str})

return explanation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Inline variable that is immediately returned (inline-immediately-returned-variable)

Suggested change
explanation = chain.invoke({"nodes": nodes_str, "evidence": evidence_str})
return explanation
return chain.invoke({"nodes": nodes_str, "evidence": evidence_str})

network_structure = [
f"{cause} → {effect}" for cause, effect in self.network.edges()
]
nodes_states = {node: states for node, states in self.nodes.items()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): We've found these issues:


ExplanationConvert list/set/tuple comprehensions that do not change the input elements into.

Before

# List comprehensions
[item for item in coll]
[item for item in friends.names()]

# Dict comprehensions
{k: v for k, v in coll}
{k: v for k, v in coll.items()}  # Only if we know coll is a `dict`

# Unneeded call to `.items()`
dict(coll.items())  # Only if we know coll is a `dict`

# Set comprehensions
{item for item in coll}

After

# List comprehensions
list(iter(coll))
list(iter(friends.names()))

# Dict comprehensions
dict(coll)
dict(coll)

# Unneeded call to `.items()`
dict(coll)

# Set comprehensions
set(coll)

All these comprehensions are just creating a copy of the original collection.
They can all be simplified by simply constructing a new collection directly. The
resulting code is easier to read and shows the intent more clearly.

}

self.char_to_idx = {token: idx for idx, token in enumerate(self.special_tokens.values())}
self.idx_to_char = {idx: token for idx, token in enumerate(self.special_tokens.values())}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Replace identity comprehension with call to collection constructor (identity-comprehension)


ExplanationConvert list/set/tuple comprehensions that do not change the input elements into.

Before

# List comprehensions
[item for item in coll]
[item for item in friends.names()]

# Dict comprehensions
{k: v for k, v in coll}
{k: v for k, v in coll.items()}  # Only if we know coll is a `dict`

# Unneeded call to `.items()`
dict(coll.items())  # Only if we know coll is a `dict`

# Set comprehensions
{item for item in coll}

After

# List comprehensions
list(iter(coll))
list(iter(friends.names()))

# Dict comprehensions
dict(coll)
dict(coll)

# Unneeded call to `.items()`
dict(coll)

# Set comprehensions
set(coll)

All these comprehensions are just creating a copy of the original collection.
They can all be simplified by simply constructing a new collection directly. The
resulting code is easier to read and shows the intent more clearly.

@leonvanbokhorst leonvanbokhorst merged commit 6334c46 into main Nov 23, 2024
1 check passed
@leonvanbokhorst leonvanbokhorst deleted the journey-to-transformer branch November 23, 2024 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant