-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Bayesian network and LLM integration for medical diagnosis #58
Conversation
…sing This commit refactors the DeepDreamer class in deepdream.py to improve the model initialization and image preprocessing steps. The model initialization now uses the weights parameter, and the image preprocessing includes resizing while maintaining the aspect ratio and converting the image to a tensor and normalizing it.
Add visual demonstration of the XOR problem showing: - Why XOR is not linearly separable - Neural network's learned decision boundaries - Interactive visualization of training process This builds on 01_xor_network.py to provide visual insights into: - Data point distribution - Failed linear separation attempts - Complex decision boundaries learned by the network Technical additions: - Contour plot of network decisions - Grid-based boundary visualization - Improved training monitoring
- Refactor the XORNetwork class to improve code readability and maintainability. - Update the class initialization and activation functions for better learning. - Add a new ModernXORNetwork class with batch normalization and leaky ReLU activation. - Ensure the output is between 0 and 1 using sigmoid activation.
…evolution This commit enhances the documentation of our neural network evolution demo by: - Adding detailed historical context for each architecture (1980s-2010s) - Explaining key innovations and their significance: * Sigmoid to Tanh activation transition * Introduction of ReLU and Batch Normalization * Early memory mechanisms leading to LSTM - Documenting architectural decisions and their rationale - Clarifying the progression of neural network development The comments help developers understand: - Why each architecture was significant - What problems each innovation solved - How different components work together - The historical context of deep learning evolution
- Add test_lstm_model function for model evaluation - Calculate key metrics (MAE, MSE, RMSE) on test data - Update demo to include testing phase - Add proper model evaluation mode handling This enables better assessment of model performance and generalization capabilities through standardized metrics.
Adds detailed explanations and technical documentation to the LSTM-based text generation model, including: - Architecture overview and component relationships - Detailed explanations of LSTM memory and sequence processing - Character embedding and vocabulary handling - Temperature-controlled text generation sampling - Tensor shape transformations and data flow The comments follow the project's documentation standards and mirror the detailed explanations in the XOR network implementation, making the code more accessible for learning purposes. Technical notes: - Follows Google-style docstrings - Includes implementation details and design decisions - Explains the role of each hyperparameter - Documents tensor shapes and transformations
- Add detailed docstrings explaining RNN concepts and historical significance - Include architectural explanations for network components - Document relationship between mathematical operations and intuitive concepts - Add inline comments explaining the purpose of each layer and transformation - Follow educational style similar to XOR network example Part of the "Journey to Transformer" tutorial series.
- Add detailed module docstring explaining Word2Vec theory and implementation - Document model architecture and training process - Add explanatory comments for key algorithms and data structures - Include examples of semantic relationships in embeddings - Explain negative sampling and context window concepts - Add inline comments for code clarity and maintainability This documentation helps developers understand both the theoretical foundations and practical implementation details of the Word2Vec model.
- Add comprehensive module docstring explaining softmax history and importance - Include detailed function docstrings with mathematical explanations - Add scenario-based examples with explanatory comments - Improve print statements with educational context - Structure code sections with clear learning objectives This commit improves the educational value of the softmax implementation by providing deeper context and clearer explanations of the concepts.
Add educational implementation of attention mechanisms showing: - Word embeddings and vocabulary mapping - Simple dot product attention scoring - Sentence-level search functionality This commit provides a foundational example for understanding attention mechanisms, a key component of transformer architectures. Key features: - CoolAttention class with word embeddings - Sentence encoding and attention search - Visualization of attention scores - Detailed documentation of concepts and history Technical details: - 5-dimensional word embeddings - Case-sensitive word matching - Dot product attention scoring
Added a new file, tokenizer_vocab.json, which contains the vocabulary mapping for the tokenizer used in the project. This file includes character-to-index mappings and special tokens such as PAD, BOS, EOS, and UNK.
…tracting relationships
…tracting relationships
…tracting relationships
Reviewer's Guide by SourceryThis PR introduces several new Python files implementing various neural network architectures and concepts, from basic neural networks to transformers. The implementation follows a journey through the evolution of neural networks, with each file building upon concepts from previous ones. The PR also adds a Bayesian network implementation for medical diagnosis and a DeepDream implementation. Class diagram for BayesianLLM and DiagnosticReasoningclassDiagram
class BayesianLLM {
- str model_name
- ChatOllama llm
- Dict~str, List~str~~ nodes
- Optional~BayesianNetwork~ network
- str patient_story
- Path log_file
+ __init__(model_name: str)
+ _initialize_log_file()
+ log_diagnostic_process(evidence: Dict~str, str~, diagnosis: DiagnosticReasoning) -> None
+ create_node(description: str) -> Tuple~str, List~str~~
+ extract_relationships(text: str) -> List~Tuple~str, str~~
+ build_network()
+ extract_medical_concepts(story: str) -> List~str~
+ extract_evidence(story: str) -> Dict~str, str~
+ setup_medical_network(story: str)
+ generate_explanation(evidence: Dict~str, str~) -> str
+ generate_diagnostic_reasoning(evidence: Dict~str, str~) -> DiagnosticReasoning
+ explain_decision_path(diagnosis: DiagnosticReasoning) -> str
+ verify_log_file() -> bool
}
class DiagnosticReasoning {
- str conclusion
- float confidence
- List~str~ evidence_path
- List~Tuple~str, float~~ alternative_explanations
}
Class diagram for MiniGPT and related classesclassDiagram
class MiniGPT {
- GPTConfig config
- nn.Embedding token_embedding
- nn.Embedding position_embedding
- ModuleList blocks
- nn.LayerNorm ln_f
- nn.Linear lm_head
+ __init__(config: GPTConfig)
+ _init_weights(module)
+ forward(idx, targets=None)
+ generate(idx, max_new_tokens, temperature=1.0, sample_fn=None)
}
class TransformerBlock {
- MultiHeadAttention attention
- FeedForward feed_forward
- nn.LayerNorm ln1
- nn.LayerNorm ln2
+ __init__(config)
+ forward(x)
}
class MultiHeadAttention {
- int num_heads
- int head_size
- float dropout
- nn.Linear query
- nn.Linear key
- nn.Linear value
- nn.Linear proj
- mask
+ __init__(config)
+ forward(x)
}
class FeedForward {
- nn.Sequential net
+ __init__(config)
+ forward(x)
}
class GPTConfig {
- int vocab_size
- int block_size
- int n_layer
- int n_embd
- int num_heads
- int head_size
- float dropout
+ __init__(vocab_size, block_size, n_layer=6, n_embd=384, num_heads=6, dropout=0.1)
}
MiniGPT --> TransformerBlock
TransformerBlock --> MultiHeadAttention
TransformerBlock --> FeedForward
MiniGPT --> GPTConfig
Class diagram for Word2Vec and Word2VecTrainerclassDiagram
class Word2Vec {
- nn.Embedding target_embeddings
- nn.Embedding context_embeddings
+ __init__(vocab_size, embedding_dim)
+ forward(target_word, context_word)
+ get_embedding(word_idx)
}
class Word2VecTrainer {
- int window_size
- Dict~str, int~ vocab
- Dict~int, str~ idx_to_word
- int vocab_size
- List~Tuple~int, int, float~ training_pairs
- Word2Vec model
- optim.Adam optimizer
- nn.BCELoss criterion
+ __init__(text, embedding_dim=64, window_size=2, min_count=5)
+ _create_training_pairs(words)
+ train(epochs=100, batch_size=24)
+ get_similar_words(word, n=5)
}
Word2VecTrainer --> Word2Vec
Class diagram for MoodPredictorclassDiagram
class MoodPredictor {
- int hidden_size
- nn.Linear input_layer
- nn.RNNCell rnn_cell
- nn.Linear output_layer
- nn.Tanh tanh
- nn.Sigmoid sigmoid
+ __init__(input_size, hidden_size, output_size)
+ forward(x, hidden=None)
}
Class diagram for TextPredictor and TextProcessorclassDiagram
class TextPredictor {
- int hidden_size
- nn.Embedding embedding
- nn.LSTM lstm
- nn.Linear fc
+ __init__(vocab_size, embedding_dim=32, hidden_size=128)
+ forward(x, hidden=None)
}
class TextProcessor {
- str chars
- Dict~str, int~ char_to_idx
- Dict~int, str~ idx_to_char
- int vocab_size
+ __init__()
+ encode(text)
+ decode(indices)
}
Class diagram for BasicNetwork, ImprovedNetwork, ModernNetwork, and SimpleMemoryNetworkclassDiagram
class BasicNetwork {
- nn.Linear layer1
- nn.Sigmoid sigmoid
- nn.Linear layer2
+ __init__(input_size, hidden_size, output_size)
+ forward(x)
}
class ImprovedNetwork {
- nn.Linear layer1
- nn.Tanh tanh
- nn.Linear layer2
+ __init__(input_size, hidden_size, output_size)
+ forward(x)
}
class ModernNetwork {
- nn.Linear layer1
- nn.BatchNorm1d bn1
- nn.ReLU relu
- nn.Linear layer2
- nn.BatchNorm1d bn2
- nn.Linear layer3
+ __init__(input_size, hidden_size, output_size)
+ forward(x)
}
class SimpleMemoryNetwork {
- int hidden_size
- nn.Linear input_gate
- nn.Linear memory_transform
- nn.Linear output_gate
- nn.Linear output
- nn.Tanh tanh
- nn.Sigmoid sigmoid
+ __init__(input_size, hidden_size, output_size)
+ forward(x, hidden_state=None)
}
Class diagram for XORNetworkclassDiagram
class XORNetwork {
- nn.Sequential layers
+ __init__()
+ forward(x)
}
Class diagram for DeepDreamerclassDiagram
class DeepDreamer {
- Inception3 model
- Dict~str, Tensor~ activations
- str layer_name
+ __init__(model_name="inception_v3", layer_name="Mixed_5b")
+ _get_activation(name)
+ preprocess_image(image_path, size=512)
+ deprocess_image(tensor)
+ dream(image_path, num_iterations=20, lr=0.01, octave_scale=1.4, num_octaves=4)
}
Class diagram for SimpleRNN and SimpleLSTMclassDiagram
class SimpleRNN {
- int hidden_size
- nn.RNNCell rnn_cell
- nn.Linear output
+ __init__(input_size, hidden_size, output_size)
+ forward(x, hidden=None)
}
class SimpleLSTM {
- nn.LSTM lstm
- nn.Linear output
+ __init__(input_size, hidden_size, output_size)
+ forward(x)
}
Class diagram for CoolAttentionclassDiagram
class CoolAttention {
- List~str~ story
- Dict~str, int~ word2idx
- nn.Embedding embeddings
+ __init__()
+ encode_sentence(sentence)
+ attention_search(person)
}
Class diagram for SimpleClassifierclassDiagram
class SimpleClassifier {
- nn.Linear layer
+ __init__()
+ forward(x)
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @leonvanbokhorst - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider splitting the larger files (especially 20_bayes_medical_explanability.py) into smaller, more focused modules for better maintainability.
- Add type hints to function parameters and return values throughout the codebase to improve code clarity and maintainability.
Here's what I looked at during the review
- 🟡 General issues: 1 issue found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟡 Complexity: 1 issue found
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
self.optimizer = optim.Adam(self.model.parameters()) | ||
self.criterion = nn.BCELoss() | ||
|
||
def _create_training_pairs(self, words): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (performance): Implement size limit for training pairs to prevent memory issues
Consider implementing a maximum size limit for the pairs list to prevent potential memory issues with large input texts.
def _create_training_pairs(self, words, max_pairs=1000000):
] | ||
) | ||
|
||
def log_diagnostic_process( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider extracting log entry preparation logic into a separate method to improve code organization.
The logging implementation can be simplified while maintaining all functionality and error handling. Here's a suggested refactor:
def _prepare_log_entry(self, evidence: Dict[str, str], diagnosis: DiagnosticReasoning) -> Dict[str, str]:
"""Prepare log entry data with error handling"""
network_structure = (
[f"{cause} → {effect}" for cause, effect in self.network.edges()]
if self.network else []
)
return {
"timestamp": datetime.now().isoformat(),
"patient_story": self.patient_story.strip(),
"extracted_evidence": json.dumps(evidence),
"primary_conclusion": diagnosis.conclusion,
"confidence": str(diagnosis.confidence),
"evidence_path": json.dumps(diagnosis.evidence_path),
"alternative_explanations": json.dumps(diagnosis.alternative_explanations),
"network_structure": json.dumps(network_structure)
}
def log_diagnostic_process(self, evidence: Dict[str, str], diagnosis: DiagnosticReasoning) -> None:
"""Log diagnostic process with simplified error handling"""
try:
log_entry = self._prepare_log_entry(evidence, diagnosis)
with open(self.log_file, "a", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=list(log_entry.keys()))
writer.writerow(log_entry)
logger.info(f"Successfully logged diagnostic process to {self.log_file}")
except Exception as e:
logger.error(f"Failed to log diagnostic process: {e}", exc_info=True)
raise
This refactor:
- Extracts log entry preparation to a separate method
- Flattens the error handling structure
- Maintains all functionality and error tracking
- Makes the code flow more linear and easier to follow
) | ||
|
||
try: | ||
chain = prompt | self.llm | StrOutputParser() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Extract code out into method (extract-method
)
logger.debug(f"Cleaned medical concepts response: {content}") | ||
|
||
try: | ||
concepts = json.loads(content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Extract code out into method (extract-method
)
explanation = chain.invoke({"nodes": nodes_str, "evidence": evidence_str}) | ||
|
||
return explanation | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Inline variable that is immediately returned (inline-immediately-returned-variable
)
explanation = chain.invoke({"nodes": nodes_str, "evidence": evidence_str}) | |
return explanation | |
return chain.invoke({"nodes": nodes_str, "evidence": evidence_str}) |
network_structure = [ | ||
f"{cause} → {effect}" for cause, effect in self.network.edges() | ||
] | ||
nodes_states = {node: states for node, states in self.nodes.items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): We've found these issues:
- Replace identity comprehension with call to collection constructor (
identity-comprehension
) - Extract code out into method (
extract-method
) - Invert any/all to simplify comparisons (
invert-any-all
)
Explanation
Convert list/set/tuple comprehensions that do not change the input elements into.Before
# List comprehensions
[item for item in coll]
[item for item in friends.names()]
# Dict comprehensions
{k: v for k, v in coll}
{k: v for k, v in coll.items()} # Only if we know coll is a `dict`
# Unneeded call to `.items()`
dict(coll.items()) # Only if we know coll is a `dict`
# Set comprehensions
{item for item in coll}
After
# List comprehensions
list(iter(coll))
list(iter(friends.names()))
# Dict comprehensions
dict(coll)
dict(coll)
# Unneeded call to `.items()`
dict(coll)
# Set comprehensions
set(coll)
All these comprehensions are just creating a copy of the original collection.
They can all be simplified by simply constructing a new collection directly. The
resulting code is easier to read and shows the intent more clearly.
} | ||
|
||
self.char_to_idx = {token: idx for idx, token in enumerate(self.special_tokens.values())} | ||
self.idx_to_char = {idx: token for idx, token in enumerate(self.special_tokens.values())} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Replace identity comprehension with call to collection constructor (identity-comprehension
)
Explanation
Convert list/set/tuple comprehensions that do not change the input elements into.Before
# List comprehensions
[item for item in coll]
[item for item in friends.names()]
# Dict comprehensions
{k: v for k, v in coll}
{k: v for k, v in coll.items()} # Only if we know coll is a `dict`
# Unneeded call to `.items()`
dict(coll.items()) # Only if we know coll is a `dict`
# Set comprehensions
{item for item in coll}
After
# List comprehensions
list(iter(coll))
list(iter(friends.names()))
# Dict comprehensions
dict(coll)
dict(coll)
# Unneeded call to `.items()`
dict(coll)
# Set comprehensions
set(coll)
All these comprehensions are just creating a copy of the original collection.
They can all be simplified by simply constructing a new collection directly. The
resulting code is easier to read and shows the intent more clearly.
++
Summary by Sourcery
Introduce multiple new machine learning models and algorithms, including a Bayesian network-based medical diagnosis system, a MiniGPT transformer model, a Word2Vec embedding model, an RNN mood predictor, an LSTM for character prediction, and a simple attention mechanism. Enhance the project with a deep dreamer class for generating DeepDream images and update the documentation to include detailed explanations of the historical significance and key concepts of these models.
New Features:
Enhancements:
Documentation: