Skip to content

woody35545/BioInformatics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Predict non-small cell lung cancer using DNA sequence

[보유 Sequence Data]

index 1 - 1342 κΉŒμ§€ λΉ„μ†Œμ„Έν¬νμ•” ν™˜μž (1341 개)
index 1342 - 2542 κΉŒμ§€ 일반인 (1201개)

μ•½ 120λͺ…μ˜ λΉ„μ†Œμ„Έν¬νμ•” ν™˜μž 데이터와 Reference Sequenceλ₯Ό μ¦ν­ν•˜μ—¬ λ§Œλ“  120λͺ…μ˜ 정상인 데이터가 원본 데이터(μ•½ 240개)λ₯Ό ꡬ성함.
원본 데이터에 λ…Έμ΄μ¦ˆλ₯Ό μΆ”κ°€ν•˜λŠ” λ°©μ‹μœΌλ‘œ (Data Augmentation) μ¦ν­ν•˜μ—¬ 총 2542개의 ν•™μŠ΅μš© λ°μ΄ν„°λ‘œ κ΅¬μ„±ν•˜μ˜€μŒ.

[ν•™μŠ΅μ— μ‚¬μš©λœ 데이터]

ν™˜μž 1200λͺ…, 일반인 1200λͺ…

2400개의 DNA Sequence Dataλ₯Ό ν•œλ²ˆμ— ν•™μŠ΅ν•˜κΈ°μ—λŠ” GPU Resource κ΄€λ ¨ Issueκ°€ λ°œμƒν•˜μ—¬μ„œ, 단계당 800λͺ… 데이터λ₯Ό μ΄μš©ν•˜μ—¬ 3λ‹¨κ³„λ‘œ λ‚˜λˆ μ„œ ν•™μŠ΅ν•˜μ˜€μŒ.(진행쀑)

1st train: ν™˜μž 400λͺ… + 정상인 400λͺ…

index: (1 ~ 400) + (1343 ~ 1742)   

2nd train: ν™˜μž 400λͺ… + 정상인 400λͺ…

index: (401 ~ 800) + (1743 ~ 2142)  

3rd train: ν™˜μž 400λͺ… + 정상인 400λͺ…

index: (801 ~ 1200) + (2143 ~ 2542)  

About

Non-small cell lung cancer diagnosis project through DNA sequence analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published