- 🤔 Researching artificial intelligence, focusing on multimodality.
- 🎓 Currently pursuing the double bachelor's degree in Electronic and Computer Engineering at Zhejiang University and University of Illinois Urbana-hampaign.
- 🌱 Learning more about Cloud Architecture, Systems Design and Network traffic measurement.
- 😄 Enjoys developing my indepedent game and watching cartoons in leisure time.
Proposed a novel model named MSMedCap based on the BLIP2, a classical multimodality model. We added a parallel Segment Anything Encoder at the vision encoder side to capture fine-grained information from medical images and generate more precise diagnoses. The paper is published on ICASSP 2024, co-first authorship. Link to the paper is attached: SAM-GUIDED ENHANCED FINE-GRAINED ENCODING WITH MIXED SEMANTIC LEARNING FOR MEDICAL IMAGE CAPTIONING.
Coursework of UIUC’s ECE391(Computer Architecture). We designed an operating system with functions including Scheduling, Dynamic Memory Allocation, Graphical User Interface, Signals, Dynamic File System, Mouse Driver, Sound Card Driver, and so on. The project gained the champion in the final competition of the course.
Coursework of ZJU’s ECE445 (Senior Design). We designed a clothing folding machine, which is able to detect clothes, grab clothes and fold clothes. We use Mask-RCNN as our computer vision model to detect clothes and build our own training dataset. We also design an open-loop system with inverse kinematics method to drive the robot arm to grab clothes. Finally, clothes will be folded by an automatic folding machine. The video below shows how the robot works. https://www.youtube.com/watch?v=VXa4wbYny8w