RollingWang

🎯

Focusing

Rolling RollingWang

🎯

Focusing

A vision algorithm engineer~

2 followers · 19 following

Meituan
Beijing
https://rollingwang.github.io/rolling.github.io/

Starred repositories

mannaandpoem / OpenManus

No fortress, purely open ground. OpenManus is Coming.

Python 10,774 1,527 Updated Mar 7, 2025

camel-ai / camel

🐫 CAMEL: Finding the Scaling Law of Agents. The first and the best multi-agent framework. https://www.camel-ai.org

Python 7,071 813 Updated Mar 7, 2025

microsoft / TinyTroupe

LLM-powered multiagent persona simulation for imagination enhancement and business insights.

Python 6,050 483 Updated Feb 28, 2025

QwenLM / Qwen-Agent

Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.

Python 6,060 542 Updated Mar 7, 2025

feizc / FluxMusic

Text-to-Music Generation with Rectified Flow Transformers

Python 1,671 133 Updated Dec 10, 2024

InternLM / xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Python 4,340 329 Updated Feb 21, 2025

mlfoundations / open_clip

An open source implementation of CLIP.

Python 11,157 1,057 Updated Mar 1, 2025

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,583 425 Updated Aug 7, 2024

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 7,189 553 Updated Feb 26, 2025

sunanhe / MKT

Official implementation of "Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer".

Python 125 6 Updated Nov 7, 2024

feizc / Diffusion-RWKV

Scaling RWKV-Like Architectures for Diffusion Models

Python 124 5 Updated Apr 12, 2024

yformer / EfficientSAM

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

Jupyter Notebook 2,277 153 Updated Dec 24, 2024

Q-Future / Q-Align

③[ICML2024] [IQA, IAA, VQA] All-in-one Foundation Model for visual scoring. Can efficiently fine-tune to downstream datasets.

Python 376 25 Updated Aug 12, 2024

zwx8981 / LIQE

[CVPR2023] Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

Python 214 12 Updated Jan 10, 2025

cloneofsimo / lora

Using Low-rank adaptation to quickly fine-tune diffusion models.

Jupyter Notebook 7,231 486 Updated Mar 22, 2024

microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 11,443 718 Updated Dec 17, 2024

lllyasviel / ControlNet

Let us control diffusion models!

Python 31,650 2,834 Updated Feb 25, 2024

CompVis / latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Jupyter Notebook 12,454 1,585 Updated Feb 29, 2024

tgxs002 / HPSv2

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

Jupyter Notebook 456 16 Updated May 24, 2024

stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image …

Python 2,096 278 Updated Mar 6, 2025

yuvalkirstain / PickScore

Python 481 29 Updated Dec 21, 2024

tgxs002 / align_sd

Better Aligning Text-to-Image Models with Human Preference. ICCV 2023

Python 274 9 Updated Jul 14, 2023

THUDM / ImageReward

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

Python 1,318 68 Updated Jan 24, 2025

xinyu1205 / recognize-anything

Open-source and strong foundation image recognition models.

Jupyter Notebook 3,101 291 Updated Feb 18, 2025

AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI

Python 149,005 27,828 Updated Mar 4, 2025

alibaba / MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…

C++ 9,970 1,772 Updated Mar 7, 2025

UX-Decoder / Segment-Everything-Everywhere-All-At-Once

[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"

Python 4,523 421 Updated Aug 19, 2024

Eurus-Holmes / Awesome-Multimodal-Research

A curated list of Multimodal Related Research.

Python 1,336 149 Updated Aug 5, 2023

willard-yuan / awesome-cbir-papers

📝Awesome and classical image retrieval papers

1,749 291 Updated Oct 31, 2023

amusi / awesome-ai-awesomeness

A curated list of awesome awesomeness about artificial intelligence

890 118 Updated Aug 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly