Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Name 'o1'? #19

Closed
huanhuan6666 opened this issue Dec 3, 2024 · 1 comment
Closed

Name 'o1'? #19

huanhuan6666 opened this issue Dec 3, 2024 · 1 comment

Comments

@huanhuan6666
Copy link

While the authors haven’t said outright that they’re replicating O1, the use of “o1” in the name and the strawberry icon (which really screams OpenAI's O1 logo) makes it seem like it’s closely related. In my opinion, that's a bit misleading, especially since the actual methodology doesn't seem to align with what’s known about O1.

The report uses the Open-O1 CoT Dataset (Filtered) for most of its data—over two-thirds of it, in fact. They also mention generating some additional data using MCTS, but the details on how this is done are a bit sparse. Specifically, using a "confidence score" to guide MCTS data generation seems risky, since it might just amplify the model’s inherent biases, as discussed in issue #13. It’d be great to see more transparency on how data quality is handled here.

The fine-tuning approach using Qwen2-7B-Instruct and CoT data is a very traditional method. And the MCTS for guiding reasoning steps has already been explored in many past works, such as AlphaMath and AlphaZero-like Tree-Search, which maybe offer valuable insights for future updates.

In conclusion, while the current work presents some interesting ideas, the name "Marco-o1" and the use of the strawberry icon could easily lead to misunderstandings about its relationship to OpenAI’s O1 model. I’m hopeful that the team will continue to refine their approach and release more innovative updates in the future. Looking forward to seeing where this work goes!

@longyuewangdcu
Copy link
Collaborator

Hi Huanhuan,

Thanks for your interests.

As emphasized in the Limitations Section, this research work is inspired by OpenAI's o1 (from which the name is also derived). This work aims to explore potential approaches to shed light on the currently unclear technical roadmap for large reasoning models. Besides, our focus is on open-ended questions, and we have observed interesting phenomena in multilingual applications. However, we must acknowledge that the current model primarily exhibits o1-like reasoning characteristics and its performance still fall short of a fully realized "o1" model. This is not a one-time effort, and we remain committed to continuous optimization and ongoing improvement.

Any updates, we will let you know. If you have any idea about "actual methodology", please kindly share them to us (you also are welcome to join this research project).

Cheers,
Vincent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants