Thoughts on going forward with adding additional model specifications to the table

## Summary

Currently there is the idea to add the following to the table:

- length of context window
- number of tokens trained

This issue is to discuss if we want to do so and what the implications are. I believe this is an important decision to make moving forward, so I would like to bring this to our attention here.

## Implications

If we want to add these, we could have one separate row per published model version. Model version here indicates the standalone model variant published by the authors. This could either be due to different model sizes (see LlaMA-7B, 13B, 33B, 65B) or due to different training procedures (MPT-7B-base, vs -instruct, -chat, -storywriter). This will have an effect on the assigned properties in our table (model size, number of tokens trained, context window, ...)

In short, including more information inside the table would lead to:
- more columns, for more properties
- more rows, as we need to differentiate between each model version (alternatively, one could indicate the span for all models in one single row, e.g 1T - 10T for number of tokens or 1024 - 4096 for context width)

with the following consequences to the audience:
- more complete information, greater level of detail
- more difficult to get a quick overview, might damage the table clarity

---
What are your thoughts on this?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Thoughts on going forward with adding additional model specifications to the table #7

Summary

Implications

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Thoughts on going forward with adding additional model specifications to the table #7

Description

Summary

Implications

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions