Skip to content

How good is the 65B model? Anyone tested it? #157

@elephantpanda

Description

@elephantpanda

I have tried the 7B model and while its definitely better than GPT2 it is not quite as good as any of the GPT3 models. This is somewhat subjective.
How do the other models compare 13B,... 65B etc.?

For example the 7B model succeeds with the prompt

The expected response for a highly intelligent computer to the input "What is the capital of France?" is "

but fails with the more tricky:

The expected response for a highly intelligent computer to the input "Write the alphabet backwards" is "

Has anyone got examples where it shows the difference between the models?

P.S.
Is there a better place to discuss these things rather than the issues section of github? We need a discord server.

Metadata

Metadata

Assignees

No one assigned

    Labels

    miscellaneousdoes not fit an existing category, useful to determine whether we need further categorization

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions