Aurora

Introduction

Aurora is a powerful large language model optimized to efficiency and low computation. While leveraging the transformer architecture this model is made possible. Aurora is trained on large datasets, and then fine-tuned to produce useful answers. As a low computation and open-source model, Aurora simplifies LLMs for everyone, both in training and in utilisation. (NOTE: Aurora is based on a highly mathematical algorithms which trains on data, given this, Aurora can make mistakes and it is not fully perfect and reliable.)

How it Works

While explaining the architecture behind Aurora is very complicated, here is a brief comprehension. The Aurora language model is based on a decoder only transformer, specifically, it somewhat follows the gpt-2 architecture. First the block goes through tokenization, this is a way of representing text as digits. Then the tokens pass through an embedding layer, this assigns intelligent integers to every word, words that are similar, also have similar embedding values. Simultaneously, the block goes through positional embedding to maintain integers which take the word structure into account. The embedding and positional embedding values are added together to make the input block. This input block then goes through layer normalization. The normalized block goes through multi-head-self-attention, this represents the block as an integer on how different words relate, it helps determine the long term interactions between words. After this, the output block goes through layer normalization once again. These newly normalized values pass through a two layer feedforward network with a gelu activation function. The final output, is then fed into a feedforward nerual network determining the next word. This is the generation, the training involves then determining the loss of this network based on the prediction, and backward propagating it in a sequential manner through the transformer. After this network is fully trained, it will have the capability to generate cohesive human-like text. The next step is applying fine-tuning so that it becomes a conversational chat-bot. (NOTE: This process is repeated in flexible ways depending on hyper parameters, for example the number of layers, or number of epochs. This explanation only represents a brief outline disregarding hyper-parameters)

Conclusion

Aurora leverages state-of-the-art technologies for word generations. It utilizes the gpt-2 paper in order to determine optimal yet efficient low-weight parameters. Although it does modify gpt-2 to achieve more powerful results. Aurora is yet not available for the public, it is currently being implemented and trained. While Aurora is not the strongest model, Sociloc has plans for expanding computation power and implementing a greater language model.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Aurora		Aurora
diagrams		diagrams
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aurora

Introduction

How it Works

Conclusion

About

Releases

Packages

Languages

License

Sociloc/Aurora

Folders and files

Latest commit

History

Repository files navigation

Aurora

Introduction

How it Works

Conclusion

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages