not quite finished yet, progress is spread through:
https://www.kaggle.com/datasets/matthewweinberger/long-discord dataset
https://www.kaggle.com/models/matthewweinberger/longtext model
https://www.kaggle.com/code/matthewweinberger/llm-v2 training code
https://huggingface.co/spaces/mattyhew/mattgpt gradio app with all code needed to deploy