You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is going to a continuation of my ideas from last year in NaNoGenMo/2019#65
Treating vocabularies as numbering systems, and works composed from them as large numbers, to be manipulated.
Following some very good advice last year I switched focus towards the end of the month to ensuring I actually had 50k words in some kind of format that was readable, rather than bug free code that was pure and true to a half-baked concept that only I was judging on. It was a good exercise in project management: focus on the results that matter.
I was happy enough with the results last year. Some of the bugs / issues with the tokenisation of the source material seemed to make the output more interesting, and my attempts last year to fix it resulted in (if I remember correctly) less interesting output, so I embraced the glitches and accomplished the goal of producing a generated novel using a simple arithmetic operation on a text.
This round I want to:
Generalise the tokenisation to be robust against many kinds of input (I'll be using a mix of properly edited text and some OCR'd source content)
Work on formalising the tokenisation algorithm so it is repeatable / comprehensible
Overcome the challenge of converting a > 100K word text like Pride and Prejudice into an integer. With the current code this requires more than 4 gig of RAM
Work on a shared vocab across more than one source work (4) and do some more interesting averaging or combinations.
Figure out if there is a conceptually pure way to make the text output interesting, or whether the output will really be as interesting as reading a large integer.
The text was updated successfully, but these errors were encountered:
hornc
changed the title
Naked Fear, Loathing, Pride, Predjudice, and Brunch at Tiffany's (in Las Vegas).
Naked Fear, Loathing, Pride, Prejudice, and Brunch at Tiffany's (in Las Vegas).
Nov 4, 2020
This is going to a continuation of my ideas from last year in NaNoGenMo/2019#65
Treating vocabularies as numbering systems, and works composed from them as large numbers, to be manipulated.
Following some very good advice last year I switched focus towards the end of the month to ensuring I actually had 50k words in some kind of format that was readable, rather than bug free code that was pure and true to a half-baked concept that only I was judging on. It was a good exercise in project management: focus on the results that matter.
I was happy enough with the results last year. Some of the bugs / issues with the tokenisation of the source material seemed to make the output more interesting, and my attempts last year to fix it resulted in (if I remember correctly) less interesting output, so I embraced the glitches and accomplished the goal of producing a generated novel using a simple arithmetic operation on a text.
This round I want to:
The text was updated successfully, but these errors were encountered: