-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Where I'm From" poem & novel generator #49
Comments
he's such a card |
You can download CDs and DVDs of Project Gutenberg books here: I didn't know there is a Google Books API, I'll have to check it. |
DAY ONEIn my teaching years, this poem was everywhere: Where I'm From I am from clothespins, I'm from fudge and eyeglasses, I'm from Artemus and Billie's Branch, Under my bed was a dress box For my first trick, I'll be working on a poem generator (I know I know, we're building a novel, stay tuned ok) to identify the parts of speech at work here and generate new "I'm From" poems that mimic parts of speech and important sound patterns. This should be good practice in working with natural language processors in order to generate poem-length memoir-esque bits of text -- which I can then use as the base for further novel expansions. |
Not a bad start! I got RiTa loaded and working, so that's a huge step in the right direction. Next I think I need to find some word banks / corpora for specific parts of the poem (example: nature words). Rita's proper nouns are kind of cringe-y but I'll run it more times and see if I need to substitute something else there. FYI for anyone getting started with Rita, here's a list of the parts of speech abbreviations: |
Yes! One of my to-do items is to make a pull request to make the PoS list
more prominent in the RiTa documentation...
|
DAY TWOI spent a few hours this evening working on linking up random choices from custom word lists. I forked Darius's corpora repo linked in the NaNoGenMo resources and also found some good word lists on the internet for what I am looking for. Fun fact: as a middle school English teacher, I loved word lists, or "word pools" we would sometimes call them. The walls of my classroom were plastered with posters of color words, verbs, adjectives, sensory words, etc. (until mandatory testing took over the entire Spring and they had to be covered up). Sticking with the corpora format, the word lists are in JSON. JavaScript isn't my first programming language, so I had to google "how do I link a local JSON file to my javascript" and Y'ALL this should be a lot easier, doncha think? I did not want to involve html files or ajax requests (eeek) or jQuery (no!), at least not yet, so I cheated by just making my word list files .js files and then requiring them. Like this: Fear the Repo Shush, You, I'll DRY it up later. I'm pretty happy with how it's shaping up, I love using RiTA to be able to control syllable length. As a reminder, the source poem is here. I'm hoping to finish assembling the poem tomorrow, then I can figure out where I want to take it from there. |
DAY THIRD -- oh it is very late make that DAY FORTHJust checking in with some sample output. I wasn't happy with the trees and bushes lists available to me, so I'm just inventing some instead. :D Done through second stanza, two to go! Names list is 1,000 randomly generated names from list of random names -- if anyone wants to be added, I'm happy to add you! (ps repo is here) I am from nightclubs, I'm from parsnip and statistics, |
DAY FOUR (FOR REAL)We have a completed poem! Where I'm From I am from birthdays, I'm from celery and byproducts, I'm from South Gate and Beaverton, Above my tea cart was a aft box For the next step I can go one of (at least) two ways:
|
I like this. |
DAY ... TEN?Ok, after taking some time off to learn all the data structures and algorithms (or not learn, as the case may be), I needed a quick win so I came back to this and was able to publish a version of the poem generator! It's not very fancy, and probably breaks all the Node/Express rules (I am a very proficient Ruby on Rails developer seriously you should hire me), but it meets the prime objective of generating a new poem on demand. I like this so much I am not sure how to translate it into a novel... but let's not call it "done" yet, because I'm going to sleep on that. I found a couple open-source texts that work well for "memoir" style (Anne of Green Gables is the frontrunner), so I played with using RiTA to markov it up. My idea was to start with the base text, and then see if there's any way to prioritize the keywords generated in the 'Where I'm From' poem (so it would be a poem followed by short vignette featuring terms mentioned in that poem, and then more in that pattern). It's interesting, but it isn't very readable in paragraph form. So I think I need to consider another method for text generation. Which puts me back at the starting line. :) Maybe I'll just write more poems...? #NaPoGenMo! I'm not 100% invested in the novel form, at least not for my first experiment this year, but I'm shooting to adhere to the 50,000 word count... |
+1 NaAnOfGreGaGenMo! Ah, if only I wasn't already overcommitted... (There really is a NaPoGenMo too btw, but it's held in April.) |
DAY ELEVENTHENSome quick text to share, I'm playing with the RiTA RiLexicon to find near replacement words for a classic poem (again with the poems!!! she just won't stop...). My goal here is to generate output that is clearly recognizable, but sounds bananas. You might be curious, what is the difference between Rita's RiLexicon methods similarBySound(), similarByLetter(), similarBySoundAndLetter(), and rhymes()? So glad you asked... let's take a look at each of these at play! Each method returns an array of matches, so the computer is choosing a random match (or the original word) each time. Similar by SoundCompares the phonemes of the input word (using a version of the min-edit distance algorithm) to each word in the lexicon, returning the set of closest matches. Two reeds divert in a yell good, Similar by LetterCompares the characters of the input string (using a version of the min-edit distance algorithm) to each word in the lexicon, returning the set of closest matches. Two loads diverged in a fellow wood, Similar by Sound and LetterFirst calls similarBySound(), then filters the result set by the algorithm used in similarByLetter(); Two rods diverge in a bellow good, RhymeTwo words rhyme are considered as rhyming if their final stressed vowel and all following phonemes are identical Two episodes diverged in a mellow likelihood, VerdictI hadn't tried by letter before this little exercise (thinking the sound would be more important) but I actually like that output the best, here. It does seem to be keeping the sound and rhythm of the word as well. Linguistical coincidence? Edit-distance magick? Rhyme is clearly variating greatest from the source text -- this could be fun to play with for replacing end words (or generating new rhyme words) but I won't use it in this "replace nearly every word" exercise. Just for fun: AlliterationFinds alliterations by comparing the phonemes of the input string to those of each word in the lexicon Two razor diverged in a abuse wings, ^^Yikes, that's dark, RiTA! I won't be using this but watch this: Two [roads organizational] diverged in a [yellow impugning] [wood whittle], These are the word pairs it's claiming for alliteration. Some are truly weird. I feel like this would need some human editing if you were to use it in text generation, or else I might just throw out anything that doesn't start with the same letter as the base word (those all seem to work well!). Signing off for now, I'm going to keep working on Bob Frost then see what else I can do in RiTA. |
DAY THE LASTAfter debating what to do with my poor poem-that-is-not-a-novel I decided to go ahead and use Rita's markov functionality, but use it on the poem as source material. What results is an epic memoir poem that doesn't have much plot but generates some interesting language. Not bad for a first attempt! My #NaNovGenMo2015 SubmissionAnd here is the source code How I made it:
I was going to serve up the results through express and node just like with my poem generator, but as soon as I got close, I ran into an 'Maximum call stack size exceeded' error. So, eff that. Markdown it is! An interesting aspect of markdown is that it doesn't preserve all the line breaks. I played with this and ultimately decided that I liked the paragraphs/prose poem format for such a long text document, so I left it alone (for a formatted version, see my earlier attempt which does preserve line breaks). I did discover that RiTA will occasionally generate language I wouldn't want to use in an app, so I'm curious if anyone (Darius) has already made a filter for this. This was fun! I still have Bob Frost to play with, and coincidentally a little project I'm working on called "Walk or Not" fits well with my Ritafied poem. I learned a bunch about natural language processing this month and feel much more comfortable working with RiTA and JavaScript. Questions or comments? I will answer what I can... if I do it again, I'll be purposeful about chapter headings or something that can break up the 50,000 words to help the flow. At this point, though, I can tinker no more. Thanks for the opportunity and see you next year! |
Some considerations:
__I'll be coding in
RubyJavaScript__I'd like to try using the Goodreads API / Google Books API (or something similar) in some way
__Use text from Gutenberg or scrape from internet? I've yet to try scraping so that could be interesting (plus Gutenberg has a pretty strict anti-robot policy so texts would need to be downloaded)
__My husband's idea: find an appropriate sci-fi novel and replace all instances of "snake people" with "millennials" (I am not making this, but somebody should)
The text was updated successfully, but these errors were encountered: