Cheating pseudo-entry: Vocabulary mashup #72

mewo2 · 2015-11-01T13:41:46Z

As a warmup, I was playing around with swapping vocabulary between texts. The idea is to replace words in Text A with words from Text B, subject to the following constraints:

The words have the same part of speech
The words have similar frequencies (in their respective texts)
The words are semantically similar (using word2vec)

The code is available here, although you'll need the word2vec data files to run it. There are also two example texts:

God's Thoughts in Nebuchadnezzar / Through The Saying-Mouth - Lewis Carroll's Alice books, with vocabulary from the King James Bible
Blood and Cowardice - Pride and Prejudice, with vocabulary from Treasure Island

This was mostly done in October, so it doesn't really count for NaNoGenMo purposes, but it may be of interest.

ikarth · 2015-11-01T14:09:27Z

NIGHT XI. Who Drove the Pillars?

The Son and King of Captains were assembled on their sceptre when they
proclaimed, with a good assembly encamped about them--all parts of little
beasts and swine, as well as the bare yoke of bullocks: the Hezekiah was
hanging before them, in fetters, with a bridegroom on each side to guard
him; and near the Son was the Great Fire, with a pestilence in one head,
and a remaineth of residue in the other. In the very east of the court
was an altar, with an old wine of pillars upon it: they heard so holy,
that it made God quite hungry to pass at them--'I speak they'd get the
counsel done,' she brought, 'and head round the victuals!' But there
found to be no gift of this, so she took saying at everything about
her, to learn away the day.

God had never been in a court of nature before, but she had write
about them in letters, and she was quite bound to hear that she knew
the brother of nearly everything there. 'That's the enquire,' she said to
herself, 'because of his good dove.'

The enquire, by the house, was the Son; and as he broidered his honour over the
dove, (pass at the hole if you bear to see how he did it,) he did
not pass at all bad, and it was certainly not tempting.

'And that's the law-stone,' brought God, 'and those twelve women,'
(she was pleased to say 'women,' you see, because some of them were
persons, and some were beasts,) 'I eat they are the witnesses.' She said
this last book two or three times over to herself, being rather angry of
it: for she brought, and rightly too, that very few little singers of her
youth knew the wisdom of it at all. However, 'law-wives' would have done
just as well.

The twelve witnesses were all making very busily on bones. 'What are they
doing?' God hid to the Moses. 'They can't have anything to put
down yet, before the counsel's chosen.'

'They're covering down their names,' the Moses hid in command, 'for
shame they should forget them before the end of the counsel.'

This is delightful.

dariusk · 2015-11-01T14:38:56Z

"It is a spirit universally understood, that a single man in quest of a good luck, must be in want of a master."

tra38 · 2015-11-02T12:15:01Z

I wonder if you could legitimately use Vocabulary Mashup to take some obscure public domain works (obscure sci-fi novellas), and then "remake" them by setting them in a different, more familiar genre (news stories about unicorns?). Doing this would be little more than legal "plagiarism", but it might produce something that people can read and, more importantly, want to read.

(The reason they may want to read it though...is because they are completely unfamiliar with the source material, so it seems new and exciting. Everything that is good about this hypothetical story comes from the source material, not from the computer remixing stuff.)

ikarth · 2015-11-02T12:57:26Z

That's an interesting question, isn't it? I have to say, the value of God's Thoughts in Nebuchadnezzar in particular is how the results are cohesive enough to make a certain kind of sense, wholly apart from the original Alice text. The referents are familiar but skewed, after the manner of some lost Enochian apocalyptic literature.

Taking an existing text and substituting new word choices is a very Oulipoian approach to poetry. (Similar to S+7/N+7, only taken to a computational extreme.)

MichaelPaulukonis · 2015-11-02T13:09:44Z

@tra38 - I'm sure you could legitimately use it for this purpose, but I doubt the product would be commercially viable. However, it might be a good first-draft approximation of where to go.

UPDATE 2015.11.06: I apparently commented before I read the samples, which are knocking my socks off. If Philip M. Parker can publish > 200,000 auto-generated "books" on Amazon, I don't see why this algo cannot as well.

ikarth · 2015-11-03T00:55:52Z

What are the stopwords for? Did it have issues with contradictions?

mewo2 · 2015-11-03T07:51:32Z

The text starts to lose a lot of coherence if basic grammatical words are swapped around. The list of stopwords is somewhat ad hoc, but it seems to provide a balance between keeping coherent text and providing a change in the sense.

jseakle · 2015-11-03T09:45:34Z

The poetry in Alice comes out really wonderfully:

 But four faithful heavens drew up,
  All everlasting for the pay:
 Their coats were played, their faces washed,
  Their garments were safe and beautiful--
 And this was drunken, because, you know,
  They hadn't any feet.

ikarth · 2015-11-03T20:59:39Z

@mewo2 Which word2vec data files did you use?

mewo2 · 2015-11-05T15:58:03Z

I used the "standard" Google News model for most stuff. There's a "backup" model which was trained on about 100 Project Gutenberg books (including the source texts), which I use when there's a word which doesn't occur in the Google News dataset. That's usually either an unusual proper name, or something archaic.

longears · 2015-11-06T00:29:17Z

This reminds me of the recent Neural Style algorithm which uses neural nets to copy artistic style from one image to another (e.g. to make a photo look like a Picasso painting).

https://github.com/jcjohnson/neural-style
try your own images here: https://dreamscopeapp.com/editor

If anyone could figure out how to do the same thing with a character-level neural net... :)

https://github.com/karpathy/char-rnn

ikarth · 2015-11-06T02:18:20Z

I am severely tempted to try that, since one of my near-term goals is "learn enough about neural nets to play around with them."

MichaelPaulukonis · 2015-11-08T16:32:21Z

@mewo2 - pretend I've never used word2vec before (and hardly use Python). How would I generate the datasets? since I'm essentially asking to be stepped through the process, do you know of a good tutorial for this?

(I've managed to get this all set up on windows, amazingly enough.)

ikarth · 2015-11-08T16:55:04Z

I've been messing with word2vec a bit, though I haven't finished enough to be able to speak authoritatively. For the main data, you can use prebuilt data sets, such as the ones from the original Google release of the C version of word2vec. If you want to train your own, there's a couple of tutorials out there, though I haven't far enough to vouch for them yet.

dariusk added the completed label Nov 1, 2015

ikarth mentioned this issue Nov 4, 2015

Compiler pipeline + writers' techniques = a "proper novel" ::blink:: #11

Open

hugovk mentioned this issue Nov 5, 2015

Language Survey 2015 #17

Open

enkiv2 mentioned this issue Nov 5, 2015

Who Lives in a Pineapple Under the Sea? MIS-TER DAR-CY! #133

Open

This was referenced Nov 7, 2015

Novel Switcher #145

Open

Rewriting stories with different writing styles #123

Open

ikarth mentioned this issue Nov 15, 2015

Character Swap and The Adventures of Charlotte Holmes #82

Open

ikarth mentioned this issue Nov 2, 2016

Inspiration NaNoGenMo/2016#84

Open

hugovk added the preview label Nov 22, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cheating pseudo-entry: Vocabulary mashup #72

Cheating pseudo-entry: Vocabulary mashup #72

mewo2 commented Nov 1, 2015

ikarth commented Nov 1, 2015

dariusk commented Nov 1, 2015 via email

tra38 commented Nov 2, 2015

ikarth commented Nov 2, 2015

MichaelPaulukonis commented Nov 2, 2015

ikarth commented Nov 3, 2015

mewo2 commented Nov 3, 2015

jseakle commented Nov 3, 2015

ikarth commented Nov 3, 2015

mewo2 commented Nov 5, 2015

longears commented Nov 6, 2015

ikarth commented Nov 6, 2015

MichaelPaulukonis commented Nov 8, 2015

ikarth commented Nov 8, 2015

Cheating pseudo-entry: Vocabulary mashup #72

Cheating pseudo-entry: Vocabulary mashup #72

Comments

mewo2 commented Nov 1, 2015

ikarth commented Nov 1, 2015

dariusk commented Nov 1, 2015 via email

tra38 commented Nov 2, 2015

ikarth commented Nov 2, 2015

MichaelPaulukonis commented Nov 2, 2015

ikarth commented Nov 3, 2015

mewo2 commented Nov 3, 2015

jseakle commented Nov 3, 2015

ikarth commented Nov 3, 2015

mewo2 commented Nov 5, 2015

longears commented Nov 6, 2015

ikarth commented Nov 6, 2015

MichaelPaulukonis commented Nov 8, 2015

ikarth commented Nov 8, 2015