-
Notifications
You must be signed in to change notification settings - Fork 1
/
man.html
389 lines (271 loc) · 15.5 KB
/
man.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
<html><head><title>parody v1.2.9 - Parody some source text with a first order Markov chain and synonym mutation from a thesaurus.</title>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" >
</head>
<body class='pod'>
<!--
generated by Pod::Simple::HTML v3.14,
using Pod::Simple::PullParser v3.14,
under Perl v5.012003 at Mon Nov 7 12:22:42 2011 GMT.
If you want to change this HTML document, you probably shouldn't do that
by changing it directly. Instead, see about changing the calling options
to Pod::Simple::HTML, and/or subclassing Pod::Simple::HTML,
then reconverting this document from the Pod source.
When in doubt, email the author of Pod::Simple::HTML for advice.
See 'perldoc Pod::Simple::HTML' for more info.
-->
<!-- start doc -->
<a name='___top' class='dummyTopAnchor' ></a>
<h1><a class='u' href='#___top' title='click to go to top of document'
name="NAME"
>NAME</a></h1>
<p>parody v1.2.9 - Parody some source text with a first order Markov chain and synonym mutation from a thesaurus.</p>
<h1><a class='u' href='#___top' title='click to go to top of document'
name="SYNOPSIS"
>SYNOPSIS</a></h1>
<p>parody [options] <file|url>...</p>
<pre> Basic Options:
-h --help Get full man page output
-v --verbose Verbose output with details of mutations
-d --debug Debug output
-c --colour Add colour highlights to word mutations
--html Output <span> tag colours instead of terminal colours, use with --colour
-i --input-model Load a stored model from file
-o --output-model Store a model to disk for later use with -i
-n --dont-generate Don't generate output from model
-s --start Word to start generating from
-l --length Number of words to output
-? --punctuation Include punctuation as model symbols
-u --passthru Just pass the text straight through applying text replacement operators
-g --graph-viz Print out a graph viz file of the model
Advanced Options:
-m --mutation Probability of mutation from thesaurus
-r --replace Replace a set of match words with replacements
-e --emotion Alter the emotional content of the text
-p --parts-of-speech Mutate only on words forming parts of speech
--prior-model Use a stored model as a prior for creating a new chain from input sources
--prior-weight How much should the old model be used compared to the new</pre>
<h1><a class='u' href='#___top' title='click to go to top of document'
name="DESCRIPTION"
>DESCRIPTION</a></h1>
<p>This program can be used to produce output text that is statistically similar to its input, whilst introducing novelty in the form of per word mutations from a thesaurus.</p>
<p>A <b>Markov chain</b>, named for <i>Andrey Markov</i>, is a mathematical system that undergoes transitions from one state to another, like following the links in a chain. In this program states and words from text are synonymous. From a given word we calculate the expectation of another word following in the text, and nothing more. This is termed the <b>Markov property</b> of the chain and refers to the memoryless property of a stochastic process. To generate the parody output we start at a word and select a new word stochastically, based on the expectation of each possible word that followed in the original text. We then move (transition) to this selected word, discarding any information about what we have previously generated. This walk of the chain continues until the length of output specified is reached.</p>
<p>Mutation of output words is acheived with the use of the public domain Moby Thesaurus, and ANEW emotional content data (academic use only). These data allow for mutation of a word based on part-of-speech specific (Noun, Verb etc.) synonyms and pyschological valence score (positive or negative emotion).</p>
<p>You may wish to use this program to create novel text from known inputs for various reasons. The following are some historic uses of Markov chain text parody:</p>
<p>In 1984 <i>Mark V Shaney</i> a fake Usenet user introduced postings generated from a Markov chain technique. This script could be used to the same ends especially for short forced grammar situations like Twitter posts.</p>
<pre> Example posts: http://en.wikipedia.org/wiki/Mark_V_Shaney</pre>
<p>The well received aleatoric poems <b>Postmortem Series</b> and <b>Accuracy</b> by the American poet <i>Jeffrey Harrison</i> were Markov chain inspired. This script is especially suited to this kind of inspired verse work, as words of a given stanza can be mutated and have the emotional content transformed. For example transforming the works of Edgar Allan Poe to have a positive outlook is an especially succesful use case.</p>
<pre> Markov poem examples: http://www.moriapoetry.com/harrison.html
Harrison's Bio: http://home.comcast.net/~jeffrey.harrison/bio.htm</pre>
<p>For examples of how to use this program with your own input, please see the <i>EXAMPLES</i> section.</p>
<h1><a class='u' href='#___top' title='click to go to top of document'
name="OPTIONS"
>OPTIONS</a></h1>
<dl>
<dt><a name="-h,_--help"
><b>-h, --help</b></a></dt>
<dd>
<p>Print this brief help message from the command line.</p>
<dt><a name="-d,_--debug"
><b>-d, --debug</b></a></dt>
<dd>
<p>Print debug output showing how the text is being mutated with thesaurus usage.</p>
<dt><a name="-v,_--verbose"
><b>-v, --verbose</b></a></dt>
<dd>
<p>Verbose output showing how the text is changing.</p>
<dt><a name="-c,_--colour"
><b>-c, --colour</b></a></dt>
<dd>
<p>Use colour to highlight text substitutions being made. Vanilla thesaurus substitution is in blue, emotional content substitution is in green for positive change and red for negative change.</p>
<dt><a name="--html"
><b>--html</b></a></dt>
<dd>
<p>Use HTML <span> tags instead of ANSI terminal codes to colour the text output, only useful with the --colour option.</p>
<dt><a name="-i,_--input-model=file"
><b>-i, --input-model</b>=<i>file</i></a></dt>
<dd>
<p>Pass in a previously stored Markov Chain model.</p>
<dt><a name="-o,_--output-model=file"
><b>-o, --output-model</b>=<i>file</i></a></dt>
<dd>
<p>Pass in a location where you would like the Markov Chain for this session stored.</p>
<dt><a name="-n,_--dont-generate"
><b>-n, --dont-generate</b></a></dt>
<dd>
<p>Don't generate output, useful if you just wish to batch merge/create models.</p>
<dt><a name="-s,_--start=word"
><b>-s, --start</b>=<i>word</i></a></dt>
<dd>
<p>Specify a word to start generating text from. The default action is to pick a word at random from the source text.</p>
<dt><a name="-l,_--length=magnitude"
><b>-l, --length</b>=<i>magnitude</i></a></dt>
<dd>
<p>Define how many output words should be generated from the Markov chain.</p>
<dt><a name="-?,_--punctuation"
><b>-?, --punctuation</b></a></dt>
<dd>
<p>Force punctuation to be model symbols regardless of white space. This is sometimes useful for improving white space in parody of verse.</p>
<dt><a name="-u,_--passthru"
><b>-u, --passthru</b></a></dt>
<dd>
<p>If you wish to just change the emotional content of the source text this flag will allow you to do that.</p>
<dt><a name="-g,_--graph-viz"
><b>-g, --graph-viz</b></a></dt>
<dd>
<p>Output to <i>STDOUT</i> a GraphViz DOT file of the Markov Chain this can then be used to visualize the model.</p>
<dt><a name="-m,_--mutation=probability"
><b>-m, --mutation</b>=<i>probability</i></a></dt>
<dd>
<p>Allow per word mutation from the thesaurus with a given expectation 0.0 < probability <= 1.0. If you asked for an output length of 100 words and specified a probability of 0.2 then 20 words are likely to have been mutated.</p>
<dt><a name="-r,_--replace_match1=replacement1_match2=replacement2_..."
><b>-r, --replace</b> <i>match1</i>=<i>replacement1</i> <i>match2</i>=<i>replacement2</i> ...</a></dt>
<dd>
<p>Define your own replacement map, for example: --replace <i>he=she</i> <i>his=hers</i> <i>him=her</i></p>
<dt><a name="-e,_--emotion=positive|negative"
><b>-e, --emotion</b>=<i>positive|negative</i></a></dt>
<dd>
<p>Force mutations with a given emotional feeling. Current options include <i>positive</i> or <i>negative</i> emotional content.</p>
<dt><a name="-p,_--parts-of-speech=list_of_parts"
><b>-p, --parts-of-speech</b>=<i>list of parts</i></a></dt>
<dd>
<p>Declare which types of words should be substituted in thesaurus mutation. By default only Nouns, Noun Phraes and Adjectives will be swapped. The default substitutions would be specified with: <b>--parts-of-speech</b>=<i>NhA</i></p>
<dl>
<dt><a name="Noun_N"
><i>Noun</i> <b>N</b></a></dt>
<dd>
<dt><a name="Plural_p"
>Plural <b>p</b></a></dt>
<dd>
<dt><a name="Noun_Phrase_h"
><i>Noun Phrase</i> <b>h</b></a></dt>
<dd>
<dt><a name="Verb_(usu_participle)_V"
>Verb (usu participle) <b>V</b></a></dt>
<dd>
<dt><a name="Verb_(transitive)_t"
>Verb (transitive) <b>t</b></a></dt>
<dd>
<dt><a name="Verb_(intransitive)_i"
>Verb (intransitive) <b>i</b></a></dt>
<dd>
<dt><a name="Adjective_A"
><i>Adjective</i> <b>A</b></a></dt>
<dd>
<dt><a name="Adverb_v"
>Adverb <b>v</b></a></dt>
<dd>
<dt><a name="Conjunction_C"
>Conjunction <b>C</b></a></dt>
<dd>
<dt><a name="Preposition_P"
>Preposition <b>P</b></a></dt>
<dd>
<dt><a name="Interjection_!"
>Interjection <b>!</b></a></dt>
<dd>
<dt><a name="Pronoun_r"
>Pronoun <b>r</b></a></dt>
<dd>
<dt><a name="Definite_Article_D"
>Definite Article <b>D</b></a></dt>
<dd>
<dt><a name="Indefinite_Article_I"
>Indefinite Article <b>I</b></a></dt>
<dd>
<dt><a name="Nominative_o"
>Nominative <b>o</b></a></dt>
</dl>
<dt><a name="_"
></a></dt>
<dd>
<dt><a name="--prior-model=file"
><b>--prior-model</b>=<i>file</i></a></dt>
<dd>
<p>Specify a stored model as a prior for creating a merged model from input source text. For example you might make models of various genres od text and then use this to weight input text towards those genres.</p>
<dt><a name="--prior-weight=weight"
><b>--prior-weight</b>=<i>weight</i></a></dt>
<dd>
<p>The weighting between 0.0 and 1.0 (default 0.5) for how strongly the old model should affect the new. Old evidence will be included regardless. If a word existed in the old model but not in the new these words will be included in the new model.</p>
<p><i>Formal description:</i> P(prior-model)*(weight) + P(input-model)*(1-weight) = P(output-model)</p>
</dd>
</dl>
<h1><a class='u' href='#___top' title='click to go to top of document'
name="EXAMPLES"
>EXAMPLES</a></h1>
<p>Create some fake quotes from an online list:</p>
<pre> parody -c -l 200 -m 0.2 http://www.cs.ubc.ca/~bsd/quotes.txt</pre>
<p>Take some depressing text and make a happy parody:</p>
<pre> parody -c -l 200 -m 0.5 --emotion=positive edgar_allan_poe/the_raven.txt</pre>
<p>To do the same as above but instead pass through the original poem and make it more negative:</p>
<pre> parody -c --passthru --emotion=negative edgar_allan_poe/the_raven.txt</pre>
<p>Create some interesting instructions with unusual verb use:</p>
<pre> parody --start=The --length=1000 --mutation=0.01 --replace spanner=spork --parts-of-speech=Vti build_instructions.txt</pre>
<p>Create two models and merge them with bias to the second source:</p>
<pre> parody -n --output-model source1.model source1.txt
parody -n --output-model source2.model source2.txt
parody -n --output-model merged.model --prior-model source1.model --prior-weight 0.3 --input-model source2.model</pre>
<p>Visualize the model as a directed graph in SVG format with GraphViz:</p>
<pre> parody --graph-viz --input-model text.model | dot -Tsvg -o text.svg /dev/stdin</pre>
<p>Dealing with PDFs or Postscript and piped data:</p>
<pre> Use ps2ascii from Ghostscript Tools suite:
ps2ascii input.pdf | parody -l 200 /dev/stdin</pre>
<p>Produce spoken esoteric sagely advice from the Tao Te Ching:</p>
<pre> GNU/Linux use the festival project:
parody -s sage -l 10 http://www.gutenberg.org/files/216/old/taote10.txt | festival --tts /dev/stdin
Mac OSX use the built in say command:
parody -s sage -l 10 http://www.gutenberg.org/files/216/old/taote10.txt | say</pre>
<h1><a class='u' href='#___top' title='click to go to top of document'
name="AUTHOR"
>AUTHOR</a></h1>
<p><b>Matt Oates</b> - <i>mattoates@gmail.com</i></p>
<h1><a class='u' href='#___top' title='click to go to top of document'
name="CONTRIBUTORS"
>CONTRIBUTORS</a></h1>
<p><b>Ward, G.</b> (2002). <i>Moby thesaurus II</i> http://icon.shef.ac.uk/Moby</p>
<p><b>Bradley, M.M., & Lang, P.J.</b> (1999). <i>Affective norms for English words (ANEW): Stimuli, instruction manual and affective ratings.</i> Technical report C-1, Gainesville, FL. The Center for Research in Psychophysiology, University of Florida.</p>
<p><b>Thomas Gorochowski</b> (2011) tested HTTP get with 404 and blank files.</p>
<p><b>Stephen Paulger</b> (2011) typo checking.</p>
<h1><a class='u' href='#___top' title='click to go to top of document'
name="LICENSE_AND_COPYRIGHT"
>LICENSE AND COPYRIGHT</a></h1>
<p><b>Copyright 2011 Matt Oates</b></p>
<p>This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.</p>
<p>This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.</p>
<p>You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.</p>
<h1><a class='u' href='#___top' title='click to go to top of document'
name="TODO"
>TODO</a></h1>
<dl>
<dt><a name="Handle_closures_and_whitespace_around_quotes_and_braces_correctly."
>Handle closures and whitespace around quotes and braces correctly.</a></dt>
<dd>
<dt><a name="Use_generative_grammar_rules_to_constrain_model_output..._this_will_break_grammatical_style_present_in_the_model."
>Use generative grammar rules to constrain model output... this will break grammatical style present in the model.</a></dt>
</dl>
<h1><a class='u' href='#___top' title='click to go to top of document'
name="FUNCTIONS_DEFINED"
>FUNCTIONS DEFINED</a></h1>
<dl>
<dt><a name="trailingmc_-_Build_a_trailing_word_Markov_Chain_of_a_set_of_files"
><i>trailingmc</i> - Build a trailing word Markov Chain of a set of files</a></dt>
<dd>
<dt><a name="storemc_-_Store_a_Markov_Chain_Model_to_file"
><i>storemc</i> - Store a Markov Chain Model to file</a></dt>
<dd>
<dt><a name="loadmc_-_Load_stored_Markov_Chain_Model_from_file"
><i>loadmc</i> - Load stored Markov Chain Model from file</a></dt>
<dd>
<dt><a name="loadthes_-_Load_a_thesaurus_from_file"
><i>loadthes</i> - Load a thesaurus from file</a></dt>
<dd>
<dt><a name="randmut_-_Random_mutation_from_thesaurus"
><i>randmut</i> - Random mutation from thesaurus</a></dt>
<dd>
<dt><a name="emomut_-_Emotive_mutation_from_thesaurus_if_possible_otherwise_random"
><i>emomut</i> - Emotive mutation from thesaurus if possible otherwise random</a></dt>
<dd>
<dt><a name="passthru_-_Pass_the_text_through_only_performing_word_replacement"
><i>passthru</i> - Pass the text through only performing word replacement</a></dt>
</dl>
<!-- end doc -->
</body></html>