man.html

<html><head><title>parody v1.2.9 - Parody some source text with a first order Markov chain and synonym mutation from a thesaurus.</title>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" >
</head>
<body class='pod'>
<!--
  generated by Pod::Simple::HTML v3.14,
  using Pod::Simple::PullParser v3.14,
  under Perl v5.012003 at Mon Nov  7 12:22:42 2011 GMT.

 If you want to change this HTML document, you probably shouldn't do that
   by changing it directly.  Instead, see about changing the calling options
   to Pod::Simple::HTML, and/or subclassing Pod::Simple::HTML,
   then reconverting this document from the Pod source.
   When in doubt, email the author of Pod::Simple::HTML for advice.
   See 'perldoc Pod::Simple::HTML' for more info.

-->

<!-- start doc -->
<a name='___top' class='dummyTopAnchor' ></a>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="NAME"
>NAME</a></h1>

<p>parody v1.2.9 - Parody some source text with a first order Markov chain and synonym mutation from a thesaurus.</p>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="SYNOPSIS"
>SYNOPSIS</a></h1>

<p>parody [options] &#60;file|url&#62;...</p>

<pre> Basic Options:
        -h --help               Get full man page output
        -v --verbose            Verbose output with details of mutations
        -d --debug              Debug output
        -c --colour             Add colour highlights to word mutations
           --html               Output &#60;span&#62; tag colours instead of terminal colours, use with --colour
        -i --input-model        Load a stored model from file
        -o --output-model       Store a model to disk for later use with -i
        -n --dont-generate      Don&#39;t generate output from model
        -s --start              Word to start generating from
        -l --length             Number of words to output
        -? --punctuation        Include punctuation as model symbols
        -u --passthru           Just pass the text straight through applying text replacement operators
        -g --graph-viz          Print out a graph viz file of the model

 Advanced Options:
        -m --mutation           Probability of mutation from thesaurus
        -r --replace            Replace a set of match words with replacements
        -e --emotion            Alter the emotional content of the text
        -p --parts-of-speech    Mutate only on words forming parts of speech
        --prior-model           Use a stored model as a prior for creating a new chain from input sources
        --prior-weight          How much should the old model be used compared to the new</pre>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="DESCRIPTION"
>DESCRIPTION</a></h1>

<p>This program can be used to produce output text that is statistically similar to its input, whilst introducing novelty in the form of per word mutations from a thesaurus.</p>

<p>A <b>Markov chain</b>, named for <i>Andrey Markov</i>, is a mathematical system that undergoes transitions from one state to another, like following the links in a chain. In this program states and words from text are synonymous. From a given word we calculate the expectation of another word following in the text, and nothing more. This is termed the <b>Markov property</b> of the chain and refers to the memoryless property of a stochastic process. To generate the parody output we start at a word and select a new word stochastically, based on the expectation of each possible word that followed in the original text. We then move (transition) to this selected word, discarding any information about what we have previously generated. This walk of the chain continues until the length of output specified is reached.</p>

<p>Mutation of output words is acheived with the use of the public domain Moby Thesaurus, and ANEW emotional content data (academic use only). These data allow for mutation of a word based on part-of-speech specific (Noun, Verb etc.) synonyms and pyschological valence score (positive or negative emotion).</p>

<p>You may wish to use this program to create novel text from known inputs for various reasons. The following are some historic uses of Markov chain text parody:</p>

<p>In 1984 <i>Mark V Shaney</i> a fake Usenet user introduced postings generated from a Markov chain technique. This script could be used to the same ends especially for short forced grammar situations like Twitter posts.</p>

<pre>        Example posts: http://en.wikipedia.org/wiki/Mark_V_Shaney</pre>

<p>The well received aleatoric poems <b>Postmortem Series</b> and <b>Accuracy</b> by the American poet <i>Jeffrey Harrison</i> were Markov chain inspired. This script is especially suited to this kind of inspired verse work, as words of a given stanza can be mutated and have the emotional content transformed. For example transforming the works of Edgar Allan Poe to have a positive outlook is an especially succesful use case.</p>

<pre>        Markov poem examples: http://www.moriapoetry.com/harrison.html
        Harrison&#39;s Bio: http://home.comcast.net/~jeffrey.harrison/bio.htm</pre>

<p>For examples of how to use this program with your own input, please see the <i>EXAMPLES</i> section.</p>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="OPTIONS"
>OPTIONS</a></h1>

<dl>
<dt><a name="-h,_--help"
><b>-h, --help</b></a></dt>

<dd>
<p>Print this brief help message from the command line.</p>

<dt><a name="-d,_--debug"
><b>-d, --debug</b></a></dt>

<dd>
<p>Print debug output showing how the text is being mutated with thesaurus usage.</p>

<dt><a name="-v,_--verbose"
><b>-v, --verbose</b></a></dt>

<dd>
<p>Verbose output showing how the text is changing.</p>

<dt><a name="-c,_--colour"
><b>-c, --colour</b></a></dt>

<dd>
<p>Use colour to highlight text substitutions being made. Vanilla thesaurus substitution is in blue, emotional content substitution is in green for positive change and red for negative change.</p>

<dt><a name="--html"
><b>--html</b></a></dt>

<dd>
<p>Use HTML &#60;span&#62; tags instead of ANSI terminal codes to colour the text output, only useful with the --colour option.</p>

<dt><a name="-i,_--input-model=file"
><b>-i, --input-model</b>=<i>file</i></a></dt>

<dd>
<p>Pass in a previously stored Markov Chain model.</p>

<dt><a name="-o,_--output-model=file"
><b>-o, --output-model</b>=<i>file</i></a></dt>

<dd>
<p>Pass in a location where you would like the Markov Chain for this session stored.</p>

<dt><a name="-n,_--dont-generate"
><b>-n, --dont-generate</b></a></dt>

<dd>
<p>Don&#39;t generate output, useful if you just wish to batch merge/create models.</p>

<dt><a name="-s,_--start=word"
><b>-s, --start</b>=<i>word</i></a></dt>

<dd>
<p>Specify a word to start generating text from. The default action is to pick a word at random from the source text.</p>

<dt><a name="-l,_--length=magnitude"
><b>-l, --length</b>=<i>magnitude</i></a></dt>

<dd>
<p>Define how many output words should be generated from the Markov chain.</p>

<dt><a name="-?,_--punctuation"
><b>-?, --punctuation</b></a></dt>

<dd>
<p>Force punctuation to be model symbols regardless of white space. This is sometimes useful for improving white space in parody of verse.</p>

<dt><a name="-u,_--passthru"
><b>-u, --passthru</b></a></dt>

<dd>
<p>If you wish to just change the emotional content of the source text this flag will allow you to do that.</p>

<dt><a name="-g,_--graph-viz"
><b>-g, --graph-viz</b></a></dt>

<dd>
<p>Output to <i>STDOUT</i> a GraphViz DOT file of the Markov Chain this can then be used to visualize the model.</p>

<dt><a name="-m,_--mutation=probability"
><b>-m, --mutation</b>=<i>probability</i></a></dt>

<dd>
<p>Allow per word mutation from the thesaurus with a given expectation 0.0 &#60; probability &#60;= 1.0. If you asked for an output length of 100 words and specified a probability of 0.2 then 20 words are likely to have been mutated.</p>

<dt><a name="-r,_--replace_match1=replacement1_match2=replacement2_..."
><b>-r, --replace</b> <i>match1</i>=<i>replacement1</i> <i>match2</i>=<i>replacement2</i> ...</a></dt>

<dd>
<p>Define your own replacement map, for example: --replace <i>he=she</i> <i>his=hers</i> <i>him=her</i></p>

<dt><a name="-e,_--emotion=positive|negative"
><b>-e, --emotion</b>=<i>positive|negative</i></a></dt>

<dd>
<p>Force mutations with a given emotional feeling. Current options include <i>positive</i> or <i>negative</i> emotional content.</p>

<dt><a name="-p,_--parts-of-speech=list_of_parts"
><b>-p, --parts-of-speech</b>=<i>list of parts</i></a></dt>

<dd>
<p>Declare which types of words should be substituted in thesaurus mutation. By default only Nouns, Noun Phraes and Adjectives will be swapped. The default substitutions would be specified with: <b>--parts-of-speech</b>=<i>NhA</i></p>

<dl>
<dt><a name="Noun_N"
><i>Noun</i> <b>N</b></a></dt>

<dd>
<dt><a name="Plural_p"
>Plural <b>p</b></a></dt>

<dd>
<dt><a name="Noun_Phrase_h"
><i>Noun Phrase</i> <b>h</b></a></dt>

<dd>
<dt><a name="Verb_(usu_participle)_V"
>Verb (usu participle) <b>V</b></a></dt>

<dd>
<dt><a name="Verb_(transitive)_t"
>Verb (transitive) <b>t</b></a></dt>

<dd>
<dt><a name="Verb_(intransitive)_i"
>Verb (intransitive) <b>i</b></a></dt>

<dd>
<dt><a name="Adjective_A"
><i>Adjective</i> <b>A</b></a></dt>

<dd>
<dt><a name="Adverb_v"
>Adverb <b>v</b></a></dt>

<dd>
<dt><a name="Conjunction_C"
>Conjunction <b>C</b></a></dt>

<dd>
<dt><a name="Preposition_P"
>Preposition <b>P</b></a></dt>

<dd>
<dt><a name="Interjection_!"
>Interjection <b>!</b></a></dt>

<dd>
<dt><a name="Pronoun_r"
>Pronoun <b>r</b></a></dt>

<dd>
<dt><a name="Definite_Article_D"
>Definite Article <b>D</b></a></dt>

<dd>
<dt><a name="Indefinite_Article_I"
>Indefinite Article <b>I</b></a></dt>

<dd>
<dt><a name="Nominative_o"
>Nominative <b>o</b></a></dt>
</dl>

<dt><a name="_"
></a></dt>

<dd>
<dt><a name="--prior-model=file"
><b>--prior-model</b>=<i>file</i></a></dt>

<dd>
<p>Specify a stored model as a prior for creating a merged model from input source text. For example you might make models of various genres od text and then use this to weight input text towards those genres.</p>

<dt><a name="--prior-weight=weight"
><b>--prior-weight</b>=<i>weight</i></a></dt>

<dd>
<p>The weighting between 0.0 and 1.0 (default 0.5) for how strongly the old model should affect the new. Old evidence will be included regardless. If a word existed in the old model but not in the new these words will be included in the new model.</p>

<p><i>Formal description:</i> P(prior-model)*(weight) + P(input-model)*(1-weight) = P(output-model)</p>
</dd>
</dl>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="EXAMPLES"
>EXAMPLES</a></h1>

<p>Create some fake quotes from an online list:</p>

<pre>        parody -c -l 200 -m 0.2 http://www.cs.ubc.ca/~bsd/quotes.txt</pre>

<p>Take some depressing text and make a happy parody:</p>

<pre>        parody -c -l 200 -m 0.5 --emotion=positive edgar_allan_poe/the_raven.txt</pre>

<p>To do the same as above but instead pass through the original poem and make it more negative:</p>

<pre>        parody -c --passthru --emotion=negative edgar_allan_poe/the_raven.txt</pre>

<p>Create some interesting instructions with unusual verb use:</p>

<pre>        parody --start=The --length=1000 --mutation=0.01 --replace spanner=spork --parts-of-speech=Vti build_instructions.txt</pre>

<p>Create two models and merge them with bias to the second source:</p>

<pre>        parody -n --output-model source1.model source1.txt
        parody -n --output-model source2.model source2.txt
        parody -n --output-model merged.model --prior-model source1.model --prior-weight 0.3 --input-model source2.model</pre>

<p>Visualize the model as a directed graph in SVG format with GraphViz:</p>

<pre>        parody --graph-viz --input-model text.model | dot -Tsvg -o text.svg /dev/stdin</pre>

<p>Dealing with PDFs or Postscript and piped data:</p>

<pre>        Use ps2ascii from Ghostscript Tools suite:
                ps2ascii input.pdf | parody -l 200 /dev/stdin</pre>

<p>Produce spoken esoteric sagely advice from the Tao Te Ching:</p>

<pre>        GNU/Linux use the festival project:
                parody -s sage -l 10 http://www.gutenberg.org/files/216/old/taote10.txt | festival --tts /dev/stdin

        Mac OSX use the built in say command:
                parody -s sage -l 10 http://www.gutenberg.org/files/216/old/taote10.txt | say</pre>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="AUTHOR"
>AUTHOR</a></h1>

<p><b>Matt Oates</b> - <i>mattoates@gmail.com</i></p>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="CONTRIBUTORS"
>CONTRIBUTORS</a></h1>

<p><b>Ward, G.</b> (2002). <i>Moby thesaurus II</i> http://icon.shef.ac.uk/Moby</p>

<p><b>Bradley, M.M., &#38; Lang, P.J.</b> (1999). <i>Affective norms for English words (ANEW): Stimuli, instruction manual and affective ratings.</i> Technical report C-1, Gainesville, FL. The Center for Research in Psychophysiology, University of Florida.</p>

<p><b>Thomas Gorochowski</b> (2011) tested HTTP get with 404 and blank files.</p>

<p><b>Stephen Paulger</b> (2011) typo checking.</p>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="LICENSE_AND_COPYRIGHT"
>LICENSE AND COPYRIGHT</a></h1>

<p><b>Copyright 2011 Matt Oates</b></p>

<p>This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.</p>

<p>This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.</p>

<p>You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.</p>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="TODO"
>TODO</a></h1>

<dl>
<dt><a name="Handle_closures_and_whitespace_around_quotes_and_braces_correctly."
>Handle closures and whitespace around quotes and braces correctly.</a></dt>

<dd>
<dt><a name="Use_generative_grammar_rules_to_constrain_model_output..._this_will_break_grammatical_style_present_in_the_model."
>Use generative grammar rules to constrain model output... this will break grammatical style present in the model.</a></dt>
</dl>

<h1><a class='u' href='#___top' title='click to go to top of document'
name="FUNCTIONS_DEFINED"
>FUNCTIONS DEFINED</a></h1>

<dl>
<dt><a name="trailingmc_-_Build_a_trailing_word_Markov_Chain_of_a_set_of_files"
><i>trailingmc</i> - Build a trailing word Markov Chain of a set of files</a></dt>

<dd>
<dt><a name="storemc_-_Store_a_Markov_Chain_Model_to_file"
><i>storemc</i> - Store a Markov Chain Model to file</a></dt>

<dd>
<dt><a name="loadmc_-_Load_stored_Markov_Chain_Model_from_file"
><i>loadmc</i> - Load stored Markov Chain Model from file</a></dt>

<dd>
<dt><a name="loadthes_-_Load_a_thesaurus_from_file"
><i>loadthes</i> - Load a thesaurus from file</a></dt>

<dd>
<dt><a name="randmut_-_Random_mutation_from_thesaurus"
><i>randmut</i> - Random mutation from thesaurus</a></dt>

<dd>
<dt><a name="emomut_-_Emotive_mutation_from_thesaurus_if_possible_otherwise_random"
><i>emomut</i> - Emotive mutation from thesaurus if possible otherwise random</a></dt>

<dd>
<dt><a name="passthru_-_Pass_the_text_through_only_performing_word_replacement"
><i>passthru</i> - Pass the text through only performing word replacement</a></dt>
</dl>

<!-- end doc -->

</body></html>