Skip to content
mallyvai edited this page Sep 13, 2010 · 1 revision

Since we are creating, breeding, and interpreting programs, we first choose a representation language that meets the following criteria:

  1. Readable enough to manually program and interpret.
  2. Powerful enough to allow a wide range of hash functions to be efficiently and compactly represented.
  3. Compact enough that there are few-to-none overlapping or redundant operators.
  4. Flexible enough that syntax errors are impossible.
  5. Ability to implement all Rivest functions

Although there are a number of special-purpose assembly languages that fulfill one or two of these criteria, we were unable to find any that fulfilled all of them. We created our own assembly language with the following characteristics:

  • No registers – all operators work directly on memory.
    • Clearly impractical to implement in hardware, but this is not a design goal.
  • 2^15 words, each 68 bits large.
  • 15 operators, each allowing four 16-bit operands.
    • Every instruction does not utilize every operand.
    • MSB of every operand indicates if the remaining 15 bits refer to an immediate or an address.
  • Program counter stored in a fixed location in memory, allowing the program to modify it (i.e. branching).
  • A special operator “iterinput” that fetches the next input chunk and places it in a fixed memory location.
  • A fixed section of memory treated as an output buffer.

Although the algorithm itself views this language in terms of its machine code, the human-readable disassembly is straightforward, since instructions take the form “[ dest ] add [ op0 ] [ op1 ] [ op2 ]”. The machine code itself is just a series of binary strings. This makes programs trivial to

  • Generate – Given a template with gaps, randomly generate bitstrings of length 68 to fill the gaps
  • Breed – Simply join subsets of the parent chromsomes.
  • Mutate – AND some random number(s) with some random word(s).

Finally, it’s obvious that we can duplicate all Rivest functions in a straightforward way – we have access to all the bitwise operators used, as well as simple chunking and branching.

Clone this wiki locally