rss.xml

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="assets/xml/rss.xsl" media="all"?><rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>shisaa.be</title><link>http://shisaa.be/</link><description>A blog about Programming, Unix, Japan and Photography</description><atom:link href="http://shisaa.be/rss.xml" type="application/rss+xml" rel="self"></atom:link><language>en</language><lastBuildDate>Mon, 05 Jan 2015 12:42:50 GMT</lastBuildDate><generator>http://getnikola.com/</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Postgis and PostgreSQL in Action - Timezones</title><link>http://shisaa.be/postset/postgis-and-postgresql-in-action-timezones.html</link><dc:creator>Tim van der Linden</dc:creator><description>&lt;div&gt;&lt;h3&gt;Preface&lt;/h3&gt;
&lt;p&gt;Recently, I was lucky to be part of an &lt;em&gt;awesome&lt;/em&gt; project called the &lt;a href="http://breakingboundariestour.com"&gt;Breaking Boundaries Tour&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This project is about two brothers, Omar and Greg Colin, who take their Stella scooters to make a full round trip across the United States.
And, while they are at it, try to raise funding for &lt;a href="http://surfershealing.org/"&gt;Surfer's Healing Folly Beach&lt;/a&gt; - an organization that does great work enhancing the lives of children with autism through surfing .
To accommodate this trip, they wished to have a site where visitors could follow their trail &lt;em&gt;live&lt;/em&gt;, as it happened.
A marker would travel across the map, with them, 24/7.&lt;/p&gt;
&lt;p&gt;Furthermore, they needed the ability to jump off their scooters, snap a few pictures, edit a video, write some side info and push it on the net, for whole the world to see.
Immediately after they made their post, it had to appear on the exact spot they where at when snapping their moments of beauty.&lt;/p&gt;
&lt;p&gt;To aid in the live tracking of their global position, they acquired a dedicated GPS tracking device which sends a latitude/longitude coordinate via a mobile data network every 5 minutes.&lt;/p&gt;
&lt;p&gt;Now, this (short) post is not about how I build the entire application, but rather about how I used PostGIS and PostgreSQL for a rather peculiar matter: deducting timezone information.&lt;/p&gt;
&lt;p&gt;For those who are interested though: the site is entirely build in Python using the Flask "micro framework" and, of course, PostgreSQL as the database.&lt;/p&gt;
&lt;h3&gt;Timezone information?&lt;/h3&gt;
&lt;p&gt;Yes. Time, dates, timezones: hairy worms in hairy cans which many developers hate to open, but have to sooner or later.&lt;/p&gt;
&lt;p&gt;In the case of Breaking Boundaries Tour, we had one major occasion where we needed the correct timezone information: where did the post happen?&lt;/p&gt;
&lt;h3&gt;Where did it happen?&lt;/h3&gt;
&lt;p&gt;A feature we wanted to implement was one to help visitors get a better view of when a certain post was written.
To be able to see when a post was written in your local timezone is much more convenient then seeing the post time in some foreign zone.&lt;/p&gt;
&lt;p&gt;We are lazy and do not wish to count back- or forward to figure out when a post popped up in our frame of time.&lt;/p&gt;
&lt;p&gt;The reasoning is simple, always calculate all the times involved back to simple UTC (GMT). Then figure out the clients timezone using JavaScript, apply the time difference and done!&lt;/p&gt;
&lt;p&gt;Simple eh?&lt;/p&gt;
&lt;p&gt;Correct, except for one small detail in the feature request, in what zone was the post actually made?&lt;/p&gt;
&lt;p&gt;Well...damn.&lt;/p&gt;
&lt;p&gt;While you heart might be at the right place while thinking: "Simple, just look at the locale of the machine (laptop, mobile phone, ...) that was used to post!", this information if just too fragile. Remember, the bothers are &lt;em&gt;crossing&lt;/em&gt; the USA, riding through at least three major timezones.
You can simply not expect all the devices involved when posting to always adjust their locale automatically depending on where they are.&lt;/p&gt;
&lt;p&gt;We need a more robust solution. We need PostGIS.&lt;/p&gt;
&lt;p&gt;But, how can a spatial database help us to figure out the timezone?&lt;/p&gt;
&lt;p&gt;Well, thanks to the hard labor delivered to us by Eric Muller from &lt;a href="http://efele.net"&gt;efele.net&lt;/a&gt;, we have a &lt;em&gt;complete&lt;/em&gt; and &lt;em&gt;maintained&lt;/em&gt; shapefile of the entire world, containing polygons that represent the different timezones accompanied by the official timezone declarations.&lt;/p&gt;
&lt;p&gt;This enables us to use the latitude and longitude information from the dedicated tracking device to pin point in which timezone they where while writing their post.&lt;/p&gt;
&lt;p&gt;So let me take you on a short trip to show you how I used the above data in conjunction with PostGIS and PostgreSQL.&lt;/p&gt;
&lt;h3&gt;Getting the data&lt;/h3&gt;
&lt;p&gt;The first thing to do, obviously, is to download the shapefile data and load it in to our PostgreSQL database.
Navigate to the &lt;a href="http://efele.net/maps/tz/world/"&gt;Timezone World&lt;/a&gt; portion of the efele.net site and download the "tz_world" shapefile.&lt;/p&gt;
&lt;p&gt;This will give you a zip which you can extract:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;unzip tz_world.zip
&lt;/pre&gt;


&lt;p&gt;Unzipping will create a directory called "world" in which you can find the needed shapefile package files.&lt;/p&gt;
&lt;p&gt;Next you will need to make sure that your database is PostGIS ready. Connect to your desired database (let us call it &lt;em&gt;bar&lt;/em&gt;) &lt;em&gt;as a superuser&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;psql -U postgres bar
&lt;/pre&gt;


&lt;p&gt;And create the PostGIS extension:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;postgis&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now go back to your terminal and load the shapefile into your database using the original owner of the database (here called &lt;em&gt;foo&lt;/em&gt;):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;shp2pgsql -S -s &lt;span class="m"&gt;4326&lt;/span&gt; -I tz_world &lt;span class="p"&gt;|&lt;/span&gt; psql -U foo bar
&lt;/pre&gt;


&lt;p&gt;As you might remember from the PostGIS series, this loads in the geometry from the shapefile using only simple geometry (not "MULTI..." types) with a SRID of 4326.&lt;/p&gt;
&lt;h3&gt;What have we got?&lt;/h3&gt;
&lt;p&gt;This will take a couple of seconds and will create one table and two indexes. If you describe your database (assuming you have not made any tables yourself):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;public &lt;span class="p"&gt;|&lt;/span&gt; geography_columns &lt;span class="p"&gt;|&lt;/span&gt; view     &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; geometry_columns  &lt;span class="p"&gt;|&lt;/span&gt; view     &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; raster_columns    &lt;span class="p"&gt;|&lt;/span&gt; view     &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; raster_overviews  &lt;span class="p"&gt;|&lt;/span&gt; view     &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; spatial_ref_sys   &lt;span class="p"&gt;|&lt;/span&gt; table    &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; tz_world          &lt;span class="p"&gt;|&lt;/span&gt; table    &lt;span class="p"&gt;|&lt;/span&gt; foo
public &lt;span class="p"&gt;|&lt;/span&gt; tz_world_gid_seq  &lt;span class="p"&gt;|&lt;/span&gt; sequence &lt;span class="p"&gt;|&lt;/span&gt; foo
&lt;/pre&gt;


&lt;p&gt;You will see the standard PostGIS bookkeeping and you will find the &lt;em&gt;tz_world&lt;/em&gt; table together with a &lt;em&gt;gid&lt;/em&gt; sequence.&lt;/p&gt;
&lt;p&gt;Let us describe the table:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="n"&gt;tz_world&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And get:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;Column &lt;span class="p"&gt;|&lt;/span&gt;          Type          &lt;span class="p"&gt;|&lt;/span&gt;                       Modifiers                        
--------+------------------------+--------------------------------------------------------
gid    &lt;span class="p"&gt;|&lt;/span&gt; integer                &lt;span class="p"&gt;|&lt;/span&gt; not null default nextval&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'tz_world_gid_seq'&lt;/span&gt;::regclass&lt;span class="o"&gt;)&lt;/span&gt;
tzid   &lt;span class="p"&gt;|&lt;/span&gt; character varying&lt;span class="o"&gt;(&lt;/span&gt;30&lt;span class="o"&gt;)&lt;/span&gt;  &lt;span class="p"&gt;|&lt;/span&gt; 
geom   &lt;span class="p"&gt;|&lt;/span&gt; geometry&lt;span class="o"&gt;(&lt;/span&gt;Polygon,4326&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; 
Indexes:
    &lt;span class="s2"&gt;"tz_world_pkey"&lt;/span&gt; PRIMARY KEY, btree &lt;span class="o"&gt;(&lt;/span&gt;gid&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="s2"&gt;"tz_world_geom_gist"&lt;/span&gt; gist &lt;span class="o"&gt;(&lt;/span&gt;geom&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;So we have:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;&lt;em&gt;gid&lt;/em&gt;: an arbitrary id column&lt;/li&gt;
&lt;li&gt;&lt;em&gt;tzid&lt;/em&gt;: holding the standards compliant textual timezone identification&lt;/li&gt;
&lt;li&gt;&lt;em&gt;geom&lt;/em&gt;: holding polygons in &lt;em&gt;SRID&lt;/em&gt; 4326.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Also notice we have two indexes made for us:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;&lt;em&gt;tz_world_pkey&lt;/em&gt;: a simple B-tree index on our gid&lt;/li&gt;
&lt;li&gt;&lt;em&gt;tz_world_geom_gist&lt;/em&gt;: a GiST index on our geometry&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;This is a rather nice set, would you not say?&lt;/p&gt;
&lt;h3&gt;Using the data&lt;/h3&gt;
&lt;p&gt;So how do we go about using this data?&lt;/p&gt;
&lt;p&gt;As I have said above, we need to figure out in which polygon (timezone) a certain point resides.&lt;/p&gt;
&lt;p&gt;Let us take an arbitrary point on the earth:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;latitude: 35.362852&lt;/li&gt;
&lt;li&gt;longitude: 140.196131&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;This is a spot in the Chiba prefecture, central Japan.&lt;/p&gt;
&lt;p&gt;Using the &lt;em&gt;Simple Features functions&lt;/em&gt; we have available in PostGIS, it is trivial to find out in which polygon a certain point resides:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;tzid&lt;/span&gt; 
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;tz_world&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ST_Intersects&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ST_GeomFromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POINT(140.196131 35.362852)'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4326&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;geom&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And we get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;    tzid    
------------
 Asia/Tokyo
&lt;/pre&gt;


&lt;p&gt;&lt;em&gt;Awesome!&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;In the above query I used the function &lt;em&gt;ST_Intersects&lt;/em&gt; which checks if a given piece of geometry (our point) &lt;em&gt;shares any space&lt;/em&gt; with another piece.
If we would check the execute plan of this query:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;tzid&lt;/span&gt; 
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;tz_world&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ST_Intersects&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ST_GeomFromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POINT(140.196131 35.362852)'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4326&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;geom&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;                                                      QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
Index Scan using tz_world_geom_gist on tz_world  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.28..8.54 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;15&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.591..0.592 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
    Index Cond: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'0101000020E61000006BD784B446866140E3A430EF71AE4140'&lt;/span&gt;::geometry &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; geom&lt;span class="o"&gt;)&lt;/span&gt;
    Filter: _st_intersects&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'0101000020E61000006BD784B446866140E3A430EF71AE4140'&lt;/span&gt;::geometry, geom&lt;span class="o"&gt;)&lt;/span&gt;
Total runtime: 0.617 ms
&lt;/pre&gt;


&lt;p&gt;That is not bad at all, a runtime of little over 0.6 Milliseconds and it is using our GiST index.&lt;/p&gt;
&lt;p&gt;But, if a lookup is using our GiST index, a small alarm bell should go off inside your head. Remember my last chapter on the PostGIS series?
I kept on babbling about index usage and how geometry functions or operators can only use GiST indexes when they perform &lt;em&gt;bounding box&lt;/em&gt; calculations.&lt;/p&gt;
&lt;p&gt;The latter might pose a problem in our case, for bounding boxes are a &lt;em&gt;very&lt;/em&gt; rough approximations of the actual geometry.
This means that when we arrive near timezone borders, our calculations might just give us the wrong timezone.&lt;/p&gt;
&lt;p&gt;So how can we fix this?&lt;/p&gt;
&lt;p&gt;This time, we do not need to.&lt;/p&gt;
&lt;p&gt;This is one of the few &lt;em&gt;blessed&lt;/em&gt; functions that makes use of both an index &lt;em&gt;and&lt;/em&gt; is very accurate.&lt;/p&gt;
&lt;p&gt;The &lt;em&gt;ST_Intersects&lt;/em&gt; first uses the index to perform bounding box calculations. This filters out the majority of available geometry.
Then it performs a more expensive, but more accurate calculation (on a small subset) to check if the given point is &lt;em&gt;really&lt;/em&gt; inside the returned matches.&lt;/p&gt;
&lt;p&gt;We can thus simply use this function without any more magic...life is simple!&lt;/p&gt;
&lt;h3&gt;Implementation&lt;/h3&gt;
&lt;p&gt;Now it is fair to say that we do not wish to perform this calculation every time a user views a post, that would not be very efficient nor smart.&lt;/p&gt;
&lt;p&gt;Rather, it is a good idea to generate this information at post time, and save it for later use.&lt;/p&gt;
&lt;p&gt;The way I have setup to save this information is twofold:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;I only save a UTC (GTM) generalized timestamp of when the post was made.&lt;/li&gt;
&lt;li&gt;I made an extra column in my so-called "posts" table where I only save the string that represents the timezone (Asia/Tokyo in the above case).&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;This keeps the date/time information in the database naive of any timezone and makes for easier calculations to give the time in either the clients timezone or in the timezone the post was originally written.
You simply have one "root" time which you can move around timezones.&lt;/p&gt;
&lt;p&gt;On every insert of a new post I have created a trigger that fetches the timezone and inserts it into the designated column.
You could also fetch the timezone and update the post record using Python, but opting for an in-database solution saves you a few extra, unneeded round trips and is most likely a lot faster.&lt;/p&gt;
&lt;p&gt;Let us see how we could create such a trigger.&lt;/p&gt;
&lt;p&gt;A trigger in PostgreSQL is an event you can set to fire when certain conditions are met. The event(s) that fire have to be encapsulated inside a PostgreSQL function.
Let us thus first start by creating the function that will insert our timezone string.&lt;/p&gt;
&lt;h3&gt;Creating functions&lt;/h3&gt;
&lt;p&gt;In PostgreSQL you can write functions in either &lt;em&gt;C&lt;/em&gt;, &lt;em&gt;Procedural&lt;/em&gt; languages (PgSQL, Perl, Python) or plain &lt;em&gt;SQL&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Creating functions with plain SQL is the most straightforward and most easy way. However, since we want to write a function that is to be used inside a trigger, we have even a better option.
We could employ the power of the embedded PostgreSQL procedural language to easily access and manipulate our newly insert data.&lt;/p&gt;
&lt;p&gt;First, let us see which query we would use to fetch the timezone and update our post record:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt;
  &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;tzid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tzid&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;tzid&lt;/span&gt;
          &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;tz_world&lt;/span&gt;
          &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ST_Intersects&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;ST_SetSRID&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
              &lt;span class="n"&gt;ST_MakePoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;140&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;196131&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;35&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;362852&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="mi"&gt;4326&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
          &lt;span class="n"&gt;geom&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;timezone&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This query will fetch the timezone string using a subquery and then update the correct record (a post with "pid" 1 in this example).&lt;/p&gt;
&lt;p&gt;How do we pour this into a function?&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;set_timezone&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt;
    &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;tzid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tzid&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;tzid&lt;/span&gt; 
            &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;tz_world&lt;/span&gt;
              &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ST_Intersects&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;ST_SetSRID&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                  &lt;span class="n"&gt;ST_MakePoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;longitude&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;latitude&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="mi"&gt;4326&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
              &lt;span class="n"&gt;geom&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;timezone&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;PLPGSQL&lt;/span&gt; &lt;span class="k"&gt;IMMUTABLE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;First we use the syntax &lt;em&gt;CREATE OR REPLACE FUNCTION&lt;/em&gt; to indicate we want to create (or replace) a custom function.
Then we tell PostgreSQL that this function will return type &lt;em&gt;TRIGGER&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;You might notice that we do not give this function any arguments. The reasoning here is that this function is "special".
Functions which are used as triggers magically get information about the inserted data available.&lt;/p&gt;
&lt;p&gt;Inside the function you can see we access our latitude and longitude prefixed with &lt;em&gt;NEW&lt;/em&gt;. These keywords, &lt;em&gt;NEW&lt;/em&gt; and &lt;em&gt;OLD&lt;/em&gt;, refer to the &lt;em&gt;record&lt;/em&gt; after and before the trigger(s) happened.
In our case we could have used both, since we do not alter the latitude or longitude data, we simply fill a column that is NULL by default.
There are more keywords available (&lt;em&gt;TG_NAME&lt;/em&gt;, &lt;em&gt;TG_RELID&lt;/em&gt;, &lt;em&gt;TG_NARGS&lt;/em&gt;, ...) which refer to properties of the trigger itself, but that is beyond today's scope.&lt;/p&gt;
&lt;p&gt;The actual SQL statement is wrapped between double dollar signs (&lt;em&gt;$$&lt;/em&gt;). This is called &lt;em&gt;dollar quoting&lt;/em&gt; and is the preferred way to quote your SQL string (as opposed to using single quotes).
The body of the function, which in our case is mostly the SQL statement, is surrounded with a &lt;em&gt;BEGIN&lt;/em&gt; and &lt;em&gt;END&lt;/em&gt; keyword.&lt;/p&gt;
&lt;p&gt;A trigger function always needs a &lt;em&gt;RETURN&lt;/em&gt; statement that is used to provide the data for the updated record. This too has to reside in the body of the function.&lt;/p&gt;
&lt;p&gt;Near the end of our function we need to declare in which language this function was written, in our case &lt;em&gt;PLPGSQL&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Finally, the &lt;em&gt;IMMUTABLE&lt;/em&gt; keyword tells PostgreSQL that this function is rather "functional", meaning: if the inputs are the same, the output will also, &lt;em&gt;always&lt;/em&gt; be the same.
Using this &lt;em&gt;caching&lt;/em&gt; keyword gives our famous PostgreSQL planner the ability to make decisions based on this knowledge.&lt;/p&gt;
&lt;h3&gt;Creating triggers&lt;/h3&gt;
&lt;p&gt;Now that we have this functionality wrapped into a tiny PLPGSQL function, we can go ahead and create the trigger.&lt;/p&gt;
&lt;p&gt;First you have the event on which a trigger can execute, these are:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;INSERT&lt;/li&gt;
&lt;li&gt;UPDATE&lt;/li&gt;
&lt;li&gt;DELETE&lt;/li&gt;
&lt;li&gt;TRUNCATE&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Next, for each event you can specify at what timing your trigger has to fire:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;BEFORE&lt;/li&gt;
&lt;li&gt;AFTER&lt;/li&gt;
&lt;li&gt;INSTEAD OF&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The last one is a special timing by which you can replace the default behavior of the mentioned events.&lt;/p&gt;
&lt;p&gt;For our use case, we are interested in executing our function &lt;em&gt;AFTER INSERT&lt;/em&gt;.&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="n"&gt;set_timezone&lt;/span&gt;
    &lt;span class="k"&gt;AFTER&lt;/span&gt; &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt;
    &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;EACH&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;
    &lt;span class="k"&gt;EXECUTE&lt;/span&gt; &lt;span class="k"&gt;PROCEDURE&lt;/span&gt; &lt;span class="n"&gt;set_timezone&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will setup the trigger that fires after the insert of a new record.&lt;/p&gt;
&lt;h3&gt;Wrapping it up&lt;/h3&gt;
&lt;p&gt;Good, that all there is to it.&lt;/p&gt;
&lt;p&gt;We use a query, wrapped in a function, triggered by an insert event to inject the official timezone string which is deducted by PostGIS's spatial abilities.&lt;/p&gt;
&lt;p&gt;Now you can use this information to get the exact timezone of where the post was made and use this to present the surfing client both the post timezone time and their local time.&lt;/p&gt;
&lt;p&gt;For the curious ones out there: I used the &lt;a href="http://momentjs.com/%20MomentJS%20JavaScript%20library"&gt;MomentJS&lt;/a&gt; library for the client side time parsing. This library offers a timezone extension which accepts these official timezone strings to calculate offsets. A lifesaver, so go check it out.&lt;/p&gt;
&lt;p&gt;Also, be sure to follow the bros while they scooter across the States!&lt;/p&gt;
&lt;p&gt;And as always...thanks for reading!&lt;/p&gt;&lt;/div&gt;</description><category>postgis</category><category>postgresql</category><category>timezone</category><guid>http://shisaa.be/postset/postgis-and-postgresql-in-action-timezones.html</guid><pubDate>Wed, 20 Aug 2014 10:00:00 GMT</pubDate></item><item><title>Postgis, PostgreSQL's spatial partner - Part 3</title><link>http://shisaa.be/postset/postgis-postgresqls-spatial-partner-part-3.html</link><dc:creator>Tim van der Linden</dc:creator><description>&lt;div&gt;&lt;p&gt;You have arrived at the final chapter of this PostGIS introduction  series. Before continuing, I recommend you read &lt;a href="http://shisaa.be/postset/postgis-postgresqls-spatial-partner-part-1.html" title="Part one of this series."&gt;chapter one&lt;/a&gt; and &lt;a href="http://shisaa.be/postset/postgis-postgresqls-spatial-partner-part-2.html" title="Part one of this series."&gt;chapter two&lt;/a&gt; first.&lt;/p&gt;
&lt;p&gt;In the last chapter we finished by doing some real world distance measuring and we saw how different projections pushed forward different results.&lt;/p&gt;
&lt;p&gt;Today I would like to take this practical approach a bit further and continue our work with real world data by showing you around the town of Kin in Okinawa. The town where I live.&lt;/p&gt;
&lt;h3&gt;A word before we start&lt;/h3&gt;
&lt;p&gt;In this chapter I want to do a few experiments together with you on real world data.
To gather this data, I would like to use OpenStreetMap because it is not only &lt;em&gt;open&lt;/em&gt; but also gives us handy tools to export map information.&lt;/p&gt;
&lt;p&gt;We will use a tool called &lt;em&gt;osm2pgsql&lt;/em&gt; to load our OSM data into PostGIS enable tables.&lt;/p&gt;
&lt;p&gt;However, it is more common to import and export real world GIS data by using the semi-closed ESRI standard &lt;em&gt;shapefile&lt;/em&gt; format.
OpenStreetMap does not support exporting to this shapefile format directly, but exports to a more open XML file (.osm) instead.&lt;/p&gt;
&lt;p&gt;Therefor, near the end of this post, we will briefly cover these shapefiles as well and see how we could import them into our PostgreSQL database.
But for the majority of our work today, I will focus on the OpenStreetMap approach.&lt;/p&gt;
&lt;h3&gt;The preparation&lt;/h3&gt;
&lt;p&gt;Let us commence with this adventure by first getting all the GIS data related to the whole of Okinawa.
We will only be interested in the data related to Kin town, but I need you to pull in a data set that is large enough (but still tiny in PostgreSQL terms) for us to experiment with indexing.&lt;/p&gt;
&lt;p&gt;Hop online and download the file being served at the following URL: &lt;a href="http://overpass-api.de/api/map?bbox=126.079,25.596,130.852,28.898"&gt;openstreetmap.org Okinawa island&lt;/a&gt;
It is a file of roughly 180 Mb and covers most of the Okinawan main island. Save the presented "map" file.&lt;/p&gt;
&lt;p&gt;Next we will need to install a third party tool which is specifically designed to import this OSM file into PostGIS.
This tool is called &lt;em&gt;osm2pgsql&lt;/em&gt; and is available in many Linux distributions.&lt;/p&gt;
&lt;p&gt;On a Debian system:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;apt-get install osm2pgsql
&lt;/pre&gt;


&lt;h3&gt;Loading foreign data&lt;/h3&gt;
&lt;p&gt;Now we are ready to load in this data. But first, let us clean our "gis" database we used before.&lt;/p&gt;
&lt;p&gt;Since all these import tools will create their own PostGIS enabled tables, we can delete our "shapes" table. Connect to your "gis" database and drop this table:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Using this new tool, repopulate the "gis" database with the data you just downloaded:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;osm2pgsql -s -U postgres -d gis map
&lt;/pre&gt;


&lt;p&gt;If everything went okay, you will get a small report containing the information about all the tables &lt;em&gt;and&lt;/em&gt; indexes that where created.&lt;/p&gt;
&lt;p&gt;Let us see what we just did. &lt;/p&gt;
&lt;p&gt;First we ran &lt;em&gt;osm2pgsql&lt;/em&gt; with the &lt;em&gt;-s&lt;/em&gt; flag. This flag enabled &lt;em&gt;slim&lt;/em&gt; mode, which means it will use a database on disk, rather then processing all the GIS data in RAM.
The latter does not only potentially slow down your machine for larger data sets, but it enables less features to be available.&lt;/p&gt;
&lt;p&gt;Next we tell the tool to connect as the user "postgres" and load the data into the "gis" database. The final argument is the "map" file you just downloaded.&lt;/p&gt;
&lt;h3&gt;What do we have now?&lt;/h3&gt;
&lt;p&gt;Open up a database console and let us describe our database to see what this tool just did:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;As you can see, it inserted 7 new tables:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;Schema &lt;span class="p"&gt;|&lt;/span&gt;        Name        &lt;span class="p"&gt;|&lt;/span&gt; Type  &lt;span class="p"&gt;|&lt;/span&gt;  Owner   
--------+--------------------+-------+----------
public &lt;span class="p"&gt;|&lt;/span&gt; geography_columns  &lt;span class="p"&gt;|&lt;/span&gt; view  &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; geometry_columns   &lt;span class="p"&gt;|&lt;/span&gt; view  &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; planet_osm_line    &lt;span class="p"&gt;|&lt;/span&gt; table &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; planet_osm_nodes   &lt;span class="p"&gt;|&lt;/span&gt; table &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; planet_osm_point   &lt;span class="p"&gt;|&lt;/span&gt; table &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; planet_osm_polygon &lt;span class="p"&gt;|&lt;/span&gt; table &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; planet_osm_rels    &lt;span class="p"&gt;|&lt;/span&gt; table &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; planet_osm_roads   &lt;span class="p"&gt;|&lt;/span&gt; table &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; planet_osm_ways    &lt;span class="p"&gt;|&lt;/span&gt; table &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; raster_columns     &lt;span class="p"&gt;|&lt;/span&gt; view  &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; raster_overviews   &lt;span class="p"&gt;|&lt;/span&gt; view  &lt;span class="p"&gt;|&lt;/span&gt; postgres
public &lt;span class="p"&gt;|&lt;/span&gt; spatial_ref_sys    &lt;span class="p"&gt;|&lt;/span&gt; table &lt;span class="p"&gt;|&lt;/span&gt; postgres
&lt;/pre&gt;


&lt;p&gt;The other 5 views and tables are the good old PostGIS bookkeeping.&lt;/p&gt;
&lt;p&gt;It is also important, yet less relevant for our work here today, to know that these tables, or rather the way &lt;em&gt;osm2pgsql&lt;/em&gt; imports, is optimized to work with &lt;em&gt;Mapnik&lt;/em&gt;.
Mapnik is an open-source map rendering software package used for both web and offline usage.&lt;/p&gt;
&lt;p&gt;The tables that are imported contain many different types of information. Let me quickly go over them to give you a basic feeling of how the import happened:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;planet_osm_line: holds all non-closed pieces of geometry (called &lt;em&gt;ways&lt;/em&gt;) at a high resolution. They mostly represent actual roads and are used when looking at a small, zoomed-in detail of a map.&lt;/li&gt;
&lt;li&gt;planet_osm_nodes: an intermediate table that holds the raw point data (points in lat/long) with a corresponding "osm_id" to map them to other tables&lt;/li&gt;
&lt;li&gt;planet_osm_point: holds all points-of-interest together with their OSM tags - tags that describe what they represent&lt;/li&gt;
&lt;li&gt;planet_osm_polygon: holds all closed piece of geometry (also called &lt;em&gt;ways&lt;/em&gt;) like buildings, parks, lakes, areas, ...&lt;/li&gt;
&lt;li&gt;planet_osm_rels: an intermediate table that holds extra connecting information about polygons&lt;/li&gt;
&lt;li&gt;planet_osm_roads: holds lower resolution, non-closed piece of geometry in contrast with "planet_osm_line". This data is used when looking at a greater distance, covering much area and thus not much detail about smaller, local roads.&lt;/li&gt;
&lt;li&gt;planet_osm_ways: an intermediate table which holds non-closed geometry in raw format&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;We will now continue working with a small subset of this data.&lt;/p&gt;
&lt;p&gt;Let us take a peek at the Polygons tables for example. First, let us see what we have available:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;That is quite a big list, but the major part of these columns are of mere TEXT type and contain human information about the geometry stored.
These columns corresponds with the way OpenStreetMap categorizes their data and with the way you could use the Mapnik software described above.&lt;/p&gt;
&lt;p&gt;Let us do a targeted query:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_AsText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'industrial'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Notice that I use the &lt;em&gt;output&lt;/em&gt; function &lt;em&gt;ST_AsText()&lt;/em&gt; to convert to a human readable WKT string.
Also, I am only interested in some of the industrial buildings, so I set the building type to &lt;em&gt;industrial&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The result:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;                    name                      &lt;span class="p"&gt;|&lt;/span&gt;  building  &lt;span class="p"&gt;|&lt;/span&gt;                                                                     st_astext                                                                   
----------------------------------------------+------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------
 沖ハム &lt;span class="o"&gt;(&lt;/span&gt;Okiham&lt;span class="o"&gt;)&lt;/span&gt;                               &lt;span class="p"&gt;|&lt;/span&gt; industrial &lt;span class="p"&gt;|&lt;/span&gt; POLYGON&lt;span class="o"&gt;((&lt;/span&gt;14221927.83 3049797.01,14222009.77 3049839.68,14222074.84 3049714.68,14222028.9 3049690.76,14221996.33 3049753.33,14221960.34 3049734.58,14221927.83 3049797.01&lt;span class="o"&gt;))&lt;/span&gt;
Kin Thermal Power Plant Coal storage building &lt;span class="p"&gt;|&lt;/span&gt; industrial &lt;span class="p"&gt;|&lt;/span&gt; POLYGON&lt;span class="o"&gt;((&lt;/span&gt;14239931.42 3054117.72,14239990.49 3054224.25,14240230.15 3054091.38,14240171.08 3053984.84,14239931.42 3054117.72&lt;span class="o"&gt;))&lt;/span&gt;
Kin Thermal Power Plant Exhaust tower         &lt;span class="p"&gt;|&lt;/span&gt; industrial &lt;span class="p"&gt;|&lt;/span&gt; POLYGON&lt;span class="o"&gt;((&lt;/span&gt;14240167.1 3054497.14,14240172.26 3054507.93,14240176.04 3054515.82,14240195.76 3054506.39,14240186.84 3054487.7,14240167.1 3054497.14&lt;span class="o"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We get back three records containing one industrial building each, described with a closed polygon. Cool.&lt;/p&gt;
&lt;p&gt;Now, I can assure you that Okinawa has more then three industrial buildings, but do remember that we are looking at a rather rural island.
OpenStreetMap relies greatly on user generated content and there simply are not many users who have felt the need to index the industrial buildings here in this neck of the woods.&lt;/p&gt;
&lt;p&gt;The &lt;em&gt;planet_osm_polygon&lt;/em&gt; table does contain little over 6000 buildings of various types, which is still a small number, but for our purpose today I am only interested in the latter two, which both lie here in Kin town.&lt;/p&gt;
&lt;p&gt;Also, if you would, for example, take a chunk of Tokyo, where there are hundreds of active OpenStreetMap contributors, you will find that many buildings are present and are sometimes even more accurately represented then some other online proprietary mapping solutions offered by some famous search engines. Ahum.&lt;/p&gt;
&lt;p&gt;Before continuing, though, I would like to delete two GiST indexes that "osm2pgsql" made for us, purely to be able to demonstrate the importance of an index.&lt;/p&gt;
&lt;p&gt;For now, just take my word and delete the indexes on all the geometry columns of the tables we will use today:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line_index&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon_index&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Then perform a VACUUM:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;VACUUM&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;VACUUM&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;&lt;em&gt;VACUUM&lt;/em&gt; together with &lt;em&gt;ANALYZE&lt;/em&gt; will force PostgreSQL to recheck the whole table for any changed conditions, as is the case since we removed the index.&lt;/p&gt;
&lt;p&gt;The first thing I would like to find out is how large these building actually are.
We cannot measure how tall they are, for we are working with two dimensional data here, but we can measure their footprint on the map.&lt;/p&gt;
&lt;p&gt;Since PostGIS makes all of our work easy, we could simply employ a function to tell us this information:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Area&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'industrial'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;    st_area      
------------------
10155.3935499731
33381.1043500491
452.9464999972
&lt;/pre&gt;


&lt;p&gt;As we know from the previous chapter, to be able to know what these numbers mean, we have to find out in which SRID this data was saved.
You could either describe the table again and look at the geometry column description, or use an &lt;em&gt;accessor&lt;/em&gt; function &lt;em&gt;ST_SRID()&lt;/em&gt;, to find it:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_SRID&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'industrial'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt; &lt;span class="n"&gt;st_srid&lt;/span&gt; 
&lt;span class="c1"&gt;---------&lt;/span&gt;
&lt;span class="mi"&gt;900913&lt;/span&gt;
&lt;span class="mi"&gt;900913&lt;/span&gt;
&lt;span class="mi"&gt;900913&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You could also query the PostGIS bookkeeping directly and look in the &lt;em&gt;geometry_columns&lt;/em&gt; view:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;f_tablename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f_geometry_column&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coord_dimension&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;srid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;type&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;geometry_columns&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This view holds information about all the geometry columns in our PostGIS enabled database.
Our above query will return a list containing all the GIS describing information we saw in the previous chapter.&lt;/p&gt;
&lt;p&gt;Nice. Both our buildings are stored in a geometry column and have an SRID of &lt;em&gt;900913&lt;/em&gt;. We can now use our &lt;em&gt;spatial_ref_sys&lt;/em&gt; table to look up this ID:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;srid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;auth_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;auth_srid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;srtext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;proj4text&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;spatial_ref_sys&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;srid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;900913&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;As you can see, this is basically a Mercator projection used by OpenStreetMap.
In the "proj4text" column we can see that its units are meters.&lt;/p&gt;
&lt;p&gt;This thus means that the information we get back is in &lt;em&gt;square Meters&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;In this map (only looking at the latter two Kin buildings) we thus have a building with a total area of 33 &lt;em&gt;square Kilometers&lt;/em&gt; and a more modest building of around 452 &lt;em&gt;square Meters&lt;/em&gt;.
The former is a coal storage facility belonging to the &lt;em&gt;Kin Thermal Power Plant&lt;/em&gt; and is indeed &lt;em&gt;huge&lt;/em&gt;.
The second building represents the exhaust tower of that same plant.&lt;/p&gt;
&lt;p&gt;You have just measured the area these buildings occupy, very neat right?&lt;/p&gt;
&lt;p&gt;Now, let us find out which road runs next to this power plant, just in case we wish to drive to there.
It is important to note that OSM (and many other mapping solutions) divide roads into different types.&lt;/p&gt;
&lt;p&gt;You have trunk roads, highways, secondary roads, tertiary roads, etc.
I am now interested to find the nearest &lt;em&gt;secondary&lt;/em&gt; road.&lt;/p&gt;
&lt;p&gt;To get a list of all the secondary roads in Okinawa, simply query the &lt;em&gt;planet_osm_roads&lt;/em&gt; table:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_AsText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'secondary'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We now get back all the linestring objects together with their reference inside of OSM.
The reference refers to the actual route number each road has.&lt;/p&gt;
&lt;p&gt;The total count should be around &lt;em&gt;3215&lt;/em&gt; pieces of geometry, which is already a nice list to work with.&lt;/p&gt;
&lt;p&gt;Let us now see which of these roads is closest to our coal storage building.&lt;/p&gt;
&lt;p&gt;To find out how far something is (nearest neighbor search) we could use our &lt;em&gt;ST_Distance()&lt;/em&gt; function we used in the previous chapter and perform the following lookup:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;highway&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'secondary'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Kin Thermal Power Plant Coal storage building'&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will bring us:&lt;/p&gt;
&lt;pre class="code literal-block"&gt; highway  &lt;span class="p"&gt;|&lt;/span&gt; ref &lt;span class="p"&gt;|&lt;/span&gt;     distance     
-----------+-----+------------------
secondary &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="m"&gt;329&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; 417.374986575458
secondary &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="m"&gt;104&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; 2258.90394593648
secondary &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="m"&gt;104&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; 2709.00178089638
secondary &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="m"&gt;104&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; 2745.76782385198
secondary &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="m"&gt;234&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; 5897.78205314507
...
&lt;/pre&gt;


&lt;p&gt;Cool, secondary route 329 is the closest to our coal storage building with a distance of &lt;em&gt;417 meters.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;While this will return quite accurate results, there is one problem with this query. Indexes are not being used.
And every time an index is potentially left alone, you should start to worry, especially with larger data sets.&lt;/p&gt;
&lt;p&gt;How do I know they are ignored? Simple, we did not make any indexes (and we deleted the ones made by "osm2pgsql")...which makes me pretty sure we cannot use them.&lt;/p&gt;
&lt;p&gt;I refer you to &lt;a href="http://shisaa.be/postset/postgresql-full-text-search-part-3.html"&gt;chapter three&lt;/a&gt; of my PostgreSQL Full Text series where I talk a bit more about GiST and B-Tree index types.
And, as I also say in that chapter, I highly recommend reading Markus Winand's &lt;a href="http://use-the-index-luke.com/" title="Use The Index, Luke series written by Markus Winand."&gt;Use The Index, Luke&lt;/a&gt; series, which explains in great detail how database indexes work.&lt;/p&gt;
&lt;p&gt;The first thing to realize is that an index will only be used if the data set on which it is build is of sufficient size.
PostgreSQL has an AI build in, called the &lt;em&gt;query planner&lt;/em&gt;, which will make a decision on whether or not to use an index.&lt;/p&gt;
&lt;p&gt;If your data set is small enough a more traditional &lt;em&gt;Sequential Scan&lt;/em&gt; will be faster or equal.&lt;/p&gt;
&lt;p&gt;To know what is going on &lt;em&gt;exactly&lt;/em&gt; and to know &lt;em&gt;how fast&lt;/em&gt; our query runs, we have the &lt;em&gt;EXPLAIN&lt;/em&gt; command at our disposal.&lt;/p&gt;
&lt;h3&gt;Speeding things up&lt;/h3&gt;
&lt;p&gt;Let us &lt;em&gt;EXPLAIN&lt;/em&gt; the query we have just run:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;highway&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'secondary'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Kin Thermal Power Plant Coal storage building'&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We simply put the keyword &lt;em&gt;EXPLAIN&lt;/em&gt; (and &lt;em&gt;ANALYZE&lt;/em&gt; to give us total runtime) right in front of our normal query.&lt;/p&gt;
&lt;p&gt;The result:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;Sort  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5047.50..5055.32 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3129&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;391&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;41.481..41.815 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3215&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
    Sort Key: &lt;span class="o"&gt;(&lt;/span&gt;st_distance&lt;span class="o"&gt;(&lt;/span&gt;road.way, building.way&lt;span class="o"&gt;))&lt;/span&gt;
    Sort Method: quicksort  Memory: 348kB
    -&amp;gt;  Nested Loop  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.00..4309.34 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3129&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;391&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.188..38.617 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3215&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
        -&amp;gt;  Seq Scan on planet_osm_polygon building  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.00..279.01 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;207&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.981..1.409 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
            Filter: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Kin Thermal Power Plant Coal storage building'&lt;/span&gt;::text&lt;span class="o"&gt;)&lt;/span&gt;
            Rows Removed by Filter: 6320
        -&amp;gt;  Seq Scan on planet_osm_line road  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.00..3216.79 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3129&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;184&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.166..26.524 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3215&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
            Filter: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'secondary'&lt;/span&gt;::text&lt;span class="o"&gt;)&lt;/span&gt;
            Rows Removed by Filter: 73488
Total runtime: 42.153 ms
&lt;/pre&gt;


&lt;p&gt;That is a lot of output, but it shows you how the internal planner executes our query and which decisions it makes along the way.&lt;/p&gt;
&lt;p&gt;To fully interpret a query plan (this is still a simple one), a lot more knowledge is needed and this would easily deserve its own &lt;em&gt;series&lt;/em&gt;.
I am by far not an expert in the query planner (though it is an interesting study topic), but I will do my best to extract the important bits we need for our direct performance tuning.&lt;/p&gt;
&lt;p&gt;A query plan is always made up out of nested nodes, the parent node containing all the accumulated information (costs, rows, ...) of its child nodes.&lt;/p&gt;
&lt;p&gt;Inside the nested loop parent node we see above, we can find that the planner decided to use two filters, which correspond to the &lt;em&gt;WHERE&lt;/em&gt; clause conditions of our query (building.name and road.highway).
You can see that both child nodes are of &lt;em&gt;Seq Scan&lt;/em&gt; type, which means &lt;em&gt;Sequential Scan&lt;/em&gt;. These types of nodes scan the whole table, simply from top to bottom, directly from disk.&lt;/p&gt;
&lt;p&gt;Another important thing to note is the total time this query costs, which is &lt;em&gt;42.153 ms&lt;/em&gt;.
The time reported here is the time on my local machine, depending on how decent your computer is, this time could vary.&lt;/p&gt;
&lt;p&gt;A detail not to forget when looking at this timing, is the fact that it is slightly skewed if compared to real-world application use:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;We neglect network/client traffic. This query now runs internally and does not need to communicate with a client driver (which almost always brings extra overhead)&lt;/li&gt;
&lt;li&gt;The time measurement itself also introduces overhead.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The total runtime from our above plan does not sound as a big number, but we are working with a rather small data set - the area of Okinawa is large, but the geometry is rather sparse.&lt;/p&gt;
&lt;p&gt;So our first reaction should be: this can be better.&lt;/p&gt;
&lt;p&gt;First, let us try to get rid of these sequential scans, for they are a clear indication that the planner does not use an index.&lt;/p&gt;
&lt;h4&gt;Creating indexes&lt;/h4&gt;
&lt;p&gt;In our case we want to make two types of indexes:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Indexes on our "meta" data, the names and other attributes describing out geometrical data&lt;/li&gt;
&lt;li&gt;Indexes that actually index our geometrical data itself&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Let us start with our attributes columns.&lt;/p&gt;
&lt;p&gt;These are all simple VARCHAR, TEXT or INT columns, so the good old Balanced Tree or &lt;em&gt;B-Tree&lt;/em&gt; can be used here.
In our query above we use "road.highway" and "building.name" in our lookup, so let us make a couple of indexes that adhere to this query.
Remember, an index only makes sense if it is built the same way your queries question your data.&lt;/p&gt;
&lt;p&gt;First, the "highway" column of the "planet_osm_line" table:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line_highway_index&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;highway&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The syntax is trivial. You simply tell PostgreSQL to create an index, give it a name, and tell it on which column(s) of which table you want it to be built.
PostgreSQL will always default to the &lt;em&gt;B-Tree&lt;/em&gt; index type.&lt;/p&gt;
&lt;p&gt;Next, the name column:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon_name_index&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now perform another &lt;em&gt;VACUUM ANALYZE&lt;/em&gt; on both tables:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;VACUUM&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;VACUUM&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Let us run explain again on the exact same query:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;Sort&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4058&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;73&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;4066&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;56&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3129&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;394&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;817&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;21&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;149&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3215&lt;/span&gt; &lt;span class="n"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;Sort&lt;/span&gt; &lt;span class="k"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st_distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;Sort&lt;/span&gt; &lt;span class="k"&gt;Method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;quicksort&lt;/span&gt;  &lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;348&lt;/span&gt;&lt;span class="n"&gt;kB&lt;/span&gt;
    &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;  &lt;span class="n"&gt;Nested&lt;/span&gt; &lt;span class="n"&gt;Loop&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;72&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;95&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;3310&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;07&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3129&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;394&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;356&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;17&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;743&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3215&lt;/span&gt; &lt;span class="n"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;  &lt;span class="k"&gt;Index&lt;/span&gt; &lt;span class="n"&gt;Scan&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon_name_index&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;28&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;207&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;054&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;056&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;Index&lt;/span&gt; &lt;span class="n"&gt;Cond&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Kin Thermal Power Plant Coal storage building'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;  &lt;span class="n"&gt;Bitmap&lt;/span&gt; &lt;span class="n"&gt;Heap&lt;/span&gt; &lt;span class="n"&gt;Scan&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;72&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;67&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;2488&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;23&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3129&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;187&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;258&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;661&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3215&lt;/span&gt; &lt;span class="n"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;Recheck&lt;/span&gt; &lt;span class="n"&gt;Cond&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'secondary'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;  &lt;span class="n"&gt;Bitmap&lt;/span&gt; &lt;span class="k"&gt;Index&lt;/span&gt; &lt;span class="n"&gt;Scan&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line_highway_index&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;71&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;89&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3129&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;864&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;864&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3215&lt;/span&gt; &lt;span class="n"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                   &lt;span class="k"&gt;Index&lt;/span&gt; &lt;span class="n"&gt;Cond&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'secondary'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;Total&lt;/span&gt; &lt;span class="n"&gt;runtime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;21&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;527&lt;/span&gt; &lt;span class="n"&gt;ms&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You can see that we now traded our &lt;em&gt;Seq Scan&lt;/em&gt; for &lt;em&gt;Index Scan&lt;/em&gt; and &lt;em&gt;Bitmap Heap Scan&lt;/em&gt;, which indicates that our attribute indexes are being used, yay!&lt;/p&gt;
&lt;p&gt;The so-called &lt;em&gt;Bitmap Heap Scan&lt;/em&gt;, instead of a &lt;em&gt;Sequential Scan&lt;/em&gt;, is performed when the planner decides it can use the index to gather all the rows it thinks it needs, sort them in logical order and then fetch the data from the table on disk in the most optimized way possible (trying to open each disk page only once).&lt;/p&gt;
&lt;p&gt;The order by which the &lt;em&gt;Bitmap Heap Scan&lt;/em&gt; arranges the data is directed by the child node aka the &lt;em&gt;Bitmap Index Scan&lt;/em&gt;. This latter type of node is the one doing the actual searching &lt;em&gt;inside&lt;/em&gt; the index. Because in our &lt;em&gt;WHERE&lt;/em&gt; clause we have a condition which tells PostgreSQL to limit the rows to the ones of "highway" type "secondary", the &lt;em&gt;Bitmap Index Scan&lt;/em&gt; fetches the needed rows from our &lt;em&gt;B-Tree&lt;/em&gt; index we just made and passes them to its parent, the &lt;em&gt;Bitmap Heap Scan&lt;/em&gt;, which then goes on to order the geometry rows to be fetched.&lt;/p&gt;
&lt;p&gt;This already helped much, for our query runtime dropped to half. Now, let us make the indexes for our actual geometry, and see the effect:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line_way&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;gist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon_way&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;gist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Creating a &lt;em&gt;GiST&lt;/em&gt; index is quite similar to a normal &lt;em&gt;B-Tree&lt;/em&gt; index. The only difference here is that you specify the index to be build with &lt;em&gt;GiST&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Vacuum:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;VACUUM&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;VACUUM&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now poke it again with the same query and see our new plan:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;Sort  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4038.82..4046.54 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3089&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;395&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;21.137..21.479 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3215&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
    Sort Key: &lt;span class="o"&gt;(&lt;/span&gt;st_distance&lt;span class="o"&gt;(&lt;/span&gt;road.way, building.way&lt;span class="o"&gt;))&lt;/span&gt;
    Sort Method: quicksort  Memory: 348kB
    -&amp;gt;  Nested Loop  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;72.64..3299.76 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3089&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;395&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.382..17.858 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3215&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
        -&amp;gt;  Index Scan using planet_osm_polygon_name_index on planet_osm_polygon building  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.28..8.30 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;207&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.041..0.044 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
            Index Cond: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Kin Thermal Power Plant Coal storage building'&lt;/span&gt;::text&lt;span class="o"&gt;)&lt;/span&gt;
        -&amp;gt;  Bitmap Heap Scan on planet_osm_line road  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;72.36..2488.32 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3089&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;188&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.297..4.726 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3215&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
            Recheck Cond: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'secondary'&lt;/span&gt;::text&lt;span class="o"&gt;)&lt;/span&gt;
            -&amp;gt;  Bitmap Index Scan on planet_osm_line_highway_index  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.00..71.59 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3089&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.866..0.866 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3215&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
                  Index Cond: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'secondary'&lt;/span&gt;::text&lt;span class="o"&gt;)&lt;/span&gt;
Total runtime: 21.873 ms
&lt;/pre&gt;


&lt;p&gt;Hmm, the plan did not change at all, and our runtime is roughly identical. Why is our performance still the same?&lt;/p&gt;
&lt;p&gt;The culprit here is &lt;em&gt;ST_Distance()&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;As it turns out, this function is unable to use the &lt;em&gt;GiST&lt;/em&gt; index and is therefor not a good candidate to set loose on your whole result set. The same goes for the &lt;em&gt;ST_Area()&lt;/em&gt; function, by the way.&lt;/p&gt;
&lt;p&gt;So we need a way to limit the amount of records we do this expensive calculation on.&lt;/p&gt;
&lt;h4&gt;ST_DWithin()&lt;/h4&gt;
&lt;p&gt;We introduce a new function: &lt;em&gt;ST_DWithin()&lt;/em&gt;. This function could be our savior in this case, for it does use the &lt;em&gt;GiST&lt;/em&gt; index.&lt;/p&gt;
&lt;p&gt;Whether or not a function (or operator) can use the &lt;em&gt;GiST&lt;/em&gt; index, depends on if it uses &lt;em&gt;bounding boxes&lt;/em&gt; when performing calculations.
The reason why is because &lt;em&gt;GiST&lt;/em&gt; indexes mainly store bounding box information and not the exact geometry itself.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;ST_DWithin()&lt;/em&gt; checks if given geometry is within a radius of another piece of geometry and simply returns &lt;em&gt;TRUE&lt;/em&gt; or &lt;em&gt;FALSE&lt;/em&gt;.
We can thus use it in our &lt;em&gt;WHERE&lt;/em&gt; clause to filter out geometry for which it returns &lt;em&gt;FALSE&lt;/em&gt; (and thus not falls within the radius).
It performs this check using bounding boxes, and thus is able to retrieve this information from our &lt;em&gt;GiST&lt;/em&gt; index.&lt;/p&gt;
&lt;p&gt;Let me present you with a query that limits the result set based on what &lt;em&gt;ST_DWithin()&lt;/em&gt; finds:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;highway&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;planet_osm_polygon&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'secondary'&lt;/span&gt;
        &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Kin Thermal Power Plant Coal storage building'&lt;/span&gt;
        &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;ST_DWithin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;building&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;As you can see we simply added one more &lt;em&gt;WHERE&lt;/em&gt; clause to limit the returned geometry by radius.
This will result in the following plan:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;Sort  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;45.66..45.67 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;395&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;6.048..6.052 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;27&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
    Sort Key: &lt;span class="o"&gt;(&lt;/span&gt;st_distance&lt;span class="o"&gt;(&lt;/span&gt;road.way, building.way&lt;span class="o"&gt;))&lt;/span&gt;
    Sort Method: quicksort  Memory: 27kB
    -&amp;gt;  Nested Loop  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4.63..45.65 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;395&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3.157..6.005 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;27&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
        -&amp;gt;  Index Scan using planet_osm_polygon_name_index on planet_osm_polygon building  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.28..8.30 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;207&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.051..0.054 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
            Index Cond: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Kin Thermal Power Plant Coal storage building'&lt;/span&gt;::text&lt;span class="o"&gt;)&lt;/span&gt;
        -&amp;gt;  Bitmap Heap Scan on planet_osm_line road  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4.34..37.09 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;188&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3.090..5.771 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;27&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
            Recheck Cond: &lt;span class="o"&gt;(&lt;/span&gt;way &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; st_expand&lt;span class="o"&gt;(&lt;/span&gt;building.way, 10000::double precision&lt;span class="o"&gt;))&lt;/span&gt;
            Filter: &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nv"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'secondary'&lt;/span&gt;::text&lt;span class="o"&gt;)&lt;/span&gt; AND &lt;span class="o"&gt;(&lt;/span&gt;building.way &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; st_expand&lt;span class="o"&gt;(&lt;/span&gt;way, 10000::double precision&lt;span class="o"&gt;))&lt;/span&gt; AND _st_dwithin&lt;span class="o"&gt;(&lt;/span&gt;way, building.way, 10000::double precision&lt;span class="o"&gt;))&lt;/span&gt;
            Rows Removed by Filter: 4838
                -&amp;gt;  Bitmap Index Scan on planet_osm_line_way  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.00..4.34 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;8&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.978..1.978 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;4865&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
                    Index Cond: &lt;span class="o"&gt;(&lt;/span&gt;way &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; st_expand&lt;span class="o"&gt;(&lt;/span&gt;building.way, 10000::double precision&lt;span class="o"&gt;))&lt;/span&gt;
Total runtime: 6.181 ms
&lt;/pre&gt;


&lt;p&gt;Good. We have just gone down to only &lt;em&gt;6.181 ms&lt;/em&gt;, That seems to be much more efficient.&lt;/p&gt;
&lt;p&gt;As you can see, our query plan got a few new rows. The main thing to notice is the fact that our &lt;em&gt;Bitmap Heap Scan&lt;/em&gt; got another &lt;em&gt;Recheck Cond&lt;/em&gt;, our expanded &lt;em&gt;ST_DWithin()&lt;/em&gt; condition.
More to the bottom, you can see that the condition is being pulled from the &lt;em&gt;GiST&lt;/em&gt; index:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;Index Cond: &lt;span class="o"&gt;(&lt;/span&gt;way &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; st_expand&lt;span class="o"&gt;(&lt;/span&gt;building.way, 10000::double precision&lt;span class="o"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This seems to be a much more desirable and scalable query.&lt;/p&gt;
&lt;p&gt;But there is a drawback, though &lt;em&gt;ST_DWithin()&lt;/em&gt; will make for speedy results, it works only by giving it a fixed radius.&lt;/p&gt;
&lt;p&gt;As you can see from our usage, we call the function as follows: ST_DWithin(road.way, building.way, 10000).
The last argument, "10000", tells us how big the search radius is. In this case our geometry is in meters, so this means we search in a radius of 10 Km.&lt;/p&gt;
&lt;p&gt;This static radius number is quite arbitrary and might not always be desirable. What other options do we have without compromising performance too much?&lt;/p&gt;
&lt;h4&gt;Operators&lt;/h4&gt;
&lt;p&gt;Another addition of PostGIS we have not talked about much up until now are the spatial &lt;em&gt;operators&lt;/em&gt; we have available.
You have a total of 16 operators you can use to perform matches on your GIS data.&lt;/p&gt;
&lt;p&gt;You have straightforward operators like &lt;em&gt;&amp;amp;&amp;amp;&lt;/em&gt;, which returns &lt;em&gt;TRUE&lt;/em&gt; if one piece of geometry intersects with another (bounding box calculation) or the &lt;em&gt;&amp;lt;&amp;lt;&lt;/em&gt; which returns &lt;em&gt;TRUE&lt;/em&gt; if one object is fully to the left of another object.&lt;/p&gt;
&lt;p&gt;But there are more interesting ones like the &lt;em&gt;&amp;lt;-&amp;gt;&lt;/em&gt; and the &lt;em&gt;&amp;lt;#&amp;gt;&lt;/em&gt; operators.&lt;/p&gt;
&lt;p&gt;The first operator, &lt;em&gt;&amp;lt;-&amp;gt;&lt;/em&gt;, returns the distance between two points. If you feed it other types of geometry (like a linestring of polygon) it will first draw a bounding box around that geometry and perform a point calculation by using the bounding box &lt;em&gt;centroids&lt;/em&gt;. A centroid is the calculated center of a piece of geometry (the drawn bounding box in our case).&lt;/p&gt;
&lt;p&gt;The second, &lt;em&gt;&amp;lt;#&amp;gt;&lt;/em&gt;, acts completely the same, but works directly on bounding boxes of given geometry. In our case, since we are not working with points, it would make more sense to use this operator.&lt;/p&gt;
&lt;p&gt;The big advantage of this distance calculation operator is, once more, the fact that it too calculates using a bounding box and is thus able to use a &lt;em&gt;GiST&lt;/em&gt; index.
However, the &lt;em&gt;ST_Distance()&lt;/em&gt; function calculates distances by finding two points on the given geometry most close to each other, which serves the most &lt;em&gt;accurate&lt;/em&gt; result.
The &lt;em&gt;&amp;lt;#&amp;gt;&lt;/em&gt; operator, as said before, stretches a &lt;em&gt;bounding box&lt;/em&gt; around each piece of geometry and therefor deforms our objects, making for less accurate distance measuring.&lt;/p&gt;
&lt;p&gt;It is therefor not wise to use &lt;em&gt;&amp;lt;#&amp;gt;&lt;/em&gt; to calculate accurate distances, but it is a life saver to &lt;em&gt;sort away&lt;/em&gt; geometry that is too far away for our interest.&lt;/p&gt;
&lt;p&gt;So a proper usage would be to first &lt;em&gt;roughly&lt;/em&gt; limit the result set using the &lt;em&gt;&amp;lt;#&amp;gt;&lt;/em&gt; operator and then more accurately measure the distance of, say, the first 50 matches with our famous &lt;em&gt;ST_Distance()&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Before we can continue, it is important to point out that both the &lt;em&gt;&amp;lt;-&amp;gt;&lt;/em&gt; and &lt;em&gt;&amp;lt;#&amp;gt;&lt;/em&gt; operator can only use the &lt;em&gt;GiST&lt;/em&gt; index when either the left or right hand side of the operator is a &lt;em&gt;constant&lt;/em&gt; or &lt;em&gt;fixed&lt;/em&gt; piece of geometry. This means we have to provide actual geometry using a constructor function.&lt;/p&gt;
&lt;p&gt;There are other ways around this limitation by, for example as Alexandre Neto points out on the PostGIS mailing list, providing your own function which converts our "dynamic" geometry into a constant.&lt;/p&gt;
&lt;p&gt;But this would make this post run way past its initial focus.
Let us simply try by providing a fixed piece of geometry.
The fixed piece is, of course, still our "Kin Thermal Power Plant Coal storage building", but converted into WKT:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;way&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt;
        &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt;
        &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'secondary'&lt;/span&gt;
        &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;ST_GeomFromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POLYGON((14239931.42 3054117.72,14239990.49 3054224.25,14240230.15 3054091.38,14240171.08 3053984.84,14239931.42 3054117.72))'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;900913&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;#&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;way&lt;/span&gt;
        &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ST_GeomFromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POLYGON((14239931.42 3054117.72,14239990.49 3054224.25,14240230.15 3054091.38,14240171.08 3053984.84,14239931.42 3054117.72))'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;900913&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;road&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;true_distance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;true_distance&lt;/span&gt;
    &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This query uses a &lt;em&gt;Common Table Expression&lt;/em&gt; or &lt;em&gt;CTE&lt;/em&gt; (you could also use a simpler subquery) to first get a rough result set of about 50 rows based on what &lt;em&gt;&amp;lt;#&amp;gt;&lt;/em&gt; finds.
Then &lt;em&gt;only&lt;/em&gt; on those 50 rows do we perform our more expensive, index-agnostic distance calculation.&lt;/p&gt;
&lt;p&gt;This results in the following plan and runtime:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;Limit  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;274.57..274.57 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;64&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;11.236..11.237 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
    CTE distance
        -&amp;gt;  Limit  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.28..260.82 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;173&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.389..10.764 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
            -&amp;gt;  Index Scan using planet_osm_line_way on planet_osm_line  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.28..16362.19 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;3140&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;173&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.389..10.745 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
                Order By: &lt;span class="o"&gt;(&lt;/span&gt;way &amp;lt;&lt;span class="c"&gt;#&amp;gt; '010300002031BF0D000100000005000000D7A3706D17296B41C3F528DC124D47417B14AECF1E296B4100000020484D4741CDCCCCC43C296B410AD7A3B0054D4741295C8F6235296B41B81E856BD04C4741D7A3706D17296B41C3F528DC124D4741'::geometry)&lt;/span&gt;
                Filter: &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'secondary'&lt;/span&gt;::text&lt;span class="o"&gt;)&lt;/span&gt;
                Rows Removed by Filter: 4562
            -&amp;gt;  Sort  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;13.75..13.88 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;64&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;11.234..11.234 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
                Sort Key: &lt;span class="o"&gt;(&lt;/span&gt;st_distance&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'010300002031BF0D000100000005000000D7A3706D17296B41C3F528DC124D47417B14AECF1E296B4100000020484D4741CDCCCCC43C296B410AD7A3B0054D4741295C8F6235296B41B81E856BD04C4741D7A3706D17296B41C3F528DC124D4741'&lt;/span&gt;::geometry, distance.road&lt;span class="o"&gt;))&lt;/span&gt;
                Sort Method: top-N heapsort  Memory: 25kB
    -&amp;gt;  CTE Scan on distance  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.00..13.50 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt; &lt;span class="nv"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;64&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;actual &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.412..11.188 &lt;span class="nv"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;50&lt;/span&gt; &lt;span class="nv"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
Total runtime: 11.268 ms
&lt;/pre&gt;


&lt;p&gt;As you can see, we are now using the &lt;em&gt;GiST&lt;/em&gt; index "planet_osm_line_way", which was what we were after.&lt;/p&gt;
&lt;p&gt;This yields roughly the same runtime as with our &lt;em&gt;ST_DWithin()&lt;/em&gt;, but without the arbitrary distance setting.
We indeed have a somewhat arbitrary limiter of 50, but this is much less severe then a distance limiter.&lt;/p&gt;
&lt;p&gt;Even if the closest secondary road is 100 Km from our building, the above query would still find it whereas our previous query would return nothing.&lt;/p&gt;
&lt;h3&gt;One more for the road home&lt;/h3&gt;
&lt;p&gt;Let us do a few more fun calculations on our Okinawa data, before I let you off the island.&lt;/p&gt;
&lt;p&gt;Next I would like to find the longest &lt;em&gt;trunk&lt;/em&gt; road that runs through this prefecture:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;highway&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_Length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;length&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;highway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'trunk'&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;length&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We have a new function &lt;em&gt;ST_Length()&lt;/em&gt; which simply returns the length, given that the geometry is a linestring or multilinestring.
The only index that will be used is our "planet_osm_line_highway_index" &lt;em&gt;B-Tree&lt;/em&gt; index to perform our &lt;em&gt;Bitmap Index Scan&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;ST_Length()&lt;/em&gt; does obviously not work with bounding boxes and therefor cannot use the geometrical &lt;em&gt;GiST&lt;/em&gt; index. This is yet another function you should use carefully.&lt;/p&gt;
&lt;p&gt;When looking at the result set that was returned to us, you will see that some routes show up multiple times.
Take route &lt;em&gt;58&lt;/em&gt;, which is the longest and most famous route in Okinawa. It shows up around &lt;em&gt;769&lt;/em&gt; times. Why?&lt;/p&gt;
&lt;p&gt;This is because, especially for a database prepared for mapping, these pieces of geometry are divided over different tiles.&lt;/p&gt;
&lt;p&gt;We thus need to accumulate the length of all the linestrings we find that represent pieces of route 58.
First, we could try to accomplish this with plain SQL:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;road_pieces&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;length&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'58'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_length&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;road_pieces&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will return:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;536468.804010367
&lt;/pre&gt;


&lt;p&gt;Meaning a total length of &lt;em&gt;536.486 Kilometers&lt;/em&gt;. This query will run in about &lt;em&gt;19.375 ms&lt;/em&gt;.
Let us add an index to our "ref" column:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line_ref_index&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Perform vacuum:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;VACUUM&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This index creation will speed up to query and make it run in little over &lt;em&gt;3.524 ms&lt;/em&gt;. Nice runtime.&lt;/p&gt;
&lt;p&gt;You could also perform almost the exact same query, but instead of using an SQL sum() function, you could use &lt;em&gt;ST_Collect()&lt;/em&gt;, which creates collections of geometry out of all the separate pieces you feed it.
In our case we feed it separate linestrings, which will make this function output a single &lt;em&gt;multilinestring&lt;/em&gt;. We would then only have to perform one length calculation.&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;road_pieces&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Collect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;geom&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'58'&lt;/span&gt;    
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;geom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;length&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;road_pieces&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This query will run even around &lt;em&gt;1 ms&lt;/em&gt; faster then former and it returns &lt;em&gt;the exact&lt;/em&gt; same distance of &lt;em&gt;536.486 Kilometers&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Now that we have this one multilinestring which represents route 58, we could check how close this route comes to our famous Kin building (which we will statically feed):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;road_pieces&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Collect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;way&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;geom&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;planet_osm_line&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'58'&lt;/span&gt;    
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;geom&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_GeomFromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POLYGON((14239931.42 3054117.72,14239990.49 3054224.25,14240230.15 3054091.38,14240171.08 3053984.84,14239931.42 3054117.72))'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;900913&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;road_pieces&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Which would give us:&lt;/p&gt;
&lt;pre class="code literal-block"&gt; 7900.58662432767
&lt;/pre&gt;


&lt;p&gt;In other words: Route 58 is, at it closest point, &lt;em&gt;7.9 Kilometers&lt;/em&gt; from our coal storage building.
This query now took about &lt;em&gt;5 ms&lt;/em&gt; to complete. A rather nice throughput.&lt;/p&gt;
&lt;p&gt;Okay, enough exploring for today.&lt;/p&gt;
&lt;p&gt;We took a brief look at indexing our spatial data, and what benefits we could gain from it.
And, as you can imagine, a lack of indexes and improper use of the GIS functions, could lead to dramatic slow-downs, certainly on larger data sets.&lt;/p&gt;
&lt;h3&gt;Shapefiles&lt;/h3&gt;
&lt;p&gt;Before I will let you go I want to take a brief look at another mechanism of carrying around GIS data: the &lt;em&gt;shapefile&lt;/em&gt;.
Probably more used then the OSM XML format, but less open. It is almost the GIS standard way of exchanging data between GIS systems.&lt;/p&gt;
&lt;p&gt;We can import shapefiles by using a tool called "shp2pgsql" which comes shipped with PostGIS.
This tool will attempt to upload &lt;em&gt;ESRI&lt;/em&gt; shape data into your PostGIS enables database.&lt;/p&gt;
&lt;h4&gt;ESRI?&lt;/h4&gt;
&lt;p&gt;&lt;em&gt;ESRI&lt;/em&gt; stands for &lt;em&gt;Environmental Systems Research Institute&lt;/em&gt; and is yet another organization that taps into the world of digital cartography.&lt;/p&gt;
&lt;p&gt;They have defined a (somewhat open) file format standard that allows the GIS world to save their data in a so called &lt;em&gt;shapefile&lt;/em&gt;.
These files hold GIS primitives (polygons, linestrings, points, ...) together with a bunch of descriptive information that tells us what each primitive represents.&lt;/p&gt;
&lt;p&gt;It was once developed for ESRI's own, proprietary software package (ArcGIS), but was quickly picked up by the rest of the GIS community.
Today, almost all serious GIS packages have the ability to read and/or write to such shapefiles.&lt;/p&gt;
&lt;h4&gt;Shapefile build-up&lt;/h4&gt;
&lt;p&gt;Let us take a peek at the guts of such a shapefile.&lt;/p&gt;
&lt;p&gt;First, contrary to what the name suggest, a shapefile is not a single file. At a minimal level, it is a bundle containing a minimum of three files to be spec compliant:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;&lt;em&gt;.shp&lt;/em&gt;: the first mandatory file has the extension &lt;em&gt;.shp&lt;/em&gt; and holds the GIS primitives themselves.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;.shx&lt;/em&gt;: the second important file is an index of the geometry &lt;/li&gt;
&lt;li&gt;&lt;em&gt;.dbf&lt;/em&gt;: the last needed file is a database file with geometry attributes&lt;/li&gt;
&lt;/ul&gt;&lt;h3&gt;Getting shapefile data&lt;/h3&gt;
&lt;p&gt;There are many organizations who offer shapefiles of all areas of the globe, either free or for a small fee.
But since we already have data in our database we are familiar with, we could create our own shapefiles.&lt;/p&gt;
&lt;h4&gt;Exporting with pgsql2shp&lt;/h4&gt;
&lt;p&gt;Besides "shp2pgsql", which is used to import or &lt;em&gt;load&lt;/em&gt; shapefiles, we also got shipped a reverse tool called "pgsql2shp", which can export to or &lt;em&gt;dump&lt;/em&gt; shapefiles based on geometry in your database.&lt;/p&gt;
&lt;p&gt;So let us, per experiment, create a shapefile containing all secondary roads of Okinawa.&lt;/p&gt;
&lt;p&gt;First we need to prepare an empty directory where this tool can dump our data. Since it will create multiple files, it is best to put them in their own spot.
Open up a terminal window and go to your favorite directory-making place and create a directory called "okinawa-roads":&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;mkdir okinawa-roads
&lt;/pre&gt;


&lt;p&gt;Next enter that directory.&lt;/p&gt;
&lt;p&gt;The "pgsql2shp" tool needs a few parameters to be able to successfully complete. We will be using the following flags:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;-f, tells the tool which file name to adhere&lt;/li&gt;
&lt;li&gt;-u, the database user to connect with&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;After these flags we need to input the database we wish to take a chunk out of and the query which will determine the actual data to be dumped.&lt;/p&gt;
&lt;p&gt;The above will result in the following command:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="nv"&gt;$ &lt;/span&gt; pgsql2shp -f secundairy_roads -u postgres gis &lt;span class="s2"&gt;"select way, ref from planet_osm_line where highway = 'secondary';"&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;As you can see we construct a query which only gets the road reference and the geometry "way" column from the secondary road types.&lt;/p&gt;
&lt;p&gt;After some processing it will have created 4 files, the 3 mandatory ones mentioned above, and a new one called a &lt;em&gt;projection&lt;/em&gt; file.
This file contains the coordinate system and other projection information in WKT format.&lt;/p&gt;
&lt;p&gt;This bundle of 4 files is now our shapefile format which you could easily exchange between GIS aware software packages.&lt;/p&gt;
&lt;h4&gt;Importing with shp2pgsql&lt;/h4&gt;
&lt;p&gt;Let us now import these shapefiles back into PostgreSQL and see what happens.&lt;/p&gt;
&lt;p&gt;For this we will ignore out "gis" database, and simply create a new database to keep things separated.
Connect to a PostgreSQL terminal, create the database and make it PostGIS aware:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;DATABASE&lt;/span&gt; &lt;span class="n"&gt;gisshape&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="n"&gt;gisshape&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;postgis&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now go back to your terminal window to do some importing.&lt;/p&gt;
&lt;p&gt;The import tool works by dumping the SQL statements to &lt;em&gt;stdin&lt;/em&gt; or to a SQL dump file if preferred.
If you do not wish to work with such a dump file, you have to pipe the output to the &lt;em&gt;psql&lt;/em&gt; command to be able to load in the data.&lt;/p&gt;
&lt;p&gt;From the directory where you saved the shapefile dump, run the "shp2pgsql" tool:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;shp2pgsql -S -s &lt;span class="m"&gt;900913&lt;/span&gt; -I secundairy_roads &lt;span class="p"&gt;|&lt;/span&gt; psql -U postgres gisshape
&lt;/pre&gt;


&lt;p&gt;Let me go over the flags we used:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;-S: is used to keep the geometry &lt;em&gt;simple&lt;/em&gt;. The tool otherwise will convert all geometry to its &lt;em&gt;MULTI...&lt;/em&gt; counterpart&lt;/li&gt;
&lt;li&gt;-s: is needed to set the correct SRID&lt;/li&gt;
&lt;li&gt;-I: specifies that we wish the tool to create &lt;em&gt;GiST&lt;/em&gt; indexes on the geometry columns&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Note that the &lt;em&gt;-S&lt;/em&gt; flag will only work if all of your geometry is actual simple and does not contain true MULTI... types of geometry with multiple linestrings, points or polygons in them.&lt;/p&gt;
&lt;p&gt;An annoying fact is that you &lt;em&gt;have&lt;/em&gt; to tell the loader which SRID your geometry is in. There is a &lt;em&gt;.prj&lt;/em&gt; file in our shapefile bundle, but it only contains the WKT projection information, not the SRID.
One trick to find the SRID based on the information in the projection file is by using &lt;em&gt;OpenGEO&lt;/em&gt;'s &lt;a href="http://prj2epsg.org"&gt;Prj2EPSG"&lt;/a&gt; website, which does quite a good job at looking up the EPSG ID (which most of the time is the SRID). However, it fails to find the SRID of our OSM projection.&lt;/p&gt;
&lt;p&gt;Another way of finding our about the SRID is by using the PostGIS &lt;em&gt;spatial_ref_sys&lt;/em&gt; table itself:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;srid&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;spatial_ref_sys&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;srtext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'PROJCS["Popular Visualisation CRS / Mercator (deprecated)",GEOGCS["Popular Visualisation CRS",DATUM["Popular_Visualisation_Datum",SPHEROID["Popular Visualisation Sphere",6378137,0,AUTHORITY["EPSG","7059"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["EPSG","6055"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.01745329251994328,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4055"]],UNIT["metre",1,AUTHORITY["EPSG","9001"]],PROJECTION["Mercator_1SP"],PARAMETER["central_meridian",0],PARAMETER["scale_factor",1],PARAMETER["false_easting",0],PARAMETER["false_northing",0],AUTHORITY["EPSG","3785"],AXIS["X",EAST],AXIS["Y",NORTH]]'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will gives us:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;900913
&lt;/pre&gt;


&lt;p&gt;Perfect!&lt;/p&gt;
&lt;p&gt;If you now connect to your database and query its structure:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="n"&gt;gisshape&lt;/span&gt;
&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You will see we have a new table called "secondary_roads". This table now holds only the information we dumped into the shapefile, being our road route numbers and their geometry. Neat!&lt;/p&gt;
&lt;h3&gt;The end&lt;/h3&gt;
&lt;p&gt;Good.&lt;/p&gt;
&lt;p&gt;We are done folks. I hope I have given you enough firepower to be able to commence with your own GIS work, using PostGIS.
As I have said in the beginning of this series, the past three chapters form merely an introduction into the capabilities of PostGIS, so as I expect you will do every time: go out and explore!&lt;/p&gt;
&lt;p&gt;Try to load in different areas of the world, either with OpenStreetMap or by using shapefiles. Experiment with all the different GIS functions and operators that PostGIS makes available.&lt;/p&gt;
&lt;p&gt;And above all, have fun!&lt;/p&gt;
&lt;p&gt;And as always...thanks for reading!&lt;/p&gt;
&lt;!--  LocalWords:  PostGIS PostgreSQL GIS OpenStreetMap
 --&gt;&lt;/div&gt;</description><category>postgis</category><category>postgresql</category><guid>http://shisaa.be/postset/postgis-postgresqls-spatial-partner-part-3.html</guid><pubDate>Wed, 25 Jun 2014 10:00:00 GMT</pubDate></item><item><title>Postgis, PostgreSQL's spatial partner - Part 2</title><link>http://shisaa.be/postset/postgis-postgresqls-spatial-partner-part-2.html</link><dc:creator>Tim van der Linden</dc:creator><description>&lt;div&gt;&lt;p&gt;Welcome to the secoflynd part of our spatial story. If you have not done so, I advise you to go and read &lt;a href="http://shisaa.be/postset/postgis-postgresqls-spatial-partner-part-1.html" title="Part one of this series."&gt;part one&lt;/a&gt; first.&lt;/p&gt;
&lt;p&gt;The first part of this series gives you some basic knowledge about the GIS world (GIS Objects, WKT, Projections, ...).
This knowledge will come in handy in this chapter.&lt;/p&gt;
&lt;p&gt;Today we will finally take an actual peek at PostGIS and do some database work:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;We will see how we can create valid GIS objects and insert them into our database&lt;/li&gt;
&lt;li&gt;Next let PostGIS retrieve information about these inserted GIS objects&lt;/li&gt;
&lt;li&gt;Further down the line we will manipulate these object a bit more&lt;/li&gt;
&lt;li&gt;Then we will leap from geometry into geography&lt;/li&gt;
&lt;li&gt;Finally we will be doing some real world measurements&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Let us get started right away!&lt;/p&gt;
&lt;h3&gt;Creating the database&lt;/h3&gt;
&lt;p&gt;Before we can do anything else, we need to make sure that we have the PostGIS extension installed.
PostGIS is most of the time packaged as a PostgreSQL contribution package.
On a Debian system, it can be installed as follows:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;apt-get install postgresql-9.3-postgis-2.1
&lt;/pre&gt;


&lt;p&gt;This will install PostGIS version 2.1 for the PostgreSQL 9.3 database.&lt;/p&gt;
&lt;p&gt;Next, fire up your database console and let us first create a new user and database:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;user&lt;/span&gt; &lt;span class="n"&gt;gis&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;PASSWORD&lt;/span&gt; &lt;span class="s1"&gt;'10gis10'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;DATABASE&lt;/span&gt; &lt;span class="n"&gt;gis&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="k"&gt;OWNER&lt;/span&gt; &lt;span class="n"&gt;gis&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Not very original names, I know, but it states its purpose.
Next, connect to the &lt;em&gt;gis&lt;/em&gt; database and enable the PostGIS extension:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="n"&gt;gis&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;postgis&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now our database is PostGIS aware, and we are ready to get our hands dirty!&lt;/p&gt;
&lt;p&gt;Notice that if you now describe your database:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;PostGIS has created a new table and a few new views. This is PostGIS's own bookkeeping and it will store which tables contain geometry or geography columns.&lt;/p&gt;
&lt;h3&gt;Fun with Polygons&lt;/h3&gt;
&lt;p&gt;Let us begin this adventure with creating a polygon that has one interior ring, similar to the one we saw in the previous chapter.&lt;/p&gt;
&lt;p&gt;Before we can create them, though, we have to create a table that will hold their geometrical data:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now we have a table named "shapes" with only a column to store its name. But where do we store the geometry?&lt;/p&gt;
&lt;p&gt;Because of the new data types that PostGIS introduces (geometry and geography) and to keep its bookkeeping up to date, you can create this column with a PostGIS function named &lt;em&gt;AddGeometryColum()&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;AddGeometryColumn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'shapes'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'shape'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'POLYGON'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Let us do a breakdown.&lt;/p&gt;
&lt;p&gt;First, all the functions that PostGIS makes available to us are divided in groups that define their area of use. &lt;em&gt;AddGeometryColumn()&lt;/em&gt; falls in the "Management Functions" group.&lt;/p&gt;
&lt;p&gt;It is a function that will create a geometry column in a table of choice and adds a reference to this column to its bookkeeping. It accepts a number of arguments:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;The table name to where you wish to add the column&lt;/li&gt;
&lt;li&gt;The actual column name you wish to have&lt;/li&gt;
&lt;li&gt;The SRID&lt;/li&gt;
&lt;li&gt;The WKT object you wish to represent&lt;/li&gt;
&lt;li&gt;The coordinate type you desire (2 means XY)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;In the above case we thus wish to add a geometry column to the "shapes" table. The column will be named "shape". The geometry inserted there will get an SRID of 0 and will be of object type POLYGON and have a normal, two dimensional coordinate layout.&lt;/p&gt;
&lt;h4&gt;SRID?&lt;/h4&gt;
&lt;p&gt;One thing that you might not yet know from the above function definition is the &lt;em&gt;SRID&lt;/em&gt; or &lt;em&gt;Spatial Reference ID&lt;/em&gt; and is a &lt;em&gt;very&lt;/em&gt; important number when working with spatial data.
Remember in the last chapter I kept on yapping about different projections we had and that each projection would yield different results?
Well, this is where all this information comes together: the SRID.&lt;/p&gt;
&lt;p&gt;Our famous OGC has create a lookup table containing a whopping &lt;em&gt;3911&lt;/em&gt; entries, each entry with a unique ID, the SRID.
This table is called &lt;em&gt;spatial_ref_sys&lt;/em&gt; and is, by default, installed into your PostgreSQL database when you enable PostGIS.&lt;/p&gt;
&lt;p&gt;But hold on, there is something I neglected to tell you in the previous chapter: the European Petroleum Survey Group or EPSG.
The following is something that confuses many people and makes them mix-and-match SRID and EPSG ID's. I will try my best not to add up to that confusion.&lt;/p&gt;
&lt;h4&gt;EPSG&lt;/h4&gt;
&lt;p&gt;The EPSG, now called the OGP, is a group of organizations that, among other things, concern themselves over cartography.
They are the world's number one authority that &lt;em&gt;defines&lt;/em&gt; how spatial coordinates (projected or real world) should be calculated.
All the definitions they make get and accompanying ID called the EPSG ID.&lt;/p&gt;
&lt;p&gt;The OGC maintains a list to be used inside databases (GIS systems). They give all their entries a unique SRID.
These entries refer to &lt;em&gt;defined&lt;/em&gt; and &lt;em&gt;official&lt;/em&gt; projections, primarily maintained by the &lt;em&gt;EPSG&lt;/em&gt; which have their own EPSG ID and unique name.
Other projections (not maintained by the EPSG) are also accepted into the OGC SRID list as are your own projections (if you would feel the need).&lt;/p&gt;
&lt;p&gt;Let us poke the spatial reference table and see if we can get a more clear picture.&lt;/p&gt;
&lt;p&gt;If we would query our table (sorry for the wildcard) and ask for a famous SRID (more on this one later):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;spatial_ref_sys&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;srid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4326&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We would get back one row containing:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;srid, which is the famous id&lt;/li&gt;
&lt;li&gt;auth_name, the name of authority organization, in most cases EPSG&lt;/li&gt;
&lt;li&gt;auth_srid, the EPSG ID the authority organization introduced&lt;/li&gt;
&lt;li&gt;srtext, tells us how the spatial reference is built using WKT&lt;/li&gt;
&lt;li&gt;proj4text, commands that drive the proj4 library which is used to make the actual projections&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;And as you can see, both the "srid" column and the "auth_srid" are identical. This will be the case with many entries.&lt;/p&gt;
&lt;p&gt;I should also tell you that this huge list of SRID entries mostly consists of dead or localized projections.
Many of the projections listed are not used anymore, but where popular some time in history (they are marked deprecated), or are very localized. 
In the previous chapter I mentioned that the general UTM system, for example, could be used as a framework for more localized UTM projections.
There are hundreds of these local projections that only make sense when used in the area they are intended for.&lt;/p&gt;
&lt;h4&gt;Simple Features Functions&lt;/h4&gt;
&lt;p&gt;As I have told you before, the functions that PostGIS makes available are divided into several, defined groups. The functions themselves are too defined, not by PostGIS but by the Simple Features standard maintained by the &lt;em&gt;OGC&lt;/em&gt; (as we saw in the previous chapter).&lt;/p&gt;
&lt;p&gt;There are a total of 8 major categories available:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Management functions: functions which can manipulate the internal bookkeeping of PostGIS&lt;/li&gt;
&lt;li&gt;Geometry constructors: functions that can create or construct geometry and geography objects&lt;/li&gt;
&lt;li&gt;Geometry accessors: functions that let us access and ask questions about the GIS objects&lt;/li&gt;
&lt;li&gt;Geometry editors: functions that let us manipulate GIS objects&lt;/li&gt;
&lt;li&gt;Geometry outputs: functions that give us various means by which to transform and "export" GIS objects&lt;/li&gt;
&lt;li&gt;Operators: various SQL operators to query our geography and geometry&lt;/li&gt;
&lt;li&gt;Spatial relationships and measurements: functions that let us do calculations between different GIS objects&lt;/li&gt;
&lt;li&gt;Geometry processing: functions to perform basic operations on GIS objects&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;I have left a few categories out for they are either not part of the Simple Features standard (such as three dimensional manipulations) or beyond the scope.
To see a list of all of the functions and their categories, I advise you to visit the PostGIS reference, &lt;a href="http://postgis.net/docs/reference.html"&gt;section 8&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Let us now do some fun manipulations and use some of the functions from these categories, just to get a bit more familiar with how it all works together.&lt;/p&gt;
&lt;p&gt;If you inserted the last SQL command which makes the geometry column, you should have gotten back the following result:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt; &lt;span class="n"&gt;SRID&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;TYPE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;POLYGON&lt;/span&gt; &lt;span class="n"&gt;DIMS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This tells us we created the "shape" column in the "shapes" table and set the SRID to 0.
SRID 0 is a convention used to tell a GIS system that you currently do not care about the SRID and simply want to store geometry with an arbitrary X and Y value.&lt;/p&gt;
&lt;p&gt;Let us now insert the shape of our square. To insert a polygon into your column, you could use various functions. One of these functions is &lt;em&gt;ST_GeomFromText()&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'Square with hole'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ST_GeomFromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POLYGON ((8 1, 8 8, 1 8, 1 1, 8 1), (6 3, 6 6, 3 6, 3 3, 6 3))'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This now inserts our polygon and gives it a name. The &lt;em&gt;ST_GeomFromText()&lt;/em&gt; function enables us to enter our polygon object using WKT.
This function also accepts a second, optional parameter which is the SRID by which we wish to work.
The category of this function is called &lt;em&gt;Geometry Constructors&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;You know this polygon has two rings, the exterior and the interior. Let us now ask PostGIS to return only the line that represents the exterior ring:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_ExteriorRing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Square with hole'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And we get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="mi"&gt;0102000020&lt;/span&gt;&lt;span class="n"&gt;E6100000050000000000000000002040000000000000F03F00000000000020400000000000002040000000000000F03F0000000000002040000000000000F03F000000000000F03F0000000000002040000000000000F03F&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Oh my...that is not what we expected. But yet it is correct. This is how PostgreSQL stores geometry/geography.
The result is correct, yet unreadable to us humans. &lt;/p&gt;
&lt;p&gt;If we wish to get back a readable WKT string, we have to convert it using one of the conversion functions:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_AsText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ST_ExteriorRing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Square with hole'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And we get:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;LINESTRING&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Aha, that is more like it! This we can read!&lt;/p&gt;
&lt;p&gt;We used the &lt;em&gt;ST_ExteriorRing()&lt;/em&gt; which falls under the &lt;em&gt;Geometry Accessors&lt;/em&gt; category and the &lt;em&gt;ST_AsText()&lt;/em&gt; function which resides in the category &lt;em&gt;Geometry Outputs&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Okay, now we wish to know the interior ring:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_AsText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ST_InteriorRingN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Square with hole'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The result:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;LINESTRING&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Notice that the function &lt;em&gt;ST_InteriorRingN()&lt;/em&gt; requires you to give the integer of which ring you wish to get, starting from 1.
As we have seen before, polygon objects can have multiple interior rings, but only a single exterior one.&lt;/p&gt;
&lt;p&gt;Next let us ask all the information about what makes up the shape:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Summary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Square with hole'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;Polygon&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="n"&gt;rings&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;
  &lt;span class="n"&gt;ring&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="n"&gt;points&lt;/span&gt;  &lt;span class="o"&gt;+&lt;/span&gt;
  &lt;span class="n"&gt;ring&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;has&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="n"&gt;points&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Ohh, that is pretty cool. We get back a human readable string that explains to us how this particular piece of geometry is build.&lt;/p&gt;
&lt;p&gt;Let us now add another polygon:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'The intersecting one'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ST_GeomFromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POLYGON ((14 1, 15 8, 7 8, 7 1, 14 1))'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This is a polygon that will &lt;em&gt;intersect&lt;/em&gt; with part of our previous polygon.
Let us ask PostGIS if these polygons really intersect each other:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Intersects&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Square with hole'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'The intersecting one'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;If all is well, this will simply return &lt;em&gt;TRUE&lt;/em&gt; if they intersect and &lt;em&gt;FALSE&lt;/em&gt; if they do not. In this case, it will return &lt;em&gt;TRUE&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The counterpart of our intersect function is &lt;em&gt;ST_Disjoint()&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Disjoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Square with hole'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'The intersecting one'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Which will return &lt;em&gt;FALSE&lt;/em&gt; in our case.&lt;/p&gt;
&lt;p&gt;Let us now add a third polygon which does not intersect our previous two:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'The solitary one'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ST_GeomFromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POLYGON ((20 20, 20 40, 1 40, 1 20 ,20 20))'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This polygon will reside well "above" the other two and does not share any space.
Let us now see how far this polygon resides from our first polygon, the one with the hole:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Square with hole'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'The solitary one'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This returns us the number "12", which means they are 12 units apart.
And remembering the definition of both shapes, the first shape is 8 units tall and the second shape starts at unit 20.
This indeed leaves a gap of 12.&lt;/p&gt;
&lt;p&gt;Nice! We have just measured the distance between two objects in a spatial database!&lt;/p&gt;
&lt;p&gt;Hmmm, this may mean we are getting closer to knowing the distance to Tokyo...but not yet, we need to play a bit more first.&lt;/p&gt;
&lt;p&gt;PostGIS also has the ability to manipulate geometry. Let us, for example, try to move our solitary polygon even further away using the &lt;em&gt;ST_Translate()&lt;/em&gt; function under the &lt;em&gt;Geometry Editors&lt;/em&gt; category:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Translate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'The solitary one'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The &lt;em&gt;ST_Translate()&lt;/em&gt; function will accept the to-be-altered geometry and accepts an X, Y and an optional third dimension.&lt;/p&gt;
&lt;p&gt;Running this query will give us a binary representation of a &lt;em&gt;new&lt;/em&gt; piece of geometry. The original geometry is not altered.
So how can we actually move the geometry that resided in the database?&lt;/p&gt;
&lt;p&gt;Simply by using SQL:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;
    &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ST_Translate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'The solitary one'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Let us now check the new distance:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Square with hole'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'The solitary one'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="mi"&gt;22&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Aha! Nice! It has now moved ten units upwards.&lt;/p&gt;
&lt;p&gt;Now let us alter the distance once again, but this time we will scale the polygon down:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;
    &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ST_Scale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'The solitary one'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;If we now check the distance, we will get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="mi"&gt;7&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Wow, we have gone from 22 to 7, how did that happen?&lt;/p&gt;
&lt;p&gt;Well, it is important to know that the &lt;em&gt;ST_Scale()&lt;/em&gt; function currently only supports scaling by multiplying each coordinate. This means that the polygon will not only become smaller or bigger, but will also translate as a result. To know exactly how our new, scaled version of our polygon looks, we can use the &lt;em&gt;ST_Boundary()&lt;/em&gt; function which shows us the outer most linestring:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_AsText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ST_Boundary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'The solitary one'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And we will get:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;LINESTRING&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;If we compare that to the same result before scaling (which I handily made ready for you):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;LINESTRING&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You can see that each value in each coordinate simply was divided by 2.
This also clarifies why our polygons are now only 7 units apart (first square stops at 8, the scaled square start at 15).&lt;/p&gt;
&lt;p&gt;Okay, okay, I guess we have played enough now.
We have seen a small glimpse of the operations you can do on GIS data within PostGIS and seen that PostGIS makes all of this work fairly easy.&lt;/p&gt;
&lt;p&gt;I guess we can now take it one step further and start to actually look at some geography!&lt;/p&gt;
&lt;h3&gt;Fun with the earth&lt;/h3&gt;
&lt;p&gt;Up until now we have been working with an SRID of &lt;em&gt;0&lt;/em&gt;, which means &lt;em&gt;undefined&lt;/em&gt;, inside a &lt;em&gt;geometry&lt;/em&gt; column, meaning the data was of type "geometry".
Now we want to go out and explore the actual earth, which means we wish to continue in a &lt;em&gt;geographical coordinate system&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This brings us at a crossroad of choices. First, you will need to ask yourself the same question we pondered in chapter one: you wish to work with geometry or geography?&lt;/p&gt;
&lt;p&gt;On the one hand we know that geographical measurements are expensive calculations, but most accurate for they are unprojected.
On the other hand GIS convention tells us that in any case, we should continue in a geometrical or Cartesian system, simply because...well...it is a convention.&lt;/p&gt;
&lt;p&gt;So what do we do?&lt;/p&gt;
&lt;p&gt;It all depends on your specific use case.&lt;/p&gt;
&lt;p&gt;When working on a "small" scale, say, part of North America, it would make sense to not use geography.
Instead, you could (and should) work in a geometrical system using a very accurate projection with SRID 4267 (datum &lt;em&gt;NAD27&lt;/em&gt;) or SRID 4269 (datum &lt;em&gt;NAD83&lt;/em&gt;) which are both local UTM variants for North America. &lt;/p&gt;
&lt;p&gt;Depending on which region you work in, chances are high you have several local projections with their own datum and coordinate system, ready to use.
They are very accurate and less expensive to use then direct geography.&lt;/p&gt;
&lt;p&gt;For us, however, we will be working on a large scale, for we want to measure a distance that covers much of the globe. You cannot use a local projection or local datum for that.&lt;/p&gt;
&lt;p&gt;In such a case you, again, are presented with two options.
You could either neglect the convention and simply use geographical data and functions or be nice and adhere to what is agreed upon and work in a Cartesian system.&lt;/p&gt;
&lt;p&gt;We will be doing both and we will use the common SRID &lt;em&gt;4326&lt;/em&gt;.
This &lt;em&gt;very&lt;/em&gt; popular SRID is by heart geographical, for it uses the geographical coordinate system, but can also be used with geometrical data. Confused?&lt;/p&gt;
&lt;p&gt;Join the club.&lt;/p&gt;
&lt;p&gt;Let me try to clarify.&lt;/p&gt;
&lt;p&gt;First, the authority of this SRID is the EPSG and the EPSG ID is identical to the SRID.
It uses a popular &lt;em&gt;datum&lt;/em&gt; (remember chapter one) called &lt;em&gt;WGS 84&lt;/em&gt; and is referred to as &lt;em&gt;unprojected&lt;/em&gt; for it is a geographical representation.
This datum is one that is used in GPS systems and is often referred to as a &lt;em&gt;word wide datum&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;When you store objects with an SRID of 4326, you are storing them using geographical coordinates aka latitude and longitude.
This in contrast to, for example, the former SRID's, like 4267 or 4269, which store their coordinates in UTM values.
When you do measurements between two objects carrying this SRID you have two options. You can either do a geographical or a geometrical measurement.&lt;/p&gt;
&lt;p&gt;With a geographical measurement there will be no projection and the system will use the WSG 84 datum (the spheroid) to calculate the distance, in three dimensional space.
As we have seen before, such a calculation is more expensive and unconventional.&lt;/p&gt;
&lt;p&gt;With a geometrical measurement, your geographical coordinates have to be &lt;em&gt;projected&lt;/em&gt; on to a flat Cartesian or &lt;em&gt;geometrical&lt;/em&gt; plane.
This is done automatically when you ask PostGIS to measure distance using one of the more common geometrical functions.
When projecting, all GIS systems will use the &lt;em&gt;Plate Carrée&lt;/em&gt; projection which means they will use the stored latitude and longitude coordinates directly as an X and Y value.&lt;/p&gt;
&lt;p&gt;Let us see this story in action. First we can take a look at the more native geography data. Let us clean our shape table first:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;DropGeometryColumn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'shapes'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'shape'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Here we use the &lt;em&gt;DropGeometryColumn()&lt;/em&gt; to remove this column from out "shapes" table. Now clear the table:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;TRUNCATE&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Next add a new geography column:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; &lt;span class="n"&gt;geography&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POINT'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We create a new &lt;em&gt;geography&lt;/em&gt; column with the name &lt;em&gt;location&lt;/em&gt; in our "shapes" table. We will only be storing Point types.&lt;/p&gt;
&lt;p&gt;Notice that the syntax is different and that here we use plain SQL as opposed to the &lt;em&gt;AddGeometryColumn()&lt;/em&gt; function from before.
Since PostGIS 2 it is possible to create and drop both geometry and geography columns with standard SQL syntax.&lt;/p&gt;
&lt;p&gt;If you wish to rewrite our "shape" column addition from the beginning of this chapter, you could write it like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;shape&lt;/span&gt; &lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POLYGON'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Looks more native and simple, no? Sorry to tell you this so late in the adventure, but now you know the existence of both the functions and the more native SQL syntax. Both will also keep the PostGIS bookkeeping in sync.&lt;/p&gt;
&lt;p&gt;Also, for fun, you could do a describe on the table:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt; Column  &lt;span class="p"&gt;|&lt;/span&gt;         Type          &lt;span class="p"&gt;|&lt;/span&gt; Modifiers 
----------+-----------------------+-----------
name     &lt;span class="p"&gt;|&lt;/span&gt; character varying     &lt;span class="p"&gt;|&lt;/span&gt; 
location &lt;span class="p"&gt;|&lt;/span&gt; geography&lt;span class="o"&gt;(&lt;/span&gt;Point,4326&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;As you can see, the column is of type &lt;em&gt;geography&lt;/em&gt; and automatically gets the famous SRID 4326.&lt;/p&gt;
&lt;p&gt;Good, let us now try and find an answer to our famous question, How far is Tokyo from my current location. You will be surprised how trivial this will be.&lt;/p&gt;
&lt;p&gt;First, as you might suspect, since we are only interested in a point on the earth and not the shape of your location nor Tokyo, we will suffice with a Point object.
Next we will need to insert two points into our database, your location and the center of Tokyo, both in geographical coordinates.&lt;/p&gt;
&lt;h4&gt;Finding Your Location&lt;/h4&gt;
&lt;p&gt;This means you need to find out your exact latitude and longitude of the place you are at right now.&lt;/p&gt;
&lt;p&gt;This could, of course, be done in a myriad of ways: using your cell phone's GPS capabilities, using your dedicated GPS device or using an online map system.
I will choose the latter and will be using OpenStreetMap (what else?) to locate my current position.&lt;/p&gt;
&lt;p&gt;Open up your favorite web browser and surf to &lt;a href="http://openstreetmap.org"&gt;openstreetmap.org&lt;/a&gt;.
Once there, punch in your address or use the "Where Am I" function. This would give you a point on the map and in the search bar on the left your latitude and longitude coordinate.
Take this coordinate and save is as point data into your fresh column:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'My location'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_GeographyFromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POINT(127.6791949 26.2124702)'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The point I am inserting reflects central Naha, the main city of the Okinawa prefecture. Not my current location, but it serves as an illustrative point.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note&lt;/em&gt; that PostGIS expects a longitude as X and latitude as Y. This is many times reversed as what you get back from other sources.&lt;/p&gt;
&lt;p&gt;Now you can insert the location of Tokyo, which I conveniently looked up for you:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Tokyo'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_GeographyFromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POINT(139.7530053 35.6823815)'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Ah, nice! Okay, are you ready to finally, after all the rambling we went through, know the distance?
You already know the syntax, punch in the magic:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'My location'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Tokyo'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;In the case you would live in the exact cartographic center of Naha, you will get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   &lt;span class="n"&gt;st_distance&lt;/span&gt;    
&lt;span class="c1"&gt;------------------&lt;/span&gt;
 &lt;span class="mi"&gt;1557506&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;28103692&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Yeah! This, my lovely folk, is how far you are from Tokyo, at this very moment.&lt;/p&gt;
&lt;p&gt;But what is this number you get back? &lt;/p&gt;
&lt;p&gt;The result you see here is the distance returned in &lt;em&gt;Meters&lt;/em&gt;, meaning, from the point I inserted as "My location", I am 1557506.28 Meters or &lt;em&gt;1557.50628 Kilometers&lt;/em&gt; from Tokyo.&lt;/p&gt;
&lt;p&gt;Very neat stuff, would you not say? PostgreSQL just told us how far we are from Tokyo, &lt;em&gt;awesome&lt;/em&gt;!&lt;/p&gt;
&lt;p&gt;But wait, we are not finished yet. We have now done the most accurate, real geographical distance measurement using expensive geographical calculations.&lt;/p&gt;
&lt;p&gt;There is an "in-between" solution before we jump to geometry. PostGIS gives us the ability to replace our spheroid datum with the more classical sphere.
The latter has much simpler calculations, but can still return more accurate results them some of the projections.&lt;/p&gt;
&lt;p&gt;To redo our calculation from above with a sphere, simply set the spheroid Boolean, a third and optional parameter to the &lt;em&gt;ST_Distance()&lt;/em&gt; function, to &lt;em&gt;False&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'My location'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Tokyo'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The result:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   &lt;span class="n"&gt;st_distance&lt;/span&gt;    
&lt;span class="c1"&gt;------------------&lt;/span&gt;
&lt;span class="mi"&gt;1557886&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;68227339&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Which is a total distance of &lt;em&gt;1557.886 Kilometers&lt;/em&gt;, a difference of around 300 Meters.&lt;/p&gt;
&lt;p&gt;Let us now repeat this story, but use &lt;em&gt;geometry&lt;/em&gt; instead. Let us do it the GIS conventional way.&lt;/p&gt;
&lt;p&gt;We do not need to recreate our column as a geometry column and insert our data again. We could cheat a little.
PostGIS together with PostgreSQL has the unique capability of &lt;em&gt;casting&lt;/em&gt; data from one type to another.
So without recreating anything, we could simply cast our geography data into geometry &lt;em&gt;on the fly&lt;/em&gt; and see what happens.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note&lt;/em&gt; that casting, while very convenient for quick checks, can render an index totally mute.
It is therefor important to think ahead and decide if you want to work with geometry or geography, then create the correct column type and use this &lt;em&gt;without&lt;/em&gt; casting.&lt;/p&gt;
&lt;p&gt;But for our quick and dirty queries, this is fine. Let us continue:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'My location'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Tokyo'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;In this query, we cast (&lt;em&gt;::&lt;/em&gt;) the geography data inside the "location" columns into geometry.
Now we get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   &lt;span class="n"&gt;st_distance&lt;/span&gt;    
&lt;span class="c1"&gt;------------------&lt;/span&gt;
&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3445794209231&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Hmm, that is a different result all together. It looks like a much smaller number then before. What is happening?&lt;/p&gt;
&lt;p&gt;We just casted our geography to geometry, this means PostGIS will now use a Cartesian system or &lt;em&gt;projection&lt;/em&gt; to calculate the distance in a linear way.
When using the distance measuring function &lt;em&gt;ST_Distance()&lt;/em&gt; on geometry, it will return not meters but the distance expressed in the units the original data was stored in.
Since our data is stored with SRID 4326, its units are latitude and longitude. The value you get back is thus &lt;em&gt;degrees&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;In the case of the Naha location, this will be &lt;em&gt;15.344&lt;/em&gt; degrees from Tokyo.&lt;/p&gt;
&lt;p&gt;For our human brain this is difficult to imagine, a result in Meters is much more easy to comprehend. So, let us transform this degree value into a metric value.&lt;/p&gt;
&lt;p&gt;It is an estimation that one planar degree (in our Cartesian system) equals 111 KM. So the distance now becomes 15.344 degrees times 111: &lt;em&gt;1703 Kilometers&lt;/em&gt;.
That is a difference of about 145 Kilometers. &lt;/p&gt;
&lt;p&gt;The reason this difference exist is of the projection we are now using. As we have mentioned a few times before, when going from data containing SRID 4326, PostGIS will automatically use the infamous &lt;em&gt;Plate Carrée&lt;/em&gt; projection. This projection, as we have seen before, is the &lt;em&gt;least&lt;/em&gt; accurate for something like distance measuring.&lt;/p&gt;
&lt;p&gt;So let us poke this projection mechanism and try a different, more accurate one, the Lambert, which carries SRID &lt;em&gt;3587&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;To change the projection PostGIS will use, we can use the &lt;em&gt;ST_Transform()&lt;/em&gt; function which casts objects to different SRIDs.
Note that &lt;em&gt;ST_Transform()&lt;/em&gt; only works for geometry objects, so we have to continue to cast our geography location to be able to use them in this function.&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ST_Transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3587&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;ST_Transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3587&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'My location'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Tokyo'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will gives us:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   &lt;span class="n"&gt;st_distance&lt;/span&gt;    
&lt;span class="c1"&gt;------------------&lt;/span&gt;
&lt;span class="mi"&gt;1602392&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;18109279&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Meaning &lt;em&gt;1602.392 Kilometers&lt;/em&gt;, a difference of about 45 Kilometers. That is indeed in between the Plate Carrée and our native geographical measurement.&lt;/p&gt;
&lt;p&gt;Another, even more accurate and popular projection is our famous UTM. It can, however, not be used on a world scale. You can only perform measurements within the same UTM zone.&lt;/p&gt;
&lt;p&gt;As mentioned in the previous chapter, there are roughly 60 World UTM zones on the earth, but each zone uses their own projection and their own coordinates.
This kind of projection is thus not fit for measuring distance on such a large scale.&lt;/p&gt;
&lt;p&gt;Let us therefor take this one step further before I leave you to rest. Let us do a measurement with such a UTM projection.
We will make a measurement inside of Japan's mainland UTM zone: &lt;em&gt;54N&lt;/em&gt; which has an SRID of &lt;em&gt;3095&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;First we will have to make another point in our database:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Aomori'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ST_GeographyFromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POINT(140.750616 40.788079)'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This point represents the city of Aomori in northern Japan, famous for its huge lantern parades.&lt;/p&gt;
&lt;p&gt;First let us measure with the native geographical calculations:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Aomori'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Tokyo'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This returns:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;st_distance&lt;/span&gt;    
&lt;span class="c1"&gt;------------------&lt;/span&gt;
&lt;span class="mi"&gt;573416&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;203868172&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Or &lt;em&gt;573.416 Kilometers&lt;/em&gt;, which is most accurate.&lt;/p&gt;
&lt;p&gt;Next, let us throw the good old &lt;em&gt;Plate Carrée&lt;/em&gt; projection at it:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Aomori'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Tokyo'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will yield&lt;/p&gt;
&lt;pre class="code literal-block"&gt;    &lt;span class="n"&gt;st_distance&lt;/span&gt;    
&lt;span class="c1"&gt;------------------&lt;/span&gt;
&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;20224702126502&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Which is in degrees again, doing this times 111 Kilometers will yield a total distance of &lt;em&gt;577.444 Kilometers&lt;/em&gt;. &lt;/p&gt;
&lt;p&gt;Then let us measure using the correct UTM projection:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ST_Distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ST_Transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3095&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;ST_Transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;geometry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3095&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Aomori'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Tokyo'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will give us:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;  &lt;span class="n"&gt;st_distance&lt;/span&gt;    
&lt;span class="c1"&gt;------------------&lt;/span&gt;
&lt;span class="mi"&gt;573228&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;002047378&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Or &lt;em&gt;573.228 Kilometers&lt;/em&gt; and thus only around 200  meters different, in contrast with the Plate Carrée, which was 4 Kilometers different.&lt;/p&gt;
&lt;p&gt;You can see that different projections will result in different measurements. It is therefor crucial to know which one to choose.
Some are better used on a local scale, like we just did for Japan, others are better on a global scale.&lt;/p&gt;
&lt;p&gt;Again, it all comes down to trade-offs and choices.&lt;/p&gt;
&lt;p&gt;Okay, yet another big chunk of PostGIS goodness is taken. I suggest a good rest of the mind.&lt;/p&gt;
&lt;p&gt;We have seen how we can insert various types of geometry and geography, we saw how to manipulate and question them and we looked at a few real world measurements.&lt;/p&gt;
&lt;p&gt;In the next and final chapter, we will be looking at loading some real GIS data from OpenStreetMap into our PostGIS database, take a quick look around my town here in Okinawa and take a deeper look at creating some important indexes.&lt;/p&gt;
&lt;p&gt;And as always...thanks for reading!&lt;/p&gt;
&lt;!--  LocalWords:  PostGIS PostgreSQL GIS
 --&gt;&lt;/div&gt;</description><category>postgis</category><category>postgresql</category><guid>http://shisaa.be/postset/postgis-postgresqls-spatial-partner-part-2.html</guid><pubDate>Wed, 18 Jun 2014 10:00:00 GMT</pubDate></item><item><title>Postgis, PostgreSQL's spatial partner - Part 1</title><link>http://shisaa.be/postset/postgis-postgresqls-spatial-partner-part-1.html</link><dc:creator>Tim van der Linden</dc:creator><description>&lt;div&gt;&lt;h3&gt;Preface&lt;/h3&gt;
&lt;p&gt;In Dutch we have an expression that says "Van hier tot Tokio", which literally translated means "From here to Tokyo" and is used to indicate that something is &lt;em&gt;very&lt;/em&gt; far or &lt;em&gt;very&lt;/em&gt; difficult.
Unless you live in Japan, like me, then Tokyo is not &lt;em&gt;that&lt;/em&gt; far actually....but you get the point. Tokyo is far, period.&lt;/p&gt;
&lt;p&gt;But the question for today is...&lt;em&gt;how&lt;/em&gt; far is it &lt;em&gt;exactly&lt;/em&gt;? From where you are reading this right now...how far is Tokyo from you? How can you know?
You could of course just hop online and question your favorite search engine for help, or use something like Open Street Map to figure it out.&lt;/p&gt;
&lt;p&gt;But that would be too simply, no? This would mean my post has to stop here, and, as some of you might know, it is difficult for me to write short blog posts. Sorry.&lt;/p&gt;
&lt;p&gt;Also, you would miss out on all of the fun that is actually happening behind the screen when you question spatial search engines and that is against my belief: &lt;em&gt;know how the tools you depend on actually work&lt;/em&gt;!&lt;/p&gt;
&lt;p&gt;And, as the same some of you might know, I love PostgreSQL.&lt;/p&gt;
&lt;p&gt;So, knowing that I cannot write short posts &lt;em&gt;and&lt;/em&gt; I like PostgreSQL...what would you suspect would happen if you ask me how far Tokyo is from my current location?
You guessed it, simply use The Elephant to figure that out!&lt;/p&gt;
&lt;p&gt;As I have showed you &lt;a href="http://shisaa.be/postset/postgresql-full-text-search-part-1.html" title="PostgreSQL full text search, chapter one."&gt;before&lt;/a&gt;, PostgreSQL is capable of storing, matching and retrieving much more then boring VARCHAR or INT data types and it is designed to be extendable.
And extending is what the folks behind the &lt;em&gt;PostGIS&lt;/em&gt; project did. To summarize, the PostGIS project extends PostgreSQL to store, match, manipulate and retrieve &lt;em&gt;spatial&lt;/em&gt; data. It makes PostgreSQL a full-blown GIS.&lt;/p&gt;
&lt;p&gt;The purpose of this series is to get your feet wet with PostGIS and to learn a thing or two about GIS itself.
In the first chapter, the one you are reading now, I would like to show you some fundamental GIS concepts: GIS Objects, standardization of GIS, geography and projections. 
We will not be doing any database action today I am afraid.&lt;/p&gt;
&lt;p&gt;Then, starting from the second chapter, we will open up PostgreSQL, initiate a database to be PostGIS aware and start playing around.
We will look at a bunch of different database functions we have available and how the knowledge from this chapter maps to the actual database.
And we will of course be solving the question posed above: how far is Tokyo from your current location. &lt;/p&gt;
&lt;p&gt;Are you ready for a new PostgreSQL adventure?&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note:&lt;/em&gt; I will take you over all the following information in lighting speed. 
My intent is not to make you a GIS expert, but I do feel it is necessary to touch on a few important topics so you know why PostGIS is doing stuff the way it does.
This will hopefully make the actual database work from the next chapter more clear and spark some curiosity towards learning more about this topic.&lt;/p&gt;
&lt;h3&gt;The data&lt;/h3&gt;
&lt;p&gt;Before we can do anything GIS related, we need to take a look at what kind of data we will be working with: the GIS objects.&lt;/p&gt;
&lt;h4&gt;GIS Objects?&lt;/h4&gt;
&lt;p&gt;Geographic information system, or GIS in short, is merely the name of any system which can store, retrieve, generate, manipulate and visualize spatial data - the kind of data that represents objects in two or three dimensional space.&lt;/p&gt;
&lt;p&gt;The GIS world is a world of standards, as with most computer sciences. These standards define what spatial data is and how we can work with it and is defined and maintained by the Open Geospatial Consortium or &lt;em&gt;OGC&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Every system that wishes to work with GIS data, including PostGIS, should adhere to these standards.&lt;/p&gt;
&lt;h4&gt;Simple Features&lt;/h4&gt;
&lt;p&gt;The OGC's standard for working with GIS data in SQL is defined in a OGC and &lt;em&gt;ISO&lt;/em&gt; specification called &lt;em&gt;Simple Features&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Simple Features defines how we can represent spatial objects, as you will see soon, but also defines how we can access and manipulate them.
You typically have available:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Functions to &lt;em&gt;create&lt;/em&gt; two dimensional spatial objects&lt;/li&gt;
&lt;li&gt;Functions to &lt;em&gt;alter&lt;/em&gt; these objects&lt;/li&gt;
&lt;li&gt;Functions to &lt;em&gt;retrieve&lt;/em&gt; and &lt;em&gt;describe&lt;/em&gt; single or multiple objects&lt;/li&gt;
&lt;li&gt;Functions to &lt;em&gt;compare&lt;/em&gt; and &lt;em&gt;measure&lt;/em&gt; single or multiple objects&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;PostGIS has been certified by the OGC for its wide support of the Simple Features set.&lt;/p&gt;
&lt;h4&gt;Well-known Text&lt;/h4&gt;
&lt;p&gt;The first and most important part that is defined in the Simple Features spec are the means by which we can represent spatial data.
I mean, we know how we can represent numbers or strings of text inside our database, but how do we represent something more abstract as a line, or a square?&lt;/p&gt;
&lt;p&gt;Folks familiar with 2D drawing or 3D modeling software might already have a gut feeling of how to represent such data, and this gut feeling is right: you store coordinates.
If you wish to represent a line, you will only need to know the two end points of this line to be able to store, manipulate or visualize it.
The same goes for a square, though there you will need four coordinates.&lt;/p&gt;
&lt;p&gt;And, as is always the case with standards, the OGC has devised two famous ways of representing these objects and their coordinates: Well-known Text or &lt;em&gt;WKT&lt;/em&gt; and Well-know Binary or &lt;em&gt;WKB&lt;/em&gt;.
These two are almost identical, only differing in the area of use.&lt;/p&gt;
&lt;p&gt;WKT is a markup language which you can use to simply write down your objects and use it in queries. It is human readable.
However, if you wish to store it in a database or wish to perform matches on the data, it has to be stored in a defined binary format, the WKB format that is.&lt;/p&gt;
&lt;p&gt;WKT can represent a wide range of objects from simple points to complex multi-polygons. The notation, however, stays roughly the same.
If you wish to represent a square, for example, you could use the &lt;em&gt;POLYGON&lt;/em&gt; object:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;POLYGON&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;For people unfamiliar with the term "polygon", a polygon is a &lt;em&gt;closed&lt;/em&gt;, &lt;em&gt;two dimensional&lt;/em&gt; object with only &lt;em&gt;straight&lt;/em&gt; lines. It has to have a minimum of three coordinates (points) thus giving it a minimum of three straight edges (making it, in that case, a triangle).&lt;/p&gt;
&lt;p&gt;Let us take a deeper look at what is happening here. First, you will see we define a polygon object which you will need if you wish to represent closed, shape objects.
Next we define the four coordinates, the four corners of our square, laid out on a fictional grid of 4 by 4 units. There are two important notes to take about this coordinate listing:&lt;/p&gt;
&lt;p&gt;First, the coordinates are all two dimensional and represent and X and a Y coordinate respectively.&lt;/p&gt;
&lt;p&gt;Also, you do not see four but &lt;em&gt;five&lt;/em&gt; coordinates. This is another rule from the spec that tells us that all polygon shapes &lt;em&gt;must&lt;/em&gt; be closed.
To get a better visualization of this you could imagine a pen moving to each coordinate. To finish the loop you draw, the pen has to move back to the original coordinate.&lt;/p&gt;
&lt;p&gt;The last thing to note is that the drawing direction of these coordinates is &lt;em&gt;counterclockwise&lt;/em&gt;, as is with most computer defined drawing systems.
This means we put our pen on our grid at coordinate (4 1) and then draw &lt;em&gt;up&lt;/em&gt; in a straight line to (4 4). Next we go &lt;em&gt;left&lt;/em&gt; in a straight line to (1 4) and &lt;em&gt;down&lt;/em&gt; in a straight line to (1 1).
Finally, we close the loop by drawing a straight line &lt;em&gt;right&lt;/em&gt;, to the starting coordinate (4 1).&lt;/p&gt;
&lt;p&gt;It is also perfectly possible to define more then one coordinate set when defining a polygon object.
A definition like this is perfectly legal:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;POLYGON ((8 1, 8 8, 1 8, 1 1, 8 1), (6 3, 6 6, 3 6, 3 3, 6 3))
&lt;/pre&gt;


&lt;p&gt;This will create a square polygon with a size of 8 by 8, called the &lt;em&gt;exterior ring&lt;/em&gt; and another square inside it with a size of 4 by 4.
Because this small square resides &lt;em&gt;inside&lt;/em&gt; the area of the big square we call it the &lt;em&gt;interior ring&lt;/em&gt; and, as a result, this small square will be interpreted by the standard as a hole in the bigger square.&lt;/p&gt;
&lt;p&gt;To bring this even further, you can define as many holes in your exterior ring as you like, you simply have to make sure that the interior rings never touch each other and never go outside of the exterior ring.
The exterior ring is always derived from the first set of coordinates in your object definition.&lt;/p&gt;
&lt;p&gt;The POLYGON object in the WKT standard also has a &lt;em&gt;MULTIPOLYGON&lt;/em&gt; counterpart for when you wish to define a multiple, &lt;em&gt;non intersecting&lt;/em&gt; set of polygon objects which, in turn, can have as many interior rings as you like.&lt;/p&gt;
&lt;p&gt;Other objects we have available in the WKT standard are:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;POINT(0 0) to represent a point on a grid&lt;/li&gt;
&lt;li&gt;LINESTRING(0 0, 0 1) to represent a line. Note that a line can consist out of more then two coordinates.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;All of these also have a &lt;em&gt;MULTIPOINT&lt;/em&gt; and a &lt;em&gt;MULTILINESTRING&lt;/em&gt; variant respectively.&lt;/p&gt;
&lt;p&gt;As we have seen before, all of these objects are two dimensional, but PostGIS also partly supports a three and a four dimensional version of some of these objects.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note&lt;/em&gt; that these extra dimensions are currently &lt;em&gt;not&lt;/em&gt; in de specification and is a PostGIS specific extension on top of the features defined by the OGC.
Furthermore, if the OGC decides to standardize three of four dimensional objects, PostGIS will have to adapt its syntax to stay compliant.
We thus refer to this extended format not as WKT or WKB but as &lt;em&gt;Extended&lt;/em&gt; WKT and &lt;em&gt;Extended&lt;/em&gt; WKB or simply &lt;em&gt;EWKT&lt;/em&gt; and &lt;em&gt;EWKB&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;To make our polygon object three dimensional, we could write it down like so:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;POLYGON&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You can see that we now have three numbers per coordinate, the third one adds a &lt;em&gt;Z&lt;/em&gt; or &lt;em&gt;depth&lt;/em&gt; value.&lt;/p&gt;
&lt;p&gt;A point gets even more fancier. If we wish to place a point in three dimensional space, we could write it down the same as we did with our polygon:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;POINT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The third parameter here being, again, a place on the &lt;em&gt;Z&lt;/em&gt; axis.&lt;/p&gt;
&lt;p&gt;But points can also have a &lt;em&gt;fourth&lt;/em&gt; dimension which sounds fancy, but is nothing more then an extra reference we can ship with our coordinates.
This reference, also called a &lt;em&gt;linear reference&lt;/em&gt;, is a number we can put in place that tell us where, along a linear path, the point we define resides.&lt;/p&gt;
&lt;p&gt;It can be written down like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;POINT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Here we have four numbers, the last one being the linear reference or &lt;em&gt;M&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;With EWKT you also have the possibility to define a two dimensional object with a linear reference:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;POINT&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Here we have again three numbers, but to distinguish between the last number being &lt;em&gt;Z&lt;/em&gt; or &lt;em&gt;M&lt;/em&gt;, we have to reference &lt;em&gt;M&lt;/em&gt; together with our point declaration.
There are more extensions defined in the EWKT and EWKB, but that is slightly off-topic, because, as I mentioned before, these are not standardized.
In most use cases you can simply use the standard WKT and WKB forms.&lt;/p&gt;
&lt;h4&gt;What to use these objects for?&lt;/h4&gt;
&lt;p&gt;You now know what kind of objects we can represent using text and what we can, later along the road, insert into our PostGIS enabled PostgreSQL database.
But how do these points, lines and polygons help us measure distance or help us locate stuff?&lt;/p&gt;
&lt;p&gt;First it is important to understand that all of the objects we have available will act as &lt;em&gt;proxies&lt;/em&gt; to real world objects.
Take, for example, the point. A point can be used on a map to indicate a place, a spot so to speak, without defining shape or size.
When you wish to know where Tokyo is, a point will suffice on a global scale, you do not need nor want to know the exact shape of the metropolis.&lt;/p&gt;
&lt;p&gt;However, if you would zoom in on our fictional map and you wish to see a part of the city the size of a few city blocks, you might be interested in the shapes of buildings, lakes, parks, etc.
These items that take up two &lt;em&gt;dimensional space&lt;/em&gt; will be drawn with polygons that resemble the shape of the real world objects as close as possible.&lt;/p&gt;
&lt;p&gt;Lines (or linestrings), finally, will almost always be used to represent roads, railroads, metro systems, etc. They many times represent actual &lt;em&gt;paths&lt;/em&gt; one could travel along.&lt;/p&gt;
&lt;h3&gt;Geometry and Geography&lt;/h3&gt;
&lt;p&gt;So you know that you can represent a place in the world with a simple point.
And as you also know, a point is defined like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;POINT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This would create a point that sits at coordinate (10 20). But, what does that mean?
How do these numbers relate to the &lt;em&gt;real world&lt;/em&gt;? What &lt;em&gt;is&lt;/em&gt; 10 or 20 anyway?&lt;/p&gt;
&lt;p&gt;Well, first you will have to ask yourself the following question: Do I wish to be Cartesian or Geographical?&lt;/p&gt;
&lt;h4&gt;Cartesian or Geographical&lt;/h4&gt;
&lt;p&gt;As you may or may not remember from your boring math lessons, a Cartesian system is a two dimensional flat grid with a X and a Y axis.
These axis go both positive and negative with the origin sitting exactly in the middle of the flat plane.&lt;/p&gt;
&lt;p&gt;When working with GIS objects, we refer to this flat, Cartesian grid system as &lt;em&gt;Geometry&lt;/em&gt;.
When, however we are working with measurements or objects related to the &lt;em&gt;real&lt;/em&gt; earth we, in PostGIS, refer to these measurements as &lt;em&gt;Geography&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Why? what is the difference? Well, to understand this, we have to take a step back, a step back into time that is.&lt;/p&gt;
&lt;p&gt;When the first maps of the world where crafted, people truly believed the earth was flat (which it is not...for your information).
This meant that all charts that where drawn assumed we could simply place a grid comprised out of an X (length) and Y (height) axis across the drawing and from their measure distances between points. If you wish to know the distance between Paris and London, simply place two points on your map, take your &lt;em&gt;straight&lt;/em&gt; ruler and measure the distance indicated.
Then factor in the chart's scale and you have your distance. You use &lt;em&gt;Geometry&lt;/em&gt; or &lt;em&gt;Geometric measurements&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;However, after Copernicus nearly got his head chopped off telling people the earth was &lt;em&gt;not&lt;/em&gt; flat, the chart drawing people gasped for air.
This meant their measuring technique was not correct. If the earth really was a sphere, then one could not simply wrap a grid around it and act as if everything was linear.
A sphere meant that there was a certain amount of distortion happening with their overlaying grid, and the measurements should encompass for those differences.&lt;/p&gt;
&lt;p&gt;Even later in time, the chart drawing folk, who barely recovered from their first shock, where zapped again when people started to realize the earth was not a sphere either.
The globe turned out to be more of an egg shape, which, again, meant that measurement techniques had to be adjusted.&lt;/p&gt;
&lt;p&gt;This was the birth of the &lt;em&gt;geographical&lt;/em&gt; measurement system where cartographers devices a model called the &lt;em&gt;spheroid&lt;/em&gt;.
A spheroid is a three dimensional object on which we can most accurately place points and measure real earth distances.
Each point on such a spheroid is define by a &lt;em&gt;latitude&lt;/em&gt; and a &lt;em&gt;longitude&lt;/em&gt;:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Latitude is measured from the center of the earth (the hot place) in an angle up or down towards the surface&lt;/li&gt;
&lt;li&gt;Longitude is measured from the same hot center in an angle left or right towards the surface&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Because both latitude and longitude represent an angle we express them as a &lt;em&gt;degrees&lt;/em&gt; and we simply call the &lt;em&gt;geographical coordinates&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Now, it is not quite convenient to have to carry around a three dimensional spheroid to find out where you are or to measure distance.
A classical old paper map is still more easy to bring along and more easy to work with.
But how do we go from a spheroid, which has the correct distortion, back to our old, flat, two dimensional geometrical map?&lt;/p&gt;
&lt;p&gt;With &lt;em&gt;projection&lt;/em&gt; or &lt;em&gt;map projection&lt;/em&gt; to be more precise. We need to &lt;em&gt;project&lt;/em&gt; the three dimensional spheroid system onto our two dimensional map.
This projecting is roughly done in three steps.&lt;/p&gt;
&lt;p&gt;First we have to decide whether to take our spheroid as the base or a simpler sphere. A simpler sphere will yield less accurate results because it does not quite represent the correct curvature of the earth, but it does keep the maths behind the calculations simpler and thus can make for faster calculations. When choosing which shape we want, we also will have to define which &lt;em&gt;datum&lt;/em&gt; we would like.&lt;/p&gt;
&lt;p&gt;After choosing the base object and the datum that represents it, we have to transform the geographic system coordinates (latitude and longitude) to more standard X and Y coordinates to be used on a simple, flat, Cartesian plane. &lt;/p&gt;
&lt;p&gt;The last part is to find out to what ratio the final two dimensional surface is scaled compared to the original, base object (which represents the earth).&lt;/p&gt;
&lt;h4&gt;Datum?&lt;/h4&gt;
&lt;p&gt;Before continuing, a word about datums.&lt;/p&gt;
&lt;p&gt;As we said before, people agreed that the earth has a spheroid shape and that this model represents the earth most accurately.
We say "model" because the spheroid is something that is actually &lt;em&gt;defined&lt;/em&gt; with math.&lt;/p&gt;
&lt;p&gt;The math behind the spheroid model is what we call the &lt;em&gt;datum&lt;/em&gt;. It is nothing more then a mathematical formula describing the shape.&lt;/p&gt;
&lt;p&gt;Something we did not see is the fact that there actually are &lt;em&gt;many types&lt;/em&gt; of spheroids out there. Each serving their own purpose and each with their own math aka datum.
Some spheroids are better to do measurements on a global scale, others are better for a more local "zoomed-in" level (continent, country, ...).&lt;/p&gt;
&lt;p&gt;The reason we have to tell which datum (thus shape) our spheroid has, is because while latitude and longitude always represent degrees, they can have different meaning depending on the chosen datum.
If you use a datum that draws the spheroid a little bit "elongated" so to speak, then 1 degree longitude will cover slightly more distance then if the datum draws a more compact spheroid.&lt;/p&gt;
&lt;p&gt;We will see more about datums in the next chapter, but it is an important part of GIS.&lt;/p&gt;
&lt;h3&gt;Types of Projections&lt;/h3&gt;
&lt;p&gt;Something that might not be as obvious right now is the fact that going from our three-dee globe to a flat surface is a process of choices.
In an ideal world you wish to keep every aspect of your spheroid intact, meaning the proportions of the objects on the map are accurate everywhere, the shape of these objects is correct, the area covered by the objects is true and the distance between these objects is retained.
However, as it turns out, this is impossible on a two dimensional surface. You have to give up some of these properties to preserve others.&lt;/p&gt;
&lt;p&gt;Throughout history there have been many attempts at creating projections that would keep as much of these aspects intact.&lt;/p&gt;
&lt;h4&gt;Mercator projection&lt;/h4&gt;
&lt;p&gt;As a Belgian I should be most proud about this type of projection, since it was created by a fellow Flemish-man, around 450 years ago and it is a projection that is still being used today.
When a map is created with this type of projection we will get a comfortable and familiar view of the earth. 
A big advantage of this projection type is the fact that the shape of all objects are accurate.&lt;/p&gt;
&lt;p&gt;The Mercator projection is most accurate around the equator, but the further you travel up or down, the more the map goes out of proportion.
Mercator used a cylindrical projection to unwrap the earth into a flat plane. Because of the nature of such a cylindrical projection, the areas more close to the poles become blown up to fit in a two dimensional world.&lt;/p&gt;
&lt;p&gt;This distortion has caused quite some frowned foreheads in the last few decades and as a result people tend to abandon this projection, specially to project regions far from the equator.&lt;/p&gt;
&lt;h4&gt;Mercator variants&lt;/h4&gt;
&lt;p&gt;To make up for the heavy distortions found in the original Mercator system, people have made two new Mercator projections.
The first that came about was called the &lt;em&gt;Transverse Mercator&lt;/em&gt; which fixes the distortions around the poles, but introduces the problem that it will make for incorrect distance measuring.&lt;/p&gt;
&lt;p&gt;To make up for this new problem, folks made yet another Mercator derivative: the &lt;em&gt;Universal Transverse Mercator&lt;/em&gt;. This type of projection takes a whole new approach and uses its own coordinate system.
It introduces the concept of UTM zones. The earth is divided into roughly 60 zones and are each about 800 Km wide. The map that is rendered in each single zone uses the previous, Transverse Mercator projection to draw the actual map. A big advantage of this approach is the fact that we get a very constant distance measurement all across the globe.&lt;/p&gt;
&lt;p&gt;Such a UTM coordinate looks quite different from our classic latitude/longitude or our X/Y version. I will give you a random UTM coordinate:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="mi"&gt;54&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="mi"&gt;384524&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="mi"&gt;3948304&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The first number identifies one of the 60 UTM zones. The letter N show us in which hemisphere we should search this zone. These letters range from C to X (omitting I and O).
The first float tell us the &lt;em&gt;easting&lt;/em&gt;, or X value, the last float tells us the &lt;em&gt;northing&lt;/em&gt; or the Y value. Both these floats represent actual meters. &lt;/p&gt;
&lt;p&gt;Another important note to take about UTM is that it also acts as a framework for more localized UTM versions.
This means that each country or region could make its own maps, using smaller UTM zones to accurately represent their land, city, forest, etc.&lt;/p&gt;
&lt;h4&gt;Lambert Azimuthal&lt;/h4&gt;
&lt;p&gt;This projection (also called the Lambert Equal-Area) is yet another approach as it uses a &lt;em&gt;disc&lt;/em&gt; to map our spheroid to a flat surface.&lt;/p&gt;
&lt;p&gt;The big advantage of this type of projection is the fact that it represent the area of objects very accurately and is true regarding distance calculation.
However, it fails when it comes to accurate shape representation for shapes get more and more distorted once you start moving away from the center of the disc.&lt;/p&gt;
&lt;p&gt;The Lambert projection is one of the more accepted projections, right after the UTM.&lt;/p&gt;
&lt;h4&gt;Plate Carrée&lt;/h4&gt;
&lt;p&gt;And then you have Plate Carrée.&lt;/p&gt;
&lt;p&gt;This is one of the oldest projections out there and was invented around 1800 years ago.
In our little history story above, this projection came about when people thought the earth was rather flat.&lt;/p&gt;
&lt;p&gt;It combines almost all disadvantages of previous projections:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;It does a terrible job in representing correct area&lt;/li&gt;
&lt;li&gt;It does not care about the shape of objects&lt;/li&gt;
&lt;li&gt;Distance measuring is way off&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Despite the fact that this projection turns out to be so terrible, it is still quite commonly used today.
Well, not for navigation or distance calculation, obviously, but for illustrative purposes.&lt;/p&gt;
&lt;p&gt;Many organizations across the globe use this simple projection to demonstrate statistical data, overlaid on this map.
Demographics, political info, zombie outbreak danger zones, ... .&lt;/p&gt;
&lt;p&gt;As we also saw in our history lesson, the first charts used the Cartesian system quite literally and without much conversion, because the earth was flat anyway.
So in GIS systems, this means that this projection maps latitude and longitude &lt;em&gt;directly&lt;/em&gt; to a X and Y coordinate without much conversion.&lt;/p&gt;
&lt;p&gt;Because the conversion math is simple and calculations are few, this projection is among the fastest, but as you know now, at great cost.&lt;/p&gt;
&lt;h4&gt;Other variants&lt;/h4&gt;
&lt;p&gt;There are numerous other variants out there that all have their advantages or disadvantages. To name a few more:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;The Robinson projection displays the earth not in a flat image, but in a cylindrical flat sphere. It show the world more accurately, but it fails when it comes to representing area and shape, especially near the poles.&lt;/li&gt;
&lt;li&gt;The Winkel Tripel projection is another popular projection type which has many parallels with the Robison one, but has less distortion.&lt;/li&gt;
&lt;li&gt;The Peirce quincuncial projection uses a technique to unwrap the earth spheroid into a square, much like you would peel an orange. These maps are not used much, for they are very heavy in calculations, but the technique is now widely used to present a spherical image, unwrapped into a square.&lt;/li&gt;
&lt;li&gt;The Goode homolosine projection is a projection developed as a teaching instrument in a frustrating answer to the heavily distorted Mercator projection. It is famous for its quite unique shape where the spheroid is unwrapped into a beast with four "legs".&lt;/li&gt;
&lt;li&gt;...&lt;/li&gt;
&lt;/ul&gt;&lt;h3&gt;What do projections mean to us?&lt;/h3&gt;
&lt;p&gt;There are many more types of projections, but that would bore you to tears.&lt;/p&gt;
&lt;p&gt;The fact that I keep going on about projections, geometry and geography is because later on, when working with PostGIS, you will need to make a decision about how you wish to combine all of these.&lt;/p&gt;
&lt;p&gt;First it is very important to understand that geometry and geography are &lt;em&gt;two different data types&lt;/em&gt; which PostGIS can store into PostgreSQL.&lt;/p&gt;
&lt;p&gt;PostGIS is quite unique in the fact that it gives you the ability to work &lt;em&gt;directly&lt;/em&gt; with our three dimensional spheroid (geography) and ignore the projections and their Cartesian Flat Land (geometry).
You will have the power to work with latitude and longitude and perform real world calculations, right out of the box.
This way of working, however, comes with a few trade-offs.&lt;/p&gt;
&lt;p&gt;The first, and most obvious one: real, three dimensional spheroid geographical calculations will cost more computing time then the simpler, two dimensional geometry counterparts.
Another disadvantage of geography over geometry is the fact that PostGIS simply has &lt;em&gt;much&lt;/em&gt; less native functions ready for you to use.&lt;/p&gt;
&lt;p&gt;So depending on your use case, it might be a good idea to convert all your geographical data into geometrical ones.
This, however, requires knowledge about the projections we just saw for different projections will yield different results.&lt;/p&gt;
&lt;p&gt;If you have two points with a latitude and longitude coordinate (thus being geographical data) and wish to know the distance between them using geometrical functions, you have to project these points on a flat surface thus converting them into a Cartesian system (the whole projection story we saw so far). &lt;/p&gt;
&lt;p&gt;As we will see in the next chapter, if you simply convert geography into geometry, PostGIS will project the geometry coordinates using the Plate Carrée, which may not be very desirable when you which to calculate distances as we will be doing later on. We have the ability to tell PostGIS to use a different projection when converting, but all come with merits and demerits.&lt;/p&gt;
&lt;p&gt;You simply cannot do serious GIS work if you do not have at least a basic understanding of what is going on when projecting geography.
By reading through this chapter, I hope I have given you enough food-for-thought to go out and explore a bit more about these different projections.&lt;/p&gt;
&lt;h3&gt;What is next?&lt;/h3&gt;
&lt;p&gt;Okay, I think we have covered enough for today. I do apologize for the rather theoretical nature of this first chapter, but believe me, you will need the knowledge.&lt;/p&gt;
&lt;p&gt;Next time we will finally be looking at some actually PostGIS work and put some of this theory into practice:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;We will see how to GIS enable your PostgreSQL database&lt;/li&gt;
&lt;li&gt;We will look at how we can store geometry and geography&lt;/li&gt;
&lt;li&gt;We will actually put some points on the earth, draw some lines between them and perform some fun calculations&lt;/li&gt;
&lt;li&gt;We will take a look at how different projections will yield different results&lt;/li&gt;
&lt;li&gt;And finally, we will answer the question that started it all: How far is Tokyo from your current location?&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;And as always...thanks for reading!&lt;/p&gt;
&lt;!--  LocalWords:  PostGIS GIS PostgreSQL
 --&gt;&lt;/div&gt;</description><category>postgis</category><category>postgresql</category><guid>http://shisaa.be/postset/postgis-postgresqls-spatial-partner-part-1.html</guid><pubDate>Thu, 12 Jun 2014 10:00:00 GMT</pubDate></item><item><title>PostgreSQL: A full text search engine - Part 3</title><link>http://shisaa.be/postset/postgresql-full-text-search-part-3.html</link><dc:creator>Tim van der Linden</dc:creator><description>&lt;div&gt;&lt;p&gt;And so we arrive at the last part of the series.&lt;/p&gt;
&lt;p&gt;If you have not done so, please read &lt;a href="http://shisaa.be/postset/postgresql-full-text-search-part-1.html" title="First chapter introducing the full text search capabilities of PostgreSQL."&gt;part one&lt;/a&gt; and &lt;a href="http://shisaa.be/postset/postgresql-full-text-search-part-2.html" title="Second chapter introducing the full text search capabilities of PostgreSQL."&gt;part two&lt;/a&gt; before embarking.&lt;/p&gt;
&lt;p&gt;Today we will close up the introduction into PostgreSQL's full text capabilities by showing you a few aspects I have intentionally neglected in the previous parts. The most important ones being ranking and indexing.&lt;/p&gt;
&lt;p&gt;So let us take off right away!&lt;/p&gt;
&lt;h3&gt;Ranking&lt;/h3&gt;
&lt;p&gt;Up until now you have seen what full text is, how to use it and how to do a full custom setup. What you have not yet seen is how to &lt;em&gt;rank&lt;/em&gt; search results based on their relevance to the search query - a feature that most search engines offer and one that most users expect.&lt;/p&gt;
&lt;p&gt;However, there is a problem when it comes to ranking, it something that is somewhat &lt;em&gt;undefined&lt;/em&gt;. It is a gray area left wide open to interpretation. It is almost...personal.&lt;/p&gt;
&lt;p&gt;In its core, ranking within full text means giving a document a place based on how many times certain words occur in a document, or how close these words are relevant to each other. So let us start there.&lt;/p&gt;
&lt;h4&gt;Normal ranking&lt;/h4&gt;
&lt;p&gt;The first case, ranking based on how many times certain words occur, has a accompanying function ready to be used: &lt;em&gt;ts_rank()&lt;/em&gt;. It accepts a mandatory &lt;em&gt;tsvector&lt;/em&gt; and a &lt;em&gt;tsquery&lt;/em&gt; as its arguments and returns a float which represents how high the given document ranks. The function also accepts a &lt;em&gt;weights array&lt;/em&gt; and &lt;em&gt;normalization integer&lt;/em&gt;, but that is for later down the road.&lt;/p&gt;
&lt;p&gt;Let us test out the basic functionality:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Elephants and dolphins do not live in the same habitat.'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
               &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'elephants'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This is an regular old 'on the fly' query where we feed a string which we convert to a tsvector and a &lt;em&gt;token&lt;/em&gt; which is converted to a tsquery. The ranking result of this is:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;0.0607927
&lt;/pre&gt;


&lt;p&gt;This does not say much, does it? Okay, let us throw a few more tokens in the mix:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Elephants and dolphins do not live in the same habitat.'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
               &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'elephants &amp;amp; dolphins'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now we want to query the two tokens &lt;em&gt;elephants&lt;/em&gt; and &lt;em&gt;dolphins&lt;/em&gt;. We chain them together in an AND (&lt;em&gt;&amp;amp;&lt;/em&gt;) formation. The ranking:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;0.0985009
&lt;/pre&gt;


&lt;p&gt;Hmm, getting higher, good. More tokens please:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Elephants and dolphins do not live in the same habitat.'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
               &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'elephants &amp;amp; dolphins &amp;amp; habitat &amp;amp; living'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Results in:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;0.414037
&lt;/pre&gt;


&lt;p&gt;Oooh, that is quite nice. Notice the word &lt;em&gt;living&lt;/em&gt;, the &lt;em&gt;tsquery&lt;/em&gt; automatically stems it to match &lt;em&gt;live&lt;/em&gt;, but that is, of course, all basic knowledge by now.&lt;/p&gt;
&lt;p&gt;The idea here is simple, the more tokens match the string, the higher the ranking will be. You can use this float to later on sort your results.&lt;/p&gt;
&lt;h4&gt;Normal ranking with weights&lt;/h4&gt;
&lt;p&gt;Okay, let us spice things up a little bit, let us look at the &lt;em&gt;weights array&lt;/em&gt; that could be set as an optional parameter.&lt;/p&gt;
&lt;p&gt;Do you remember the weights we saw in chapter one? A quick rundown: You can optionally give weights to lexemes in a tsvector to group them together. This is, most of the time, used to reflect the original document structure within a tsvector. We also saw that, actually, all lexemes contain a standard weight of '&lt;em&gt;D&lt;/em&gt;' unless specified otherwise.&lt;/p&gt;
&lt;p&gt;Weights, when ranking, define importance of words. The &lt;em&gt;ts_rank()&lt;/em&gt; function will automatically take these weights into account and use a &lt;em&gt;weights array&lt;/em&gt; to influence the ranking float. Remember that there are only four possible weights: &lt;em&gt;A&lt;/em&gt;, &lt;em&gt;B&lt;/em&gt;, &lt;em&gt;C&lt;/em&gt; and &lt;em&gt;D&lt;/em&gt;. &lt;/p&gt;
&lt;p&gt;The weights array has a default value of:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="o"&gt;{&lt;/span&gt;0.1, 0.2, 0.4, 1.0&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;These values correspond to the weight letters you can assign. Note that these are in reverse order, the array represents: {D,C,B,A}.&lt;/p&gt;
&lt;p&gt;Let us test that out. We take the same query as before, but now using the &lt;em&gt;setweight()&lt;/em&gt; function, we will apply a weight of &lt;em&gt;C&lt;/em&gt; to all lexemes:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Elephants and dolphins do not live in the same habitat.'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="s1"&gt;'C'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
               &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'elephants &amp;amp; dolphins &amp;amp; habitat &amp;amp; live'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The result:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;0.674972
&lt;/pre&gt;


&lt;p&gt;Wow, that is a lot higher then our last ranking (which had an implicit, default weight of &lt;em&gt;D&lt;/em&gt;).
The reason for this is that the floats in the weights array &lt;em&gt;influence&lt;/em&gt; the ranking calculation.
Just for fun, you can override the default weights array, simply by passing it in as a first argument.
Let us put the weights all equal to the default of &lt;em&gt;D&lt;/em&gt; being &lt;em&gt;0.1&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;array&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
               &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Elephants and dolphins do not live in the same habitat.'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="s1"&gt;'C'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
               &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'elephants &amp;amp; dolphins &amp;amp; habitat &amp;amp; live'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And we get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;0.414037
&lt;/pre&gt;


&lt;p&gt;You can see that this is now back to the value we had before we assigned weights, or in other words, when the implicit weight was &lt;em&gt;D&lt;/em&gt;. You can thus influence what kind of an effect a certain weight has in you ranking. You can even reverse the lot and make a D have a more positive influence then an A, just to mess with peoples heads.&lt;/p&gt;
&lt;h4&gt;Normal ranking, the fair way&lt;/h4&gt;
&lt;p&gt;Not that what we have seen up until now was unfair, but is does not take into account the length of the documents searched through&lt;/p&gt;
&lt;p&gt;Document length is also an important factor when judging the relevance. A short document which matches on four or five tokens has a different relevance than a three times as long document which matches on the same amount of tokens. The shorter one is probably more relevant then the longer one.&lt;/p&gt;
&lt;p&gt;The same ranking function &lt;em&gt;ts_rank()&lt;/em&gt; has an extra, final optional parameter that you can pass in called the &lt;em&gt;normalization integer&lt;/em&gt;. This integer can have a combination of seven different values, they can be a single integer or mixed with a pipe (|) to pass in multiple values.&lt;/p&gt;
&lt;p&gt;The default value is &lt;em&gt;0&lt;/em&gt; - meaning that it will ignore document length all together, giving us the more "unfair" behavior. The next values you can give are &lt;em&gt;1&lt;/em&gt;, &lt;em&gt;2&lt;/em&gt;, &lt;em&gt;4&lt;/em&gt;, &lt;em&gt;8&lt;/em&gt;, &lt;em&gt;16&lt;/em&gt; and &lt;em&gt;32&lt;/em&gt; which stand for the following manipulations of the ranking float:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;1: It will divide the ranking float by the sum of 1 and the logarithmic number of the document length. The latter number is the ratio this document has compared to the other documents you wish to compare.&lt;/li&gt;
&lt;li&gt;2: Simply divides the ranking float by the length of the document.&lt;/li&gt;
&lt;li&gt;4: Divides the ranking float by the harmonic mean (the fair average) between matched tokens. This one is only uses by the other ranking function &lt;em&gt;ts_rank_cd&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;8: Divides the ranking float by the number of &lt;em&gt;unique&lt;/em&gt; words that are found in the document. &lt;/li&gt;
&lt;li&gt;16: Divides the ranking float by the sum of 1 and the logarithmic number of the number of &lt;em&gt;unique&lt;/em&gt; words found in the document.&lt;/li&gt;
&lt;li&gt;32: Simply divides the ranking float by &lt;em&gt;itself&lt;/em&gt; and adds one to that.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;These are a lot of values and some of them are quite confusing. But all of these have only one purpose: to make ranking more "fair", based on various use cases.&lt;/p&gt;
&lt;p&gt;Take, for example, &lt;em&gt;1&lt;/em&gt; and &lt;em&gt;2&lt;/em&gt;. These calculate document length by taking into account the amount of &lt;em&gt;words&lt;/em&gt; present in the document.
The &lt;em&gt;words&lt;/em&gt; here reference the amount of &lt;em&gt;pointers&lt;/em&gt; that are present in the tsvector.&lt;/p&gt;
&lt;p&gt;To illustrate, we will convert the sentence "These token are repeating on purpose. Bad tokens!" into a tsvector, resulting in:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'bad'&lt;/span&gt;:7 &lt;span class="s1"&gt;'purpos'&lt;/span&gt;:6 &lt;span class="s1"&gt;'repeat'&lt;/span&gt;:4 &lt;span class="s1"&gt;'token'&lt;/span&gt;:2,8
&lt;/pre&gt;


&lt;p&gt;The &lt;em&gt;length&lt;/em&gt; of this document is &lt;em&gt;5&lt;/em&gt;, because we have &lt;em&gt;five pointers&lt;/em&gt; in total.&lt;/p&gt;
&lt;p&gt;If you now look at the integers &lt;em&gt;8&lt;/em&gt; and &lt;em&gt;16&lt;/em&gt;, they take the &lt;em&gt;uniqueness&lt;/em&gt; to calculate document length.
What that means is they do not count the pointers, but the actual &lt;em&gt;lexemes&lt;/em&gt;.
In the above tsvector and thus would result in a length of &lt;em&gt;4&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;All of these manipulations are just different ways of counting document length.
The ones summed up in the above integer list are mere educated guesses at what most people desire when ranking with a full text engine.
As I said in the beginning, it is a gray area, left open for interpretation.&lt;/p&gt;
&lt;p&gt;Let us try to see the different effects that such an integer can have.&lt;/p&gt;
&lt;p&gt;First we need to create a few documents (tsvectors) inside our famous phraseTable (from the previous chapters) that we will use throughout this chapter.
Connect to your phrase database, add a "title" column, truncate whatever we have stored there and insert a few variable length documents based on Edgar Allan Poe's "The Raven".
I have prepared the whole syntax below, this time you may copy-and-paste:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;TRUNCATE&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;into&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore, While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "Tis some visitor," I muttered, "tapping at my chamber door - Only this, and nothing more."'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'Tiny Allan'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;into&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore, While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "Tis some visitor," I muttered, "tapping at my chamber door - Only this, and nothing more." Ah, distinctly I remember it was in the bleak December, And each separate dying ember wrought its ghost upon the floor. Eagerly I wished the morrow - vainly I had sought to borrow From my books surcease of sorrow - sorrow for the lost Lenore - For the rare and radiant maiden whom the angels name Lenore - Nameless here for evermore.'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'Small Allan'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;into&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore, While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "Tis some visitor," I muttered, "tapping at my chamber door - Only this, and nothing more." Ah, distinctly I remember it was in the bleak December, And each separate dying ember wrought its ghost upon the floor. Eagerly I wished the morrow - vainly I had sought to borrow From my books surcease of sorrow - sorrow for the lost Lenore - For the rare and radiant maiden whom the angels name Lenore - Nameless here for evermore. And the silken sad uncertain rustling of each purple curtain Thrilled me - filled me with fantastic terrors never felt before; So that now, to still the beating of my heart, I stood repeating, "Tis some visitor entreating entrance at my chamber door - Some late visitor entreating entrance at my chamber door - This it is, and nothing more."'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'Medium Allan'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;into&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore, While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "Tis some visitor," I muttered, "tapping at my chamber door - Only this, and nothing more." Ah, distinctly I remember it was in the bleak December, And each separate dying ember wrought its ghost upon the floor. Eagerly I wished the morrow - vainly I had sought to borrow From my books surcease of sorrow - sorrow for the lost Lenore - For the rare and radiant maiden whom the angels name Lenore - Nameless here for evermore. And the silken sad uncertain rustling of each purple curtain Thrilled me - filled me with fantastic terrors never felt before; So that now, to still the beating of my heart, I stood repeating, "Tis some visitor entreating entrance at my chamber door - Some late visitor entreating entrance at my chamber door - This it is, and nothing more."  Presently my soul grew stronger; hesitating then no longer, "Sir," said I, "or Madam, truly your forgiveness I implore; But the fact is I was napping, and so gently you came rapping, And so faintly you came tapping, tapping at my chamber door, That I scarce was sure I heard you"- here I opened wide the door; - Darkness there, and nothing more.'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'Big Allan'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Nothing better then some good old Edgar to demonstrate a full text search ranking. Here we have four different lengths of the same verse making for four documents of different lengths stored in our tsvector column. Now we would like to search through these documents and find the keywords &lt;em&gt;'door'&lt;/em&gt; and &lt;em&gt;'gently'&lt;/em&gt;, ranking them as we go.&lt;/p&gt;
&lt;p&gt;For later reference, let us first count how many times our keywords occur in the sentence:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Tiny Allan: "door" 2, "gently" 1&lt;/li&gt;
&lt;li&gt;Small Allan: "door" 2, "gently" 1&lt;/li&gt;
&lt;li&gt;Medium Allan: "door" 4, "gently" 1&lt;/li&gt;
&lt;li&gt;Big Allan: "door" 6, "gently" 2&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;First, let us simply rank the result with the default normalization of &lt;em&gt;0&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Before we go over the results, a little bit about this query for people who are not so familiar with this SQL syntax.
We do a simple &lt;em&gt;SELECT&lt;/em&gt; from a data set using &lt;em&gt;FROM&lt;/em&gt; filtering it with a &lt;em&gt;WHERE&lt;/em&gt; clause.
Going over it line by line:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We &lt;em&gt;SELECT&lt;/em&gt; on the &lt;em&gt;title&lt;/em&gt; column we just made and on a "on-the-fly" column we create for the result set named &lt;em&gt;rank&lt;/em&gt; which contains the result of the &lt;em&gt;ts_rank()&lt;/em&gt; function.&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;In the &lt;em&gt;FROM&lt;/em&gt; clause you can put a series of statements that will deliver the data for the query. In this case we take our normal database &lt;em&gt;table&lt;/em&gt; and the result of the &lt;em&gt;to_tsquery()&lt;/em&gt; function which we name &lt;em&gt;keywords&lt;/em&gt; so we can use it throughout the query itself.&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Here we filter the result set using the &lt;em&gt;WHERE&lt;/em&gt; clause and the &lt;em&gt;matching&lt;/em&gt; operator (@@). The @@ is a Boolean operator, meaning it will simply return &lt;em&gt;true&lt;/em&gt; or &lt;em&gt;false&lt;/em&gt;.
So in this case, we check if the result of the &lt;em&gt;to_tsquery()&lt;/em&gt; function (named keywords and which will return lexemes) &lt;em&gt;match&lt;/em&gt; the results of the phrase &lt;em&gt;column&lt;/em&gt; from our table (which contains &lt;em&gt;tsvectors&lt;/em&gt; and thus lexemes). We want to rank only those phrases that actually contain our keywords.&lt;/p&gt;
&lt;p&gt;Now, back to our ranking. The result of this query will be:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   title     &lt;span class="p"&gt;|&lt;/span&gt;   rank    
--------------+-----------
Tiny Allan   &lt;span class="p"&gt;|&lt;/span&gt; 0.0906565
Small Allan  &lt;span class="p"&gt;|&lt;/span&gt; 0.0906565
Medium Allan &lt;span class="p"&gt;|&lt;/span&gt; 0.0906565
Big Allan    &lt;span class="p"&gt;|&lt;/span&gt;   0.10109
&lt;/pre&gt;


&lt;p&gt;Let us order the results first, so the most relevant document is always on top:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Result:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   title     &lt;span class="p"&gt;|&lt;/span&gt;   rank    
--------------+-----------
Big Allan    &lt;span class="p"&gt;|&lt;/span&gt;   0.10109
Tiny Allan   &lt;span class="p"&gt;|&lt;/span&gt; 0.0906565
Small Allan  &lt;span class="p"&gt;|&lt;/span&gt; 0.0906565
Medium Allan &lt;span class="p"&gt;|&lt;/span&gt; 0.0906565
&lt;/pre&gt;


&lt;p&gt;"Big Allen" is on top, for it has more occurrences of the keywords "door" and "gently".
But to be fair, in ratio "Tiny Allan" has almost the same amount of occurrences of both keywords. Three times less, but it also is three times as small.&lt;/p&gt;
&lt;p&gt;So let us take document length (based on &lt;em&gt;word count&lt;/em&gt;) into account, setting our &lt;em&gt;normalization&lt;/em&gt; to &lt;em&gt;1&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You will get:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   title     &lt;span class="p"&gt;|&lt;/span&gt;   rank    
--------------+-----------
Tiny Allan   &lt;span class="p"&gt;|&lt;/span&gt; 0.0181313
Small Allan  &lt;span class="p"&gt;|&lt;/span&gt; 0.0151094
Big Allan    &lt;span class="p"&gt;|&lt;/span&gt; 0.0145124
Medium Allan &lt;span class="p"&gt;|&lt;/span&gt;  0.013831
&lt;/pre&gt;


&lt;p&gt;This could be seen as a more fair ranking, "Tiny Allan" is now on top because, considering its &lt;em&gt;ratio&lt;/em&gt;, it is the most relevant. "Medium Allan" falls all the way down because it is almost as big as "Big Allan", but contains lesser occurrences of the keywords. In total five keywords in contrast to "Big Allan" who has eight but is only slightly bigger.&lt;/p&gt;
&lt;p&gt;Let us do the same, but count the document length based on the &lt;em&gt;unique&lt;/em&gt; occurrences using integer &lt;em&gt;8&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The result:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;Tiny Allan   &lt;span class="p"&gt;|&lt;/span&gt; 0.00335765
Small Allan  &lt;span class="p"&gt;|&lt;/span&gt; 0.00161887
Medium Allan &lt;span class="p"&gt;|&lt;/span&gt; 0.00119285
Big Allan    &lt;span class="p"&gt;|&lt;/span&gt; 0.00105303
&lt;/pre&gt;


&lt;p&gt;That is a very different result, but quite what you should expect.&lt;/p&gt;
&lt;p&gt;We are searching for only &lt;em&gt;two&lt;/em&gt; tokens here, and considering the fact that uniqueness is now adhered, all the extra occurrences of these words are ignored.
This means that for the ranking algorithm, all the documents we searched through (which all have at least one occurrence of each token) get normalized to only &lt;em&gt;2&lt;/em&gt; matching tokens.
And in that case, the shortest document wins hands down, for it is seen as most relevant. As you can see in the result set, the documents are neatly ordered from tiny to big.&lt;/p&gt;
&lt;h4&gt;Ranking with density&lt;/h4&gt;
&lt;p&gt;Up until now we have seen the "normal" ranking function &lt;em&gt;ts_rank()&lt;/em&gt;, which is the one you will probably use the most.&lt;/p&gt;
&lt;p&gt;There is, however, one more function at our direct disposal called &lt;em&gt;ts_rank_cd()&lt;/em&gt;. The &lt;em&gt;cd&lt;/em&gt; stands for &lt;em&gt;Cover Density&lt;/em&gt; and is simply yet another way of considering relevance.
This function has exactly the same required and optional arguments, it simply counts relevancy differently.
Very important for this function to work properly is that you do not let it operate on a &lt;em&gt;stripped&lt;/em&gt; tsvector.&lt;/p&gt;
&lt;p&gt;A stripped tsvector is one that has been undone of its pointer information. If you know that you do not need this pointer information - you just need to match tsqueries against the lexemes in you tsvector - you can strip these pointers and thus make for smaller footprints in your database.&lt;/p&gt;
&lt;p&gt;In case of our cover density ranker, it needs this positional pointer information to see how &lt;em&gt;close&lt;/em&gt; the search tokens are to each other.
It makes sense that this ranking function only works on multiple tokens, on single tokens it is kind of pointless.&lt;/p&gt;
&lt;p&gt;In a way, this ranking function looks for &lt;em&gt;phrases&lt;/em&gt; rather then single tokens; the closer lexemes are together, the more positive influence they will have on the resulting ranking float.&lt;/p&gt;
&lt;p&gt;In our "Raven" examples this might be a little bit hard to see, so let me demonstrate this with a couple of new, on-the-fly queries.&lt;/p&gt;
&lt;p&gt;We wish to search for the tokens &lt;em&gt;'token'&lt;/em&gt; and &lt;em&gt;'count'&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;First, a sentence in which the searched for tokens are wide apart: "These tokens are very wide apart and therefor do not count as much.":&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'These tokens are very wide apart and do not count as much.'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'token &amp;amp; count'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Will have this tsvector:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'apart'&lt;/span&gt;:6 &lt;span class="s1"&gt;'count'&lt;/span&gt;:10 &lt;span class="s1"&gt;'much'&lt;/span&gt;:12 &lt;span class="s1"&gt;'token'&lt;/span&gt;:2 &lt;span class="s1"&gt;'wide'&lt;/span&gt;:5
&lt;/pre&gt;


&lt;p&gt;And this result:&lt;/p&gt;
&lt;pre class="code literal-block"&gt; 0.008624
&lt;/pre&gt;


&lt;p&gt;Let us put these tokens closer together now: "These tokens count for much now that they are not so wide apart!":&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'These tokens count for much now that they are not so wide apart!'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'token &amp;amp; count'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The vector:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'apart'&lt;/span&gt;:13 &lt;span class="s1"&gt;'count'&lt;/span&gt;:3 &lt;span class="s1"&gt;'much'&lt;/span&gt;:5 &lt;span class="s1"&gt;'token'&lt;/span&gt;:2 &lt;span class="s1"&gt;'wide'&lt;/span&gt;:12
&lt;/pre&gt;


&lt;p&gt;The result:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;0.0198206
&lt;/pre&gt;


&lt;p&gt;You can see that both the vectors have &lt;em&gt;exactly&lt;/em&gt; the same lexemes, but different pointer information.
In the second vector, the tokens we searched for are next to each other, which results in a ranking float that is more then double of the first result.&lt;/p&gt;
&lt;p&gt;This demonstrates the working of this function. The same optional manipulations can be passed in (weights and normalization) and they will have roughly the same effect.&lt;/p&gt;
&lt;p&gt;Pick the ranking function that is best fit for your use case.&lt;/p&gt;
&lt;p&gt;It needs to be said that the two ranking functions we have seen so far are officially called &lt;em&gt;example functions&lt;/em&gt; by the PostgreSQL community.
They are functions devised to be fitting for most purposes, but also to demonstrate how you could write your own.&lt;/p&gt;
&lt;p&gt;If you have very specific use cases it is advised to write you own ranking functions to fit your exact needs.
But this is considered beyond the scope of this series (and maybe also beyond the scope of your needs).&lt;/p&gt;
&lt;h3&gt;Highlight your results!&lt;/h3&gt;
&lt;p&gt;The next interesting thing we can do with the results of our full text is to highlight the relevant words.&lt;/p&gt;
&lt;p&gt;As is the case with many search engines, users want to skim over an excerpt of each result to see if it is what they are searching for.
For this PostgreSQL delivers us yet another function: &lt;em&gt;ts_headline()&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;To demonstrate its use, we first have to make our small database a little bit bigger by inserting the original text of the "Raven" next to our tsvectors.
So, again, copy and past this new set of queries (yes you may...):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;TRUNCATE&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;into&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore, While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "Tis some visitor," I muttered, "tapping at my chamber door - Only this, and nothing more."'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'Tiny Allan'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore, While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "Tis some visitor," I muttered, "tapping at my chamber door - Only this, and nothing more."'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;into&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore, While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "Tis some visitor," I muttered, "tapping at my chamber door - Only this, and nothing more." Ah, distinctly I remember it was in the bleak December, And each separate dying ember wrought its ghost upon the floor. Eagerly I wished the morrow - vainly I had sought to borrow From my books surcease of sorrow - sorrow for the lost Lenore - For the rare and radiant maiden whom the angels name Lenore - Nameless here for evermore.'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'Small Allan'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore, While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "Tis some visitor," I muttered, "tapping at my chamber door - Only this, and nothing more." Ah, distinctly I remember it was in the bleak December, And each separate dying ember wrought its ghost upon the floor. Eagerly I wished the morrow - vainly I had sought to borrow From my books surcease of sorrow - sorrow for the lost Lenore - For the rare and radiant maiden whom the angels name Lenore - Nameless here for evermore.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;into&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore, While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "Tis some visitor," I muttered, "tapping at my chamber door - Only this, and nothing more." Ah, distinctly I remember it was in the bleak December, And each separate dying ember wrought its ghost upon the floor. Eagerly I wished the morrow - vainly I had sought to borrow From my books surcease of sorrow - sorrow for the lost Lenore - For the rare and radiant maiden whom the angels name Lenore - Nameless here for evermore. And the silken sad uncertain rustling of each purple curtain Thrilled me - filled me with fantastic terrors never felt before; So that now, to still the beating of my heart, I stood repeating, "Tis some visitor entreating entrance at my chamber door - Some late visitor entreating entrance at my chamber door - This it is, and nothing more."'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'Medium Allan'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore, While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "Tis some visitor," I muttered, "tapping at my chamber door - Only this, and nothing more." Ah, distinctly I remember it was in the bleak December, And each separate dying ember wrought its ghost upon the floor. Eagerly I wished the morrow - vainly I had sought to borrow From my books surcease of sorrow - sorrow for the lost Lenore - For the rare and radiant maiden whom the angels name Lenore - Nameless here for evermore. And the silken sad uncertain rustling of each purple curtain Thrilled me - filled me with fantastic terrors never felt before; So that now, to still the beating of my heart, I stood repeating, "Tis some visitor entreating entrance at my chamber door - Some late visitor entreating entrance at my chamber door - This it is, and nothing more'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;into&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore, While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "Tis some visitor," I muttered, "tapping at my chamber door - Only this, and nothing more." Ah, distinctly I remember it was in the bleak December, And each separate dying ember wrought its ghost upon the floor. Eagerly I wished the morrow - vainly I had sought to borrow From my books surcease of sorrow - sorrow for the lost Lenore - For the rare and radiant maiden whom the angels name Lenore - Nameless here for evermore. And the silken sad uncertain rustling of each purple curtain Thrilled me - filled me with fantastic terrors never felt before; So that now, to still the beating of my heart, I stood repeating, "Tis some visitor entreating entrance at my chamber door - Some late visitor entreating entrance at my chamber door - This it is, and nothing more."  Presently my soul grew stronger; hesitating then no longer, "Sir," said I, "or Madam, truly your forgiveness I implore; But the fact is I was napping, and so gently you came rapping, And so faintly you came tapping, tapping at my chamber door, That I scarce was sure I heard you"- here I opened wide the door; - Darkness there, and nothing more.'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'Big Allan'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore, While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "Tis some visitor," I muttered, "tapping at my chamber door - Only this, and nothing more." Ah, distinctly I remember it was in the bleak December, And each separate dying ember wrought its ghost upon the floor. Eagerly I wished the morrow - vainly I had sought to borrow From my books surcease of sorrow - sorrow for the lost Lenore - For the rare and radiant maiden whom the angels name Lenore - Nameless here for evermore. And the silken sad uncertain rustling of each purple curtain Thrilled me - filled me with fantastic terrors never felt before; So that now, to still the beating of my heart, I stood repeating, "Tis some visitor entreating entrance at my chamber door - Some late visitor entreating entrance at my chamber door - This it is, and nothing more."  Presently my soul grew stronger; hesitating then no longer, "Sir," said I, "or Madam, truly your forgiveness I implore; But the fact is I was napping, and so gently you came rapping, And so faintly you came tapping, tapping at my chamber door, That I scarce was sure I heard you"- here I opened wide the door; - Darkness there, and nothing more.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Good, we now have the same data, but this time we stored the text of the original document alongside the "vectorized" version.&lt;/p&gt;
&lt;p&gt;The reason for this being that this &lt;em&gt;ts_headline()&lt;/em&gt; function searches in the original documents (being our &lt;em&gt;article&lt;/em&gt; column) rather that in your ts_vector column.
Two arguments are mandatory: the original article and the ts_query. The optional arguments are the full text configuration you wish to use and a string of additional, comma separated options.&lt;/p&gt;
&lt;p&gt;But first, let us take a look at its most basic usage:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Will give you:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   title     |                                                      result                                                      
--------------+------------------------------------------------------------------------------------------------------------------
Tiny Allan   | &amp;lt;b&amp;gt;gently&amp;lt;/b&amp;gt; rapping, rapping at my chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt;. "Tis some visitor," I muttered, "tapping at my chamber
Small Allan  | &amp;lt;b&amp;gt;gently&amp;lt;/b&amp;gt; rapping, rapping at my chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt;. "Tis some visitor," I muttered, "tapping at my chamber
Medium Allan | &amp;lt;b&amp;gt;gently&amp;lt;/b&amp;gt; rapping, rapping at my chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt;. "Tis some visitor," I muttered, "tapping at my chamber
Big Allan    | &amp;lt;b&amp;gt;gently&amp;lt;/b&amp;gt; rapping, rapping at my chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt;. "Tis some visitor," I muttered, "tapping at my chamber
&lt;/pre&gt;


&lt;p&gt;As you can see, we get back a short excerpt of each verse with the tokens of interest surrounded with a HTML "&amp;lt;b&amp;gt;" tag.
That is actual all there is to this function, it return the results with the tokens highlighted.&lt;/p&gt;
&lt;p&gt;However, there are some nice options you can set to alter this basic behavior.&lt;/p&gt;
&lt;p&gt;The first one up is the HTML tag you wish to put around you highlighted words. For this you have two variables &lt;em&gt;StartSel&lt;/em&gt; and &lt;em&gt;StopSel&lt;/em&gt;.
If we wanted this to be a "&amp;lt;em&amp;gt;" tag instead, we could tell the function to change as follows:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'StartSel=&amp;lt;em&amp;gt;,StopSel=&amp;lt;/em&amp;gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And now we will get back an &amp;lt;em&amp;gt; instead of a &amp;lt;b&amp;gt; (including just one row this time):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   title     |                                                        result                                                        
--------------+----------------------------------------------------------------------------------------------------------------------
Tiny Allan   | &amp;lt;em&amp;gt;gently&amp;lt;/em&amp;gt; rapping, rapping at my chamber &amp;lt;em&amp;gt;door&amp;lt;/em&amp;gt;. "Tis some visitor," I muttered, "tapping at my chamber
&lt;/pre&gt;


&lt;p&gt;In fact, it does not need to be HTML at all, you can put (almost) any string there:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'StartSel=foobar&amp;gt;&amp;gt;,StopSel=&amp;lt;&amp;lt;barfoo '&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Result:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   title     |                                                               result                                                               
--------------+------------------------------------------------------------------------------------------------------------------------------------
Tiny Allan   | foobar&amp;gt;&amp;gt;gently&amp;lt;&amp;lt;barfoo rapping, rapping at my chamber foobar&amp;gt;&amp;gt;door&amp;lt;&amp;lt;barfoo. "Tis some visitor," I muttered, "tapping at my chamber
&lt;/pre&gt;


&lt;p&gt;Quite awesome!&lt;/p&gt;
&lt;p&gt;Another attribute you can tamper with is how many words should be included in the result set by using the &lt;em&gt;MaxWords&lt;/em&gt; and &lt;em&gt;MinWords&lt;/em&gt;: &lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'MaxWords=4,MinWords=1'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Which gives you:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   title     |             result             
--------------+--------------------------------
Tiny Allan   | &amp;lt;b&amp;gt;gently&amp;lt;/b&amp;gt; rapping, rapping
&lt;/pre&gt;


&lt;p&gt;To make the resulting headline a little bit more readable there is an attribute in this options string called &lt;em&gt;ShortWord&lt;/em&gt; which tells the function which is the shortest word that may appear at the start or end of the headline. &lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'ShortWord=8'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Will give you:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   title     |                                                                                   result                                                                                    
--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Tiny Allan   | &amp;lt;b&amp;gt;gently&amp;lt;/b&amp;gt; rapping, rapping at my chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt;. "Tis some visitor," I muttered, "tapping at my chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt; - Only this, and nothing more."
Small Allan  | &amp;lt;b&amp;gt;gently&amp;lt;/b&amp;gt; rapping, rapping at my chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt;. "Tis some visitor," I muttered, "tapping at my chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt; - Only this, and nothing more." Ah, distinctly
&lt;/pre&gt;


&lt;p&gt;Now it will try and set word boundaries to words of minimal 8 letters. This time I included the second line of the result set. As you can see the engine could not find an 8 letter word at the remainder of the document, so it simply prints it until the end. The second row, "Small Allan" is a bit bigger and the word "distinctly" has more then 8 letters, so is set as the boundary,&lt;/p&gt;
&lt;p&gt;So far the headline function has given us almost full sentences and not really fragments of text. This is because the optional &lt;em&gt;MaxFragments&lt;/em&gt; defaults to 0. If we up this variable, it will start to include fragments and not sentences. Let us try it out:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'MaxFragments=2,MaxWords=8,MinWords=1'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Gives you&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   title     |                                                     result                                                      
--------------+-----------------------------------------------------------------------------------------------------------------
Tiny Allan   | &amp;lt;b&amp;gt;gently&amp;lt;/b&amp;gt; rapping, rapping at my chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt;
...
Big Allan    | &amp;lt;b&amp;gt;gently&amp;lt;/b&amp;gt; rapping, rapping at my chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt; ... chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt; - This it is, and nothing more
&lt;/pre&gt;


&lt;p&gt;I include only the first and last line of this result set. As you can see on the last line, the result is now fragmented, and we get back different pieces of our result.
If, for instance, four or five tokens match in our document, setting the &lt;em&gt;MaxFragments&lt;/em&gt; to a higher value will show more of these matches glued together.&lt;/p&gt;
&lt;p&gt;Accompanying this &lt;em&gt;MaxFragments&lt;/em&gt; option is the &lt;em&gt;FragmentDelimiter&lt;/em&gt; variable which is used to define, well, the delimiter between the fragments. Short demo:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'MaxFragments=2,FragmentDelimiter=;,MaxWords=8,MinWords=1'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You will get:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;   title     |                                                   result                                                    
--------------+-------------------------------------------------------------------------------------------------------------
Big Allan    | &amp;lt;b&amp;gt;gently&amp;lt;/b&amp;gt; rapping, rapping at my chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt;;chamber &amp;lt;b&amp;gt;door&amp;lt;/b&amp;gt; - This it is, and nothing more
&lt;/pre&gt;


&lt;p&gt;Including only the last line, you will see we now have a semicolon (;) instead of a ellipses (...). Neat.&lt;/p&gt;
&lt;p&gt;A final, less common option for the &lt;em&gt;ts_headline()&lt;/em&gt; function is to ignore all the word boundaries we set before and simply return the &lt;em&gt;whole&lt;/em&gt; document and highlight all the words of relevance.
This variable is called &lt;em&gt;HighlightAll&lt;/em&gt; and is a Boolean set to &lt;em&gt;false&lt;/em&gt; by default:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'HighlightAll=true'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The result would be too large to print here, but try it out. It will give you the whole text, but with the important tokens decorated with the element (or text) of choice.&lt;/p&gt;
&lt;h4&gt;A big word of &lt;em&gt;caution&lt;/em&gt;&lt;/h4&gt;
&lt;p&gt;It is very fun to play with highlighting your results, I will admit that. The only problem is, as you might have concluded yourself, this is a potential performance grinder.&lt;/p&gt;
&lt;p&gt;The problem here is that this function cannot use any indexes and it can also not use your stored tsvector. It &lt;em&gt;needs&lt;/em&gt; the original document text and it needs to not only parse the whole document text to a tsvector for matching, it also needs to parse the original document text a second time to find the substrings and &lt;em&gt;decorate&lt;/em&gt; them with the characters you have set. And this whole process has to happen &lt;em&gt;for every single record&lt;/em&gt; in your result set.&lt;/p&gt;
&lt;p&gt;Highlighting, with this function, is a &lt;em&gt;very&lt;/em&gt; expensive to do. &lt;/p&gt;
&lt;p&gt;This does not mean that you have to avoid this function, if so I would have told you from the start and skipped this whole part. No, it is there to be used. But use it in a correct way.&lt;/p&gt;
&lt;p&gt;A correct way often seen is to use the highlighting only on the top results you are interested in - the top results the user has on their screen at the moment.
This could be achieved in SQL with a so called &lt;em&gt;subquery&lt;/em&gt;.&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_headline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
          &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'door &amp;amp; gently'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
          &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;
          &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
          &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;alias&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;For those unfamiliar, a &lt;em&gt;subquery&lt;/em&gt; is nothing more than a query within a query (queue Inception drums...sorry).&lt;/p&gt;
&lt;p&gt;You evaluate the inner query and use the result set of that to perform the outer query. You can achieve the same with two queries, but that would prove not to be as elegant.
When PostgreSQL sees a subquery, it can plan and execute more efficiently then with separate queries, many times giving you a better performance.&lt;/p&gt;
&lt;p&gt;The query you see above might look a bit frightening to beginning SQL folk, but simply see it as two separate ones and the beast becomes a tiny mouse.
Unless you are afraid of mice, let it become a...euhm...soft butterfly gliding on the wind instead.&lt;/p&gt;
&lt;p&gt;In the inner query we perform the actual matching and ranking as we have seen before. This inner query then only returns &lt;em&gt;two&lt;/em&gt; matching records, because of the &lt;em&gt;LIMIT&lt;/em&gt; clause.
The outer query takes those results and performs the expensive operation of highlighting.&lt;/p&gt;
&lt;h3&gt;Indexing&lt;/h3&gt;
&lt;p&gt;Back to a more serious matter, the act of &lt;em&gt;indexing&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;If you do not know what an index is, you have to brush up real fast, for indexing is quite important for the performance of your queries.
In a very simplistic view, an index is like a chapter listing in a book. You can quickly skim over the chapters to find the page you are looking for, instead of having to flip over every single page.&lt;/p&gt;
&lt;p&gt;You typically put indexes on tables which are consulted often and you build the index in a way that is in parallel with how you query them.&lt;/p&gt;
&lt;p&gt;As indexing is a whole topic, or rather, a whole profession of its own, I will not go too deeply into the matter.
But I will try to give you some basic knowledge on the subject.&lt;/p&gt;
&lt;p&gt;Note that I will go over this matter in lighting speed and thus have to greatly skim down on the amount of details.
A &lt;em&gt;very&lt;/em&gt; good place to learn about indexes is Markus Winand's &lt;a href="http://use-the-index-luke.com/" title="Use The Index, Luke series written by Markus Winand."&gt;Use The Index, Luke&lt;/a&gt; series.
I seriously suggest you read that stuff, it is golden knowledge for every serious developer working with databases.&lt;/p&gt;
&lt;h4&gt;B-tree&lt;/h4&gt;
&lt;p&gt;Before we can go to the PostgreSQL full text index types we first have to look at the most common index type, the &lt;em&gt;Balanced tree&lt;/em&gt; or &lt;em&gt;B-tree&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;em&gt;B-tree&lt;/em&gt; is a proven "computer science" concept that give us a way to search certain types of data, fast.&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;B-tree&lt;/em&gt; is a tree structure with a root, nodes and leafs (inverse from a natural tree).
The data that is within your table rows will be ordered and chopped up to fit within the tree.&lt;/p&gt;
&lt;p&gt;It is called &lt;em&gt;Balanced&lt;/em&gt;, meaning that each level of the tree has the same amount of nodes.&lt;/p&gt;
&lt;p&gt;Take this picture for example:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;            &lt;span class="p"&gt;|&lt;/span&gt;root&lt;span class="p"&gt;|&lt;/span&gt;
              &lt;span class="p"&gt;|&lt;/span&gt;
       ----------------
       &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt;
    &lt;span class="p"&gt;|&lt;/span&gt;node&lt;span class="p"&gt;|&lt;/span&gt;          &lt;span class="p"&gt;|&lt;/span&gt;node&lt;span class="p"&gt;|&lt;/span&gt;
       &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt;
  ----------       --------- 
  &lt;span class="p"&gt;|&lt;/span&gt;        &lt;span class="p"&gt;|&lt;/span&gt;       &lt;span class="p"&gt;|&lt;/span&gt;       &lt;span class="p"&gt;|&lt;/span&gt;
&lt;span class="p"&gt;|&lt;/span&gt;node&lt;span class="p"&gt;|&lt;/span&gt;  &lt;span class="p"&gt;|&lt;/span&gt;node&lt;span class="p"&gt;|&lt;/span&gt;  &lt;span class="p"&gt;|&lt;/span&gt;node&lt;span class="p"&gt;|&lt;/span&gt;  &lt;span class="p"&gt;|&lt;/span&gt;node&lt;span class="p"&gt;|&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;In &lt;em&gt;B-tree&lt;/em&gt; terms, we summarize this tree by saying:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;It has an &lt;em&gt;order&lt;/em&gt; of 2, meaning that each node will contain two leaves only&lt;/li&gt;
&lt;li&gt;It has a &lt;em&gt;depth&lt;/em&gt; of 3, meaning it is three levels deep (including the root node)&lt;/li&gt;
&lt;li&gt;It has 4 &lt;em&gt;leaves&lt;/em&gt;, meaning that the amount of nodes that do not contain children is 4 (bottom row)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;If you set the &lt;em&gt;order&lt;/em&gt; of your tree to a higher number, more nodes can fit onto a single row and you will end up with a lesser &lt;em&gt;depth&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Now the actual power of an index comes from an I/O perspective. As you know (or will know now) the thing that will slow down a program/computer the most is I/O.
This can be network I/O, disk I/O, etc. In case of our database we will speak of disk I/O. &lt;/p&gt;
&lt;p&gt;When a database has to go and &lt;em&gt;scan&lt;/em&gt; your table without an index is has to plow through all your rows to find a match.
Database rows are almost always &lt;em&gt;not&lt;/em&gt; I/O optimized, this means that they do not fit well in the blocks of your physical disks structure.
This, in short, means that there is a lot of overhead in reading through that physical data,&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;B-tree&lt;/em&gt; on the other hand, is &lt;em&gt;very&lt;/em&gt; optimized for I/O. Each level of a &lt;em&gt;B-tree&lt;/em&gt; will try and fit perfectly within one physical block on your disk.
If all levels fit within one block each, walking over the tree will be very efficient and have almost no overhead.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;B-trees&lt;/em&gt; work with the most common data types such as TEXT, INT, VARCHAR, ... .&lt;/p&gt;
&lt;p&gt;But because full text search in PostgreSQL is its own "thing" (using the @@ operator), all knowledge that you may have learned about regarding indexes does not apply (or not in full anyway) to full text search.&lt;/p&gt;
&lt;p&gt;Full text search needs its own kind of indexing  for a &lt;em&gt;tsquery&lt;/em&gt; to be able to use them. 
And as we will see in a moment, indexing on full text in PostgreSQL is a dance of trade-offs.
When it comes to this matter we have two types of indexes available: &lt;em&gt;GiST&lt;/em&gt; and &lt;em&gt;GiN&lt;/em&gt; which are both closely related to the &lt;em&gt;B-tree&lt;/em&gt;.&lt;/p&gt;
&lt;h4&gt;GiST&lt;/h4&gt;
&lt;p&gt;GiST stands for &lt;em&gt;Generalized Search Tree&lt;/em&gt; and can both be set on &lt;em&gt;tsvector&lt;/em&gt; and &lt;em&gt;tsquery&lt;/em&gt; column types, though most of the time you will use it on the former.&lt;/p&gt;
&lt;p&gt;The &lt;em&gt;GiST&lt;/em&gt; itself is not something that is unique to PostgreSQL, it is a project on its own and its concept is laid out in a C library called &lt;em&gt;libGist&lt;/em&gt;.
You could go ahead and play around with &lt;em&gt;libGiist&lt;/em&gt; to get a better understanding of how it works, it even comes shipped with some demo applications.&lt;/p&gt;
&lt;p&gt;Over time there have come many new types of trees based on the &lt;em&gt;B-tree&lt;/em&gt; concept, but most of them are limit in how they can match.
A &lt;em&gt;B-tree&lt;/em&gt; and its direct descendants can only use basic match operators like "&amp;lt;", "&amp;gt;", "=", etc.
A &lt;em&gt;GiST&lt;/em&gt; index, however, has more advanced matching capabilities like "intersect" and in case of PostgreSQL's implementation: the &lt;em&gt;"@@"&lt;/em&gt; operator.&lt;/p&gt;
&lt;p&gt;Another big advantage of the &lt;em&gt;GiST&lt;/em&gt; is the fact that it can store arbitrary data types and therefor can be used in a wide area of conduct.
The trade off for the wide data type support is the fact that &lt;em&gt;GiST&lt;/em&gt; will always return a &lt;em&gt;no&lt;/em&gt; if there is no match or a &lt;em&gt;maybe&lt;/em&gt; if there is.
There is no true &lt;em&gt;hit&lt;/em&gt; with this kind of index.&lt;/p&gt;
&lt;p&gt;Because of this behavior there is extra overhead in the case of full text search because PostgreSQL has to manually go and check all the &lt;em&gt;maybe&lt;/em&gt;'s that return and see if they are an actual match.&lt;/p&gt;
&lt;p&gt;The big advantages of &lt;em&gt;GiST&lt;/em&gt; are the fact that the index builds faster and the update of such an index is less expensive then the next index type we will see.&lt;/p&gt;
&lt;h4&gt;GiN&lt;/h4&gt;
&lt;p&gt;The second index candidate we have at our disposal is the &lt;em&gt;Generalized Inverted Index&lt;/em&gt; or &lt;em&gt;GiN&lt;/em&gt; in short.&lt;/p&gt;
&lt;p&gt;Same as we saw with &lt;em&gt;GiST&lt;/em&gt;, &lt;em&gt;GiN&lt;/em&gt; also allows for arbitrary data types to be indexes and allows for more matching operators to be used.
But as opposed to &lt;em&gt;GiST&lt;/em&gt;, a &lt;em&gt;GiN&lt;/em&gt; index is &lt;em&gt;deterministic&lt;/em&gt; - it will always return a true match, cutting the checking overhead needed with &lt;em&gt;GiST&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Well, unless you wish to use &lt;em&gt;weights&lt;/em&gt; in your queries. A &lt;em&gt;GiN&lt;/em&gt; index does &lt;em&gt;not&lt;/em&gt; store lexeme weights. This means that, if weights need to be taken into account when querying, PostgreSQL still has to go and fetch the actual row(s) that return a true match, giving you somewhat of the same overhead as with a &lt;em&gt;GiST&lt;/em&gt; index.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;GiN&lt;/em&gt; tries to improve the &lt;em&gt;B-tree&lt;/em&gt; concept by minimizing the amount of redundancy within nodes and there leaves.
When you search for a number between 0 and 1000, it can be that your index has to go over 5 levels to find the desired entry.
This means that the four levels above the matching leaf could potentially contain an (implied) reference to the row id you wish to have fetched.
In a &lt;em&gt;GiN&lt;/em&gt; index, this is generalized by storing a single entry of the duplicates into so-called &lt;em&gt;posting trees&lt;/em&gt; and &lt;em&gt;posting lists&lt;/em&gt; and pointing to those lists instead of drilling down multiple levels.&lt;/p&gt;
&lt;p&gt;The downside of &lt;em&gt;GiN&lt;/em&gt; is the fact that this kind of index will slow down the bigger it gets.&lt;/p&gt;
&lt;p&gt;On a more positive note, &lt;em&gt;GiN&lt;/em&gt; indexes are most of the time smaller on disk (because of it trying to reduce duplicates). And, as of PostgreSQL 9.4, they will be &lt;em&gt;even&lt;/em&gt; smaller.
The soon-to-be version will introduce a so-called &lt;em&gt;varbyte&lt;/em&gt; version of &lt;em&gt;GiN&lt;/em&gt;. For now just take it from me that it will make these type of indexes &lt;em&gt;much&lt;/em&gt; smaller, and even more efficient.&lt;/p&gt;
&lt;p&gt;As you can see, there is no perfect index when it comes to full text. You will have to carefully look at what data you will save and how you wish to query the data.&lt;/p&gt;
&lt;p&gt;If you do not update your database much but you have a lot of querying going on, &lt;em&gt;GiN&lt;/em&gt; might be a better option for it is much faster with a lookup (if no weights are required).
If your data does not get read much, but is updated frequently, maybe a &lt;em&gt;GiST&lt;/em&gt; is a better choice for it allows for faster updating.&lt;/p&gt;
&lt;h4&gt;Making an index&lt;/h4&gt;
&lt;p&gt;We have (very roughly) seen what an index is and what we have available for full text, but how do you actually build such an index?&lt;/p&gt;
&lt;p&gt;Luckily for us, this too has been neatly abstracted and is very simple to do.&lt;/p&gt;
&lt;p&gt;If we wanted our phraseTable to contain an index, we simply could go about and create it with the following syntax:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;phrasetable_idx&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;gin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will create a &lt;em&gt;GiN&lt;/em&gt; index called &lt;em&gt;phrasetable_ixd&lt;/em&gt; on the column &lt;em&gt;phrase&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Just like we did before, we will now re-populate our &lt;em&gt;phrase&lt;/em&gt; column, but this time we will fill it with the data we want to have indexed: article and title.
Let me show you what I mean.&lt;/p&gt;
&lt;p&gt;First, empty the four &lt;em&gt;phrase&lt;/em&gt; columns in our tiny database:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Notice that I removed the &lt;em&gt;NOT NULL&lt;/em&gt; constraint.
Next we can populate it containing a tsvector of both the &lt;em&gt;title&lt;/em&gt; and the &lt;em&gt;article&lt;/em&gt; columns:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The &lt;em&gt;coalesce&lt;/em&gt; function may be something that you are unfamiliar with.
This functions simply returns the first argument which is &lt;em&gt;not&lt;/em&gt; NULL.
In this case we use:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Which means that if title would be NULL it will return the empty string &lt;em&gt;''&lt;/em&gt; which never is NULL.
We use &lt;em&gt;coalesce&lt;/em&gt; here to substitute a value for NULL, being the empty string.&lt;/p&gt;
&lt;p&gt;If we would not substitute NULL then our &lt;em&gt;tsvector&lt;/em&gt; generation would fail if either the title or article column would be NULL.&lt;/p&gt;
&lt;p&gt;Next we can create an index on that newly filled column:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;phrasetable_idx&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;gin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And we have magic, there now is a &lt;em&gt;GiN&lt;/em&gt; index on that column which will be used during full text search.&lt;/p&gt;
&lt;p&gt;To create a &lt;em&gt;GiST&lt;/em&gt; index we could use exactly the same syntax:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;phrasetable_idx&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;gist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now, the disk-space savvy readers will have noticed that our "phrase" column now contains some redundant information as we store the tsvector of the article and title column that was already in the database.
If you do not wish to have this extra column, you could created &lt;em&gt;expression&lt;/em&gt; indexes (our on-the-fly queries we seen before).&lt;/p&gt;
&lt;p&gt;The setup of such an expression index is trivial:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;phrasetable_exp_idx&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;gin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)));&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Instead of having this extra tsvector column around, we now have created an on-the-fly index using the same syntax as we employed when we populated the &lt;em&gt;phrase&lt;/em&gt; column a few lines back.&lt;/p&gt;
&lt;p&gt;One important thing to note when you use &lt;em&gt;expression&lt;/em&gt; indexes is the text search configuration you used. Here we specify that we wish the index to be created using the 'english' configuration set.
This results in an index which is configuration aware and will &lt;em&gt;only&lt;/em&gt; work with a query which has the &lt;em&gt;same&lt;/em&gt; configuration set fed to the tsquery function (well, the same name anyway).&lt;/p&gt;
&lt;p&gt;You could omit the configuration which would then default to the one set in the "default_text_search_config" variable we saw in the last chapter. The problem you will have then is that the index is created using a configuration that &lt;em&gt;could&lt;/em&gt; be altered &lt;em&gt;after&lt;/em&gt; the index was created. If we later would query the database with the altered default, the index would be useless and will return inaccurate results. &lt;/p&gt;
&lt;p&gt;Also note that we may save on disk space when we use the &lt;em&gt;expression&lt;/em&gt; index, but we do not save on CPU. Now, instead of indexing data already parsed and ready in a column, the index has to compute the &lt;em&gt;to_tsvector&lt;/em&gt; on every index match. Again, a world of trade-offs.&lt;/p&gt;
&lt;h3&gt;Triggers&lt;/h3&gt;
&lt;p&gt;A final, small topic I want to briefly touch on before I let you go free are &lt;em&gt;update triggers&lt;/em&gt;. The way we have been populating our database so far does not need a trigger actually. Up until now we have been inserting records (or updating them) using the &lt;em&gt;ts_tsvector()&lt;/em&gt; function. The negative aspect of going about it the way we did is that it is extremely &lt;em&gt;redundant&lt;/em&gt;. &lt;/p&gt;
&lt;p&gt;If we inserted a piece of Raven text into a record, we specified it &lt;em&gt;twice&lt;/em&gt;, one time for the &lt;em&gt;article&lt;/em&gt; column and one time for the &lt;em&gt;phrase&lt;/em&gt; column which holds the tsvector result.&lt;/p&gt;
&lt;p&gt;A better way to do this is to not let the insert query care about the tsvector &lt;em&gt;at all&lt;/em&gt;. We simply insert the text we like and let the database do the converting  behind the curtains.&lt;/p&gt;
&lt;p&gt;This is where a &lt;em&gt;trigger&lt;/em&gt; comes in handy. Actually, PostgreSQL has a whole set of &lt;em&gt;trigger&lt;/em&gt; functions available that will fire when certain conditions are met, but when it comes to full text we have two functions at our disposal.&lt;/p&gt;
&lt;h4&gt;tsvector_update_trigger()&lt;/h4&gt;
&lt;p&gt;The first, and most used one, is called &lt;em&gt;tsvector_update_trigger()&lt;/em&gt; and fires whenever a new row is inserted into your table (in our case &lt;em&gt;phraseTable&lt;/em&gt;).&lt;/p&gt;
&lt;p&gt;To setup such a trigger, we could use the following SQL:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="n"&gt;tsvectorupdate&lt;/span&gt; &lt;span class="k"&gt;BEFORE&lt;/span&gt; &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;
    &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;EACH&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;EXECUTE&lt;/span&gt; &lt;span class="k"&gt;PROCEDURE&lt;/span&gt;
    &lt;span class="n"&gt;tsvector_update_trigger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'pg_catalog.english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;That is all you need to setup such a trigger. Let us see what we just did.&lt;/p&gt;
&lt;p&gt;First, we have new syntax staring us in the face: &lt;em&gt;CREATE TRIGGER&lt;/em&gt;. This will create a trigger on certain &lt;em&gt;events&lt;/em&gt;. The events here are &lt;em&gt;BEFORE INSERT&lt;/em&gt; and &lt;em&gt;BEFORE UPDATE&lt;/em&gt; which are contracted to &lt;em&gt;BEFORE INSERT OR UPDATE&lt;/em&gt;. Then we specify on which &lt;em&gt;table&lt;/em&gt; this trigger has to act and for each &lt;em&gt;ROW&lt;/em&gt;. Then we say we want to &lt;em&gt;EXECUTE PROCEDURE&lt;/em&gt;, which, in our case, is the function &lt;em&gt;tsvector_update_trigger()&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The function itself needs a bit of explaining as well. This version takes three required arguments: the tsvector column name, the full text configuration name and the original text column name.
The latter can be multiple columns to concatenate them together. This concatenation is done with &lt;em&gt;coalesce&lt;/em&gt; under the hood, as we have seen before.&lt;/p&gt;
&lt;p&gt;In our case, we create a trigger that takes the &lt;em&gt;phrase&lt;/em&gt; tsvector column, the &lt;em&gt;english&lt;/em&gt; full text configuration and concatenates the text from both &lt;em&gt;title&lt;/em&gt; and &lt;em&gt;article&lt;/em&gt; to be normalized into lexemes.&lt;/p&gt;
&lt;p&gt;Note that instead of &lt;em&gt;english&lt;/em&gt; we say &lt;em&gt;pg_catalog.english&lt;/em&gt; when providing this function with the full text configuration.
In case of this function (and the next) we have to provide the schema-qualified path to the configuration.&lt;/p&gt;
&lt;h4&gt;tsvector_update_trigger_column()&lt;/h4&gt;
&lt;p&gt;The other of the two full text trigger functions we have is called &lt;em&gt;tsvector_update_trigger_column()&lt;/em&gt; and has only one difference to the former: the full text configuration used.
Here, the full text configuration can be read from a &lt;em&gt;column&lt;/em&gt; instead of given directly as a string.&lt;/p&gt;
&lt;p&gt;A possibility we have not seen in this series is one where you can have yet another column in your phraseTable where you store the name of the full text configuration you wish to use.
This way you can store multiple "languages" within the same table, specifying which configuration to use with each row.&lt;/p&gt;
&lt;p&gt;This trigger functions can take into account these per-row differing configurations and is able to read them from the specified column.&lt;/p&gt;
&lt;p&gt;But we have a trade-off once more. These two trigger functions, which are officially called &lt;em&gt;example functions&lt;/em&gt; again (remember our ranking functions?), do &lt;em&gt;not&lt;/em&gt; take into account weights.
If you have the need to store different weights in your tsvectors, you will have to write you own trigger function.&lt;/p&gt;
&lt;h3&gt;The end&lt;/h3&gt;
&lt;p&gt;Okay, I guess this covers the &lt;em&gt;basics&lt;/em&gt; of full text within PostgreSQL.&lt;/p&gt;
&lt;p&gt;We have covered the most important parts and touched some segments deeply, others just with a soft lovers glove.
As I always say at the end of such lengthy chapters: go out and explore.&lt;/p&gt;
&lt;p&gt;I have tried to give you a solid, full text knowledge base to build further adventures on. I highly encourage you to pack your elephant, take your new ship for a maiden voyage, set high the sails and if certain blue wales try to swim next to your vessel, simply let the mammoth take a good relief down the ship's head, and let those turds float together with our squeaky finned friends!&lt;/p&gt;
&lt;p&gt;And as always...thanks for reading!&lt;/p&gt;
&lt;!--  LocalWords:  PostgreSQL lexeme
 --&gt;&lt;/div&gt;</description><category>full text search</category><category>postgresql</category><guid>http://shisaa.be/postset/postgresql-full-text-search-part-3.html</guid><pubDate>Wed, 14 May 2014 09:00:00 GMT</pubDate></item><item><title>PostgreSQL: A full text search engine - Part 2</title><link>http://shisaa.be/postset/postgresql-full-text-search-part-2.html</link><dc:creator>Tim van der Linden</dc:creator><description>&lt;div&gt;&lt;p&gt;Welcome to the second installment of our look into full text search within PostgreSQL.&lt;/p&gt;
&lt;p&gt;If this is the first time you heard about full text search I highly encourage you to go and read &lt;a href="http://shisaa.be/postset/postgresql-full-text-search-part-1.html" title="First chapter introducing the full text search capabilities of PostgreSQL."&gt;the first chapter&lt;/a&gt; in this series before continuing. This chapter builds on what we have seen previously.&lt;/p&gt;
&lt;h3&gt;A look back&lt;/h3&gt;
&lt;p&gt;In short, the previous chapter introduced the general concept of full text search, regardless of the software being used. It looked at how the idea of full text search was brought to computer software by breaking it up into roughly three steps: &lt;em&gt;case removal&lt;/em&gt;, &lt;em&gt;stop word removal&lt;/em&gt;, normalizing with &lt;em&gt;synonyms&lt;/em&gt; and &lt;em&gt;stemming&lt;/em&gt;. &lt;/p&gt;
&lt;p&gt;Next we delved into PostgreSQL's implementation and introduced the &lt;em&gt;tsvector&lt;/em&gt; and the &lt;em&gt;tsquery&lt;/em&gt; as two new data types together with a handful of new functions such as &lt;em&gt;to_tsvector()&lt;/em&gt;, &lt;em&gt;to_tsquery()&lt;/em&gt; and &lt;em&gt;plainto_tsquery()&lt;/em&gt;, which all extend PostgreSQL to support full text search. &lt;/p&gt;
&lt;p&gt;We saw how we could feed PostgreSQL a string of text which would then get &lt;em&gt;parsed&lt;/em&gt; into &lt;em&gt;tokens&lt;/em&gt; and processed even further into &lt;em&gt;lexemes&lt;/em&gt; which in turn got &lt;em&gt;stored&lt;/em&gt; into a &lt;em&gt;tsvector&lt;/em&gt;. We then queried that &lt;em&gt;tsvector&lt;/em&gt; using the &lt;em&gt;tsquery&lt;/em&gt; data type and the &lt;em&gt;@@&lt;/em&gt; matching operator.&lt;/p&gt;
&lt;p&gt;In this chapter, I want to flesh out an important topic we touched on in previously: PostgreSQL's full text search &lt;em&gt;configurations&lt;/em&gt;. &lt;/p&gt;
&lt;h3&gt;Precaution&lt;/h3&gt;
&lt;p&gt;Let me be very clear, in &lt;em&gt;most&lt;/em&gt; cases the configurations shipped with PostgreSQL will suffice and you do not need to touch them at all, in which case this chapter could be considered a waste of time.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;However&lt;/em&gt;, I highly encourage you to read through this chapter and, as always, actually &lt;em&gt;run the queries&lt;/em&gt; with me. You need to know how the tools you use work under the hood.&lt;/p&gt;
&lt;p&gt;To be even more bold, someday you might even need to get your hands dirty and actually &lt;em&gt;build&lt;/em&gt; your own configuration. Why? Because a customer wanting full text search for their application might have specific requirements, or even deliver you specific dictionaries to use in the parsing stage. Such use cases may arise in very specific areas of conduct where much official, technical lingo is used which is not covered in a general dictionary.&lt;/p&gt;
&lt;p&gt;So, put on your favorite pants (or none if you like that better), turn down the lights, pull the computer close to you, open up a terminal window, put on some eery music and let us get started.&lt;/p&gt;
&lt;h3&gt;Configuring PostgreSQL full text search&lt;/h3&gt;
&lt;p&gt;In the last chapter we saw that PostgreSQL uses a couple of tools like a &lt;em&gt;stop word list&lt;/em&gt; and &lt;em&gt;dictionaries&lt;/em&gt; to perform its parsing. We also saw that we did not need to tell PostgreSQL about which of these tools to use. It turned out that full text search comes with a set of default configurations for several languages. We also found out that, if no configuration was given, the database assumes that the document or string to be parsed is English and uses a configuration called &lt;em&gt;'english'&lt;/em&gt;. &lt;/p&gt;
&lt;p&gt;Beware of localized packages of PostgreSQL though. As I noted in the previous chapter, there is a small possibility that the default configuration in your PostgreSQL installation is &lt;em&gt;not&lt;/em&gt; set to 'english'.
If this is the case with your setup, be sure to include the 'english' configuration if not stated otherwise or &lt;em&gt;change&lt;/em&gt; it to be 'english'. We will see how to do that in a minute.&lt;/p&gt;
&lt;p&gt;Taking the small database we created last time, the syntax to feed a configuration set to PostgreSQL during parsing was the following:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'elephants &amp;amp; blue'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The string '&lt;em&gt;english&lt;/em&gt;' represents the &lt;em&gt;name&lt;/em&gt; of the configuration which we would like to use. As you know by now, this string can be omitted which will make the database use the default configuration. PostgreSQL knows this default because it is set in the general &lt;em&gt;postgresql.conf&lt;/em&gt; configuration file. In that file you will find a variable called &lt;em&gt;default_text_search_config&lt;/em&gt; which, in most cases, is set to &lt;em&gt;pg_catalog.english&lt;/em&gt;. If you wish to have a own, custom configuration to be the default, that is the place to set it.&lt;/p&gt;
&lt;p&gt;Before hacking away at your own configuration, it may be of interest to see what PostgreSQL has to offer. To see which shipped configuration files are available to you, use the &lt;em&gt;describe&lt;/em&gt; command (\d) together with the full text flag (F):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;dF&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will &lt;em&gt;describe&lt;/em&gt; the objects in the database that represent full text configurations. You see that by default you have quite a lot of language support. To see a different configuration in action, let us do a quick, fun test. &lt;/p&gt;
&lt;p&gt;First, take the dutch string "Een blauwe olifant springt al dartelend over de kreupele dolfijn.", which is a rough translation of the "The blue elephant jumps over the crippled dolphin." example from the first chapter. If we would feed this to PostgreSQL, using the default (english) configuration:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Een blauwe olifant springt al dartelend over de kreupele dolfijn'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We would get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt; &lt;span class="s1"&gt;'al'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="s1"&gt;'blauw'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="s1"&gt;'dartelend'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="s1"&gt;'de'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="s1"&gt;'dolfijn'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="s1"&gt;'een'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="s1"&gt;'kreupel'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt; &lt;span class="s1"&gt;'olif'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="s1"&gt;'springt'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;It attempted to guess some words as you can see from the lexeme 'olif', but, to a dutch reader, this is &lt;em&gt;not&lt;/em&gt; stemmed correctly. Neither are the stop words removed: 'de' and 'een' are articles which, in dutch, are considered of no value in a text search context. So let us try this again with the built-in &lt;em&gt;dutch&lt;/em&gt; configuration:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'dutch'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Een blauwe olifant springt al dartelend over de kreupele dolfijn'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And we get:&lt;/p&gt;
&lt;pre class="code literal-block"&gt; &lt;span class="s1"&gt;'blauw'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="s1"&gt;'dartel'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="s1"&gt;'dolfijn'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="s1"&gt;'kreupel'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt; &lt;span class="s1"&gt;'olifant'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="s1"&gt;'springt'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Aha! That is much shorter then the previous result, and it is also more correct. As you can see, the words 'de' and 'een' are now removed and the stemming is done correctly on 'dartel', 'olifant' and 'kreupel'.
The target of this series, however, is not to show you the dutch language (for it will make you weep...), but you see the effect a different configuration set can have. &lt;/p&gt;
&lt;p&gt;But what is such a configuration set made of? To answer that, we can simply use the same describe, but ask for more detailed information with the &lt;em&gt;+&lt;/em&gt; flag:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;dF&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will return a list of &lt;em&gt;all&lt;/em&gt; the configurations and their details, so let us filter that and look at only the english version:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;dF&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;english&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The following result will be returned:&lt;/p&gt;
&lt;pre class="code literal-block"&gt; asciihword      &lt;span class="p"&gt;|&lt;/span&gt; english_stem
 asciiword       &lt;span class="p"&gt;|&lt;/span&gt; english_stem
 email           &lt;span class="p"&gt;|&lt;/span&gt; simple
 file            &lt;span class="p"&gt;|&lt;/span&gt; simple
 float           &lt;span class="p"&gt;|&lt;/span&gt; simple
 host            &lt;span class="p"&gt;|&lt;/span&gt; simple
 hword           &lt;span class="p"&gt;|&lt;/span&gt; english_stem
 hword_asciipart &lt;span class="p"&gt;|&lt;/span&gt; english_stem
 hword_numpart   &lt;span class="p"&gt;|&lt;/span&gt; simple
 hword_part      &lt;span class="p"&gt;|&lt;/span&gt; english_stem
 int             &lt;span class="p"&gt;|&lt;/span&gt; simple
 numhword        &lt;span class="p"&gt;|&lt;/span&gt; simple
 numword         &lt;span class="p"&gt;|&lt;/span&gt; simple
 sfloat          &lt;span class="p"&gt;|&lt;/span&gt; simple
 uint            &lt;span class="p"&gt;|&lt;/span&gt; simple
 url             &lt;span class="p"&gt;|&lt;/span&gt; simple
 url_path        &lt;span class="p"&gt;|&lt;/span&gt; simple
 version         &lt;span class="p"&gt;|&lt;/span&gt; simple
 word            &lt;span class="p"&gt;|&lt;/span&gt; english_stem
&lt;/pre&gt;


&lt;p&gt;All of these are &lt;em&gt;token categories&lt;/em&gt; that target the different groups of words that the PostgreSQL full text parser recognizes.
 For each category there are one or more dictionaries defined which will receive the token and try to return a lexeme.
 We also call this overview a configuration map, for it maps a category to one or more dictionaries.&lt;/p&gt;
&lt;p&gt;If the parser encounters a URL, for example, it will categorize it as a &lt;em&gt;url&lt;/em&gt; or &lt;em&gt;url_path&lt;/em&gt; token and as a result, PostgreSQL will consult the dictionaries &lt;em&gt;mapped&lt;/em&gt; to this category to try and create a single lexeme containing a URL pointing to the same path. Example:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;example.com&lt;/li&gt;
&lt;li&gt;example.com/index.html&lt;/li&gt;
&lt;li&gt;example.com/foo/../index.html&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The URLs all result in the same document being served, so it makes sense to only save one variant as a lexeme in the resulting vector.
The same kind of &lt;em&gt;normalization&lt;/em&gt; is done for file paths, version numbers, host names, units of measure, ... . A lot more then normal, English words.&lt;/p&gt;
&lt;p&gt;There are 23 categories in total that the parser can recognize, ones not included here, for example, are &lt;em&gt;tag&lt;/em&gt; for XML tags, &lt;em&gt;blank&lt;/em&gt; for whitespace or punctuation, etc.&lt;/p&gt;
&lt;p&gt;To see a description of the different token categories supported, use the 'p' flag together with '+' for more information:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;dFp&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;When parsing, the chain of command goes as follows:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;A string is fed to PostgreSQL's full text&lt;/li&gt;
&lt;li&gt;The parser crawls over the string and chops it into tokens of a certain type&lt;/li&gt;
&lt;li&gt;For each token category a list of dictionaries (or a single dictionary) is consulted&lt;/li&gt;
&lt;li&gt;If a dictionary list is used, the dictionaries are (generally) ordered from most precise (narrow) to most generic (wide)&lt;/li&gt;
&lt;li&gt;As soon as a dictionary returns a lexeme (single or in the form of an array), the flow for that token stops&lt;/li&gt;
&lt;li&gt;If no lexeme is proposed (a dictionary returns &lt;em&gt;NULL&lt;/em&gt;) the token is given to the next dictionary in line or if a stop word list returns a match (returns &lt;em&gt;empty array&lt;/em&gt;), the token is discarded&lt;/li&gt;
&lt;/ul&gt;&lt;h3&gt;Dictionary templates and dictionaries&lt;/h3&gt;
&lt;p&gt;In the list of token categories that were supported by the built-in 'english' configuration, you will find that only two &lt;em&gt;dictionaries&lt;/em&gt; are used: &lt;em&gt;simple&lt;/em&gt; and &lt;em&gt;english_stem&lt;/em&gt;, which in turn come from the &lt;em&gt;simple&lt;/em&gt; and &lt;em&gt;snowball&lt;/em&gt; dictionary &lt;em&gt;templates&lt;/em&gt; respectively.&lt;/p&gt;
&lt;p&gt;So, what exactly is the difference between a &lt;em&gt;dictionary template&lt;/em&gt; and a &lt;em&gt;dictionary&lt;/em&gt;?&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;dictionary template&lt;/em&gt; is the skeleton (hence template) of a dictionary. It defines the actual &lt;em&gt;C&lt;/em&gt; functions that will do the heavy lifting.
A &lt;em&gt;dictionary&lt;/em&gt; is an instantiation of that template - providing it with data to work with.&lt;/p&gt;
&lt;p&gt;Let me try to clear any confusion on this. &lt;/p&gt;
&lt;p&gt;Take, for example, the &lt;em&gt;simple&lt;/em&gt; dictionary &lt;em&gt;template&lt;/em&gt;. It does two things: it first checks a token against a &lt;em&gt;stop word&lt;/em&gt; list. If it finds a match it returns an &lt;em&gt;empty array&lt;/em&gt;, which will result in the token being discarded. If no match is found in the stop word list, the process will return the same token, but with &lt;em&gt;casing removed&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;All the checking and case removing is done by functions, under the hood. The stop word file, however, is something that the &lt;em&gt;dictionary&lt;/em&gt; (the instantiation) provides.
The instantiation of the &lt;em&gt;simple&lt;/em&gt; dictionary template, thus the &lt;em&gt;dictionary&lt;/em&gt; itself, would be defined as follows:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="k"&gt;DICTIONARY&lt;/span&gt; &lt;span class="k"&gt;simple&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;stopwords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;english&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;No need to run this SQL for PostgreSQL already comes shipped with the &lt;em&gt;simple&lt;/em&gt; dictionary, but I wish to show you how you &lt;em&gt;could&lt;/em&gt; create it.&lt;/p&gt;
&lt;p&gt;First, you will see that we &lt;em&gt;have&lt;/em&gt; to define the template, thus telling PostgreSQL which set of functions to use.
Next we feed it the data it is expecting, in case of &lt;em&gt;simple&lt;/em&gt; it only expects a stop word list.&lt;/p&gt;
&lt;p&gt;The reason for this separation is a safe guard one. Only a database user with &lt;em&gt;super user&lt;/em&gt; privileges can write the actual template, because this template will contain functions that, if written incorrectly, could slow down or crash the database. You need someone who knows what they are doing and not your local script kiddy who has normal user access to your part of the database.&lt;/p&gt;
&lt;p&gt;Notice that we only give the stopwords attribute the word &lt;em&gt;english&lt;/em&gt; instead of a full file path.
This is because PostgreSQL has set a few standards in place for all dictionary types we will see in this chapter.&lt;/p&gt;
&lt;p&gt;First, in case of a stop word list, the file &lt;em&gt;must&lt;/em&gt; have the &lt;em&gt;.stop&lt;/em&gt; extension.&lt;/p&gt;
&lt;p&gt;Next, you can provide a full path to the file, anywhere on your system. 
However, if you do not provide a full path, PostgreSQL will search for it inside a directory called &lt;em&gt;tsearch_data&lt;/em&gt; within PostgreSQL's portion of your system's user &lt;em&gt;shared&lt;/em&gt; directory.&lt;/p&gt;
&lt;p&gt;On a Debian system (using PostgreSQL 9.3) the path to this directory reads: "/usr/share/postgresql/9.3/tsearch_data".&lt;/p&gt;
&lt;p&gt;A dictionary like the &lt;em&gt;simple&lt;/em&gt; dictionary is one that is most of the time put at the beginning of a &lt;em&gt;dictionary list&lt;/em&gt; to remove all the stop words before other dictionaries are being consulted. However, in all the cases where we see &lt;em&gt;simple&lt;/em&gt; in the dictionary column of the token type list above, only this dictionary is used, meaning that only stop words are removed and all else is stripped of casing.&lt;/p&gt;
&lt;h3&gt;Creating the "simple" dictionary&lt;/h3&gt;
&lt;p&gt;Say that we wanted to setup our own &lt;em&gt;simple&lt;/em&gt; dictionary based on the &lt;em&gt;simple&lt;/em&gt; dictionary template, but feed it our own list of stop words. Before setting up this new dictionary, we would first have to write a stop word file. &lt;/p&gt;
&lt;p&gt;Luckily for us, this is trivial. A stop word file is nothing more then a plain text file with one word on each line. Empty lines and trailing whitespace are ignored. We would then have to save this file with the &lt;em&gt;.stop&lt;/em&gt; extension. Let us try just that.&lt;/p&gt;
&lt;p&gt;Open up your editor and punch in the words "dolphin" and "the", both on their own line. Write the file out as "shisaa_stop.stop", preferably in PostgreSQL's shared directory.&lt;/p&gt;
&lt;p&gt;Next we need to setup our dictionary. Connect to the "phrases" database from chapter one and run the following SQL:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="k"&gt;DICTIONARY&lt;/span&gt; &lt;span class="n"&gt;shisaa_simple&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;simple&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;stopwords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shisaa_stop&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;h3&gt;Setting up a configuration&lt;/h3&gt;
&lt;p&gt;Now, the dictionary by itself is not very helpful. As we have seen before, we need to map it to token categories before we can actually use it for parsing.
This means that we need to make our own configuration.&lt;/p&gt;
&lt;p&gt;Let us setup an empty configuration (not based on an existing one like 'english'):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="n"&gt;CONFIGURATION&lt;/span&gt; &lt;span class="n"&gt;shisaa&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'default'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This statement will create a new configuration for us which is completely empty, it has no mappings. The argument we have to give here can be either &lt;em&gt;parser&lt;/em&gt; or &lt;em&gt;copy&lt;/em&gt;. With parser you define which parser to use and it will create an empty configuration. PostgreSQL has only one parser by default which is named...&lt;em&gt;default&lt;/em&gt;. If you choose &lt;em&gt;copy&lt;/em&gt; then you will have to provide an &lt;em&gt;existing&lt;/em&gt; configuration name (like english) from which you would like to make a copy.&lt;/p&gt;
&lt;p&gt;To verify that the configuration is empty, run our describe on it:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;dF&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;shisaa&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And marvel at its emptiness.&lt;/p&gt;
&lt;p&gt;Now, let us add the &lt;em&gt;shisaa_simple&lt;/em&gt; dictionary we created before:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="n"&gt;CONFIGURATION&lt;/span&gt; &lt;span class="n"&gt;shisaa&lt;/span&gt;
    &lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="n"&gt;MAPPING&lt;/span&gt; &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="n"&gt;asciiword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asciihword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword_asciipart&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword_part&lt;/span&gt;
    &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;shisaa_simple&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;As you will see throughout this (and the next) chapter, full text extends not only the data types and functions we have available, but also extends PostgreSQL's SQL syntax with a handful of new statements.
I need to note that all of these statements are &lt;em&gt;not&lt;/em&gt; SQL standard (for SQL has no full text standard) and thus cannot be easily ported to a different database.
But then again...what is this folly...who would even need a different database!&lt;/p&gt;
&lt;p&gt;The new statements introduced here (and in the previous SQL blocks) are:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;&lt;em&gt;CREATE TEXT SEARCH DICTIONARY&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;CREATE TEXT SEARCH CONFIGURATION&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;ALTER TEXT SEARCH CONFIGURATION&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Just remember that these are not part of the SQL standard (something which PostgreSQL holds very dear, in high contrast with many other databases).&lt;/p&gt;
&lt;p&gt;Did it work? Well, describe it:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;dF&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;shisaa&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt; asciihword      &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple
 asciiword       &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple
 hword           &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple
 hword_asciipart &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple
 hword_part      &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple
 word            &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple
&lt;/pre&gt;


&lt;p&gt;Perfect!&lt;/p&gt;
&lt;p&gt;Here we mapped our fresh dictionary to the token groups "asciihword", "asciiword", "hword", "hword_asciipart", "hword_part", "word", because these will target most of a normal, English sentence.&lt;/p&gt;
&lt;p&gt;It is time to try out this new search configuration! Punch in the same on-the-fly SQL as we had in the previous chapter, but this time with &lt;em&gt;our own&lt;/em&gt; configuration:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'shisaa'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'The big blue elephant jumped over the crippled blue dolphin.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And we get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'big'&lt;/span&gt;:2 &lt;span class="s1"&gt;'blue'&lt;/span&gt;:3,9 &lt;span class="s1"&gt;'crippled'&lt;/span&gt;:8 &lt;span class="s1"&gt;'elephant'&lt;/span&gt;:4 &lt;span class="s1"&gt;'jumped'&lt;/span&gt;:5 &lt;span class="s1"&gt;'over'&lt;/span&gt;:6
&lt;/pre&gt;


&lt;p&gt;Ha! All squeaky flippers unite! The word &lt;em&gt;dolphin&lt;/em&gt; is &lt;em&gt;removed&lt;/em&gt;, because we defined it to be a stop word. A world as it should be.&lt;/p&gt;
&lt;p&gt;We now have a basic full text configuration with a &lt;em&gt;simple&lt;/em&gt; dictionary. To have a more real world full text search we will need more then just this dictionary though, we will at least need to take care of stemming.&lt;/p&gt;
&lt;h3&gt;Extending the configuration: stemming with the Snowball&lt;/h3&gt;
&lt;p&gt;Stemming, the process of reducing words to their basic form, is done by a special, dedicated kind of dictionary, the &lt;em&gt;Snowball&lt;/em&gt; dictionary. &lt;/p&gt;
&lt;p&gt;What?&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Snowball&lt;/em&gt; is a &lt;em&gt;very proven&lt;/em&gt; string processing language specially designed for stemming purposes and supports a wide range of languages. It originated from the &lt;em&gt;Porter stemming algorithm&lt;/em&gt; and uses a natural syntax to define stemming rules. &lt;/p&gt;
&lt;p&gt;And luckily for us, PostgreSQL has a &lt;em&gt;Snowball&lt;/em&gt; dictionary template ready to use. This template has the Snowball stemming rules embedded for a wide variety of languages. Let us create a &lt;em&gt;dictionary&lt;/em&gt; for our shisaa configuration, shall we?&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="k"&gt;DICTIONARY&lt;/span&gt; &lt;span class="n"&gt;shisaa_snowball&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;snowball&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;language&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;english&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Again, very easy to setup. The snowball dictionary &lt;em&gt;template&lt;/em&gt; accepts two variables to be setup. The first, mandatory one is the language you wish to support. Without this, the template does not know which of the Snowball stemming rules to take.&lt;/p&gt;
&lt;p&gt;The next, optional one is, again, a stop word list. But...why can we feed this dictionary a stop word list? Did we not already do that with the &lt;em&gt;simple&lt;/em&gt; dictionary?&lt;/p&gt;
&lt;p&gt;That is correct, we did setup the &lt;em&gt;simple&lt;/em&gt; dictionary to remove stop words for us, but we are not required to use the &lt;em&gt;simple&lt;/em&gt; and the &lt;em&gt;snowball&lt;/em&gt; dictionary in tandem.
It is perfectly possible to &lt;em&gt;map&lt;/em&gt; only the &lt;em&gt;snowball&lt;/em&gt; dictionary for various token categories and ignore all other dictionaries.
If you would not tell the &lt;em&gt;snowball&lt;/em&gt; dictionary to remove stop words, it could become messy for the Snowball stemmer will try and stem &lt;em&gt;all&lt;/em&gt; words it finds.&lt;/p&gt;
&lt;p&gt;This stop word list can be the exact same list we fed the &lt;em&gt;simple&lt;/em&gt; dictionary.&lt;/p&gt;
&lt;p&gt;Also, because a &lt;em&gt;snowball&lt;/em&gt; dictionary will try and parse &lt;em&gt;all&lt;/em&gt; the tokens it is being fed, it is consider to be a &lt;em&gt;wide&lt;/em&gt; dictionary. Therefor, as we have seen earlier, it is a good idea when chaining dictionaries together to put this dictionary at the end of your chain.&lt;/p&gt;
&lt;p&gt;We now have our own version of the &lt;em&gt;snowball&lt;/em&gt; dictionary and need to extend our configuration and map this dictionary to the desired token categories:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="n"&gt;CONFIGURATION&lt;/span&gt; &lt;span class="n"&gt;shisaa&lt;/span&gt;
    &lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="n"&gt;MAPPING&lt;/span&gt; &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="n"&gt;asciiword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asciihword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword_asciipart&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword_part&lt;/span&gt;
    &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;shisaa_simple&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shisaa_snowball&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Notice that in the &lt;em&gt;WITH&lt;/em&gt; clause we are now chaining the &lt;em&gt;simple&lt;/em&gt; and the &lt;em&gt;snowball&lt;/em&gt; dictionary together. The order is, of course, important.
Describe our configuration once more:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;dF&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;shisaa&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;asciihword      &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple,shisaa_snowball
asciiword       &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple,shisaa_snowball
hword           &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple,shisaa_snowball
hword_asciipart &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple,shisaa_snowball
hword_part      &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple,shisaa_snowball
word            &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple,shisaa_snowball
&lt;/pre&gt;


&lt;p&gt;Perfect, now the &lt;em&gt;simple&lt;/em&gt; dictionary will be consulted first followed by the &lt;em&gt;snowball&lt;/em&gt; dictionary.&lt;/p&gt;
&lt;p&gt;Note that throughout this chapter I will chain together dictionaries in order. This will &lt;em&gt;not&lt;/em&gt; always be the most smart or desired order, just an order to demonstrate &lt;em&gt;how&lt;/em&gt; you can chain dictionaries.&lt;/p&gt;
&lt;p&gt;To the test, throw a new query at it:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'shisaa'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'The big blue elephant jumped over the crippled blue dolphin.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'big'&lt;/span&gt;:2 &lt;span class="s1"&gt;'blue'&lt;/span&gt;:3,9 &lt;span class="s1"&gt;'crippled'&lt;/span&gt;:8 &lt;span class="s1"&gt;'elephant'&lt;/span&gt;:4 &lt;span class="s1"&gt;'jumped'&lt;/span&gt;:5 &lt;span class="s1"&gt;'over'&lt;/span&gt;:6
&lt;/pre&gt;


&lt;p&gt;Nice, that is very...oh wait. Something is not correct. I am getting back &lt;em&gt;exactly&lt;/em&gt; the same result as before. The words "crippled" and "elephant" are not stemmed at all. Why?&lt;/p&gt;
&lt;p&gt;Well, the &lt;em&gt;simple&lt;/em&gt; dictionary, as we defined it earlier, is setup to be a bit greedy. In its current state it will return an unmatched token as a lexeme with casing removed.
It does not return &lt;em&gt;NULL&lt;/em&gt;. And, as you know by now, &lt;em&gt;NULL&lt;/em&gt; is needed to give other dictionaries a chance to examine the token.&lt;/p&gt;
&lt;p&gt;So, we need to alter the &lt;em&gt;simple&lt;/em&gt; dictionary's behavior. For this, we can use the &lt;em&gt;ALTER&lt;/em&gt; syntax provided to us. And as it turns out, the &lt;em&gt;simple&lt;/em&gt; dictionary &lt;em&gt;template&lt;/em&gt; can accept one more variable: the &lt;em&gt;accept&lt;/em&gt; variable. If this is set to false, then it will return &lt;em&gt;NULL&lt;/em&gt; for every unmatched token. Let us alter that dictionary:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="k"&gt;DICTIONARY&lt;/span&gt; &lt;span class="n"&gt;shisaa_simple&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;accept&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt; &lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Run the ts_vector query again, and look at the results:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'big'&lt;/span&gt;:2 &lt;span class="s1"&gt;'blue'&lt;/span&gt;:3,9 &lt;span class="s1"&gt;'crippl'&lt;/span&gt;:8 &lt;span class="s1"&gt;'eleph'&lt;/span&gt;:4 &lt;span class="s1"&gt;'jump'&lt;/span&gt;:5 &lt;span class="s1"&gt;'over'&lt;/span&gt;:6
&lt;/pre&gt;


&lt;p&gt;That is what we were looking for, nicely stemmed results!&lt;/p&gt;
&lt;h3&gt;Extending the configuration: fun with synonyms&lt;/h3&gt;
&lt;p&gt;By now we have seen the first and the last dictionary in our control chain, but at least one more important part is missing: synonyms are not removed.&lt;/p&gt;
&lt;p&gt;Let us extend our favorite sentence and add a few synonyms to it: "The big blue elephant, joined by its enormous blue mammoth friend, jumped over the crippled blue dolphin while smiling at the orca."&lt;/p&gt;
&lt;p&gt;Still perfectly possible.&lt;/p&gt;
&lt;p&gt;In the light of (cue dark en deep Batman voice) "science" (end Batman voice), let us first see what we get when we run it through our current configuration:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'at'&lt;/span&gt;:20 &lt;span class="s1"&gt;'big'&lt;/span&gt;:2 &lt;span class="s1"&gt;'blue'&lt;/span&gt;:3,9,16 &lt;span class="s1"&gt;'by'&lt;/span&gt;:6 &lt;span class="s1"&gt;'crippl'&lt;/span&gt;:15 &lt;span class="s1"&gt;'eleph'&lt;/span&gt;:4 &lt;span class="s1"&gt;'enorm'&lt;/span&gt;:8 &lt;span class="s1"&gt;'friend'&lt;/span&gt;:11 &lt;span class="s1"&gt;'it'&lt;/span&gt;:7 &lt;span class="s1"&gt;'join'&lt;/span&gt;:5 &lt;span class="s1"&gt;'jump'&lt;/span&gt;:12 &lt;span class="s1"&gt;'mammoth'&lt;/span&gt;:10 &lt;span class="s1"&gt;'orca'&lt;/span&gt;:22 &lt;span class="s1"&gt;'over'&lt;/span&gt;:13 &lt;span class="s1"&gt;'smile'&lt;/span&gt;:19 &lt;span class="s1"&gt;'while'&lt;/span&gt;:18
&lt;/pre&gt;


&lt;p&gt;That is one big result set. Maybe we should cut the blue dolphin a little bit of slack and feed a real stop word list to our &lt;em&gt;simple&lt;/em&gt; dictionary before continuing by altering our &lt;em&gt;dictionary&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="k"&gt;DICTIONARY&lt;/span&gt; &lt;span class="n"&gt;shisaa_simple&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;stopwords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;english&lt;/span&gt; &lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;As you see you can simply use the same &lt;em&gt;ALTER&lt;/em&gt; syntax as before. The "english" here refers to the shipped "english.stop" stop word list.&lt;/p&gt;
&lt;p&gt;Querying again, we will get back a better, short list (including our Dolphin friend):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'big'&lt;/span&gt;:2 &lt;span class="s1"&gt;'blue'&lt;/span&gt;:3,9,16 &lt;span class="s1"&gt;'crippl'&lt;/span&gt;:15 &lt;span class="s1"&gt;'dolphin'&lt;/span&gt;:17 &lt;span class="s1"&gt;'eleph'&lt;/span&gt;:4 &lt;span class="s1"&gt;'enorm'&lt;/span&gt;:8 &lt;span class="s1"&gt;'friend'&lt;/span&gt;:11 &lt;span class="s1"&gt;'join'&lt;/span&gt;:5 &lt;span class="s1"&gt;'jump'&lt;/span&gt;:12 &lt;span class="s1"&gt;'mammoth'&lt;/span&gt;:10 &lt;span class="s1"&gt;'orca'&lt;/span&gt;:22 &lt;span class="s1"&gt;'smile'&lt;/span&gt;:19
&lt;/pre&gt;


&lt;p&gt;Now we would like to reduce this result even further by compacting synonyms into one lexeme.&lt;/p&gt;
&lt;p&gt;Enter the &lt;em&gt;synonym&lt;/em&gt; dictionary &lt;em&gt;template&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This template requires you to have a so-called "synonym" file; A file containing lists of words with the same meaning. For the sake of learning, let us create our own synonym file. This file has to end with the &lt;em&gt;.syn&lt;/em&gt; extension.&lt;/p&gt;
&lt;p&gt;Open up your editor again and write out a file called "shisaa_syn.syn" with the following contents:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;big enormous
elephant mammoth
dolphin orca
&lt;/pre&gt;


&lt;p&gt;And let us setup the &lt;em&gt;dictionary&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="k"&gt;DICTIONARY&lt;/span&gt; &lt;span class="n"&gt;shisaa_synonym&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;synonym&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;synonyms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shisaa_syn&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And add the mapping for it:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="n"&gt;CONFIGURATION&lt;/span&gt; &lt;span class="n"&gt;shisaa&lt;/span&gt;
    &lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="n"&gt;MAPPING&lt;/span&gt; &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="n"&gt;asciiword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asciihword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword_asciipart&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword_part&lt;/span&gt;
    &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;shisaa_simple&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shisaa_synonym&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shisaa_snowball&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Okay, time to test our big string again and see the results:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'blue'&lt;/span&gt;:3,9,16 &lt;span class="s1"&gt;'crippl'&lt;/span&gt;:15 &lt;span class="s1"&gt;'enorm'&lt;/span&gt;:8 &lt;span class="s1"&gt;'enormous'&lt;/span&gt;:2 &lt;span class="s1"&gt;'friend'&lt;/span&gt;:11 &lt;span class="s1"&gt;'join'&lt;/span&gt;:5 &lt;span class="s1"&gt;'jump'&lt;/span&gt;:12 &lt;span class="s1"&gt;'mammoth'&lt;/span&gt;:4,10 &lt;span class="s1"&gt;'orca'&lt;/span&gt;:17,22 &lt;span class="s1"&gt;'smile'&lt;/span&gt;:19
&lt;/pre&gt;


&lt;p&gt;Very neat. The words "elephant", "big" and "dolphin" are now removed and only their synonyms are kept.
Also notice that both "mammoth" and "orca" have two pointers each, one for every synonym.&lt;/p&gt;
&lt;p&gt;But look at the words 'enorm' and 'enormous', why is this happening?&lt;/p&gt;
&lt;p&gt;If you look at the pointers, you see that &lt;em&gt;enormous&lt;/em&gt; points to the second word in the string, being &lt;em&gt;big&lt;/em&gt;, while &lt;em&gt;enorm&lt;/em&gt; points to the original &lt;em&gt;enormous&lt;/em&gt; word.
The reason why this is happening is because our &lt;em&gt;synonym&lt;/em&gt; dictionary has priority over our &lt;em&gt;snowball&lt;/em&gt; one. The &lt;em&gt;synonym&lt;/em&gt; dictionary emits a lexeme as a synonym for &lt;em&gt;big&lt;/em&gt;, being &lt;em&gt;enormous&lt;/em&gt;, simply because we told it to do so in our &lt;em&gt;synonym file&lt;/em&gt;. Now, because it emits a lexeme, the original token, &lt;em&gt;big&lt;/em&gt;, is not available anymore for the rest of the dictionary chain.&lt;/p&gt;
&lt;p&gt;The token &lt;em&gt;enormous&lt;/em&gt; itself has &lt;em&gt;no&lt;/em&gt; synonym because we did not define it in our synonym file. It is ignored by the &lt;em&gt;synonym&lt;/em&gt; dictionary and passed over to the &lt;em&gt;snowball&lt;/em&gt; dictionary which then stems the token into a lexeme resulting in &lt;em&gt;enorm&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;If you wish to prevent this from happening, you could add a self pointing line to your synonym list:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;enormous enormous
&lt;/pre&gt;


&lt;p&gt;Now load in the file on disk to pull the changes into PostgreSQL:&lt;/p&gt;
&lt;pre class="code literal-block"&gt; &lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="k"&gt;DICTIONARY&lt;/span&gt; &lt;span class="n"&gt;shisaa_synonym&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;synonyms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;shisaa_syn&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And run the query again, the result should now read:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'blue'&lt;/span&gt;:3,9,16 &lt;span class="s1"&gt;'crippl'&lt;/span&gt;:15 &lt;span class="s1"&gt;'enormous'&lt;/span&gt;:2,8 &lt;span class="s1"&gt;'friend'&lt;/span&gt;:11 &lt;span class="s1"&gt;'join'&lt;/span&gt;:5 &lt;span class="s1"&gt;'jump'&lt;/span&gt;:12 &lt;span class="s1"&gt;'mammoth'&lt;/span&gt;:4,10 &lt;span class="s1"&gt;'orca'&lt;/span&gt;:17,22 &lt;span class="s1"&gt;'smile'&lt;/span&gt;:19
&lt;/pre&gt;


&lt;p&gt;Now &lt;em&gt;enorm&lt;/em&gt; will be removed and both &lt;em&gt;big&lt;/em&gt; and &lt;em&gt;enormous&lt;/em&gt; are cast to the same lexeme. &lt;/p&gt;
&lt;p&gt;PostgreSQL does not ship a synonym list, so you will have to compile your own just like we did above but hopefully a little bit more useful&lt;/p&gt;
&lt;h3&gt;Extending the configuration: phrasing with a Thesaurus&lt;/h3&gt;
&lt;p&gt;Next up is the &lt;em&gt;thesaurus&lt;/em&gt; dictionary, which is quite close to the &lt;em&gt;synonym&lt;/em&gt; dictionary, with one exception: &lt;em&gt;phrases&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;thesaurus&lt;/em&gt; dictionary is used to recognize phrases and convert them into lexemes with the same meaning. Again, this dictionary relies on a file containing the phrase conversions.
This time, the file has the &lt;em&gt;.ths&lt;/em&gt; extension. &lt;/p&gt;
&lt;p&gt;Open up your editor and write out a file called "shisaa_thesaurus.ths" with the following contents:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;big blue elephant : PostgreSQL
crippled blue dolphin : MySQL
&lt;/pre&gt;


&lt;p&gt;Before we can create the dictionary, there is one more required variable we have to set, the &lt;em&gt;subdictionary&lt;/em&gt; the &lt;em&gt;thesaurus&lt;/em&gt; dictionary can use.
This subdictionary will be &lt;em&gt;another&lt;/em&gt; dictionary you have defined before. Most of the time a stemmer is fed to this variable to let the thesaurus stem the input before comparing it with its thesaurus file.&lt;/p&gt;
&lt;p&gt;So let us feed it our &lt;em&gt;snowball&lt;/em&gt; dictionary and set it up:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="k"&gt;DICTIONARY&lt;/span&gt; &lt;span class="n"&gt;shisaa_thesaurus&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;TEMPLATE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;thesaurus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;DICTFILE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shisaa_thesaurus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;DICTIONARY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;shisaa_snowball&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Map it:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="n"&gt;CONFIGURATION&lt;/span&gt; &lt;span class="n"&gt;shisaa&lt;/span&gt;
    &lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="n"&gt;MAPPING&lt;/span&gt; &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="n"&gt;asciiword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asciihword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword_asciipart&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword_part&lt;/span&gt;
    &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;shisaa_simple&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shisaa_thesaurus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shisaa_snowball&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Notice that I took out the &lt;em&gt;synonym&lt;/em&gt; dictionary. If we chain up to many dictionaries, the results might turn out to be undesirable in our demonstration use case.&lt;/p&gt;
&lt;p&gt;Querying will result in the following tsvector:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'blue'&lt;/span&gt;:7 &lt;span class="s1"&gt;'enorm'&lt;/span&gt;:6 &lt;span class="s1"&gt;'friend'&lt;/span&gt;:9 &lt;span class="s1"&gt;'join'&lt;/span&gt;:3 &lt;span class="s1"&gt;'jump'&lt;/span&gt;:10 &lt;span class="s1"&gt;'mammoth'&lt;/span&gt;:8 &lt;span class="s1"&gt;'mysql'&lt;/span&gt;:13 &lt;span class="s1"&gt;'orca'&lt;/span&gt;:18 &lt;span class="s1"&gt;'postgresql'&lt;/span&gt;:2 &lt;span class="s1"&gt;'smile'&lt;/span&gt;:15
&lt;/pre&gt;


&lt;p&gt;That is quite awesome, it now recognizes "big blue elephant" as PostgreSQL and "crippled blue dolphin" as MySQL. We have created a &lt;em&gt;pun-aware&lt;/em&gt; full text search configuration!&lt;/p&gt;
&lt;p&gt;As you can see,  both the "MySQL" and "PostgreSQL" lexemes have &lt;em&gt;one&lt;/em&gt; pointer each, pointing to the first word of the substring that got converted.&lt;/p&gt;
&lt;h3&gt;Extending the configuration a last time: morphing with Ispell&lt;/h3&gt;
&lt;p&gt;Okay, we are almost at the end of the dictionary &lt;em&gt;templates&lt;/em&gt; that PostgreSQL supports.&lt;/p&gt;
&lt;p&gt;This last one is a fun one too. Many Unix and Linux systems come shipped with a spell checker called &lt;em&gt;Ispell&lt;/em&gt; or with the more modern variant called &lt;em&gt;HunSpell&lt;/em&gt;.
Besides your average spell checking, these dictionaries are very good at morphological lookups, meaning that they can link all different writing structures of words together.&lt;/p&gt;
&lt;p&gt;A synonym or thesaurus dictionary would not catch these, unless explicitly set with a huge amount of lines in the &lt;em&gt;.syn&lt;/em&gt; or &lt;em&gt;.ths&lt;/em&gt; files, which is error prone and inelegant. 
The Ispell or Hunspell dictionaries &lt;em&gt;will&lt;/em&gt; capture these and try to make them into one lexeme.&lt;/p&gt;
&lt;p&gt;Before setting up the &lt;em&gt;dictionary&lt;/em&gt;, we first need to make sure that we have the Ispell or Hunspell dictionary files for the language we wish to support.
Normally you would want to download these files from the official OpenOffice page. These pages, however, seem to be confusing and the correct files very hard to find. I have found &lt;a href="http://fmg-www.cs.ucla.edu/geoff/ispell-dictionaries.html" title="OpenOffice Extension page."&gt;the following page&lt;/a&gt; of great help to get the files you need for your desired language
.
Download the files for your desired language and place the &lt;em&gt;.dict&lt;/em&gt; and the &lt;em&gt;.affix&lt;/em&gt; files into the PostgreSQL shared directory.&lt;/p&gt;
&lt;p&gt;For now, let us just take the basic &lt;em&gt;english&lt;/em&gt; dict and affix files (named both &lt;em&gt;en_us&lt;/em&gt; and already shipped with PostgreSQL) and feed them to the configuration:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="k"&gt;DICTIONARY&lt;/span&gt; &lt;span class="n"&gt;shisaa_ispell&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ispell&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;DictFile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;en_us&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AffFile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;en_us&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;StopWords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;english&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And chain it:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;SEARCH&lt;/span&gt; &lt;span class="n"&gt;CONFIGURATION&lt;/span&gt; &lt;span class="n"&gt;shisaa&lt;/span&gt;
    &lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="n"&gt;MAPPING&lt;/span&gt; &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="n"&gt;asciiword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asciihword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword_asciipart&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hword_part&lt;/span&gt;
    &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;shisaa_simple&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shisaa_ispell&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shisaa_snowball&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Notice again I took out the &lt;em&gt;thesaurus&lt;/em&gt; dictionary, not to pile up too many dictionaries at once.&lt;/p&gt;
&lt;p&gt;Query it once more, and look at what we get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'big'&lt;/span&gt;:2 &lt;span class="s1"&gt;'blue'&lt;/span&gt;:3,9,16 &lt;span class="s1"&gt;'cripple'&lt;/span&gt;:15 &lt;span class="s1"&gt;'dolphin'&lt;/span&gt;:17 &lt;span class="s1"&gt;'elephant'&lt;/span&gt;:4 &lt;span class="s1"&gt;'enormous'&lt;/span&gt;:8 &lt;span class="s1"&gt;'friend'&lt;/span&gt;:11 &lt;span class="s1"&gt;'join'&lt;/span&gt;:5 &lt;span class="s1"&gt;'joined'&lt;/span&gt;:5 &lt;span class="s1"&gt;'jump'&lt;/span&gt;:12 &lt;span class="s1"&gt;'mammoth'&lt;/span&gt;:10 &lt;span class="s1"&gt;'orca'&lt;/span&gt;:22 &lt;span class="s1"&gt;'smile'&lt;/span&gt;:19 &lt;span class="s1"&gt;'smiling'&lt;/span&gt;:19
&lt;/pre&gt;


&lt;p&gt;Hmm, interesting. Notice that we now got &lt;em&gt;more&lt;/em&gt; lexemes than before, &lt;em&gt;smile&lt;/em&gt; and &lt;em&gt;smiling&lt;/em&gt; for example, and &lt;em&gt;join&lt;/em&gt; and &lt;em&gt;joined&lt;/em&gt;. Also, both these cases have the &lt;em&gt;same&lt;/em&gt; pointer. Why is that?&lt;/p&gt;
&lt;p&gt;What is happening here is a feature of the Ispell dictionary called &lt;em&gt;morphology&lt;/em&gt;, or as we seen above, &lt;em&gt;morphological lookups&lt;/em&gt;.
One of the reasons why Ispell is such a powerful dictionary is because it can recognize and act upon the &lt;em&gt;structure&lt;/em&gt; of a word. &lt;/p&gt;
&lt;p&gt;In our case, Ispell recognizes &lt;em&gt;joined&lt;/em&gt; (or &lt;em&gt;smiling&lt;/em&gt;) and emits an array of &lt;em&gt;two&lt;/em&gt; lexemes, the original token converted to a lexeme &lt;em&gt;and&lt;/em&gt; the stemmed version of the token.&lt;/p&gt;
&lt;p&gt;This concludes all the dictionaries that PostgreSQL ships with by default and the ones you will most likely ever need. What is next?&lt;/p&gt;
&lt;h3&gt;Debugging&lt;/h3&gt;
&lt;p&gt;Now that you have a good understanding of how to build your own configuration and setup your own dictionaries, I would like to introduce a few new functions that can come in handy when your configuration would produce seemingly strange results.&lt;/p&gt;
&lt;h4&gt;ts_debug()&lt;/h4&gt;
&lt;p&gt;The first function I want show you is a &lt;em&gt;very&lt;/em&gt; handy one that is built to test your &lt;em&gt;whole&lt;/em&gt; full text configuration. It helps you keep your mental condition to just mildly insane, so to speak.&lt;/p&gt;
&lt;p&gt;The function &lt;em&gt;ts_debug()&lt;/em&gt; accepts a configuration and a string of text you wish to test. As a result you will get back a set that contains an overview of how the parser chopped your string into tokens,  which category it picked for each token, which dictionary was consulted and which lexeme(s) where emitted. Oh boy, this is too much fun, let us just try it out! &lt;/p&gt;
&lt;p&gt;Feed our original pun string and let us test the current &lt;em&gt;shisaa&lt;/em&gt; configuration:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ts_debug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'shisaa'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'The big blue elephant jumped over the crippled blue dolphin.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Hmm, that may not be very readable, rather use the wildcard selector and a FROM clause to include column names into our result set (one of the few times you may use this selector without getting smacked):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;ts_debug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'shisaa'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'The big blue elephant jumped over the crippled blue dolphin.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Which will result in the following, huge set:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;  &lt;span class="nb"&gt;alias&lt;/span&gt;   &lt;span class="p"&gt;|&lt;/span&gt;   description   &lt;span class="p"&gt;|&lt;/span&gt;  token   &lt;span class="p"&gt;|&lt;/span&gt;                 dictionaries                  &lt;span class="p"&gt;|&lt;/span&gt;  dictionary   &lt;span class="p"&gt;|&lt;/span&gt;  lexemes   
-----------+-----------------+----------+-----------------------------------------------+---------------+------------
asciiword &lt;span class="p"&gt;|&lt;/span&gt; Word, all ASCII &lt;span class="p"&gt;|&lt;/span&gt; The      &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;shisaa_simple,shisaa_ispell,shisaa_snowball&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;
blank     &lt;span class="p"&gt;|&lt;/span&gt; Space symbols   &lt;span class="p"&gt;|&lt;/span&gt;          &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;                                            &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt; 
asciiword &lt;span class="p"&gt;|&lt;/span&gt; Word, all ASCII &lt;span class="p"&gt;|&lt;/span&gt; big      &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;shisaa_simple,shisaa_ispell,shisaa_snowball&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; shisaa_ispell &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;big&lt;span class="o"&gt;}&lt;/span&gt;
blank     &lt;span class="p"&gt;|&lt;/span&gt; Space symbols   &lt;span class="p"&gt;|&lt;/span&gt;          &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;                                            &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt; 
asciiword &lt;span class="p"&gt;|&lt;/span&gt; Word, all ASCII &lt;span class="p"&gt;|&lt;/span&gt; blue     &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;shisaa_simple,shisaa_ispell,shisaa_snowball&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; shisaa_ispell &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;blue&lt;span class="o"&gt;}&lt;/span&gt;
blank     &lt;span class="p"&gt;|&lt;/span&gt; Space symbols   &lt;span class="p"&gt;|&lt;/span&gt;          &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;                                            &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt; 
asciiword &lt;span class="p"&gt;|&lt;/span&gt; Word, all ASCII &lt;span class="p"&gt;|&lt;/span&gt; elephant &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;shisaa_simple,shisaa_ispell,shisaa_snowball&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; shisaa_ispell &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;elephant&lt;span class="o"&gt;}&lt;/span&gt;
blank     &lt;span class="p"&gt;|&lt;/span&gt; Space symbols   &lt;span class="p"&gt;|&lt;/span&gt;          &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;                                            &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt; 
asciiword &lt;span class="p"&gt;|&lt;/span&gt; Word, all ASCII &lt;span class="p"&gt;|&lt;/span&gt; jumped   &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;shisaa_simple,shisaa_ispell,shisaa_snowball&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; shisaa_ispell &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;jump&lt;span class="o"&gt;}&lt;/span&gt;
blank     &lt;span class="p"&gt;|&lt;/span&gt; Space symbols   &lt;span class="p"&gt;|&lt;/span&gt;          &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;                                            &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt; 
asciiword &lt;span class="p"&gt;|&lt;/span&gt; Word, all ASCII &lt;span class="p"&gt;|&lt;/span&gt; over     &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;shisaa_simple,shisaa_ispell,shisaa_snowball&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;
blank     &lt;span class="p"&gt;|&lt;/span&gt; Space symbols   &lt;span class="p"&gt;|&lt;/span&gt;          &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;                                            &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt; 
asciiword &lt;span class="p"&gt;|&lt;/span&gt; Word, all ASCII &lt;span class="p"&gt;|&lt;/span&gt; the      &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;shisaa_simple,shisaa_ispell,shisaa_snowball&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; shisaa_simple &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;
blank     &lt;span class="p"&gt;|&lt;/span&gt; Space symbols   &lt;span class="p"&gt;|&lt;/span&gt;          &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;                                            &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt; 
asciiword &lt;span class="p"&gt;|&lt;/span&gt; Word, all ASCII &lt;span class="p"&gt;|&lt;/span&gt; crippled &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;shisaa_simple,shisaa_ispell,shisaa_snowball&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; shisaa_ispell &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;cripple&lt;span class="o"&gt;}&lt;/span&gt;
blank     &lt;span class="p"&gt;|&lt;/span&gt; Space symbols   &lt;span class="p"&gt;|&lt;/span&gt;          &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;                                            &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt; 
asciiword &lt;span class="p"&gt;|&lt;/span&gt; Word, all ASCII &lt;span class="p"&gt;|&lt;/span&gt; blue     &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;shisaa_simple,shisaa_ispell,shisaa_snowball&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; shisaa_ispell &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;blue&lt;span class="o"&gt;}&lt;/span&gt;
blank     &lt;span class="p"&gt;|&lt;/span&gt; Space symbols   &lt;span class="p"&gt;|&lt;/span&gt;          &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;                                            &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt; 
asciiword &lt;span class="p"&gt;|&lt;/span&gt; Word, all ASCII &lt;span class="p"&gt;|&lt;/span&gt; dolphin  &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;shisaa_simple,shisaa_ispell,shisaa_snowball&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; shisaa_ispell &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;dolphin&lt;span class="o"&gt;}&lt;/span&gt;
blank     &lt;span class="p"&gt;|&lt;/span&gt; Space symbols   &lt;span class="p"&gt;|&lt;/span&gt; .        &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;                                            &lt;span class="p"&gt;|&lt;/span&gt;               &lt;span class="p"&gt;|&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You now have a complete overview of the flow from string to vector of lexemes. Let me go over some interesting facts of this result set.&lt;/p&gt;
&lt;p&gt;First, notice how the tokens &lt;em&gt;the&lt;/em&gt; and &lt;em&gt;over&lt;/em&gt; got removed by the &lt;em&gt;simple&lt;/em&gt; dictionary. They where a hit in the stop word list, so the dictionary returned an &lt;em&gt;empty array&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Next you see the alias &lt;em&gt;blank&lt;/em&gt; between each &lt;em&gt;asciiword&lt;/em&gt;. &lt;em&gt;Blank&lt;/em&gt; is a category used for spaces or punctuation. A &lt;em&gt;space&lt;/em&gt; and a &lt;em&gt;.&lt;/em&gt; (full stop) is considered a token, but is stripped out by the parser itself for it has no value in this context.&lt;/p&gt;
&lt;p&gt;And last, see that our &lt;em&gt;snowball&lt;/em&gt; dictionary was never consulted. This means that, in this string, the &lt;em&gt;shisaa_ispell&lt;/em&gt; gobbled all the lexemes that &lt;em&gt;shisaa_simple&lt;/em&gt; threw at it.&lt;/p&gt;
&lt;h4&gt;ts_lexize()&lt;/h4&gt;
&lt;p&gt;The second function is &lt;em&gt;ts_lexize()&lt;/em&gt;. This little helper lets you test different &lt;em&gt;parts&lt;/em&gt; of your whole setup. Take the unexpected result of our last dictionary, where we got back multiple lexemes. As it turned out it is normal behavior, but you may want to verify that the result is coming from the dictionary and not from a side effect of how you chained your dictionaries together.&lt;/p&gt;
&lt;p&gt;To test our single, &lt;em&gt;shisaa_ispell&lt;/em&gt; dictionary, we could feed it to this new function, together with &lt;em&gt;one token&lt;/em&gt; we wish to test:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ts_lexize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'shisaa_ispell'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'joined'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will return:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="o"&gt;{&lt;/span&gt;joined,join&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Same as we had before, but now we know, for sure, that it is a feature of our Ispell dictionary. 
Notice that I stressed the fact that you can only feed this function &lt;em&gt;one token&lt;/em&gt;, not a string of text and not multiple tokens.&lt;/p&gt;
&lt;p&gt;You can use this function to test all your dictionaries individually, one token at a time.&lt;/p&gt;
&lt;p&gt;Phew, that was a lot to take in for we covered a lot of ground here today. You can turn the lights back high and go get some fresh air.
In the next chapter, I will round up this introduction by introducing you to the following, new material:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Ranking search results&lt;/li&gt;
&lt;li&gt;Highlighting words inside search results&lt;/li&gt;
&lt;li&gt;Creating special full text search indexes&lt;/li&gt;
&lt;li&gt;Setting up update triggers for tsvector records&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;And as always...thanks for reading!&lt;/p&gt;
&lt;!--  LocalWords:  instantiation PostgreSQL
 --&gt;&lt;/div&gt;</description><category>full text search</category><category>postgresql</category><guid>http://shisaa.be/postset/postgresql-full-text-search-part-2.html</guid><pubDate>Wed, 07 May 2014 13:00:00 GMT</pubDate></item><item><title>PostgreSQL: A full text search engine - Part 1</title><link>http://shisaa.be/postset/postgresql-full-text-search-part-1.html</link><dc:creator>Tim van der Linden</dc:creator><description>&lt;div&gt;&lt;h3&gt;Preface&lt;/h3&gt;
&lt;p&gt;PostgreSQL, the database of miracles, the RDBMS of wonders.&lt;/p&gt;
&lt;p&gt;People who have read my stuff before know that I am a fan of the blue-ish elephant and I greatly entrust it with my data. 
For reasons why, I invite you to read the "Dolphin ass-whopping" part of the &lt;a href="http://shisaa.be/postset/mailserver-2.html" title="Second chapter of the mail setup series."&gt;second chapter&lt;/a&gt; of my mail server setup series.&lt;/p&gt;
&lt;p&gt;But what some of you may not know is that PostgreSQL is capable of much more then simply storing and retrieving your data.
Well, that is actually not entirely correct...you are &lt;em&gt;always&lt;/em&gt; storing and retrieving data.
A more correct way to say it is that PostgreSQL is capable of storing all &lt;em&gt;kinds&lt;/em&gt; of data and gives you all &lt;em&gt;kinds&lt;/em&gt; of ways to retrieve it.
It is not limited to &lt;em&gt;storing&lt;/em&gt; boring stuff like "VARCHAR" or "INT". Neither is it limited to retrieving and &lt;em&gt;comparing&lt;/em&gt; with boring
operators like "=", "ILIKE" or "~". &lt;/p&gt;
&lt;p&gt;For instance, are you familiar with PostgreSQL's &lt;em&gt;"tsvector"&lt;/em&gt; data type? Or the &lt;em&gt;"tsquery"&lt;/em&gt; type? Or what these two represent? No?
Well, diddlydangeroo, then by all means, keep reading, because that is exactly what this series is all about!&lt;/p&gt;
&lt;p&gt;In the following three chapters I would like to show you how you can configure PostgreSQL to be a batteries included, blazing fast, competition crunching, full text search engine.&lt;/p&gt;
&lt;h3&gt;But, I can already search strings of text with PostgreSQL!&lt;/h3&gt;
&lt;p&gt;Hmm, that is very correct. But the basic operators you have at your disposal are limited. &lt;/p&gt;
&lt;p&gt;Let me demonstrate.&lt;/p&gt;
&lt;p&gt;Imagine we would have a table, called "phraseTable" containing thousands of strings, all saved in a regular, old VARCHAR column named "phrase".
Now we would like to find the string &lt;em&gt;"An elephant a day keeps the dolphins at bay."&lt;/em&gt;.
We do not fully remember the above string, but we do remember it had the word "elephant" in it.
With regular SQL you could use the "LIKE" operator to try and find a matching substring. The resulting query would look something like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'%elephant%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;It would work, you render any index on the table mute when using front &lt;em&gt;and&lt;/em&gt; back wildcards, but it would work.
Now imagine a humble user would like to find the same string but their memory is bad, they thought the word elephant was capitalized, because it may refer to PostgreSQL, of course.
The query would become this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'%Elephant%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And as a result, you get back zero records.&lt;/p&gt;
&lt;p&gt;"But wait!", you shout, "I am a smart ass, there is a solution to this!". And you are correct: the ILIKE operator.
The "I" stands for Insensitive...as in &lt;em&gt;Case Insensitive&lt;/em&gt;. So you change the query:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;ILIKE&lt;/span&gt; &lt;span class="s1"&gt;'%Elephant%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And now you will get back a result. Good for you.&lt;/p&gt;
&lt;p&gt;A day goes by and the same user comes back and wishes to find this string again. But, his memory still being bad and all, he thought there where multiple elephants keeping the dolphins at bay, because, you know, pun. So the query, you altered yesterday, now reads:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;ILIKE&lt;/span&gt; &lt;span class="s1"&gt;'%Elephants%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And...now the query will return zero results.&lt;/p&gt;
&lt;p&gt;"Ha!", you shout in my general direction. "I am a master of Regular Expressions! I shall fix thay query!".&lt;/p&gt;
&lt;p&gt;No, you shall &lt;em&gt;not&lt;/em&gt; fix my query. Never, ever go smart on my derrière by throwing a regular expression in the mix to solve a database lookup problem. It is unreadable, un-scalable and fits only one solution perfectly-ish. And, not to forget, is &lt;em&gt;slow as hell&lt;/em&gt; for it not only ignores any index you have set, it also asks more of the database engine then a LIKE or ILIKE.&lt;/p&gt;
&lt;p&gt;Let me put an end to this and tell you that I am afraid there are no more (scalable) smart ass tricks left to perform and the possibilities to search text with regular, build-in operators are exhausted.&lt;/p&gt;
&lt;p&gt;You agree? Yes? Good! So, enter "&lt;em&gt;full text search&lt;/em&gt;"!&lt;/p&gt;
&lt;h3&gt;Full text search?&lt;/h3&gt;
&lt;p&gt;But before we delve into the details of the PostgreSQL implementation, let us take a step back and first see what exactly a full text search engine is.&lt;/p&gt;
&lt;p&gt;Short version: A full text search engine is a system that can retrieve documents, or parts of documents, based on natural language searching.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Natural language&lt;/em&gt; means the living, breathing language we humans use. And as you know, human language can be complex and above all &lt;em&gt;ambiguous&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Consider yourself in the situation where you knew, for sure, that you have read an interesting article about elephants in the latest edition of "Your Favorite Elephant Magazine".
You liked it so much that you want to show it to your best friend, who happens to be an elephant lover too.
The only bummer is, you cannot remember the title, but you do remember it has an interesting sentence in it.&lt;/p&gt;
&lt;p&gt;So what do you do? First you quote the sentence in your mind: "The best elephants have a blue skin color.".
Next, you pick up the latest edition and you start &lt;em&gt;searching&lt;/em&gt;, flipping through the pages, skimming for that sentence.&lt;/p&gt;
&lt;p&gt;After a minute or two you shout: "Dumbo!, I have found the article!". You read the sentence out loud: "The best Elephants bear a blue skin tone.".
You are happy with yourself, call up your friend and tell him that you will be over right away to show him that specific article.&lt;/p&gt;
&lt;p&gt;One thing you forgot to notice was that the sentence in your head, and the sentence that was actually printed where &lt;em&gt;different&lt;/em&gt;, but your brain (which is trained in this natural stuff), sees them as the same.
How did that work? Well, your brain used its internal &lt;em&gt;synonym&lt;/em&gt; list and &lt;em&gt;thesaurus&lt;/em&gt; to link the different words together, making them the same thing, just written differently:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;"elephants" is the same as "Elephants"&lt;/li&gt;
&lt;li&gt;"have" is the same as "bear"&lt;/li&gt;
&lt;li&gt;"skin color" is the same "skin tone"&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Without noticing it, you have just completed a full text search using natural language algorithms, your magazine as the database, your brain as the engine.&lt;/p&gt;
&lt;h3&gt;But how does such a natural language lookup work...on an unnatural computer?&lt;/h3&gt;
&lt;p&gt;What a perfectly timed question, I was just getting to that.&lt;/p&gt;
&lt;p&gt;Now that you have a basic understanding of what natural language searching is, how does one port this idea to a stupid, binary ticking tin box?
By dissecting the process we do in our brains, lay it out in several programmable steps and concepts. 
Such a process, run by computers, will never be as good on the &lt;em&gt;natural&lt;/em&gt; part as our brains are, but it is certainly a lot faster with flipping and skimming through the magazine pages.&lt;/p&gt;
&lt;p&gt;Let us look at how a computer, regardless of which program, platform or engine you use, would go about being "natural" when searching for strings of text.&lt;/p&gt;
&lt;p&gt;To speed up the search process, a full text search engine will never search through the actual document itself.
That is how we humans would do it, and that is slow and (for our eyes) error prone. Before a document can be searched through with a full text search engine, it has to be parsed into a list of words first.
The parsing is where the magic happens, this is our attempt at programming the natural language process. Once the document is parsed, the parsed state is saved. Depending on your database model, you can save the parsed state together with a reference to the original document for later retrieval.&lt;/p&gt;
&lt;p&gt;Note that a document, in this context, is simply a big collection of words contained within a file. The engine does not care, and most of the time does not know, about what kind of file (plain text, LibreOffice document, HTML file, ...) it is handling or what the files structure is. It simply looks at all the readable words inside of the file.&lt;/p&gt;
&lt;p&gt;So how does the parsing work? Parsing, in this regard, is all about compressing the text found in a document. Cutting down the word count to the least possible, so later, when a user searches, the engine has to plow through fewer words. This compressing, in most engines, is done in roughly three steps.&lt;/p&gt;
&lt;h4&gt;Remove stop words&lt;/h4&gt;
&lt;p&gt;The first step is the removal of words that do not add any searchable value to the text and are seldom searched for.
These words are known as "stop words", a term first coined by Hans Peter Luhn, a renowned IBM computer scientist who specialized in the retrieval and indexing of information stored in computer systems.&lt;/p&gt;
&lt;p&gt;The list of stop words is not limited to simply ones like "and" or "the". There is an extensive list of hundreds and hundreds of words which are generally considered to be of little value in a search context.
A (very) short list of stop words: her, him, the, also, each, was, we, after, been, they, would, up, from, only, you, while, ... .&lt;/p&gt;
&lt;h4&gt;Eliminate casing&lt;/h4&gt;
&lt;p&gt;The following step in the compression process is the elimination of casing - keeping only the lower case versions of a word.
If you would keep a search case sensitive, then "The ElEphAnt" would not match "the elephant", but generally you do want a match to happen.
The user will many times not care (or not know) about casing in a full text search.&lt;/p&gt;
&lt;h4&gt;Remove synonyms, employing a thesaurus and perform stemming&lt;/h4&gt;
&lt;p&gt;The last part in the compacting of our to-be-indexed document is removing words that have the same meaning and perform stemming.
Synonym lookups are used for removing &lt;em&gt;words&lt;/em&gt; of the same meaning where as thesaurus lookups are used to compact whole &lt;em&gt;phrases&lt;/em&gt; with similar meaning.&lt;/p&gt;
&lt;p&gt;Only one instance of all the synonyms, thesaurus phrases and case eliminations is stored, the surviving word is referred to as a &lt;em&gt;lexeme&lt;/em&gt;, the smallest, meaningful word.
The lexemes that are stored usually (depending on the engine you use) get an accompanying list of (alpha)numeric values stored alongside. Two types of (alpha)numeric values can be stored in case of PostgreSQL:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;The first type are pure numerical and represent pointer(s) to where the word occurs in the original document.&lt;/li&gt;
&lt;li&gt;The second type is pure alphabetical (actually only capital A,B,C,D) and represent the weight a certain lexeme has. &lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Do not worry to much about these two (alpha)numerical values for now, we will get to that later.&lt;/p&gt;
&lt;p&gt;Next, let us get practical and start to actually use PostgreSQL to see how all of this works. &lt;/p&gt;
&lt;h3&gt;The tsvector&lt;/h3&gt;
&lt;p&gt;As PostgreSQL is an &lt;em&gt;extendable&lt;/em&gt; database engine, two new data types where added to make full text search possible, as you have seen in the beginning.
One of them is called &lt;em&gt;tsvector&lt;/em&gt;, "ts" for &lt;em&gt;t&lt;/em&gt;ext &lt;em&gt;s&lt;/em&gt;earch and "vector", which is analogous with the generic programming data type "vector".
It is the container in which the result of the parsing is eventually stored.&lt;/p&gt;
&lt;p&gt;Let me show you an example of such a tsvector, as presented by PostgreSQL on querying.
Imagine a document with the following string of text inside: &lt;em&gt;"The big blue elephant jumped over the crippled blue dolphin."&lt;/em&gt;.
A perfectly normal sentence, elephants jump over dolphins all the time.&lt;/p&gt;
&lt;p&gt;Without bothering about how to do it, if we let PostgreSQL parse this string, we will get the following tsvector stored in our record:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'big'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="s1"&gt;'blue'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt; &lt;span class="s1"&gt;'crippl'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="s1"&gt;'dolphin'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="s1"&gt;'eleph'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="s1"&gt;'jump'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You will notice a few things about this vector, let me go over them one by one.&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;First, you recognize the structure of a vector-ish data type. Hence the name "tsvector".&lt;/li&gt;
&lt;li&gt;Next, the numbers behind the lexemes themselves, like I said before, represent the pointer(s) to that word. Notice the word "blue" in particular, it has two pointers for the two occurrences in the string.&lt;/li&gt;
&lt;li&gt;And last, notice how some lexemes do not even look like English words at all. The lexeme "crippl" or "eleph" do not mean anything, to us humans anyway. These are the surviving lexemes of "cripple" and "elephant". PostgreSQL has stemmed and reduced the words to match all possible variants. The lexeme "crippl", for example, matches "cripple", "crippled", "crippling", "cripples", ... .&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Note that the above example is the simplest of full text search parsing results, we did not add any weights nor did we employ a thesaurus (or an advanced dictionary) to get back a more efficient compressing.&lt;/p&gt;
&lt;p&gt;Now that we are dwelling inside of PostgreSQL, I can elaborate a bit more about how the parsing works exactly.
As we have seen above, it happens in roughly three steps. But I intentionally neglected to say that with PostgreSQL, there is an intermediate state between the word and the resulting lexeme.&lt;/p&gt;
&lt;p&gt;When PostgreSQL parses the string of text it goes over them and first &lt;em&gt;categorizes&lt;/em&gt; each word into sections like "word", "url", "int", "hword", "asciiword", ... .
Once the words are broken down into categories, we refer to them as &lt;em&gt;tokens&lt;/em&gt;. This is the intermediate state.
For a token to become a lexeme, PostgreSQL will consult a set of defined &lt;em&gt;dictionaries&lt;/em&gt; for each category to try and find a match.
If a match is found, the dictionary will propose a lexeme. This lexeme is the one that will finally be put in the vector as the parsed result.&lt;/p&gt;
&lt;p&gt;If the dictionaries did not find a match, the word is discarded. The one exception to this are the "stop words", if a word matches a stop word, it will be discarded instead of kept.&lt;/p&gt;
&lt;p&gt;Let us now get our hands dirty and setup a quick testing database and rig it up with the phraseTable table we have been using in our journey so far.
But instead of a varchar column, this table will contain a tsvector type for we will unleash to power of Full Text Search!&lt;/p&gt;
&lt;p&gt;Note: I am assuming you have at least PostgreSQL 9.1 or higher. This post was written with PostgreSQL 9.3 in mind.&lt;/p&gt;
&lt;p&gt;So connect to your PostgreSQL install and create the database:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;DATABASE&lt;/span&gt; &lt;span class="n"&gt;phrases&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Do not worry to much about the ownership of this database nor the ownership of its tables, you can discard it whole later.
Now, switch over to the database:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="n"&gt;phrases&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And create the phraseTable table:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="n"&gt;tsvector&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Okay, simple enough. We now have a tiny database, with a table containing one column of type &lt;em&gt;tsvector&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Let us insert a parsed vector into the table.
Again, without employing a thesaurus or any other tools, we only use the built-in, default configuration to parse a string and save it as a vector.&lt;/p&gt;
&lt;p&gt;Let us insert the vector, shall we?&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'The big blue elephant jumped over the crippled blue dolphin.'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;That was easy enough. Most of what you see is simple, regular SQL with one new kid on the block: "&lt;em&gt;to_tsvector&lt;/em&gt;".
The latter is a &lt;em&gt;function&lt;/em&gt; that is shipped with PostgreSQL's Full Text Search extension and it does what its name suggests: it takes a string of text and converts it into a &lt;em&gt;tsvector&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;As a first argument to this function you can optionally input the full text search &lt;em&gt;configuration&lt;/em&gt; you wish the parser to use. The default is &lt;em&gt;"english"&lt;/em&gt;, so I could have omitted it from the argument list.
This configuration holds everything that PostgreSQL will employ to do all of the parsing, including a basic dictionary, stop word list, ... .
PostgreSQL has some default settings, which many times are good enough. The 'english' configuration is such an example.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Note:&lt;/em&gt; As was pointed out by one of my observant readers, depending on how your PostgreSQL is packaged, it could be &lt;em&gt;localized&lt;/em&gt;. This means that the default 'english' configuration could be changed to reflect the language of the localized package. If this is the case with your install, be sure to &lt;em&gt;not omit&lt;/em&gt; the optional parameter and keep its value set to 'english' for all the tsvector and tsquery work we will do in this chapter. Otherwise your full text parsing will produce different, unpredictable results which will make this chapter difficult to follow.&lt;/p&gt;
&lt;p&gt;In the next chapter we will delve &lt;em&gt;deep&lt;/em&gt; into creating our own configuration, for now just take it for granted.&lt;/p&gt;
&lt;p&gt;If we query the result, with a simple select, it will return our newly created vector:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Will return:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'big'&lt;/span&gt;:2 &lt;span class="s1"&gt;'blue'&lt;/span&gt;:3,9 &lt;span class="s1"&gt;'crippl'&lt;/span&gt;:8 &lt;span class="s1"&gt;'dolphin'&lt;/span&gt;:10 &lt;span class="s1"&gt;'eleph'&lt;/span&gt;:4 &lt;span class="s1"&gt;'jump'&lt;/span&gt;:5
&lt;/pre&gt;


&lt;p&gt;Now remember that I talked about the second kind of value we could store alongside the numeric pointers, the &lt;em&gt;weights&lt;/em&gt;? Let us take a deeper look into that now.&lt;/p&gt;
&lt;p&gt;First, weights are not mandatory and only give you an extra tool for ranking the results afterwards.
They are nothing more then a label you can put on a lexeme to group it together. With weights you could, for example, reflect the structure the original document had.
You may wish to put a higher weight on lexemes that come from a title element and a lower weight on those from the body text.&lt;/p&gt;
&lt;p&gt;PostgreSQL knows four weight labels &lt;em&gt;A&lt;/em&gt;, &lt;em&gt;B&lt;/em&gt;, &lt;em&gt;C&lt;/em&gt;, &lt;em&gt;D&lt;/em&gt;. The lowest in rank being &lt;em&gt;D&lt;/em&gt;. In fact, if you do not define any weights to the lexemes inside a tsvector, all of them will implicitly get a &lt;em&gt;D&lt;/em&gt; assigned.
If all the lexemes in a tsvector carry a &lt;em&gt;D&lt;/em&gt;, it is omitted from display when printing the tsvector, simply for readability.
The above query result could thus also be written as:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'big'&lt;/span&gt;:2D &lt;span class="s1"&gt;'blue'&lt;/span&gt;:3D,9D &lt;span class="s1"&gt;'crippl'&lt;/span&gt;:8D &lt;span class="s1"&gt;'dolphin'&lt;/span&gt;:10D &lt;span class="s1"&gt;'eleph'&lt;/span&gt;:4D &lt;span class="s1"&gt;'jump'&lt;/span&gt;:5D
&lt;/pre&gt;


&lt;p&gt;It is &lt;em&gt;exactly&lt;/em&gt; the same result, but unnecessarily verbose.&lt;/p&gt;
&lt;p&gt;I told you, in the very beginning, that a full text engine does not know or care about the structure of a document, it only sees the words.
So how can it then put labels on lexemes based on a document structure that it does not know?&lt;/p&gt;
&lt;p&gt;It cannot.&lt;/p&gt;
&lt;p&gt;It is your job to provide PostgreSQL with label information when building the tsvector.
Up until now we have been working with simple text strings, which contain no hierarchy. 
If you wish to reflect your original document structure by using weights, you will have to preprocess the document and construct your &lt;em&gt;to_tsvector&lt;/em&gt; query manually.&lt;/p&gt;
&lt;p&gt;Just for demonstration purposes, we could, of course, assign weights to the lexemes inside a simple text string.
The process of weight assignment is trivial. PostgreSQL gives you the appropriately named &lt;em&gt;setweight&lt;/em&gt; function for this.
This function accepts a tsvector as the first argument and a weight label as the second.&lt;/p&gt;
&lt;p&gt;To demonstrate, let me update our record and give all the lexemes in our famous sentence a &lt;em&gt;A&lt;/em&gt; weight label:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'A'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;If we now query this table, the result will be this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'big'&lt;/span&gt;:2A &lt;span class="s1"&gt;'blue'&lt;/span&gt;:3A,9A &lt;span class="s1"&gt;'crippl'&lt;/span&gt;:8A &lt;span class="s1"&gt;'dolphin'&lt;/span&gt;:10A &lt;span class="s1"&gt;'eleph'&lt;/span&gt;:4A &lt;span class="s1"&gt;'jump'&lt;/span&gt;:5A
&lt;/pre&gt;


&lt;p&gt;Simple, right?&lt;/p&gt;
&lt;p&gt;One more for fun. What if you wanted to assign different weights to the lexemes?
For this, you have to concatenate several &lt;em&gt;setweight&lt;/em&gt; functions together.
An example query would look something like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;update&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;set&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'the big blue elephant'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'A'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
&lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'jumped over the'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'B'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
&lt;span class="n"&gt;setweight&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'crippled blue dolphin.'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'C'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The result:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s1"&gt;'big'&lt;/span&gt;:2A &lt;span class="s1"&gt;'blue'&lt;/span&gt;:3A,7C &lt;span class="s1"&gt;'crippl'&lt;/span&gt;:6C &lt;span class="s1"&gt;'dolphin'&lt;/span&gt;:8C &lt;span class="s1"&gt;'eleph'&lt;/span&gt;:4A &lt;span class="s1"&gt;'jump'&lt;/span&gt;:5B
&lt;/pre&gt;


&lt;p&gt;Not very usefull, but it demonstrates the principle.&lt;/p&gt;
&lt;p&gt;If the documents you wish to index have a fixed structure, many times the table that will hold the tsvectors for these documents will reflect that structure with appropriately named columns.
For example, if your document would always have a title, body text and a footer, you could create a table which contains three tsvector type columns, named after each structure type.
When you parse the document and construct the query, you could assign all lexemes that will be stored in the title column with an &lt;em&gt;A&lt;/em&gt; label, in the body column with a &lt;em&gt;B&lt;/em&gt; and in the footer column with a &lt;em&gt;C&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Okay, that is enough about weights. Simply remember that they give you extra power to influence the search result ranking, if needed.&lt;/p&gt;
&lt;p&gt;We now have a table with a decent tsvector inside. The data is in, so to speak. But what can we do with it now?&lt;/p&gt;
&lt;p&gt;Well, let us try to retrieve and compare it, shall we!&lt;/p&gt;
&lt;h3&gt;The tsquery&lt;/h3&gt;
&lt;p&gt;You could, of course, simply retrieve the results stored in a &lt;em&gt;tsvector&lt;/em&gt; by doing a &lt;em&gt;SELECT&lt;/em&gt; on the column. However, you have no way of filtering out the results using the operators we have seen before (LIKE, ILIKE). Even if you could use them, you would still run into the same kind of problems as before. You would still have a user who will search for a synonym or search for a plural form of the stemmed lexeme actually stored in the vector.&lt;/p&gt;
&lt;p&gt;So how do we query it?&lt;/p&gt;
&lt;p&gt;Step in &lt;em&gt;tsquery&lt;/em&gt;. What is it? It is a data type that gives us extra tools to &lt;em&gt;query&lt;/em&gt; the full &lt;em&gt;text search&lt;/em&gt; vector.&lt;/p&gt;
&lt;p&gt;Pay attention to the fact that we do not call &lt;em&gt;tsquery&lt;/em&gt; a set of extra &lt;em&gt;operators&lt;/em&gt; but we call it a &lt;em&gt;data type&lt;/em&gt;. This is very important to understand.
With &lt;em&gt;tsquery&lt;/em&gt; we can construct search &lt;em&gt;predicates&lt;/em&gt;, which can search through a &lt;em&gt;tsvector&lt;/em&gt; type and can employ specially designed indexes to speed up the process. &lt;/p&gt;
&lt;p&gt;Let me throw a query at you that will try to find the word "elephants" in our favorite string using &lt;em&gt;tsquery&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'elephants'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Try it out, this will give you back the same result set we had before. Let me explain what just happened.&lt;/p&gt;
&lt;p&gt;As you can see, there is again a new function introduced: &lt;em&gt;to_tsquery&lt;/em&gt; and it is almost identical to its &lt;em&gt;to_tsvector&lt;/em&gt; counterpart.
The function &lt;em&gt;to_tsquery&lt;/em&gt; takes one argument, a string containing the &lt;em&gt;tokens&lt;/em&gt; (not the words, not the lexemes) you wish to search for.
The first argument I give in this example, 'english', is again the full text configuration and is optional.&lt;/p&gt;
&lt;p&gt;Let us first look a bit more at this one. Say, for instance, you wish to find two tokens of the sentence inside your database.
Your first instinct would be to query this the following way:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'elephants blue'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Unfortunately, this will throw an error stating there is a syntax error. Why? Because the string your provided as an argument is malformed.
The &lt;em&gt;to_tsquery&lt;/em&gt; helper function does not accept a simple string of text, it needs a string of tokens &lt;em&gt;separated by operators&lt;/em&gt;.
The operators at your disposal are the regular &lt;em&gt;&amp;amp;&lt;/em&gt; (AND), &lt;em&gt;|&lt;/em&gt; (OR) and &lt;em&gt;!&lt;/em&gt; (NOT). Note that the &lt;em&gt;!&lt;/em&gt; operator &lt;em&gt;needs&lt;/em&gt; the &lt;em&gt;&amp;amp;&lt;/em&gt; or the &lt;em&gt;|&lt;/em&gt; operator.&lt;/p&gt;
&lt;p&gt;It then goes and creates a true &lt;em&gt;tsquery&lt;/em&gt; to retrieve the results. Let us try this query again, but with correct syntax this time:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'elephants &amp;amp; blue'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Perfect! This will return, once more, the same result as before. You could even use parenthesis inside your string argument to enforce grouping if desired.
Like I said before, what this helper function does is translate its input (the tokens in the string) into actual lexemes. After that, it tries to match this result with the lexemes present in the tsvector.&lt;/p&gt;
&lt;p&gt;We still have a problem if we would let a user type her or his search string into an interface search box and feed it to &lt;em&gt;to_tsquery&lt;/em&gt;, for a user does not know about the operators they need to use.
Luckily for us, there is another help function, the &lt;em&gt;plainto_tsquery&lt;/em&gt; which takes care of exactly that problem: convert an arbitrary string of text into lexemes.&lt;/p&gt;
&lt;p&gt;Let me demonstrate:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;phraseTable&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;plainto_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'elephants blue'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Notice we did not separate the words with operators, now it is a simple search string. In fact, &lt;em&gt;plainto_tsquery&lt;/em&gt; converts it to a list of lexemes separated by an &lt;em&gt;&amp;amp;&lt;/em&gt; (AND) operator.
The only drawback is that this function can only separate the lexemes with an &lt;em&gt;&amp;amp;&lt;/em&gt; operator.
If you wish to have something other then the &lt;em&gt;&amp;amp;&lt;/em&gt; operator, you will have to stick to &lt;em&gt;to_tsquery&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;A word of caution though, the &lt;em&gt;plainto_tsvector&lt;/em&gt; may seem interesting, but is most of the time not a general solution for building a higher level search interface. When you are building, say, a web application that contains a full text search box, there are a few more steps between the string entered in that box, and the final query that will be preformed. &lt;/p&gt;
&lt;p&gt;Building a web application and safely handling user input that travels to the database is a separate story and &lt;em&gt;way&lt;/em&gt; beyond the scope of this post, but you will have to build your own parser that sits between the user input and the query.&lt;/p&gt;
&lt;p&gt;If you would play dumb and accept the fact that your interface would only allow to enter a string in the search box (no operators, no grouping, ...) then you still need to send over the user input using query parameters &lt;em&gt;and&lt;/em&gt; you need to make sure that the parameter sent over is a string. This, of course, is not really a parser, this is more basic, sane database handling on the web. &lt;/p&gt;
&lt;p&gt;As tempting (and simple) it might seem to be to build a query like that, it will probably frustrate your end users. The reason why is because as I mentioned before, the &lt;em&gt;plainto_tsquery&lt;/em&gt; accepts a string, but will chop the string into separate lexemes and put the &lt;em&gt;&amp;amp;&lt;/em&gt; operator between them. This means that &lt;em&gt;all&lt;/em&gt; the words entered by the user (or at least their resulting lexemes) must be found in the string.&lt;/p&gt;
&lt;p&gt;Many times, this may not be what you want. Users expect to have their search string interpreted as &lt;em&gt;|&lt;/em&gt; (OR) separated lexemes, or users may want the ability to define these operators themselves on the interface.&lt;/p&gt;
&lt;p&gt;So, one way or the other, you will have to write your own parser if you want a non-database user to work with your application. This parser looks at the options your present on your search form and will crawl over the user entered string to interpret certain characters not as words to search but as operators or grouping tokens to build your final query.&lt;/p&gt;
&lt;p&gt;But enough about web applications, that is not our focus now. Let us continue.&lt;/p&gt;
&lt;p&gt;The next, new item you will see in the last few queries is the &lt;em&gt;@@&lt;/em&gt; operator. This operator (also referred to as text-search-matching operator) is also specific to a full text search context. It allows you to &lt;em&gt;match&lt;/em&gt; a &lt;em&gt;ts_vector&lt;/em&gt; against the results of a &lt;em&gt;ts_query&lt;/em&gt;. In our queries we matched the result of a &lt;em&gt;ts_query&lt;/em&gt; against a column, but you could also match against a &lt;em&gt;ts_vector&lt;/em&gt; on the fly:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'The blue elephant.'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'blue &amp;amp; the'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;A nice little detail about the &lt;em&gt;@@&lt;/em&gt; operator is that it can also match against a &lt;em&gt;TEXT&lt;/em&gt; or &lt;em&gt;VARCHAR&lt;/em&gt; data type, giving you a poor-mans full text capability. Let me demonstrate:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="s1"&gt;'The blue elephant.'&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'blue &amp;amp; the'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This 'on-the-fly' query will generate a &lt;em&gt;VARCHAR&lt;/em&gt; string (by using the &lt;em&gt;::&lt;/em&gt; or &lt;em&gt;cast&lt;/em&gt; operator) and try to match the tokens &lt;em&gt;blue&lt;/em&gt; and &lt;em&gt;the&lt;/em&gt;. The result will be &lt;em&gt;t&lt;/em&gt;, meaning that a match is found.&lt;/p&gt;
&lt;p&gt;Before I continue, it is nice to know that you can always test the result of a &lt;em&gt;ts_query&lt;/em&gt;, meaning, test the output of what it will use to find lexemes in the &lt;em&gt;ts_vector&lt;/em&gt;.
To see that output, you simply call it with the helper function, the same way we called the &lt;em&gt;to_tsvector&lt;/em&gt; a while ago:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'elephants &amp;amp; blue'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will result in:&lt;/p&gt;
&lt;pre class="code literal-block"&gt; &lt;span class="s1"&gt;'eleph'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="s1"&gt;'blue'&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;It is also important to note that &lt;em&gt;to_tsquery&lt;/em&gt; (and &lt;em&gt;plainto_tsquery&lt;/em&gt;) too uses a configuration of the same kind &lt;em&gt;to_tsvector&lt;/em&gt; uses, for it too has to do the same parsing to find the lexemes of the string or tokens you feed it. So the first, optional argument to &lt;em&gt;to_tsquery&lt;/em&gt; is the configuration, this also defaults to "english". This means we could rewrite the query as such:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;to_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'elephants &amp;amp; blue'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And we would get back the same results.&lt;/p&gt;
&lt;p&gt;Okay, I think this is enough to take in for now. You have got a basic understanding of what full text search means, you know how to construct a vector containing lexemes, pointers and weights. You also know how to build a query data type and perform basic matching to retrieve the text you desire.&lt;/p&gt;
&lt;p&gt;In part 2 we will look at how we can dig deeper and setup our own full text search configuration.
We will cover fun stuff like:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Looking deeper into PostgreSQL's guts&lt;/li&gt;
&lt;li&gt;Defining dictionaries&lt;/li&gt;
&lt;li&gt;Building Stop word lists&lt;/li&gt;
&lt;li&gt;Mapping token categories to our dictionaries&lt;/li&gt;
&lt;li&gt;Defining our own, super awesome full text configuration&lt;/li&gt;
&lt;li&gt;And, of course, more dolphin pun...&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;In the last part we will break open yet another can of full text search goodness and look at:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Creating special, full text search indexes&lt;/li&gt;
&lt;li&gt;Ranking search results&lt;/li&gt;
&lt;li&gt;Highlighting word inside search results&lt;/li&gt;
&lt;li&gt;Setting up update triggers for ts_vector records&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Hang in there!&lt;/p&gt;
&lt;p&gt;And as always...thanks for reading!&lt;/p&gt;&lt;/div&gt;</description><category>full text search</category><category>postgresql</category><guid>http://shisaa.be/postset/postgresql-full-text-search-part-1.html</guid><pubDate>Wed, 30 Apr 2014 14:00:00 GMT</pubDate></item><item><title>Moving to Japan</title><link>http://shisaa.be/postset/moving-to-japan.html</link><dc:creator>Tim van der Linden</dc:creator><description>&lt;div&gt;&lt;h3&gt;Preface&lt;/h3&gt;
&lt;p&gt;If you have had the courage to read my
&lt;a href="http://shisaa.be/stories/about.html" title="About me"&gt;about page&lt;/a&gt; you will have noticed
that I have an interest in Japan and that I am married to a Citizen of the
Japanese Empire. People who have a spouse from another country, continent or
planet will have had to make the decision, sooner or later: who of the two will
leave their family, friends and history behind to join their loved one in that
strange new place. In the early days of the relationship I was lucky enough that
my wife had bigger coconuts then me - She left everything and everyone behind to
come and join me.&lt;/p&gt;
&lt;p&gt;But before I met my wife, there was another, simple reason I visited Japan: I love that
country. It is difficult (and beyond the scope here) to say why I like it so
much. Probably has something to do with the deep respect that is woven into the
fabric of their daily life. Or the sharp contrast with the former in which many
Japanese literally offer their &lt;em&gt;lives&lt;/em&gt; to work for the bigger, older Japanese
companies who it turn show no respect what so ever to their employees.&lt;/p&gt;
&lt;p&gt;It could also be the ancient culture that is still so much alive today. Or the fact that
the Japanese have made their own isolated world with their own brands and models of almost
everything they use in daily life (cars, cellphones, food products, health
products, medicine, movies, ...). Maybe even, in harmony with the former, the almost
chauvinistic way the Japanese love Japan...because it has everything one ever
needs, and more.&lt;/p&gt;
&lt;p&gt;Or it could just be the food...because the Japanese make some goddamn tasty
food. Tampura, Onigiri, Yakiniku, Shabu-Shabu, Terriyaki, Okonomiyaki, Sushi,
Soba, Ramen, ... I'm so hungry right now.  Okay, it has definitely something
to do with the food.&lt;/p&gt;
&lt;p&gt;As much as it startled me at first though, one cannot make a decision on moving
to the other side of the world solely based on the quality of the food. Really. One
cannot. So it is probably a mix of everything I scribbled above, &lt;em&gt;the&lt;/em&gt; most important
one being: my wife.&lt;/p&gt;
&lt;p&gt;Do not get me wrong, she loves Belgium, and she surely loves the food (which I
think is still a good reason to move). And she surely would be okay with living
out her live together with me here in this cold, foggy speck on the world
map. But there is a small detail I have come to know over the past years that
grows more apparent over time:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;"If you marry a Japanese, you marry with Japan."&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Aside from the high corny content of the above quote, it still is a fact. The
Japanese &lt;em&gt;love&lt;/em&gt; Japan, and the more I visited Japan, the more I got acquainted
with their customs and way of going about things, the more I realized why. Again
it is too elaborate and too difficult to pinpoint but it boils down, like I said
above, to the fact that Japan has everything the Japanese need.&lt;/p&gt;
&lt;p&gt;So after this brief introduction about the reasoning and sentimentality's behind
our decision, I would like to take you on a small tour of what it takes to move
to Japan.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; the procedures explained below are based on my personal experience in
  obtaining a legal visa to live and work in Japan. The documents, forms and
  procedures explained can vary in different countries, different visa types and
  are subject to change over time.&lt;/p&gt;
&lt;h3&gt;Types of Visa&lt;/h3&gt;
&lt;p&gt;When you simply visit Japan for recreational purposes, you will have a &lt;em&gt;tourist
visa&lt;/em&gt;. This visa is strictly for roaming around the country; you are not allowed
to do any paid (or unpaid) employment. The maximum length of stay is limited to
three months. If you want to actually work in Japan you will need a different
type of visa. One of the most significant ones:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Diplomat visa&lt;/li&gt;
&lt;li&gt;Student exchange program visa&lt;/li&gt;
&lt;li&gt;Working visa&lt;/li&gt;
&lt;li&gt;Press agency visa&lt;/li&gt;
&lt;li&gt;Spouse or child of Japanese national visa&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Being married to a Japanese national, I have to choose the last: &lt;em&gt;Spouse
or child of Japanese national&lt;/em&gt;. Each visa has their own procedures and
surprises, but this post will, of course, only cover the latter type.&lt;/p&gt;
&lt;p&gt;First, what exactly is this visa thing?&lt;/p&gt;
&lt;p&gt;All of the above visa types are, in the first place, mere landing permits. Like
I said before, the standard visa is a tourist visa. You will get this automatically on entering Japan
(unless you are prohibited to enter Japan). The only thing you need to get such
a tourist visa is simply a valid passport (for most countries).&lt;/p&gt;
&lt;p&gt;You land in Japan, enter immigration, get your passport checked, get your
fingerprints and  mugshot taken and if all is well, you will get a stamp
in your passport that allows your entrance into Japan and a stay of three
months.&lt;/p&gt;
&lt;p&gt;If you want to live and work in Japan as a &lt;em&gt;spouse of a Japanese national&lt;/em&gt;, you
will get a special visa glued into your passport together with a special
document officially titled a &lt;em&gt;Certificate of Eligibility&lt;/em&gt;. These two will be
checked, together with the same checks a tourist will get, and if all is well
you will be granted a &lt;em&gt;Foreigner Card&lt;/em&gt; which will (together with a
possible driver's license) act as your Japanese ID card.&lt;/p&gt;
&lt;p&gt;But first things first, how do you get such a certificate and such a glued visa?&lt;/p&gt;
&lt;h3&gt;Certificate of Eligibility&lt;/h3&gt;
&lt;p&gt;Before you can do anything else, you have to get your own, genuine certificate.
As is the case in many countries, Japan is very wary for sham marriages;
two people getting married for the sole purpose of obtaining a visa.&lt;/p&gt;
&lt;p&gt;So in an attempt to counter these fake marriages the Japanese introduced a check
in which you will have to file a few forms of paperwork together with prove of
your love towards your spouse. After sending them all these documents a special
immigration officer will decide if you are eligible to get the certificate.
Let us look into this a bit more.&lt;/p&gt;
&lt;h4&gt;The Question Form&lt;/h4&gt;
&lt;p&gt;First, the paperwork. The first form you have to fill in is fittingly called
"Question form" and consists of about 7 pages crammed with Japanese / English
texts followed by dotted lines.&lt;/p&gt;
&lt;p&gt;One the first page you will be asked about your current details. You, the
&lt;em&gt;applier&lt;/em&gt;, has to tell immigration where your are currently living and what your
current contact details are. They also want to know the basic layout
of your current residence.&lt;/p&gt;
&lt;p&gt;This layout has to be entered in Japanese form. Japanese layout always
consists out of a number followed by letters representing predefined rooms. The
following letter codes are available:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;L = Living room&lt;/li&gt;
&lt;li&gt;D = Dining room&lt;/li&gt;
&lt;li&gt;K = Kitchen&lt;/li&gt;
&lt;li&gt;S = Service room&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;All of the remaining rooms can be summed up and put in the number preceding the letters.&lt;/p&gt;
&lt;p&gt;Example:&lt;/p&gt;
&lt;p&gt;2LDK = Living room, kitchen, dining room and 2 extra rooms.&lt;/p&gt;
&lt;p&gt;3LK - Living room, kitchen and 3 extra rooms.&lt;/p&gt;
&lt;p&gt;1ROOM = Single room (small studio or really small student flat) including everything.&lt;/p&gt;
&lt;p&gt;They also wish to know how much rent or loan you have to pay to live in your
current place.&lt;/p&gt;
&lt;p&gt;Still on page one they would like to know some employment details of your
Japanese spouse, referred to as the &lt;em&gt;Japanese national&lt;/em&gt;. Where does she or he
live and when did she or he start to work for that company.&lt;/p&gt;
&lt;p&gt;Page number two is dedicated to your relationship with your spouse and is the
page the immigration officer will use to see if your are actually really in love
with each other. This page has some general questions that you have to answer
with short stories.&lt;/p&gt;
&lt;p&gt;The most significant question is about how you met your spouse. You need to write a
short story explaining how and how many times you met each other before
you got married. They ask you to prove this with extra material you can send
together with this form. In our case we added copies of old plane tickets
proving I flew to Japan many times.&lt;/p&gt;
&lt;p&gt;We also include about 30+ photographs showing many highlights of our
relationship in the time before we married. Each photograph was numbered and we
used page number two together with some extra white sheets to describe each photo
(including date, time and place taken).&lt;/p&gt;
&lt;p&gt;On this page they actually only ask for about 3 photographs, but it speaks for
itself that the more material you can include, the better chance you will have
of being taken seriously.&lt;/p&gt;
&lt;p&gt;The next 5 pages consist of many small questions asking more details about you
as a couple and the family on both sides. These questions are:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;What language do you speak with your spouse?&lt;/li&gt;
&lt;li&gt;To what extend do you understand your spouse main language?&lt;/li&gt;
&lt;li&gt;To what extend does your spouse understand your main language?&lt;/li&gt;
&lt;li&gt;How well do you understand Japanese and how did you learn to speak it?&lt;/li&gt;
&lt;li&gt;If you do not understand each others language, how do you communicate?&lt;/li&gt;
&lt;li&gt;How many times did you marry before marring your spouse?&lt;/li&gt;
&lt;li&gt;How many times did you spouse marry before marring you?&lt;/li&gt;
&lt;li&gt;How many times, on which dates and why did you come to Japan before marring?&lt;/li&gt;
&lt;li&gt;How many times, on which dates and why did your spouse come to your country
  before marrying?&lt;/li&gt;
&lt;li&gt;Before getting married, where you forced to leave Japan?&lt;/li&gt;
&lt;li&gt;List the age, full name, address, telephone number and relationship to you of each of your direct family members&lt;/li&gt;
&lt;li&gt;List the age, full name, address, telephone number and relationship to your spouse of each of your spouses
  direct family members&lt;/li&gt;
&lt;li&gt;Who of those listed direct family members knows about your marriage with your
  spouse (both sides)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;None of these questions are difficult to fill in and many of them are in multiple
choice style; you do not have to write much stories. Use common sense to fill
them in, you can be optimistic but be realistic. If
you can only speak very basic Japanese, be honest. If you say you speak it
fluently, chances are high the immigration officer at your port of entry will
talk to you in Japanese, without a translator and I do not think they will like
it if you cannot get any further then "Konnichiwa".&lt;/p&gt;
&lt;p&gt;This sums up the "Question form" papers. The next piece of dead tree is the &lt;em&gt;"Guarantor
form"&lt;/em&gt;. Before a foreigner can move to Japan they, at least, need a &lt;em&gt;Guarantor&lt;/em&gt;
to safeguard there stay in Japan.&lt;/p&gt;
&lt;p&gt;A guarantor is a person who, for at least the past 3 years, is living &lt;em&gt;and&lt;/em&gt;
working in Japan. It is, of course, also preferred (though not required) that
that person is of
Japanese origin. They will be financially responsible in case you need to be
deported and cannot pay for your own means of travel or in case you subject
yourself to criminal acts while in Japan.&lt;/p&gt;
&lt;p&gt;This person can be anyone, a friend, a family member (in my case it was my
mother in law), your future boss in Japan, ... as long as they qualify with the
above requirements.&lt;/p&gt;
&lt;p&gt;On the "Guarantor form" they will ask the nationality, full name and address of
the person as well as the full employer details and your relationship to your
guarantor. It is also mandatory for the guarantor to include a paper from her or
his employer stating that she or he is actually working in that company.&lt;/p&gt;
&lt;p&gt;This one page form has to be signed with the regular &lt;em&gt;inkan&lt;/em&gt; that belongs to
the guarantor.&lt;/p&gt;
&lt;h4&gt;What is an inkan?&lt;/h4&gt;
&lt;p&gt;In Japan it is uncommon and in some cases even not allowed to sign forms
with a western style signature. They have to be signed with a special stamp
baring the family or given name of the person it belongs to. This stamp is
called and &lt;em&gt;inkan&lt;/em&gt; or &lt;em&gt;hanko&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;You usually have three types of inkan:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Registered inkan: This inkan can be used for the whole direct family and
  carries the family name decorated with many lines to make it as unique as
  possible. This inkan has to be registered at your local city hall and can only
  be used to sign very official documents ie. buying a house, buying a car, ...&lt;/li&gt;
&lt;li&gt;Bank inkan: This stamp is used only inside a bank office to sign transactions
  or to open a new bank account.&lt;/li&gt;
&lt;li&gt;Regular inkan: This stamp, also referred to as &lt;em&gt;hanko&lt;/em&gt;, is used to sign every
  day things ie. receiving a parcel, signing something at your office, ...&lt;/li&gt;
&lt;/ul&gt;&lt;h4&gt;The Application for Certificate of Eligibility Form&lt;/h4&gt;
&lt;p&gt;The next sheets you have to fill in are called the &lt;em&gt;"Application for Certificate
of Eligibility form"&lt;/em&gt;. This is the &lt;em&gt;actual&lt;/em&gt; application form for getting your
certificate. Being a bit redundant, you have to fill in many details again. Part
one of the form consists of your personal details:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;a recent photo&lt;/li&gt;
&lt;li&gt;your full name, sex, age&lt;/li&gt;
&lt;li&gt;place and date of birth&lt;/li&gt;
&lt;li&gt;current place of residence&lt;/li&gt;
&lt;li&gt;current occupation&lt;/li&gt;
&lt;li&gt;address and contact details where you can be contacted in Japan (or
  details of friends/family of you can be contacted in Japan). I gave my other
  in laws details.&lt;/li&gt;
&lt;li&gt;passport number, passport expiration date, the government who issued your
  current passport&lt;/li&gt;
&lt;li&gt;purpose of entry into Japan (in my case: living indefinitely)&lt;/li&gt;
&lt;li&gt;estimated date of entry into Japan&lt;/li&gt;
&lt;li&gt;how many times you came to Japan in the past&lt;/li&gt;
&lt;li&gt;persons who will accompany you while traveling to Japan&lt;/li&gt;
&lt;li&gt;whether you have a criminal record in your country of origin or in Japan&lt;/li&gt;
&lt;li&gt;if you have been deported out of Japan previously&lt;/li&gt;
&lt;li&gt;which immigration office you would like to apply in Japan&lt;/li&gt;
&lt;li&gt;details of your family in-law in Japan&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Part two mainly asks about your financial situation:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;which visa type are you applying for&lt;/li&gt;
&lt;li&gt;do you have any money in the bank (Japanese bank or foreign bank) to support
  yourself&lt;/li&gt;
&lt;li&gt;do you bring cash money into Japan and how much&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Part three of this form will ask, again, the details of your Guarantor.&lt;/p&gt;
&lt;p&gt;That is actually all the mandatory forms you have to fill in before your
application will be accepted. However, again, the more details you can ship with
your application the better.&lt;/p&gt;
&lt;p&gt;In my case, we added the following extra documents:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;A copy of the official "Marriage certificate" issued by our local government office here
  in Belgium.&lt;/li&gt;
&lt;li&gt;All the bank statements regarding the savings we build up the past few years;
  money which we will use to bootstrap our Japan life.&lt;/li&gt;
&lt;/ul&gt;&lt;h3&gt;Trip to the immigration office&lt;/h3&gt;
&lt;p&gt;After you have filled in all these forms and supplied all the additional pieces
you are ready to submit this to your preferred immigration office in
Japan.&lt;/p&gt;
&lt;p&gt;In our case we lived in Belgium at the time of application, so we had to send
the whole package to my mother in law, who in turn did the application for us in
the local immigration office of Chiba (Japanese prefecture). When you go to the
office, you also need to bring an envelope they can use to send your certificate
to you if your application was accepted. &lt;/p&gt;
&lt;p&gt;When you apply, prepare a good book or any other waiting material because
waiting can take up to four hours or more.&lt;/p&gt;
&lt;p&gt;My mother in law went to there in the early morning, just after opening hours,
and there was already a line of 20+ people. She had to wait for three
hours before getting to a clerks desk. Also make sure you have filled in every
form, if not, they will check, point you to your errors and send you back to the
end of the line for another few hours of waiting.&lt;/p&gt;
&lt;p&gt;If you have no direct errors in your forms they will assign you an application
number and give you a paper with the date of application and the immigration
office stamp together with a small leaflet of what will happen next.&lt;/p&gt;
&lt;h3&gt;The waiting&lt;/h3&gt;
&lt;p&gt;Good, you just applied and they inform you (like only a government can) that it
will take from two weeks up to three months for the application to be
processed. If the result is positive the envelope you provided will be filled
with a certificate and sent by special post to the Japanese address you provided
when applying.&lt;/p&gt;
&lt;p&gt;If it was negative they will call the Japanese contact person and tell the
reason why.&lt;/p&gt;
&lt;p&gt;After about seven weeks my mother in law received the certificate, bringing me
one step closer to living in Japan.&lt;/p&gt;
&lt;h3&gt;Applying for a visa (landing permission)&lt;/h3&gt;
&lt;p&gt;However, the certificate by itself is not worth a thing. It will need an
accompanying visa. This visa has to be provided by the Japanese embassy of the
country you are currently living and thus moving away from.&lt;/p&gt;
&lt;p&gt;In my case this meant that my mother in law had to send the certificate via
secure post to my home here in Belgium, so another week past.&lt;/p&gt;
&lt;p&gt;Please note that this step is heavily dependent on which country your are
living. For most European countries this will be quite identical. If you live in
countries like America or Russia, the local Japanese embassy will probably require
different items.&lt;/p&gt;
&lt;p&gt;Once your certificate arrived you have to prepare the following items to bring 
to your Japanese embassy:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Your current valid passport&lt;/li&gt;
&lt;li&gt;Recent photo&lt;/li&gt;
&lt;li&gt;Your certificate of eligibility&lt;/li&gt;
&lt;li&gt;Your ID card&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;When you arrive at your embassy you will get another application form, this time
to get your visa. This form is only one page with some general questions like
your full name, birth date and place, passport number, ...&lt;/p&gt;
&lt;p&gt;Give this form together with a recent photo, your passport and your certificate to an embassy
employer. In return you will get a small card with the date you can pick up your
passport containing the glued visa.&lt;/p&gt;
&lt;p&gt;This usually takes up to 3 business days. If you did not get any phone call in
those three days, you can go there, pick up everything and prepare your trip to
Japan.&lt;/p&gt;
&lt;h3&gt;Landing in Japan AKA The foreigner card&lt;/h3&gt;
&lt;p&gt;The next and final step is to actually say goodbye to your home country, take the airplane
and fly to your favorite airport in Japan. After getting off the airplane you usually have
to head over to the immigration booths where they will, as I mentioned in the beginning, 
sample some biometric data and stamp your passport.&lt;/p&gt;
&lt;p&gt;But for you it will be slightly different, in fact, there is a special waiting line
for people who enter Japan on bases of a visa other then a tourist visa. When I landed
at Narita airport they even had different lines based on the types of visa you had
applied for.&lt;/p&gt;
&lt;p&gt;Once you reach the immigration officer she or he will check your glued passport visa and 
certificate of eligibility. If these are in order you have to wait for about 10 minutes
as they prepare your foreigner card. If your arrival is suspicious you will be taken
to a separate, private booth. What happens there is left over to the imagination, but
 I think a rubber glove is included.&lt;/p&gt;
&lt;p&gt;I was lucky enough that everything checked out and I received my foreigner card together
with a few pamphlets telling me about the Japanese health care system, Japanese pension system, etc.&lt;/p&gt;
&lt;h3&gt;What is next?&lt;/h3&gt;
&lt;p&gt;Well, if you have come this far I think you have accomplished your goal: you moved to Japan.
Now everything is left up to you, learn the language, find a job, become a pupil of a bonsai
master, etc. Start your Japanese life.&lt;/p&gt;
&lt;p&gt;Some starting points:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Get your Japanese drivers license&lt;/li&gt;
&lt;li&gt;Go to your local city hall to enroll into the national health insurance&lt;/li&gt;
&lt;li&gt;Go to your local city hall to enroll into the national pension service&lt;/li&gt;
&lt;li&gt;Go to your local city hall to register your official Inkan&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;From time to time I will update you with enlightenments about my Japanese experiences, hopefully 
giving you useful tips and tricks to speed up your life over here or to help you consider 
(or reconsider) a possible move to Japan. Keep watching!&lt;/p&gt;
&lt;p&gt;And as always, thanks for reading!&lt;/p&gt;&lt;/div&gt;</description><category>japan</category><guid>http://shisaa.be/postset/moving-to-japan.html</guid><pubDate>Mon, 03 Mar 2014 10:00:00 GMT</pubDate></item><item><title>The Scheme programming language AKA The CHICKEN hens nest - Part 3</title><link>http://shisaa.be/postset/chicken-scheme-3.html</link><dc:creator>Tim van der Linden</dc:creator><description>&lt;div&gt;&lt;h3&gt;Chapter 3 - Wrapping the egg&lt;/h3&gt;
&lt;p&gt;And here we arrive at the final stage of our egg development.&lt;/p&gt;
&lt;p&gt;If you did not yet do so, please go and read &lt;a href="http://shisaa.be/postset/chicken-scheme-1.html" title="Chapter 1 of the CHICKEN series."&gt;chapter 1&lt;/a&gt; and &lt;a href="http://shisaa.be/postset/chicken-scheme-2.html" title="Chapter 2 of the CHICKEN series."&gt;chapter 2&lt;/a&gt; before you endeavor on this final hop to a published egg in CHICKEN.&lt;/p&gt;
&lt;p&gt;Let us find out what we will be dealing with:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Explain a little bit about &lt;em&gt;why the hell we need to write tests for our code&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Look at how we can actually write basic tests in CHICKEN&lt;/li&gt;
&lt;li&gt;While writing our tests, look at some new items like "let" and "apply"&lt;/li&gt;
&lt;li&gt;Create our setup file to automatically setup our egg when people install it&lt;/li&gt;
&lt;li&gt;Create the needed meta file for the CHICKEN egg system&lt;/li&gt;
&lt;li&gt;Create the release-info file for CHICKEN's code host independent deployment system &lt;/li&gt;
&lt;li&gt;Quickly compile and install our egg&lt;/li&gt;
&lt;li&gt;Submit the egg to CHICKEN&lt;/li&gt;
&lt;li&gt;Write the egg's documentation&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;You have got your battle axe ready? Then by all means...dive in!&lt;/p&gt;
&lt;h4&gt;Why tests?&lt;/h4&gt;
&lt;p&gt;First, let me explain to you the importance of writing tests for your code:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IT IS FRACKING IMPORTANT!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Ahum...sorry, got carried away there for a bit...let me rephrase that:&lt;/p&gt;
&lt;p&gt;Writing tests is rather smart to do.&lt;/p&gt;
&lt;p&gt;Why?&lt;/p&gt;
&lt;p&gt;Well, it is actually a cultural thing inside a programmers mind or inside a community of programmers. Some (many) programmers find writing tests a cumbersome task, time consuming.&lt;/p&gt;
&lt;p&gt;Their train of thought usually runs past &lt;em&gt;Lazy Ville&lt;/em&gt; and stops at &lt;em&gt;I'm Freaking Awesome - Don't Need Tests&lt;/em&gt; station.&lt;/p&gt;
&lt;p&gt;Phrases you often hear are: "My code works, screw your tests!" or "I can test my code better then a computer can" or "I wanted to write tests, but then my cat sat on my keyboard, switched on YouTube and watched other cats do silly things. It was awesome so I decided to not write tests tonight...I will do that first thing in the morning &lt;small&gt;after washing the car, letting out my cat, putting out the garbage, reading my awesome Facebook page, ...&lt;/small&gt;".&lt;/p&gt;
&lt;p&gt;The problem, however, is that we are all stupid dumb ass bags of watery flesh. Thinking that we never make mistakes and therefor do not need sufficient testing of our work is downright ignorant and even arrogant. We &lt;em&gt;all&lt;/em&gt; suck at programming. Some of us less, some of us more. We humans have a tendency of slacking. We let our brains trick ourselves into thinking we always have the whole picture in our minds and therefor know what we are doing.&lt;/p&gt;
&lt;p&gt;In other engineering professions where folks build bridges or airplanes, there is a strong sense of not making the same mistake twice (and thus spare a life or two). People learn from mistakes made in the past and implement a huge amount of time and money into testing their work before releasing it to the public. Somehow, in the programming world we dwell in, we tend to neglect all of these values as if we are some kind of super humans. We are not.&lt;/p&gt;
&lt;p&gt;Good programmers implement the same engineering values as our neighboring colleagues: they learn from others mistakes and they test their work.&lt;/p&gt;
&lt;p&gt;Got it?&lt;/p&gt;
&lt;p&gt;Good...then let us write some tests!&lt;/p&gt;
&lt;h4&gt;Testing in CHICKEN&lt;/h4&gt;
&lt;p&gt;Because testing is so vital for every serious project, the CHICKEN community has put a lot of effort into making this as trivial as possible to setup.
Especially the work of &lt;a href="http://wiki.call-cc.org/users/alex-shinn" title="CHICKEN User page of Alex Shinn."&gt;Alex Shinn&lt;/a&gt; and  &lt;a href="http://wiki.call-cc.org/users/mario-domenech-goulart" title="CHICKEN User page of Mario Domenech Goulart."&gt;Mario Domenech Goulart&lt;/a&gt; made our CHICKEN testing lives a lot easier.&lt;/p&gt;
&lt;p&gt;Alex created an egg called "&lt;em&gt;Test&lt;/em&gt;" which gives us a bunch of handy procedures to easily write tests for our code. Mario has written an &lt;em&gt;immense&lt;/em&gt; egg neatly called "&lt;em&gt;Salmonella&lt;/em&gt;" that you can use to automatically test your egg. Once you submit your egg to CHICKEN, the server will run Salmonella automatically every night on all submitted eggs. It will generate a report that you can check at any time to see if you egg fails its tests.&lt;/p&gt;
&lt;p&gt;For my JSON-RPC egg I decided to write two different kind of tests:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;A normal test where we tell CHICKEN which result we except with the given code&lt;/li&gt;
&lt;li&gt;An error test where we explicitly test if something throws an error&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Let the writing commence!&lt;/p&gt;
&lt;p&gt;All of your tests go into one file called &lt;em&gt;run.scm&lt;/em&gt; and for Salmonella to find your tests that file has to be placed inside a directory of your egg called &lt;em&gt;tests&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;In the &lt;em&gt;run.scm&lt;/em&gt; the first thing you have to put is a line that will actually load the test suite for you:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt; &lt;span class="nv"&gt;test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Next we need to load in our egg itself, since this is an external test file. Once your egg is compiled you can load it using the &lt;em&gt;use&lt;/em&gt; procedure like so:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt; &lt;span class="nv"&gt;json-rpc-client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We can, of course, pull these two lines together to form:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt; &lt;span class="nv"&gt;test&lt;/span&gt; &lt;span class="nv"&gt;json-rpc-client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This will not yet work since we still have to compile and install our egg. First, let us write the test file.&lt;/p&gt;
&lt;p&gt;When testing it is a good idea to divide your tests into groups so that the reports that will roll out later can group your tests to make everything more readable.
For this we can use &lt;em&gt;test-group&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-group&lt;/span&gt; &lt;span class="s"&gt;"A name for your group"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You define a name for the group and then list all the tests you would like to perform.
In my case, I would like to test if the JSON-RPC call I do gets formatted to correct JSON-RPC. Therefore I named my first group "JSON-RPC string output checks".
We will first concentrate on writing some tests and later wrap this group around them.&lt;/p&gt;
&lt;p&gt;A normal test is very easy to setup and uses easy to read syntax:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt; &lt;span class="s"&gt;"Description of the test"&lt;/span&gt; &lt;span class="s"&gt;"The result you want to see"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;procedure&lt;/span&gt; &lt;span class="nv"&gt;to&lt;/span&gt; &lt;span class="nv"&gt;test&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We start a test by using the "test" procedure and we give it a description.
Next we set the result that we &lt;em&gt;expect&lt;/em&gt; to see if our code is correct, this can be a string, a list, anything that could be returned.
Finally we call the to be tested procedure itself.&lt;/p&gt;
&lt;p&gt;We want to test that the final JSON-RPC string, that will be sent to the server, is correctly formatted.
Because during testing, we generally do not have access to a real JSON-RPC server to communicate with and thus cannot setup any real world ports to communicate, we need to setup a different kind of port.
These kind of ports are called &lt;em&gt;string ports&lt;/em&gt;:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;An &lt;em&gt;input&lt;/em&gt; string port holds a string that you define&lt;/li&gt;
&lt;li&gt;An &lt;em&gt;output&lt;/em&gt; string port captures strings written to it that you can read out&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Once we have these string ports setup we can use it to catch the string that normally would be send to a JSON-RPC server and compare it with a string we defined.
Before testing, we have to define the ports and setup the JSON-RPC connection, in our case called "xbmc":&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open-input-string&lt;/span&gt; &lt;span class="s"&gt;"some-string"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open-output-string&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="nv"&gt;xbmc&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now we can throw something at the simulated server and check how it comes out the other end:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;xbmc&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt; &lt;span class="s"&gt;"Call with only a method"&lt;/span&gt; &lt;span class="s"&gt;"{\"jsonrpc\":\"2.0\",\"method\":\"Player.PlayPause\",\"id\":\"1\"}"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-output-string&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We call "xbmc" with only a method. Then we call "test", give it a description, give the string we are expecting and finally read out the output port that our xbmc call used with the build in "get-output-string" procedure.&lt;/p&gt;
&lt;p&gt;While this test works fine, the whole is not very flexible.
Let me illustrate this by showing you what happens when we create a second test in our "run.scm" file:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;xbmc&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt; &lt;span class="nv"&gt;playerid:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt; &lt;span class="s"&gt;"Call with a method and a one dimensional params"&lt;/span&gt; &lt;span class="s"&gt;"{\"jsonrpc\":\"2.0\",\"method\":\"Player.PlayPause\",\"params\":{\"playerid\":0},\"id\":\"1\"}"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-output-string&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;With this test, we not only check the method, but we also give a param and check how that turns out. This is also a perfectly legal and sensible test.&lt;/p&gt;
&lt;p&gt;The problem here is that the &lt;em&gt;output&lt;/em&gt; port we defined above will contain both the output of the first &lt;em&gt;and&lt;/em&gt; the second test. It is a characteristic of the string output port that it will accumulate the strings that it receives. So when we read from that port we will again receive the result of the first test followed by the result of the second test, which is the one we are actually interested in.&lt;/p&gt;
&lt;p&gt;This, of course, means that our second test will fail, because the string comparison will be #f (false).&lt;/p&gt;
&lt;p&gt;To solve this problem, we need to create our own little test procedure specifically tailored for our JSON-RPC server. This procedure will just be a small wrapper around the normal "test" procedure, but will take care of the accumulating output port problem described above. Let me present the code to you:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-server&lt;/span&gt; &lt;span class="nv"&gt;description&lt;/span&gt; &lt;span class="nv"&gt;expected&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;output&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open-output-string&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;
            &lt;span class="nv"&gt;message&lt;/span&gt;
            &lt;span class="nv"&gt;expected&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-output-string&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Okay, there are some new things in here, so let us break this down line by line.&lt;/p&gt;
&lt;p&gt;The first line:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-server&lt;/span&gt; &lt;span class="nv"&gt;description&lt;/span&gt; &lt;span class="nv"&gt;expected&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This is nothing new, we define a new procedure called "test-server" which takes three mandatory and one optional argument.
The first argument will be the description we print in our test report, the second argument is the "datum" we except.
The third and fourth arguments are the familiar method and optional params we use in our testing.&lt;/p&gt;
&lt;p&gt;Next line:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;output&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open-output-string&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Here we have a new "thing" in sight: &lt;em&gt;let&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;"Let" is actually not a procedure, but a "special form" that is quite commonly used when programming in CHICKEN and is needed for &lt;em&gt;local scoping&lt;/em&gt; of your variables.
The difference between procedures and special forms is (way) beyond the scope of this chapter, just know there is a difference between the two.&lt;/p&gt;
&lt;h5&gt;Local scoping?&lt;/h5&gt;
&lt;p&gt;In almost all programming languages you have a concept called &lt;em&gt;"scoping"&lt;/em&gt; and it is nothing more then the name suggests. It is the process of keeping parts of your code (variables, procedures, ...)  "hidden" from other parts of your code. When you scope variables, you make them available to only a certain part of your program. Other parts of your code do not even know that those variables exists and thus cannot read or overwrite them.&lt;/p&gt;
&lt;p&gt;In CHICKEN, and most other Scheme languages, "let" will do that job for you. How does "let" work? Simple:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;(let ((foo "a local string")
      (bar "another local string"))
     (cons foo bar))
&lt;/pre&gt;


&lt;p&gt;In a "let" you have to define your local variables first, in our case "foo" and "bar". These definitions form the first argument.
The second argument to "let" is the procedure body; the code that will be executed using your locally stored variables.
Code that resides outside of the "let" never knows the existence of these local variables "foo" and "bar".&lt;/p&gt;
&lt;p&gt;In our case, "let" can help us by making the output port a local one so that every time the "let" is called, a new output port will be created and thus will only contain one string at a time.&lt;/p&gt;
&lt;p&gt;The next line is the start of the body of the "let":&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Here we encounter yet another new player in town: &lt;em&gt;"apply"&lt;/em&gt;. This procedure takes an "infinite" amount of arguments; the first always being a procedure and the rest being arguments that the procedure we be applied upon. A very important thing to note about the arguments is that the last argument &lt;em&gt;has&lt;/em&gt; to be a list, otherwise "apply" does not work.&lt;/p&gt;
&lt;p&gt;"Apply" will cons the arguments you put in between its first and last argument onto the last argument, before calling the given procedure with those arguments. Let me demonstrate:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="nv"&gt;some-procedure&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now "apply" simply has two arguments, a fictional procedure called "some-procedure" and a list of three arguments '(1 2 3). It is allowed to put additional arguments &lt;em&gt;between&lt;/em&gt; the two given here:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="nv"&gt;some-procedure&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now three extra arguments will be placed in between the original procedure and the original, last argument '(1 2 3). "Apply" will first cons every argument onto the last list given:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="nv"&gt;some-procedure&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Then it will take that list and call the procedure with every item, as if it where the actual arguments to that function.&lt;/p&gt;
&lt;p&gt;In our case, we have the procedure "json-rpc-server" with its own arguments, one of which is the locally scoped output port. Because this procedure will return a lambda expecting a method and optional params, we can use "apply" to give these arguments to this returned lambda. "Apply" will in turn return the result of the lambda with the method and params applied.&lt;/p&gt;
&lt;p&gt;Next we have the rest of our "let" body:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;
    &lt;span class="nv"&gt;description&lt;/span&gt;
    &lt;span class="nv"&gt;expected&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-output-string&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Here we simply do what we did before: we call the "test" procedure, give it a description, tell it what we expect to get and finally read out the local output string port to compare with.&lt;/p&gt;
&lt;p&gt;Let us use this newly created procedure in our test:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-server&lt;/span&gt; &lt;span class="s"&gt;"Call with only a method"&lt;/span&gt; &lt;span class="s"&gt;"{\"jsonrpc\":\"2.0\",\"method\":\"Player.PlayPause\",\"id\":\"1\"}"&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now we have a piece of code that is a little bit more readable and tailored to our JSON-RPC egg:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;We call the test-server procedure&lt;/li&gt;
&lt;li&gt;First argument is just a description of what we will be testing&lt;/li&gt;
&lt;li&gt;Second is the string we expect to get&lt;/li&gt;
&lt;li&gt;Last is, in this case, the method we want to test our "json-rpc-server" procedure with&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;One more thing to note is the escaping that you may have noticed in the string we expect. Because a double quote " means something in CHICKEN (it wraps a string) we need to escape it inside the string by using a backslash. The JSON-RPC valid JSON string we use here normally looks like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nt"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And with the double quotes escaped in a CHICKEN string becomes this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="s"&gt;"{\"jsonrpc\":\"2.0\",\"method\":\"Player.PlayPause\",\"id\":\"1\"}"&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now we have a valid test for the case where we call our procedure with only a method. Let us now write one with a method and a single param:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-server&lt;/span&gt; &lt;span class="s"&gt;"Call with a method and a one dimensional params"&lt;/span&gt; &lt;span class="s"&gt;"{\"jsonrpc\":\"2.0\",\"method\":\"Player.PlayPause\",\"params\":{\"playerid\":0},\"id\":\"1\"}"&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt; &lt;span class="nv"&gt;playerid:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Nothing strange happening here. We give it a different description of course, and we expect a different string, including params this time. And at the end we, of course, include a param in there.&lt;/p&gt;
&lt;p&gt;Let us now put these two tests in a group, like we said we would do a moment ago:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-group&lt;/span&gt; &lt;span class="s"&gt;"JSON-RPC string output checks"&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-server&lt;/span&gt; &lt;span class="s"&gt;"Call with only a method"&lt;/span&gt; &lt;span class="s"&gt;"{\"jsonrpc\":\"2.0\",\"method\":\"Player.PlayPause\",\"id\":\"1\"}"&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-server&lt;/span&gt; &lt;span class="s"&gt;"Call with a method and a one dimensional params"&lt;/span&gt; &lt;span class="s"&gt;"{\"jsonrpc\":\"2.0\",\"method\":\"Player.PlayPause\",\"params\":{\"playerid\":0},\"id\":\"1\"}"&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt; &lt;span class="nv"&gt;playerid:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;There, this does not look scary, right? We have a test group called "JSON-RPC string output checks" containing the two tests we just have written.&lt;/p&gt;
&lt;p&gt;You know what? That is all there is to basic testing. We have just written a few simple tests to verify the whole purpose of the JSON-RPC client side egg.&lt;/p&gt;
&lt;p&gt;The full code that we have in our "run.scm" up until now looks like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt; &lt;span class="nv"&gt;test&lt;/span&gt; &lt;span class="nv"&gt;json-rpc-client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open-input-string&lt;/span&gt; &lt;span class="s"&gt;"just-a-test-string"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open-output-string&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="nv"&gt;xbmc&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-server&lt;/span&gt; &lt;span class="nv"&gt;description&lt;/span&gt; &lt;span class="nv"&gt;expected&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;output&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open-output-string&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;
            &lt;span class="nv"&gt;description&lt;/span&gt;
            &lt;span class="nv"&gt;expected&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-output-string&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-group&lt;/span&gt; &lt;span class="s"&gt;"JSON-RPC string output checks"&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-server&lt;/span&gt; &lt;span class="s"&gt;"Call with only a method"&lt;/span&gt; &lt;span class="s"&gt;"{\"jsonrpc\":\"2.0\",\"method\":\"Player.PlayPause\",\"id\":\"1\"}"&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-server&lt;/span&gt; &lt;span class="s"&gt;"Call with a method and a one dimensional params"&lt;/span&gt; &lt;span class="s"&gt;"{\"jsonrpc\":\"2.0\",\"method\":\"Player.PlayPause\",\"params\":{\"playerid\":0},\"id\":\"1\"}"&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt; &lt;span class="nv"&gt;playerid:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;A few things to note about the above snippet:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;We use "use" in the beginning, but this does not work yet until we actually compile and install our egg.&lt;/li&gt;
&lt;li&gt;We define a &lt;em&gt;global&lt;/em&gt; input and output port plus a JSON-RPC connection called xbmc, we will need this later on.&lt;/li&gt;
&lt;li&gt;The input port we defined globally is used in our "test-server" procedure, but the output port is locally scoped in that same procedure.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The next kind of tests we want to write are error tests. This sort of test does exactly what you would expect, test if something gives an error.
In our case we need these kind of tests to see if our error handlers work correctly.&lt;/p&gt;
&lt;p&gt;The first error handler we wrote is the one for our main "json-rpc-server" procedure. So to write an error test for this, we simply have to do an erroneous call.
If we recall our "json-rpc-server" procedure we need three things to correctly setup the connection:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;A valid input port&lt;/li&gt;
&lt;li&gt;A valid output port&lt;/li&gt;
&lt;li&gt;A correct version number (a string that equals "2.0")&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;If we would like to test if it fails when we use a faulty input port, we would write the test like so:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-error&lt;/span&gt; &lt;span class="s"&gt;"Non port call on input"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="s"&gt;"input"&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The syntax looks quite familiar. We call the procedure "test-error" instead of "test" and give it a description. Then we simply call our procedure with some kind of error in it.
In this case we give the string &lt;em&gt;"input"&lt;/em&gt; instead of the defined variable &lt;em&gt;input&lt;/em&gt;. This string is not a valid input port of course, so our error handler fires and we get an error.
Getting an error in this case means that the test will pass!&lt;/p&gt;
&lt;p&gt;In my test I include some more error checks:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-error&lt;/span&gt; &lt;span class="s"&gt;"Non port call on output"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="s"&gt;"output"&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-error&lt;/span&gt; &lt;span class="s"&gt;"Non correct version number call"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"3.0"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And of course, this can become a group, giving you this code:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-group&lt;/span&gt; &lt;span class="s"&gt;"Non-port or non-version calls"&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-error&lt;/span&gt; &lt;span class="s"&gt;"Non port call on input"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="s"&gt;"input"&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-error&lt;/span&gt; &lt;span class="s"&gt;"Non port call on output"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="s"&gt;"output"&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-error&lt;/span&gt; &lt;span class="s"&gt;"Non correct version number call"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"3.0"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;A tricky thing to note about error tests is that &lt;em&gt;any&lt;/em&gt; kind of failure will make the test pass. This can be very misleading because even real errors in your code can cause a failure to occur and the test to pass. To more correctly setup these error tests, I should also check if the error I get back is of the custom type I defined specifically for the JSON-RPC egg. This way I know for sure that it is &lt;em&gt;my&lt;/em&gt; custom error handler that raises the situation, and not something else. But that kind of setup is beyond the scope of this chapter. For now just remember to be careful when writing error tests.&lt;/p&gt;
&lt;p&gt;A very important last step in your "run.scm" file is to end the file with the following code:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-exit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;If you omit this line the automated tests on the server will fail, so it is vital you end your file with this one.&lt;/p&gt;
&lt;p&gt;Good, we finished writing our tests!&lt;/p&gt;
&lt;p&gt;To recapitulate, here is the total code we have written so far:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt; &lt;span class="nv"&gt;test&lt;/span&gt; &lt;span class="nv"&gt;json-rpc-client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open-input-string&lt;/span&gt; &lt;span class="s"&gt;"some-string"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open-output-string&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="nv"&gt;xbmc&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-server&lt;/span&gt; &lt;span class="nv"&gt;description&lt;/span&gt; &lt;span class="nv"&gt;expected&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;let &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;output&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open-output-string&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;apply &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;
            &lt;span class="nv"&gt;description&lt;/span&gt;
            &lt;span class="nv"&gt;expected&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-output-string&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-group&lt;/span&gt; &lt;span class="s"&gt;"JSON-RPC string output checks"&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-server&lt;/span&gt; &lt;span class="s"&gt;"Call with only a method"&lt;/span&gt; &lt;span class="s"&gt;"{\"jsonrpc\":\"2.0\",\"method\":\"Player.PlayPause\",\"id\":\"1\"}"&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-server&lt;/span&gt; &lt;span class="s"&gt;"Call with a method and a one dimensional params"&lt;/span&gt; &lt;span class="s"&gt;"{\"jsonrpc\":\"2.0\",\"method\":\"Player.PlayPause\",\"params\":{\"playerid\":0},\"id\":\"1\"}"&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt; &lt;span class="nv"&gt;playerid:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-group&lt;/span&gt; &lt;span class="s"&gt;"Non-port or non-version calls"&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-error&lt;/span&gt; &lt;span class="s"&gt;"Non port call on input"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="s"&gt;"input"&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-error&lt;/span&gt; &lt;span class="s"&gt;"Non port call on output"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="s"&gt;"output"&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-error&lt;/span&gt; &lt;span class="s"&gt;"Non correct version number call"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="s"&gt;"3.0"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-exit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;But before we can run our tests, we first have to build some extra files that in total will define our egg and make it able to be compiled.&lt;/p&gt;
&lt;h4&gt;The setup file&lt;/h4&gt;
&lt;p&gt;file: &lt;em&gt;"json-rpc.setup"&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;This file mainly contains information for the compiler and for the CHICKEN install system. Check the &lt;a href="http://wiki.call-cc.org/eggs%20tutorial#the-setup-file" title="Setup file wiki page on CHICKEN."&gt;wiki page&lt;/a&gt; for more information.&lt;/p&gt;
&lt;p&gt;The JSON-RPC eggs setup file looks like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt; &lt;span class="nv"&gt;-s&lt;/span&gt; &lt;span class="nv"&gt;-O2&lt;/span&gt; &lt;span class="nv"&gt;-d1&lt;/span&gt; &lt;span class="nv"&gt;json-rpc-client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;scm&lt;/span&gt; &lt;span class="nv"&gt;-j&lt;/span&gt; &lt;span class="nv"&gt;json-rpc-client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt; &lt;span class="nv"&gt;-s&lt;/span&gt; &lt;span class="nv"&gt;json-rpc-client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;import&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;scm&lt;/span&gt; &lt;span class="nv"&gt;-O2&lt;/span&gt; &lt;span class="nv"&gt;-d0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;install-extension&lt;/span&gt;
    &lt;span class="ss"&gt;'json-rpc&lt;/span&gt;
    &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"json-rpc-client.so"&lt;/span&gt; &lt;span class="s"&gt;"json-rpc-client.import.so"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;version&lt;/span&gt; &lt;span class="s"&gt;"0.1.4"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;documentation&lt;/span&gt; &lt;span class="s"&gt;"json-rpc.html"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The first two lines are the lines the compiler will use to compile your "scm" files into actual raw C code.
The top line contains the CHICKEN file we have been working on in chapter one and two. The flags that are set here are a sensible default, what they mean is beyond the scope of these posts. You can, in most cases, just use these flags as is.&lt;/p&gt;
&lt;p&gt;The second line contains a CHICKEN file called "json-rpc-client.import.scm" which we did not create but which will be automatically created for you.
This file will contain information for the CHICKEN module system and also needs to be compiled to C.&lt;/p&gt;
&lt;p&gt;The rest of the file is occupied by a list containing some meta data about your egg. Let me break this down for you:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;'json-rpc - This is the actual name of your egg&lt;/li&gt;
&lt;li&gt;'("json-rpc-client.so" "json-rpc-client.import.so") - Will be the compiled files, the two CHICKEN file we mentioned above, but just with the extension "so" instead of "scm"&lt;/li&gt;
&lt;li&gt;'(version "0.1.4") - The version that you wish to compile. Every time you update the version you should &lt;em&gt;change&lt;/em&gt; this here as well&lt;/li&gt;
&lt;li&gt;'(documentation "json-rpc.html") - The place where CHICKEN can find your documentation. I have placed my documentation on the CHICKEN Wiki, which is the standard. If you place it on the wiki, all you need to do is to put the name of you egg (in this case "json-rpc") with the extension ".html" in there. Next it is important to also actually create that page on the wiki. We will do this in a bit.&lt;/li&gt;
&lt;/ul&gt;&lt;h4&gt;The meta file&lt;/h4&gt;
&lt;p&gt;file: &lt;em&gt;"json-rpc.meta"&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The meta file contains all kinds of information related to your egg. This information will later be displayed on the CHICKEN egg pages. Check the &lt;a href="http://wiki.call-cc.org/eggs%20tutorial#the-meta-file" title="Meta file wiki page on CHICKEN."&gt;wiki page&lt;/a&gt; for more information.
In the case of my JSON-RPC egg I input the following information:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;egg&lt;/span&gt; &lt;span class="s"&gt;"json-rpc.egg"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;synopsis&lt;/span&gt; &lt;span class="s"&gt;"JSON RPC client/server implementation"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;category&lt;/span&gt; &lt;span class="nv"&gt;web&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;needs&lt;/span&gt; &lt;span class="nv"&gt;medea&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;test-depends&lt;/span&gt; &lt;span class="nv"&gt;test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;doc-from-wiki&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;license&lt;/span&gt; &lt;span class="s"&gt;"BSD"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;author&lt;/span&gt; &lt;span class="s"&gt;"Tim van der Linden"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Let us go over them:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Give the eggs name, together with the extension ".egg"&lt;/li&gt;
&lt;li&gt;Write a short synopsis about the function of your egg&lt;/li&gt;
&lt;li&gt;Say which category the eggs belongs to. The different categories are listed &lt;a href="http://wiki.call-cc.org/eggs%20tutorial#the-setup-file" title="Egg categories wiki page on CHICKEN."&gt;here&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Tell CHICKEN which &lt;em&gt;non-core&lt;/em&gt; dependencies your egg has. In the case of the JSON-RPC egg, we have "srfi-1", "medea" and "extras", but only "medea" is a non-core dependency.&lt;/li&gt;
&lt;li&gt;Give the type of documentation you have. If your documentation resides on the CHICKEN Wiki, you need to put this line in&lt;/li&gt;
&lt;li&gt;Tell about the license you have included in your ".scm" files&lt;/li&gt;
&lt;li&gt;Tell them who the author is&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;That is all that goes in the meta file, save and close it.&lt;/p&gt;
&lt;h4&gt;The release-info file&lt;/h4&gt;
&lt;p&gt;file: &lt;em&gt;"release-info"&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;CHICKEN has the unique ability to be totally code host independent.&lt;/p&gt;
&lt;p&gt;You can put your code on CHICKEN's own SVN repositories or on your favorite code hosting site and the CHICKEN egg server will pull all information from there.
The "release-info" file contains information about where your code is hosted and which version numbers you have available.
Even when your third party code host is down, CHICKEN still has a local copy of the latest available version of your egg, so people installing your egg can still continue their development.&lt;/p&gt;
&lt;p&gt;Depending on which code hosting site you use there are different settings you ave to configure. There is an &lt;a href="http://wiki.call-cc.org/releasing-your-egg#creating-a-release-info-file" title="Release-info file creation on the CHICKEN wiki."&gt;extensive page&lt;/a&gt; about how to setup your release-info for your code host.&lt;/p&gt;
&lt;p&gt;I personally resent the hype around GIT (among other things) and chose Mercurial as my version control system and Bitbucket as my host. So the settings in the release-info file became:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;repo&lt;/span&gt; &lt;span class="nv"&gt;hg&lt;/span&gt; &lt;span class="s"&gt;"https://bitbucket.org/Timusan/{egg-name}"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uri&lt;/span&gt; &lt;span class="nv"&gt;targz&lt;/span&gt; &lt;span class="s"&gt;"https://bitbucket.org/Timusan/{egg-name}/get/{egg-release}.tar.gz"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;release&lt;/span&gt; &lt;span class="s"&gt;"0.1"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;release&lt;/span&gt; &lt;span class="s"&gt;"0.1.1"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;What is means:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;The first line is the url to the eggs main location&lt;/li&gt;
&lt;li&gt;Then we have the url to the gzipped tar files for each egg release I make (using Mercurial tags)&lt;/li&gt;
&lt;li&gt;And finally we have the different versions listed that are released&lt;/li&gt;
&lt;/ul&gt;&lt;h4&gt;Doing a test install&lt;/h4&gt;
&lt;p&gt;Okay, we now have the correct environment to install our egg via the &lt;em&gt;chicken-install&lt;/em&gt; command we used in chapter one.
Make sure you are in the main directory of your egg and that you are root. Then simply call "chicken-install" without any arguments:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;chicken-install
&lt;/pre&gt;


&lt;p&gt;Or if you do not want to become root, simply use the "-s" switch to temporarily sudo the install:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;chicken-install -s
&lt;/pre&gt;


&lt;p&gt;The installer will now look in the current directory for all the needed files and install your egg.&lt;/p&gt;
&lt;p&gt;If all went well you will have some compiler output and your egg is compiled and installed in your local CHICKEN ecosystem.
This means you can now run your tests!&lt;/p&gt;
&lt;p&gt;You can either run "Salmonella" that will not only run your "run.scm" tests file, but also will do checks on the availability of the documentation among other things.
Since our egg is not published yet, those checks would fail. But by simply running the "run.scm" file by itself, we can run our tests without anything interfering.&lt;/p&gt;
&lt;p&gt;So go into your "tests" directory and run your CHICKEN file with "csi":&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;csi run.scm
&lt;/pre&gt;


&lt;p&gt;CHICKEN will now print a neat little report with the results of our tests. Cool!&lt;/p&gt;
&lt;h4&gt;Publishing &amp;amp; Documentation&lt;/h4&gt;
&lt;p&gt;You now what? You are ready!&lt;/p&gt;
&lt;p&gt;The only thing left for you to do now is to first commit your code to your favorite code hosting site, then announce the existence of your egg and finally write the documentation for it.&lt;/p&gt;
&lt;p&gt;Before you can finally publish your egg for the world to see, you &lt;em&gt;have&lt;/em&gt; to write some form of documentation.
Since it is common to write the documentation on the CHICKEN Wiki itself, you will first have to ask for an account so you can properly access the Wiki.&lt;/p&gt;
&lt;p&gt;After committing your code, send an email to the &lt;a href="https://lists.nongnu.org/mailman/listinfo/chicken-users" title="The CHICKEN Users mailing list."&gt;CHICKEN Users&lt;/a&gt; mailing list with the location to your egg so the CHICKEN peeps can add your egg into the system.&lt;/p&gt;
&lt;p&gt;Your account details are best mailed to the private address of &lt;a href="http://wiki.call-cc.org/users/mario-domenech-goulart" title="CHICKEN User page of Mario Domenech Goulart."&gt;Mario Domenech Goulart&lt;/a&gt;. Check out &lt;a href="http://wiki.call-cc.org/contribute" title="The CHICKEN egg contribute page."&gt;this&lt;/a&gt; page to find out his email. Mail him your desired user name and your hashed password. To generate the hash for your password, use the "OpenSSL" program:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;openssl passwd -apr1 your-password-here
&lt;/pre&gt;


&lt;p&gt;Once you get confirmation of your account creation, you can start writing.&lt;/p&gt;
&lt;p&gt;The way this is usually done is to create a page on the Wiki with the same name as your egg.
For the JSON-RPC egg, this would be:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;http://wiki.call-cc.org/eggref/4/json-rpc
&lt;/pre&gt;


&lt;p&gt;Say, for the sake of exampling, that your eggs name is "foo-bar", you would surf to:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;http://wiki.call-cc.org/eggref/4/foo-bar
&lt;/pre&gt;


&lt;p&gt;The Wiki will tell you that that page does not exist, but it also gives you the chance to create it.
Create the page and be sure to authenticate with your fresh, new credentials when saving.&lt;/p&gt;
&lt;p&gt;To get inspiration on how to write the documentation for your egg, check out other contributors their documentation pages. Make sure that you always include the following items:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;A clear description of your egg&lt;/li&gt;
&lt;li&gt;The name of the author&lt;/li&gt;
&lt;li&gt;The non-core dependencies your egg has&lt;/li&gt;
&lt;li&gt;A link to the external repository where you eggs code is hosted&lt;/li&gt;
&lt;li&gt;Documentation about the procedure the users of your egg can use&lt;/li&gt;
&lt;li&gt;Some real-world examples&lt;/li&gt;
&lt;li&gt;Version history&lt;/li&gt;
&lt;li&gt;A copy of the license you use&lt;/li&gt;
&lt;/ul&gt;&lt;h4&gt;The end&lt;/h4&gt;
&lt;p&gt;Okay, that is it folks, we have a working, tested and published egg!&lt;/p&gt;
&lt;p&gt;I hope this three chapter CHICKEN saga has brought you a little bit closer to CHICKEN or Scheme and that you can gradually find out the power behind this tiny language.&lt;/p&gt;
&lt;p&gt;Now go out and create some awesome eggs for the whole community to enjoy! And remember, if you run stuck, have questions or just want a little chat, there is a great CHICKEN community out there ready to help you out. &lt;/p&gt;
&lt;p&gt;And as always...thanks for reading!&lt;/p&gt;&lt;/div&gt;</description><category>chicken</category><category>json rpc</category><category>scheme</category><guid>http://shisaa.be/postset/chicken-scheme-3.html</guid><pubDate>Sat, 21 Sep 2013 10:00:00 GMT</pubDate></item><item><title>The Scheme programming language AKA The CHICKEN hens nest - Part 2</title><link>http://shisaa.be/postset/chicken-scheme-2.html</link><dc:creator>Tim van der Linden</dc:creator><description>&lt;div&gt;&lt;h3&gt;Chapter 2 - Laying the egg&lt;/h3&gt;
&lt;p&gt;Welcome to the second installment of this introduction into the CHICKEN programming language.&lt;/p&gt;
&lt;p&gt;If you haven't done so, I encourage you to go and read &lt;a href="http://shisaa.be/postset/chicken-scheme-1.html" title="Chapter 1 of the CHICKEN series."&gt;chapter 1&lt;/a&gt; before you continue.&lt;/p&gt;
&lt;p&gt;In the first chapter we talked about the programming language CHICKEN, which is a Scheme implementation. To introduce you to this language I decided to guide you trough making your own egg.
This egg will become a translator to talk to a JSON-RPC server so you could send over commands and invoke remote procedure calls trough JSON right out of CHICKEN.&lt;/p&gt;
&lt;p&gt;I left you with the beginning of our main procedure called &lt;em&gt;json-rpc-server&lt;/em&gt; which would do two things:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;It would setup the connection to the server&lt;/li&gt;
&lt;li&gt;It would then be used to translate the commands into JSON and send them over&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Before we dive in, let me give you a quick perspective of what we will be dealing with today:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Create an exception handler for the arguments of the lambda&lt;/li&gt;
&lt;li&gt;See how we can abstract out positional and named params in the arguments&lt;/li&gt;
&lt;li&gt;Build the actual list that will hold the CHICKEN JSON-RPC request object we wish to send&lt;/li&gt;
&lt;li&gt;While building this, we can see a little bit about proper and improper lists&lt;/li&gt;
&lt;li&gt;Still while building, we can also dive into recursion in CHICKEN for building the params list&lt;/li&gt;
&lt;li&gt;After the request object is build, actually send it over to Medea&lt;/li&gt;
&lt;li&gt;Make sure we export only the needed procedures&lt;/li&gt;
&lt;li&gt;Setting the correct license for our egg&lt;/li&gt;
&lt;li&gt;Take a small peek at what lies ahead in chapter three.&lt;/li&gt;
&lt;/ul&gt;&lt;h4&gt;Let us dive in!&lt;/h4&gt;
&lt;p&gt;We where busy writing the second part of our procedure. Once the connection was setup we would return a &lt;em&gt;lambda&lt;/em&gt; or &lt;em&gt;anonymous procedure&lt;/em&gt; to the user which she or he could then use to send commands with.
This is how we left our procedure:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="nv"&gt;!key&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;version&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input-port? &lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"input port"&lt;/span&gt; &lt;span class="s"&gt;"input-port"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-type&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
         &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;output-port? &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"output port"&lt;/span&gt; &lt;span class="s"&gt;"ouput-port"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-type&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
         &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;is-valid-version?&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"version"&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;else&lt;/span&gt;
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;As you can see, we started to return a lambda that accepts two arguments: &lt;em&gt;method&lt;/em&gt; and &lt;em&gt;params&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;As we have seen in the previous chapter, params are &lt;em&gt;optional&lt;/em&gt;. We thus want them to be optional too in our CHICKEN implementation.
To give optional arguments to a procedure we can use the &lt;em&gt;dot notation&lt;/em&gt;: everything left of the dot is required and right of the dot is optional.&lt;/p&gt;
&lt;p&gt;And as you remember from the first chapter, we have to help the user when they input wrong information. In this case it means we have to check two arguments. But since both method and params are completely custom to our egg, we will have to build the predicates ourselves and then build a custom exception handler that will inform the user about her or his mistake.&lt;/p&gt;
&lt;p&gt;We know that the "method" argument must be a string, so the first predicate is quite simple:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;is-valid-method?&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;string? &lt;/span&gt;&lt;span class="nv"&gt;method&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The "params" argument is a little bit different. Since we are implementing JSON-RPC in CHICKEN, we have the chance of also making some aspects more abstract, more easy to use.
As we have seen in the JSON-RPC spec, params can be &lt;em&gt;positional&lt;/em&gt;, which in CHICKEN becomes a &lt;em&gt;vector&lt;/em&gt; or they can be &lt;em&gt;named&lt;/em&gt; which is an &lt;em&gt;alist&lt;/em&gt;.
So we could opt for the situation where the user has to input either an alist or a vector as a "params" argument, or we could try and find a more simple way of input.&lt;/p&gt;
&lt;p&gt;I chose the latter and opted to let the user either input &lt;em&gt;symbol arguments&lt;/em&gt; (positional params) or &lt;em&gt;keyword arguments&lt;/em&gt; (named params):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;'these&lt;/span&gt; &lt;span class="ss"&gt;'are&lt;/span&gt; &lt;span class="ss"&gt;'five&lt;/span&gt; &lt;span class="ss"&gt;'positional&lt;/span&gt; &lt;span class="ss"&gt;'params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;;symbols&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;these:&lt;/span&gt; &lt;span class="nv"&gt;are&lt;/span&gt; &lt;span class="nv"&gt;keyword:&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;;keywords&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Inside the procedure body which implements the params, both keywords and symbols become lists we can use, just as the example above.
So to make the predicate for checking valid params input the only thing we really have to do is to check if the given params are a list.
Even when we do not give any params, they will be considered &lt;em&gt;null&lt;/em&gt;. And as we have seen in the previous chapter, &lt;em&gt;null&lt;/em&gt; in CHICKEN is the same as the &lt;em&gt;empty list&lt;/em&gt; or &lt;em&gt;'()&lt;/em&gt;, which, of course, is also a list!&lt;/p&gt;
&lt;p&gt;Cool, right?&lt;/p&gt;
&lt;p&gt;So our params predicate would look like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;are-valid-params?&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Next we have to make a new exception handler. This one we will call &lt;em&gt;server-setup-data-error&lt;/em&gt; and it will be called if one of the predicates fail or in other words, when the method or the params are invalid:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-data-error&lt;/span&gt; &lt;span class="nv"&gt;type&lt;/span&gt; &lt;span class="nv"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;signal&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;make-property-condition&lt;/span&gt;
      &lt;span class="ss"&gt;'exn&lt;/span&gt; &lt;span class="ss"&gt;'message&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;sprintf&lt;/span&gt; &lt;span class="s"&gt;"Cannot setup connection, the given ~S data is invalid. The ~S ~A"&lt;/span&gt;
      &lt;span class="nv"&gt;type&lt;/span&gt; &lt;span class="nv"&gt;type&lt;/span&gt; &lt;span class="nv"&gt;message&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Looks quite the same as the first one we wrote in chapter 1, no?&lt;/p&gt;
&lt;p&gt;The only difference is the message we print and the arguments we give to this procedure.
We can now implement the predicates and the exception handling the same way we did before, so this becomes:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="nv"&gt;!key&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;version&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input-port? &lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"input port"&lt;/span&gt; &lt;span class="s"&gt;"input-port"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-type&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
         &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;output-port? &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"output port"&lt;/span&gt; &lt;span class="s"&gt;"ouput-port"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-type&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
         &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;is-valid-version?&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"version"&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;else&lt;/span&gt;
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
             &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;is-valid-method?&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-data-error&lt;/span&gt; &lt;span class="s"&gt;"method"&lt;/span&gt; &lt;span class="s"&gt;"can only be a string."&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                    &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;are-valid-params?&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-data-error&lt;/span&gt; &lt;span class="s"&gt;"params"&lt;/span&gt; &lt;span class="s"&gt;"can only be a vector or an alist."&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;else &lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="p"&gt;))))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;I also added the &lt;em&gt;else&lt;/em&gt; statement at the end which will carry the code that will be executed when the given arguments are correct.&lt;/p&gt;
&lt;p&gt;What's next?&lt;/p&gt;
&lt;p&gt;We now have to start looking at actually building the JSON-RPC request object in CHICKEN and sending it to server.
First let us concentrate on the building part:&lt;/p&gt;
&lt;h4&gt;Building the request object&lt;/h4&gt;
&lt;p&gt;Let us assume that we already finished our egg and we define a connection to the "XBMC" box I described in the first chapter, we would call our &lt;em&gt;json-rpc-server&lt;/em&gt; procedure as follows (assuming we have a valid input-port and output-port):&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="nv"&gt;xbmc&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;When we now call "xbmc" we will get a new procedure (the lambda we are writing now) that excepts a method and optional params.
Given the example from chapter 1 (Player.PlayPause), let us see how we would call "xbmc" to send this to the server:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;xbmc&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt; &lt;span class="nv"&gt;playerid:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;&lt;em&gt;Player.PlayPause&lt;/em&gt; is the method, which is a string and &lt;em&gt;playerid: 0&lt;/em&gt; is a &lt;em&gt;keyword&lt;/em&gt; param where &lt;em&gt;playerid:&lt;/em&gt; is the keyword and &lt;em&gt;0&lt;/em&gt; the value of that keyword.
We now have to write the code that will take that and turn it into this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;jsonrpc&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;params&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;playerid&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The above code is a list, so we can also say that we want to &lt;em&gt;build&lt;/em&gt; a list containing the &lt;em&gt;"Player.PlayPause"&lt;/em&gt; and &lt;em&gt;playerid: 0&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;In chapter one we have seen that the &lt;em&gt;jsonrpc&lt;/em&gt; version is always &lt;em&gt;2.0&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;On top of that, we can always keep the &lt;em&gt;id&lt;/em&gt; at a fixed number.&lt;/p&gt;
&lt;p&gt;The reasoning behind the &lt;em&gt;id&lt;/em&gt; in a JSON-RPC call is that the server can use the same &lt;em&gt;id&lt;/em&gt; in its &lt;em&gt;response&lt;/em&gt; or &lt;em&gt;error&lt;/em&gt; object it returns so you know the server's feedback belongs to your &lt;em&gt;request&lt;/em&gt;. But because we only send one request at a time, it is overkill to generate a unique &lt;em&gt;id&lt;/em&gt; for each &lt;em&gt;request&lt;/em&gt; we send. In our egg, we will keep this &lt;em&gt;id&lt;/em&gt; at &lt;em&gt;1&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Let us concentrate on building the request object from only the &lt;em&gt;version&lt;/em&gt; and the &lt;em&gt;id&lt;/em&gt; for now. The &lt;em&gt;method&lt;/em&gt; and the &lt;em&gt;params&lt;/em&gt; will need some further attention later on.&lt;/p&gt;
&lt;p&gt;So the first thing we encounter in the &lt;em&gt;s-expression&lt;/em&gt; above is the &lt;em&gt;cons cell&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;jsonrpc&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The "version" is an argument to the "json-rpc-server procedure", so to create this cons cell we can use &lt;em&gt;cons&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Mind the &lt;em&gt;'&lt;/em&gt; or &lt;em&gt;quote&lt;/em&gt; before "jsonrpc". This is needed for otherwise CHICKEN will see this as &lt;em&gt;code&lt;/em&gt; instead of &lt;em&gt;data&lt;/em&gt; and try to execute it.
The last cons cell is our "id", the same idea applies here:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Again, mind the &lt;em&gt;'&lt;/em&gt; in front of "id". We use the &lt;em&gt;number&lt;/em&gt; 1 here without any quoting because we want CHICKEN to treat this as an actual &lt;em&gt;number&lt;/em&gt; data type and not as a &lt;em&gt;symbol&lt;/em&gt; or &lt;em&gt;string&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;How can we put these two &lt;em&gt;cons cells&lt;/em&gt; together?&lt;/p&gt;
&lt;p&gt;Let us try it with cons and see what happens. Remember that cons only accepts two arguments and &lt;em&gt;conses&lt;/em&gt; them on to each other. So if we want to put the two cons cells together, it will look something like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This code will result in:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;jsonrpc&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;id&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Hmm, that is not really what we want, is it? It has now become a list containing one &lt;em&gt;cons cell&lt;/em&gt; and two &lt;em&gt;atoms&lt;/em&gt; separated by a dot.
Why is this happening and why do we not get the following code?&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;jsonrpc&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Good question, the answer is: You have just created an &lt;em&gt;improper list&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Improper list?&lt;/p&gt;
&lt;p&gt;To explain this we have to take another look at the primitive &lt;em&gt;cons&lt;/em&gt; procedure we saw in chapter 1. There we learned that &lt;em&gt;cons&lt;/em&gt; is used to &lt;em&gt;cons&lt;/em&gt;truct pairs, just as we did above. And with pairs, we can eventually construct &lt;em&gt;lists&lt;/em&gt;. But there was also one important detail we saw, the &lt;em&gt;null&lt;/em&gt;, &lt;em&gt;'()&lt;/em&gt; or &lt;em&gt;the empty list&lt;/em&gt;. I showed you that when we look at a list, we are actually looking at cons cells and the last item in the list is actually a cons cell that contains the empty list at the end. You do not &lt;em&gt;see&lt;/em&gt; the empty list, but it is there.&lt;/p&gt;
&lt;p&gt;An improper list, as opposed to a proper list, does &lt;em&gt;not&lt;/em&gt; end in the empty list. In the example above, we ended with the cons cell:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This is considered wrong, it is even said that an improper list is not a list at all. To turn this into a proper list, we simply have to add the empty list into our consing adventure:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And this will give as a proper list and thus the result we are after:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;jsonrpc&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Nice!&lt;/p&gt;
&lt;p&gt;There is one annoying thing about building lists this way: its verbose. Luckily for us, CHICKEN has a shorthand for building lists, amazingly called &lt;em&gt;list&lt;/em&gt;.
The above example would thus be translated to:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;That is a little bit less to type, and maybe even more easy to read. The procedure &lt;em&gt;list&lt;/em&gt; can take an infinite amount of arguments and construct a proper list for you which means that it will always automatically add the &lt;em&gt;empty list&lt;/em&gt; at the end.&lt;/p&gt;
&lt;p&gt;Let us now take this one step further and include a &lt;em&gt;method&lt;/em&gt; in our construction. If we would use &lt;em&gt;cons&lt;/em&gt;, the construction would look like this&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'method&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;())))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And if we used &lt;em&gt;list&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'method&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Both approaches give us:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;jsonrpc&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You are of course free to choose if you want to use "cons" or "list", the take-home message here is that you know &lt;em&gt;how&lt;/em&gt; to cons together a proper list and that you understand how "list" will create a &lt;em&gt;proper list&lt;/em&gt; for you by adding the empty list at the final cons cell.&lt;/p&gt;
&lt;p&gt;The next bit we have to deal with are our &lt;em&gt;optional&lt;/em&gt; params. The fact that they are optional means that we cannot simply add them in our consing goodness. We have to take the process of putting together our final list apart and build in the necessary checks to see if the params are given or not.&lt;/p&gt;
&lt;p&gt;One way to do this is to simply go about and check if there are any params present and if they are, just cons them onto the list we already build. While this would work fine, there is still a theoretical problem that could stick its head up. When we would cons the params onto the existing list, the params will end up at the &lt;em&gt;beginning&lt;/em&gt; of the list. You would get something like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;params&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;playerid&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;jsonrpc&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This is actually perfectly fine. The order of the &lt;em&gt;jsonrpc&lt;/em&gt;, &lt;em&gt;method&lt;/em&gt;, &lt;em&gt;params&lt;/em&gt; and &lt;em&gt;id&lt;/em&gt; is arbitrary. In the final JSON string the order of these keywords does not matter and the server you are sending your string to will perfectly understand your call. So the above CHICKEN code is correct.&lt;/p&gt;
&lt;p&gt;However, like I said, there is a theoretical problem. In the JSON-RPC specification they use &lt;em&gt;exactly&lt;/em&gt; the same order throughout the documentation. The &lt;em&gt;jsonrpc&lt;/em&gt; version is first, then the &lt;em&gt;method&lt;/em&gt;, then optionally the &lt;em&gt;params&lt;/em&gt; and finally the &lt;em&gt;id&lt;/em&gt;. They do this for clarity, of course, but somebody building a JSON-RPC server could interpreted this as part of the spec and only accept JSON-RPC calls that are formatted in that exact order.&lt;/p&gt;
&lt;p&gt;As assuming a particular order would be a wrong interpretation of the JSON-RPC spec, you could ignore this. If somebody would use your egg to communicate to such a faulty JSON-RPC server, you simply could blame the programmer who build the server. You would be perfectly correct. Or not...?&lt;/p&gt;
&lt;p&gt;You could also think about it from a slightly different perspective. If 99 percent of the JSON-RPC servers do not care about the order, you could as well provide the same order as the spec demonstrated. This way even the faulty servers would still work and the users of your egg will have one frustration less to worry about.&lt;/p&gt;
&lt;p&gt;The latter is the path I choose with my egg, I construct my JSON-RPC call in exactly the same order as demonstrated in the specification.&lt;/p&gt;
&lt;p&gt;But by choosing this path, we bring forth another interesting challenge. In the order of the examples, the &lt;em&gt;params&lt;/em&gt; come not at the end, not at the beginning but in the middle of our list.&lt;/p&gt;
&lt;p&gt;Luckily for us we are using CHICKEN and this problem is actually quite trivial. We can simply put an &lt;em&gt;if&lt;/em&gt; condition around our &lt;em&gt;param&lt;/em&gt; while putting our string together. Let me demonstrate:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'method&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'params&lt;/span&gt; &lt;span class="ss"&gt;'theparams&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;As you can see, we put a simple &lt;em&gt;if&lt;/em&gt; procedure around our &lt;em&gt;params&lt;/em&gt; cons cell. When we do not give any &lt;em&gt;params&lt;/em&gt; we simply return the empty list. When we do have &lt;em&gt;params&lt;/em&gt; we return those. Note that in the example above I used &lt;em&gt;'theparams&lt;/em&gt; to substitute for the actual &lt;em&gt;params&lt;/em&gt; that we have to build later on.&lt;/p&gt;
&lt;p&gt;If you would like to try this example in the REPL, we first have to define &lt;em&gt;params&lt;/em&gt;, otherwise our &lt;em&gt;if&lt;/em&gt; procedure will fail stating that it does not know &lt;em&gt;params&lt;/em&gt;. So first, for the sake of testing, define &lt;em&gt;params&lt;/em&gt; in the REPL:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt; &lt;span class="no"&gt;#t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We now defined it as &lt;em&gt;#t&lt;/em&gt; or &lt;em&gt;true&lt;/em&gt; which means it is &lt;em&gt;not empty&lt;/em&gt;. If you now type in "params" in the REPL, you will get back &lt;em&gt;#t&lt;/em&gt;.
Now, let us input our slightly modified statement into the REPL:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="ss"&gt;'version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'method&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'params&lt;/span&gt; &lt;span class="ss"&gt;'theparams&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Notice that I quoted &lt;em&gt;version&lt;/em&gt; also, because this is still unknown for the REPL in this stage of development.
If you punch this in, you will get the following output:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;jsonrpc&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;params&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;theparams&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Neat! We have four &lt;em&gt;cons cells&lt;/em&gt;, including our params. Seems to work so far, no?&lt;/p&gt;
&lt;p&gt;Okay, now let us try it when &lt;em&gt;params&lt;/em&gt; are not given.
We first need to redefine &lt;em&gt;params&lt;/em&gt; to not be &lt;em&gt;#t&lt;/em&gt; but be &lt;em&gt;the empty list&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And now run our code again:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="ss"&gt;'version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'method&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'params&lt;/span&gt; &lt;span class="ss"&gt;'theparams&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The output will now read:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;jsonrpc&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Hmmm, you notice the problem? There is an empty list right in the middle of our list.&lt;/p&gt;
&lt;p&gt;This, of course, makes perfect sense, since we told CHICKEN to input the empty list if we had no &lt;em&gt;params&lt;/em&gt;.
But if we would later go and convert this into a &lt;em&gt;JSON-RPC&lt;/em&gt; valid call, we would run into trouble.&lt;/p&gt;
&lt;p&gt;We have to find a way to &lt;em&gt;remove&lt;/em&gt; this empty list from our resulting list.&lt;/p&gt;
&lt;p&gt;How?&lt;/p&gt;
&lt;p&gt;With &lt;em&gt;remove&lt;/em&gt; of course! &lt;em&gt;Remove&lt;/em&gt; is a procedure that is delivered by the so-called &lt;em&gt;SRFI-1&lt;/em&gt; and is in CHIKCKEN core.&lt;/p&gt;
&lt;p&gt;What is &lt;em&gt;SRFI-1&lt;/em&gt; you ask?&lt;/p&gt;
&lt;p&gt;&lt;em&gt;SRFI&lt;/em&gt; stands for &lt;em&gt;S&lt;/em&gt;cheme &lt;em&gt;R&lt;/em&gt;equests &lt;em&gt;f&lt;/em&gt;or &lt;em&gt;I&lt;/em&gt;mplementation and are well defined Scheme libraries. The reasoning is that Schemers can use these libraries to extend the functionality of their Scheme. Somebody building a Scheme implementation can choose which SRFI's he or she would include or implement in its core. If a library proves to be useful enough it even makes a tiny chance of being included into a new RnRS spec. You can check out &lt;a href="http://srfi.schemers.org/" title="The official SRFI repository."&gt;the SRFI website&lt;/a&gt; containing all the available SRFI's.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;SRFI-1&lt;/em&gt;, aka the &lt;em&gt;List Library&lt;/em&gt;, contains several useful procedures for working with lists. One of these procedures is called &lt;em&gt;remove&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Remove&lt;/em&gt; accepts two arguments, a predicate and a list. Whatever matches the predicate will be, well, removed. So in our case, we want to remove everything that is &lt;em&gt;null&lt;/em&gt; or &lt;em&gt;the empty list&lt;/em&gt;.
In the previous chapter we already saw that we have a &lt;em&gt;null?&lt;/em&gt; predicate. Now we can add the procedure &lt;em&gt;remove&lt;/em&gt; together with that predicate to get rid of our &lt;em&gt;empty list&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;remove&lt;/span&gt;
    &lt;span class="nv"&gt;null?&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="ss"&gt;'version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'method&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'params&lt;/span&gt; &lt;span class="ss"&gt;'theparams&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Which will result in:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;jsonrpc&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"Player.PlayPause"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Good! The empty list is now gone!&lt;/p&gt;
&lt;p&gt;But before we can use stuff from SRFI-1 in our egg, we will have to &lt;em&gt;load in&lt;/em&gt; its procedures the same way we loaded in &lt;em&gt;Extras&lt;/em&gt; in chapter one.
To load in multiple libraries or eggs at once, simply list the names as arguments to &lt;em&gt;use&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt; &lt;span class="nv"&gt;extras&lt;/span&gt; &lt;span class="nv"&gt;srfi-1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Let us take a look at the full code we have now for our &lt;em&gt;json-rpc-server&lt;/em&gt; procedure:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="nv"&gt;!key&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;version&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input-port? &lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"input port"&lt;/span&gt; &lt;span class="s"&gt;"input-port"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-type&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
         &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;output-port? &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"output port"&lt;/span&gt; &lt;span class="s"&gt;"ouput-port"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-type&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
         &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;is-valid-version?&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"version"&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
         &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;else&lt;/span&gt;
           &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;is-valid-method?&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-data-error&lt;/span&gt; &lt;span class="s"&gt;"method"&lt;/span&gt; &lt;span class="s"&gt;"can only be a string."&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                  &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;are-valid-params?&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-data-error&lt;/span&gt; &lt;span class="s"&gt;"params"&lt;/span&gt; &lt;span class="s"&gt;"can only be a vector or an alist."&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;else&lt;/span&gt;
                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;remove&lt;/span&gt;
                        &lt;span class="nv"&gt;null?&lt;/span&gt;
                        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;
                            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'method&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'params&lt;/span&gt; &lt;span class="ss"&gt;'theparams&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="s"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;))))))))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;That starts to look like a nice little program!&lt;/p&gt;
&lt;p&gt;But as you can notice, we still have &lt;em&gt;'theparams&lt;/em&gt; in there, which needs to be replaced with the actual params the user inputs.
In the beginning of the chapter I explained that the &lt;em&gt;params&lt;/em&gt; can be either an &lt;em&gt;alist&lt;/em&gt; or a &lt;em&gt;vector&lt;/em&gt; so that they can later be converted correctly into JSON.But we also saw that we don't want to bother the user with that. We want the user to input simple keywords or symbols.
This means we have to make a translation from keywords/symbols to alists/vectors.&lt;/p&gt;
&lt;p&gt;The first thing we need to do is to check what the user has input for &lt;em&gt;params&lt;/em&gt;. Are it keywords? Or are it symbols? After we know what the user has input, we can build either an &lt;em&gt;alist&lt;/em&gt; or a &lt;em&gt;vector&lt;/em&gt;. This calls for three small procedures, one that checks the input, one that builds an &lt;em&gt;alist&lt;/em&gt; and one that builds a &lt;em&gt;vector&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Let us begin with the input checker. This procedure will check the input and call the appropriate procedure, then it will return that result. So let us call this procedure &lt;em&gt;build-params&lt;/em&gt;. It will take one argument: the params that the user has input. This little fellow will look something like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-params&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;keyword?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; 
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-alist&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list-&amp;gt;vector &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-vector&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;As you can see, we have an &lt;em&gt;if&lt;/em&gt; procedure that checks the &lt;em&gt;car&lt;/em&gt; of the &lt;em&gt;params&lt;/em&gt; to be a &lt;em&gt;keyword&lt;/em&gt;. The &lt;em&gt;keyword?&lt;/em&gt; procedure is a build in predicate that we have at our disposal. If the &lt;em&gt;car&lt;/em&gt; is a keyword we have to build an &lt;em&gt;alist&lt;/em&gt;. So it calls a custom procedure &lt;em&gt;build-alist&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Let us forget the rest for a moment and see how we can put together &lt;em&gt;build-alist&lt;/em&gt;.
Before we start writing code, let us do it in our minds first.&lt;/p&gt;
&lt;p&gt;If the user has input keywords, we can safely assume they will be in the following form:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;first:&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt; &lt;span class="nv"&gt;second:&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt; &lt;span class="nv"&gt;third:&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And we also know the resulting alist we would like to see:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;second&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;third&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;What we need to do is take the keywords (first:, second: and third:) and make them the car of each pair, then take the value of the keywords and make them the cdr of each pair.&lt;/p&gt;
&lt;p&gt;We have already seen almost every tool we need to do this job of putting together these pairs. If we had, for example, three keywords like in the above example, the code to put this together would look something like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)))))(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)))))))&lt;/span&gt;
            &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;())))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Wow...that seems quite verbose, no? Look at all the car, cdr and cons procedures, not to mention the unscalability of such code. If we get, for example, four or five keywords, this gets twice as long! So we have to be a bit more lazy and let CHICKEN do that heavy lifting for us. We need some kind of loop.&lt;/p&gt;
&lt;h4&gt;Recursion&lt;/h4&gt;
&lt;p&gt;A loop in CHICKEN or any other Scheme is not like your average loop in Python or C, in CHICKEN there actually is &lt;em&gt;no&lt;/em&gt; loop. Instead of a loop there exists an idea that is called &lt;em&gt;recursion&lt;/em&gt;. This is quite a simple idea where you do something over and over again until a certain condition is met. Let me demonstrate:&lt;/p&gt;
&lt;p&gt;Say for example, we have a list with three numbers:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And we want to do the quite useless exercises of adding one to each of the numbers.&lt;/p&gt;
&lt;p&gt;We could do this by writing a tiny procedure that takes the first number, adds one to it, go over to the next number and do this over and over until there are no numbers left.&lt;/p&gt;
&lt;p&gt;In code, this tiny &lt;em&gt;recursive&lt;/em&gt; procedure looks like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;addone&lt;/span&gt; &lt;span class="nv"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;addone&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The + sign we use here is a build-in procedure that adds up numbers, just so you know. Remember, in CHICKEN you can use almost any character as a variable or procedure name.&lt;/p&gt;
&lt;p&gt;Then we call this newly made procedure:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;addone&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And we get back:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You have just seen recursion in action!&lt;/p&gt;
&lt;p&gt;To fully understand what just happened we have to dig in a little bit deeper and take a look at how the &lt;em&gt;recursion&lt;/em&gt; is happening behind the curtains.&lt;/p&gt;
&lt;p&gt;We created the procedure called "addone" that calls &lt;em&gt;itself&lt;/em&gt; until the so-called &lt;em&gt;base case&lt;/em&gt; is reached. The base case here being:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;list&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;An important detail to know is that every time the procedure calls itself it will be put on a waiting list and wait for the newly called procedure to finish and give back its result.
When the base case is reached, the final procedure call returns its result (in this case the &lt;em&gt;empty list&lt;/em&gt;) and this empty list will be given to the second last procedure which can cons its result on to that.
This second last procedure then finishes and gives its result to the third last procedure that was waiting. This continues until the first, original procedure gets its result and returns this to you (or the procedure that called it in the first place).&lt;/p&gt;
&lt;p&gt;You can think of it as having a big box right in front of you, you open the box only to find another box, you open that one and you find yet another box.
This goes on until there are no boxes in boxes left (the base case is reached). But before you can close the big, original box, you will have to close every box, starting with the most smallest inner box working your way back to the big, original box.&lt;/p&gt;
&lt;p&gt;Let us look at each step of the recursion for our "addone" procedure.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;1&lt;/em&gt; - We call "addone" for the first time, the argument is not null, so the base case is not reached. This means we continue with the final cons line.&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;addone&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;which translates to&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;addone&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;You can notice that the "addone" procedure is now called with the argument '(2 3).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;2&lt;/em&gt; - The base case is still not reached, because '(2 3) is not null, we continue again with the last line.&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;addone&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;which translates to&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;addone&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;The procedure "addone" is now called again, with the '(3) as the argument.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;3&lt;/em&gt; - The base case is still not reached, because '(3) is not null, we continue again with the last line.&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;addone&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;which translates to&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;addone&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now "addone" is called with the argument '(), because that is the &lt;em&gt;cdr&lt;/em&gt; of '(3).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;4&lt;/em&gt; - Aha! Now the base case &lt;em&gt;is&lt;/em&gt; reached  because the argument is '() or in other words: null.&lt;/p&gt;
&lt;p&gt;So the base case condition is executed:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;which simply translates to:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;As you can see, this time we don't call the "addone" procedure again, but we return a value instead.
This means we now can walk back up the tree and start to give back a value to each procedure waiting.&lt;/p&gt;
&lt;p&gt;The procedure waiting for its result was the previous one before our base case was met:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;addone&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We can now fill in the result of the &lt;em&gt;addone&lt;/em&gt;:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And the result of the sum:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Now this procedure also has a value, being the cons cell &lt;em&gt;(4 . '())&lt;/em&gt;, we can go one more up.
The one procedure waiting now is this one:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;addone&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;So this one now becomes:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We have yet another value, up to the next, final procedure:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;+ &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;addone&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Which now becomes:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This is the final, top most procedure, so this one simply can return its value, becoming:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;And there we have our result!&lt;/p&gt;
&lt;h4&gt;Back to our egg&lt;/h4&gt;
&lt;p&gt;So let us put this recursion idea into practice with our &lt;em&gt;alist&lt;/em&gt; builder. The procedure &lt;em&gt;build-alist&lt;/em&gt; would look like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-alist&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-alist&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;This looks much like the simple &lt;em&gt;addone&lt;/em&gt; procedure, does it not? Okay, the last part is different of course, but the principle is the same.&lt;/p&gt;
&lt;p&gt;First we put in our &lt;em&gt;base case&lt;/em&gt;, which in many cases is to check for the empty list with the &lt;em&gt;null?&lt;/em&gt; predicate. If the given &lt;em&gt;params&lt;/em&gt; are not empty, the base case is not met and we perform a &lt;em&gt;cons&lt;/em&gt;.
Let us take this cons line apart and see where the recursion happens:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-alist&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;We are consing two things together:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;A &lt;em&gt;cons cell&lt;/em&gt; containing the &lt;em&gt;car&lt;/em&gt; of the params (the first keyword) and the &lt;em&gt;car of the cdr&lt;/em&gt; of the params (the value of the first keyword)&lt;/li&gt;
&lt;li&gt;A call to the &lt;em&gt;same procedure&lt;/em&gt; but with the first keyword and the value of the first keyword stripped off (cdr (cdr params))&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;If we call this procedure with a list of keyword arguments:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-alist&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;first:&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt; &lt;span class="nv"&gt;second:&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt; &lt;span class="nv"&gt;third:&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;CHICKEN will return us:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nf"&gt;first:&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;second:&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;third:&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Which is a list of cons cells instead of a list of keyword arguments, that is exactly what we want!&lt;/p&gt;
&lt;p&gt;When the recursion ends, this procedure can return its value to the procedure that called it in the first place: our "build-params":&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-params&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;keyword?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; 
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-alist&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list-&amp;gt;vector &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-vector&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Which does nothing more then give this result back to the procedure that called it, the &lt;em&gt;if&lt;/em&gt; condition we where using to check if params existed while building our request object.&lt;/p&gt;
&lt;p&gt;That is all there is to it for building an alist, but what if we have to build a vector using the "build-vector" procedure? Well, this turns out to be very much the same.&lt;/p&gt;
&lt;p&gt;Let me present to you the code for "build-vector":&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-vector&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;symbol-&amp;gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-vector&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Pretty much the same, right? The only difference here is, again, the consing line at the end.
Since it is a vector we are building we do not need a keyword and its value, we simply have to build a list of items, one by one.&lt;/p&gt;
&lt;p&gt;So the recursion here is the fact that we cons together the &lt;em&gt;car&lt;/em&gt; of the params and call the procedure again with the &lt;em&gt;cdr&lt;/em&gt;.
One odd thing you might notice is the &lt;em&gt;symbol-&amp;gt;string&lt;/em&gt; procedure that I have put in front of the &lt;em&gt;(car params)&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This is needed to convert the symbols we input as arguments to the lambda to be converted into real strings.
If we would not do this, we would build a list containing symbol types which would make no sense at all to a JSON-RPC call.&lt;/p&gt;
&lt;p&gt;Good. We now have our code almost completely finished, the full code for our "json-rpc-server" procedure and the procedures for building alists or vectors looks like this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="nv"&gt;!key&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;version&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input-port? &lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"input port"&lt;/span&gt; &lt;span class="s"&gt;"input-port"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-type&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
  &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;output-port? &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"output port"&lt;/span&gt; &lt;span class="s"&gt;"ouput-port"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-type&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
  &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;is-valid-version?&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"version"&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;else&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;is-valid-method?&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-data-error&lt;/span&gt; &lt;span class="s"&gt;"method"&lt;/span&gt; &lt;span class="s"&gt;"can only be a string."&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
       &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;are-valid-params?&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-data-error&lt;/span&gt; &lt;span class="s"&gt;"params"&lt;/span&gt; &lt;span class="s"&gt;"can only be a vector or an alist."&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;else&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;remove&lt;/span&gt; &lt;span class="nv"&gt;null?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'method&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'params&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-params&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
                            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="s"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
              &lt;span class="nv"&gt;input&lt;/span&gt; 
              &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))))))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-params&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;keyword?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; 
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-alist&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list-&amp;gt;vector &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-vector&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-alist&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
        &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-alist&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))))))&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-vector&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cons&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;symbol-&amp;gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;car &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-vector&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cdr &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;There we have our whole mechanism for creating our request object. We have the procedure that sets up a connection and returns another procedure that builds the actually object using the small recursive procedures we just wrote.&lt;/p&gt;
&lt;p&gt;The only problem now is that this code build the request object...and that is it. We still need to hand it over to Medea, the JSON&amp;lt;-&amp;gt;CHICKEN converter we saw in chapter one.
Luckily for us (and thanks to &lt;a href="http://wiki.call-cc.org/users/moritz-heidkamp" title="Moritz Heidkamp user page on the CHICKEN wiki"&gt;Moritz Heidkamp&lt;/a&gt; who made the Medea egg), this is yet another trivial task.&lt;/p&gt;
&lt;h4&gt;Sending over the data&lt;/h4&gt;
&lt;p&gt;All we need to do is to call the sending procedure of Medea with our build request object and the output port we set when creating our connection with "json-rpc-server".
So first thing to do is to make sure we load in the Medea egg using the same &lt;em&gt;use&lt;/em&gt; procedure as we did when loading in the SRFI-1 and Extras:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt; &lt;span class="nv"&gt;extras&lt;/span&gt; &lt;span class="nv"&gt;srfi-1&lt;/span&gt; &lt;span class="nv"&gt;medea&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;As we are now used to doing, let us create a small procedure that will send the data over to Medea:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;send-request&lt;/span&gt; &lt;span class="nv"&gt;request&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;write-json&lt;/span&gt; &lt;span class="nv"&gt;request&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;That is how simple it works. The "write-json" is a procedure from Medea that, well, takes the given object or "datum", converts it into JSON and sends it over to the output port given.&lt;/p&gt;
&lt;p&gt;And now we need to call this procedure in our lambda, where the "request" argument will become the whole building process we have been seeing in this chapter:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;define &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt; &lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="nv"&gt;!key&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;version&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input-port? &lt;/span&gt;&lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"input port"&lt;/span&gt; &lt;span class="s"&gt;"input-port"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-type&lt;/span&gt; &lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
  &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;output-port? &lt;/span&gt;&lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"output port"&lt;/span&gt; &lt;span class="s"&gt;"ouput-port"&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;get-type&lt;/span&gt; &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
  &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;is-valid-version?&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-arguments-error&lt;/span&gt; &lt;span class="s"&gt;"version"&lt;/span&gt; &lt;span class="s"&gt;"2.0"&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;else&lt;/span&gt;
   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;method&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;cond &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;is-valid-method?&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-data-error&lt;/span&gt; &lt;span class="s"&gt;"method"&lt;/span&gt; &lt;span class="s"&gt;"can only be a string."&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
       &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;not &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;are-valid-params?&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;server-setup-data-error&lt;/span&gt; &lt;span class="s"&gt;"params"&lt;/span&gt; &lt;span class="s"&gt;"can only be a vector or an alist."&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;else&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;send-request&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;remove&lt;/span&gt; &lt;span class="nv"&gt;null?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'jsonrpc&lt;/span&gt; &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'method&lt;/span&gt; &lt;span class="nv"&gt;method&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;null? &lt;/span&gt;&lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;'&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'params&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;build-params&lt;/span&gt; &lt;span class="nv"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
                            &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;cons &lt;/span&gt;&lt;span class="ss"&gt;'id&lt;/span&gt; &lt;span class="s"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
                        &lt;span class="nv"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)))))))&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Voila, we now call the procedure "send-request" where its first argument is the &lt;em&gt;whole&lt;/em&gt; request object building process and the output port is the second argument.&lt;/p&gt;
&lt;h4&gt;Defining and exporting&lt;/h4&gt;
&lt;p&gt;Good, we finished up the total code, &lt;em&gt;congratulations&lt;/em&gt;!&lt;/p&gt;
&lt;p&gt;Next up is to actually tell CHICKEN that this file is the source code of an egg.&lt;/p&gt;
&lt;p&gt;First we load in the actual CHICKEN core at the beginning of our file:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;import&lt;/span&gt; &lt;span class="nv"&gt;chicken&lt;/span&gt; &lt;span class="nv"&gt;scheme&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Next we have to give this file a place inside our egg (eggs can consist of multiple source code files).&lt;/p&gt;
&lt;p&gt;This is done trough the &lt;em&gt;module&lt;/em&gt; procedure. This procedure takes several arguments, the first being the actual name of the module. In our case, the file we have been creating the past two chapters will contains the code for the &lt;em&gt;client&lt;/em&gt; implementation. So I choose &lt;em&gt;json-rpc-client&lt;/em&gt; as the name for this module file.&lt;/p&gt;
&lt;p&gt;The next argument is the procedures you wish to export. When creating an egg with one or multiple files, you can choose to which of the procedures the user of your egg has access to. In our case, the users does need to use our predicate procedures or anything like that. There is only one procedure that they will need to use: our &lt;em&gt;json-rpc-server&lt;/em&gt; procedure. So to scope their access, we can set this procedure to be exported.&lt;/p&gt;
&lt;p&gt;The third argument is the &lt;em&gt;whole code&lt;/em&gt; of our egg. Everything we have been writing up to this point goes into that procedure.&lt;/p&gt;
&lt;p&gt;So, in the top of our egg file, we will end up with this:&lt;/p&gt;
&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;module&lt;/span&gt; &lt;span class="nv"&gt;json-rpc-client&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;json-rpc-server&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;import&lt;/span&gt; &lt;span class="nv"&gt;chicken&lt;/span&gt; &lt;span class="nv"&gt;scheme&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="c1"&gt;; the whole egg code ; ...&lt;/span&gt;

&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;


&lt;p&gt;Mind the ending parenthesis at the end of your CHICKEN file to close off the last argument.&lt;/p&gt;
&lt;h4&gt;Licensing&lt;/h4&gt;
&lt;p&gt;There is one more thing left before we can start dreaming of chapter three: setting the correct license.&lt;/p&gt;
&lt;p&gt;It is important to set a correct license on your work, for protecting it, but also for making it possible for people to know if they can include it in &lt;em&gt;their&lt;/em&gt; work.
The CHICKEN website has &lt;a href="http://wiki.call-cc.org/eggs-licensing" title="Licensing page on the CHICKEN website."&gt;a page&lt;/a&gt; containing all sorts of licenses you can use. Pick one carefully and put it in the top of your CHICKEN files.&lt;/p&gt;
&lt;h4&gt;Looking ahead&lt;/h4&gt;
&lt;p&gt;While we probably have a functioning CHICKEN Scheme file, it is still not a real egg, it is merely one file containing a bunch of CHICKEN code and a module wrapper to give it a name and export the needed procedures.
As we have seen in chapter one, we now need to create some extra files so that CHICKEN will recognize this as a real module.&lt;/p&gt;
&lt;p&gt;And one other &lt;em&gt;very important&lt;/em&gt; thing we have to setup before you can release an egg is a &lt;em&gt;testing suite&lt;/em&gt;.
It is not mandatory for releasing an egg, but it is highly recommended.
&lt;em&gt;Always&lt;/em&gt; write tests for your code, this takes a little bit of your time, but can be a huge benefit in further development of your code.&lt;/p&gt;
&lt;p&gt;Okay, I will leave you to rest now, take a deep breath and relax.&lt;/p&gt;
&lt;p&gt;And as always...thanks for reading!&lt;/p&gt;&lt;/div&gt;</description><category>chicken</category><category>json rpc</category><category>scheme</category><guid>http://shisaa.be/postset/chicken-scheme-2.html</guid><pubDate>Thu, 05 Sep 2013 19:00:00 GMT</pubDate></item></channel></rss>