Helping people store, retrieve and derive insights from data
is the essence of all software applications.
Like it or not, Relational Databases store
most of the world's structured data
and Structured Query Language (SQL)
is by far the most frequent way of retrieving the data.
According to the most recent surveys/statistics, SQL still dominates the world of databases.
https://insights.stackoverflow.com/survey/2018/#technology-databases
https://db-engines.com/en/ranking
Note: you should never adopt a technology based on it's current popularity, also be ware of "argumentum ad populum" ("it's popular therefore you should use it"). Always pick the appropriate tool for the job based on the requirements, constraints and/or availability (both of "skill" on your existing team or in the wider community). We include these stats to explain that relational databases are still the most widely used by far and so learning SQL skills is a very wise investment both as an individual and for your team or organisation.
Getting started with PostgreSQL is easy,
(just follow the steps in this guide and try out the example queries!)
When you are ready to deploy your app, you are in safe hands,
PostgreSQL runs everywhere:
- Travis-CI (free) Integration Testing: https://docs.travis-ci.com/user/database-setup/#postgresql
- Heroku PostgreSQL (free for MVP: 10k rows): https://www.heroku.com/postgres
- AWS RDS Postgres (good value + high performance): https://aws.amazon.com/rds/postgresql/
- Google Cloud SQL: https://cloud.google.com/sql/
- DigitalOcean: https://www.digitalocean.com/products/managed-databases/
- Linode: https://www.linode.com/docs/databases/postgresql/create-a-highly-available-postgresql-cluster-using-patroni-and-haproxy/
- Azure: https://azure.microsoft.com/en-us/services/postgresql/
- Self-managed high availability cluster: https://github.com/sorintlab/stolon
Everyone building any application that stores data should learn SQL. SQL is ubiquitous in every field/industry and the sooner you learn/master it, the higher your life-time return on time investment.
Learning how to use a relational database is a foundational skill for all of computer science and application development.
Being proficient in SQL will open the door to Data Science with SQL-on-Hadoop Apache Spark, Google BigQuery, Oracle and Teradata. In short, get really good at SQL! It's very useful.
This tutorial covers 5 areas:
- What is PostgreSQL?
- How do I get started with PostgreSQL? (a fully functioning example!)
- What is Structured Query Language (SQL)? (lots of example queries!)
- How do I write my own SQL Queries?
- How do I deploy my own PostgreSQL-based Application?
Once you have covered these areas, you will know if PostgreSQL is "right" for your needs, or if you need to keep looking for a different way to store data.
Let's dive in!
PostgreSQL (often shortened to simply "Postgres") is an advanced Relational DataBase Management System ("RDBMS"), that lets you efficiently and securely store any type of data. We will explain "Relational Database" in the context of our example below, so don't worry if it sounds like a buzzword soup.
Postgres has an emphasis on standards compliance and extensibility which means there are many plugins you can use to enhance it like PostGIS for mapping applications and entire projects built on top of it like TimescaleDB (a time-series database perfect for analytics) and AgensGraph (a graph database, great for modelling networks e.g a "social graph").
Structured Query Language (SQL)
is the preferred means of interacting with data at any scale.
The only reason MySQL is still more widely used than Postgres can be summarised in one word: WordPress. WordPress has a firm grip on the CMS-based website market and it shows no sign of slowing down. If your goal is to build CMS-based websites, or the company you already work for uses WordPress, you should go for it! If you prefer a more general introduction to SQL, follow this tutorial! The knowledge you will gain by learning Postgres is 95%+ "transferable" to other SQL databases so don't worry about the differences between MySQL and Postgres for now. If you're curious, read: https://hackr.io/blog/postgresql-vs-mysql
Before you get started with using PostgreSQL, you'll have to install it. Follow these steps to get started:
-
There are a couple of ways to install PostgreSQL. One of the easier ways to get started is with Postgres.app. Navigate to https://postgresapp.com/ and then click "Download":
-
Once it's finished downloading, double click on the file to unzip then move the PostgreSQL elephant icon into your
applications
folder. Double click the icon to launch the application. -
You should now see a new window launched with a list of servers to the left side of the window (if it's a fresh install, you should see one named
PostgreSQL XX
). If it shows anything else or an error props up, make sure you don't have any other instances of Postgres on your computer and reinstall. To fully reinstall follow these steps to delete data directories and preferences. Click on the button 'Initialize' (or 'Start' if you had already installed previously).
-
Run
sudo mkdir -p /etc/paths.d && echo /Applications/Postgres.app/Contents/Versions/latest/bin | sudo tee /etc/paths.d/postgresapp
(found here) to usepsql
in the terminal. Close and open the terminal. -
Postgres.app will by default create a role and database that matches your current macOS username. You can connect straight away by running
psql
. -
You should then see something in your terminal that looks like this (with your macOS username in front of the prompt rather than 'postgres'):
- You should now be all set up to start using PostgreSQL. For documentation on command line tools etc see https://postgresapp.com/documentation/
Digital Ocean have got a great article on getting started with postgres. A quick summary is below.
sudo apt-get update
sudo apt-get install postgresql postgresql-contrib
By default the only role created is the default 'postgres', so PostgreSQL will only respond to connections from an Ubuntu user called 'postgres'. We need to pretend to be that user and create a role matching our actual Ubuntu username:
sudo -u postgres createuser --interactive
This command means 'run the command createuser --interactive
as the user called "postgres"'.
When asked for the name of the role enter your Ubuntu username. If you're not sure, open a new Terminal tab and run whoami
.
When asked if you want to make the role a superuser, type 'y'.
We now need to create the database matching the role name, as PostgreSQL expects this. Run:
sudo -u postgres createdb [your user name]
You can now connect to PostgreSQL by running psql
.
-
To start PostgreSQL, type this command into the terminal:
psql
-
Next type this command into the PostgreSQL interface:
CREATE DATABASE test;
NOTE: Don't forget the semi-colon. If you do, useful error messages won't show up. -
To check that our database has been created, type
\l
into the psql prompt. You should see something like this in your terminal:
-
If you closed the PostgreSQL server, start it again with:
psql
-
To create a new user, type the following into the psql prompt:
CREATE USER testuser;
-
Check that your user has been created. Type
\du
into the prompt. You should see something like this: Users can be given certain permissions to access any given database you have created. -
Next we need to give our user permissions to access the test database we created above. Enter the following command into the
psql
prompt:GRANT ALL PRIVILEGES ON DATABASE test TO testuser;
If you've installed Postgres App as in the example above, you can easily extend it to include PostGIS. Follow these steps to begin using PostGIS:
-
Ensure that you're logged in as a user OTHER THAN
postgres
. Follow the steps above to enable your default user to be able to access thepsql
prompt. (installation step 7) -
Type the following into the
psql
prompt to add the extension:
CREATE EXTENSION postgis;
After you've extended PostgreSQL with PostGIS you can begin to use it. Type
the following command into the psql
command line:
SELECT ST_Distance(gg1, gg2) As spheroid_dist
FROM (SELECT
ST_GeogFromText('SRID=4326;POINT(-72.1235 42.3521)') As gg1,
ST_GeogFromText('SRID=4326;POINT(-72.1235 43.1111)') As gg2
) As foo ;
This should return spheroid_dist
along with a value in meters. The
example above returns: 84315.42034614
which is rougly 84.3km between the two
points.
Once you are serving the database from your computer
-
To change db
\connect database_name;
-
To see the tables in the database
\d;
-
To select (and show in terminal) all tables
SELECT * FROM table_name
-
To make a table
CREATE TABLE table_name (col_name1, col_name2)
-
To add a row
INSERT INTO table_name ( col_name ) VALUES ( col_value)
col_name only require if only some of the cols are being filled out -
To edit a column to a tableΒ
ALTER TABLE table_name Β ALTER COLUMN column_name SET DEFAULT expression
-
To add a column to a tableΒ
ALTER TABLE table_name Β ADD COLUMN column_name data_type
-
To find the number of instances where the word βDayβ is present in the title of a table
SELECT count(title) FROM table_name WHERE title LIKE '%Day%β;
-
To delete a row in a table
DELETE FROM table_name WHERE column_name = βhello';
Postgresql follows the SQL convention of calling relations TABLES, attributes COLUMNs and tuples ROWS
Transaction All or nothing, if something fails the other commands are rolled back like nothing happened
Reference When a table is being created you can reference a column in another table to make sure any value which is added to that column exists in the referenced table.
CREATE TABLE cities (
name text NOT NULL,
postal_code varchar(9) CHECK (postal_code <> ''),
country_code char(2) REFERENCES countries,
PRIMARY KEY (country_code, postal_code)
);
<>
means not equal
Join reads You can join tables together when reading them,
Inner Join Joins together two tables by specifying a column in each to join them by i.e.
SELECT cities.*, country_name
FROM cities INNER JOIN countries
ON cities.country_code = countries.country_code;
This will select all of the columns in both the countries
and cities tables the data, the rows are matched up by country_code
.
Grouping You can put rows into groups where the group is defined by a shared value in a particular column.
SELECT venue_id, count(*)
FROM events
GROUP BY venue_id;
This will group the rows together by the venue_id, count is then performed on each of the groups.
- Node-hero: https://blog.risingstack.com/node-js-database-tutorial
- Pluralsight postgres getting started: https://www.pluralsight.com/courses/postgresql-getting-started
- Tech Republic Postgres setup: https://www.techrepublic.com/article/diy-a-postgresql-database-server-setup-anyone-can-handle/
- PostGIS installation: https://postgis.net/install
- PostGIS docs: https://postgis.net/docs/manual-2.3
- SQl Tutorials: https://www.scaler.com/topics/sql/
- PostGIS ST_Distance: https://postgis.net/docs/ST_Distance.html
- Foreign Key Constraints:
- Graphical Interface (GUI) tools: https://wiki.postgresql.org/wiki/Community_Guide_to_PostgreSQL_GUI_Tools