Skip to content
Andy Theuninck edited this page Jan 4, 2016 · 4 revisions

The HQ Data Model is one method for managing data across multiple stores. This model centralizes data management and is not designed for resiliency against connectivity problems. On-location data stores are provided for performance as well as minimum necessary functionality during outages as opposed to full redundancy.

Terms

  • A Database (DB) is a server running SQL, typically MySQL
  • A Web Server is a server running the Office UI, typically via Apache
    • In diagrams databases and web servers are shown as separate machines. This is for illustrative purposes and not strictly necessary.
  • A Lane is a machine running the POS software and checking out customers
  • Writes are SQL queries that add, remove, or alter records within the database or structures within the database.
  • Reads are SQL queries that are not Writes
  • Replication is an automated process where changes to one database are propogated to one or more slave DBs
  • Manual Syncs are a non-automated process where data is transferred between databases.

Diagram

The Databases

The Master Database is the primary data store for the entire organization. If there is any discrepancy between databases, the Master Database is correct. Directing the vast majority of Writes to the Master Database greatly reduces the potential for discrepancies and inconsistencies. The Master Database always resides at HQ. HQ might be a retail location with its own Lanes but does not have to be.

The Store Database is a local copy of the Master Database. Each non-HQ location should have a Store Database replicating the Master Database. The Store Database serves two purposes. First, it reduces load on the Master Database and latency for end users by handling a portion of Reads. Second, it provides a local data store for lane transaction data. Because transaction data is only INSERTed and never modified it's safe to handle this data out-of-band from replication. One named database on the Store Database server should be excluded from replication for this purpose.

Slave Databases are entirely optional. One or more slave databases at each location can also replicate the Master Database. Slave Database can be used to further spread out the load of Read queries, as hot spares for other hardware, or for taking snapshot backups without taking the Master Database offline.

The Web Servers

The web servers run the Office software and provide the end-user UI. The HQ web server simply Reads from and Writes to the Master Database. A non-HQ web server should still be configured with the Master Database as its primary data store. All Writes will be directed to the Master Database while Reads will be load-balanced across all available databases.

The Lanes

Lanes interact with their nearest database. HQ Lanes, if any, will interact with the Master Database. Non-HQ Lanes will interact with their Store Database. Writes to lanes are usually proxied through the location's Web Server which will read from an available database and copy the results out to the lanes. As mentioned above, non-HQ lanes should be configured to Read and Write from the local Store Database. Any loss of site-to-site connectivity will be entirely invisible to the Lanes.

Connectivity and Outages

Loss of connectivity between locations causes problems for non-HQ locations. The Store Web Server can no longer save any changes since all Writes depend on the Master Database and some fraction of Reads will fail as well (this could and should be mitigated in software detecting loss of Read DBs but is not currently). Possible mitigation strategies include:

  • Wait / Fix the connectivity problem. Depending on circumstances this may be perfectly reasonable.
  • Reconfigure Office to use the Store Database as its primary data store.
    • If you use an SQL user account that only has read access to the Store Database, changes cannot be saved but no data inconsistencies will be introduced.
    • If you use an SQL user account that has read/write access to the Store Database, you'll need to do some cleanup once connectivity is restored.
    • If you use an SQL user account that has read access to the entire Store Database but only has read/write access to one named database, e.g. core_op, the scope of cleanup will be reduced. Giving read/write access as needed on a per-table basis could be even better.

Load Balanced Reads

This setup makes a clear distinction between Reads and Writes. It's the developers responsibility to keep track of these. Calling FannieDB::getReadOnly will return a connection to an available database where as calling FannieDB::get will return a connection to the configured primary data store. Alertnate, read-only databases (both Store DBs and Slave DBs) are configured on Office's Stores tab. Using SQL user account with read-only access is recommended. Developers are human and may accidentally send Write queries to a read-only database.

Clone this wiki locally