% jason(7) | cloud and database-wrangler - Linux man page
jason tevnan - cloud and database-wrangler with a penchant for automation and sous vide cooking
jason [--store SQL|NOSQL] [--administrate] [--automate] [--monitor] [--virtualize] [--containerize] [--script] [--think]
I worked in a variety of environments, sometimes outside of my comfort-zone where I have been forced to learn and run technologies in production at the same time. I pursue issues with a passion (as any good DBRE/SRE would) and don’t settle for it just works. I love learning new technologies and wrapping my head around new paradigms, ideally while enjoying dark-roast coffee at my standing desk.
- Verifying database workloads, identifying schema bottlenecks, tuning innodb settings, while providing an HA database can be a lot of fun
- Scaled Runtastic’s core database starting at 1M users to 99M users and the ability to handle 100k QPS
- Migrated 90+ microservice DBs from Cloudsql to RDS with 7 seconds of downtime
- Partitioning a table or verifying the innodb engine status are things i consider fun
- Experience finding and debugging DBaaS provider based issues while gathering all the metrics possible
- Designed and implemented upgrade system for 200+ mysql 5.7 databases to 8.0 with 3 seconds of downtime
- Used it with all its geo-spatial features for OpenStreetMap
- Automated replicating to another database with slony before 9.0 came out with log shipping
- Automated replication and fail-over with repmgrd/patroni
- Had lots of fun automating pgbouncer with consul as a HA write endpoint
- Have come to embrace MVCC and the pros and cons that it brings
- Have solidified my understanding of the pg_% tables and the plethora of information they provide
- Love (and sometimes hate) the pg_stat_Statements extension
- Had to learn on highly performant, always-on database systems
- Learned what a missing index can do, which queries to avoid and how to find the meaning of all those v$ views
- The experience I gathered understanding B-Tree index look-ups, helped me to understand look-ups across all database systems
- Enjoyed a love/hate relationship with Azure SQL server as a core database
- Love the depth of the query planner and the insight into live statistics
- Hate the fact that indexes are built around fragmentable structures and inline statistics calculation is enabled by default
- Attended admin training courses to understand: schema-less can also be a curse since it is incredibly challenging when defining a scalable, future proof indexing concept
- Enjoy shard key definitions for collections to ensure the fewest scatter-gather queries, Optimizing storage read ahead settings and playing with Wiredtiger cache options
- Have lots of experience running the robust HA setup which is very easy to run from an operations standpoint
- I find the replication approach and ring setup amazingly robust with anisotropic read repairs and the ability to stream and receive data while accepting queries
- Scaled Runtastic’s cluster from 4 to 8 to 32 and finally to 64 nodes with almost no downtime
- Run instances under normal conditions as well as high load condition
- Have found the right balance between RAM and IOs to ensure that they can flush to disk
- Since Redis is the core of many ruby queueing solutions, I have been forced to face my fears, tuning BGSAVE cycles and finding the optimal fsync to AOF ratios
- Was thrown into cold water with multiple clusters running at high load
- When building the prometheus exporter for a CouchDBaaS provider, learned about many of the internal stats and their meanings.
- Translated that knowledge into practical experience by building Kubernetes powered scale-out deployments to load test, while starting an ever growing comprehensive cluster administration documentation.
(run, play, break, repeate)
- Traditional stack (e.g. apache2, nginx)
- Extended web servers (e.g. trinidad, passenger)
- Experience debugging performance bottlenecks
- Setup instances which handle > 80k rpm
- HaProxy, nginx
- LOTS of experience with the normal Linux stack (e.g. bind, dhcpd, ldap, openvpn, ssh, memcached ...)
- After > 15 years experience with all aspects of the os, i still think Linux is the best for servers
- zookeeper
- rabbitmq
- activemq
- nats (with jetstream)
- consul (for service discovery)
- All in clusters running at least 3 nodes
- Very interesting (i.E. challenging) to scale
- Run at scale (gitlab.com) as well as company wide implementations.
- Experience with the joys and pains of CI implementation and administration.
(Automation, testing and auditing is inevitable in today’s world of highly fluctuant infrastructure)
- Wrote and deployed cookbooks for every aspect of Runtastic's infrastructure
- Try to ensure that all infrastructure code has full test coverage
- Test-Kitchen, inspec and chefspec are my friends
- Wrote and deployed roles to automate cache layer deployments
- Discovered the love/hate relationship in the python's jinja2
- Compiled modules to simplify complex deployments
- Wrote a provider to interface with OpenNebula
- Used to deploy all aspects of Cabify and Fonoa's non application layer infrastructure
- flux - to run large and small infrastructure
- ArgoCD - and the app-of-app-of-apps
- Helm - the joys and the follies
(No observability, means not knowing anything)
- Wrote and deployed numerous checks
- Running an nrpe based deployment with full automation
- > 10k checks distributed across 1k servers
- Wrote and integrated checks for nfs-iostat and mongodb
- Running and fully automated with a graphite front end
- Collecting > 100k metrics an hour
- Implemented Percona’s graphing suite for mysql
- Collect all core database metrics from connections to innodb flush times
- Alert-manager, recording-rule, exporter - oh my. Very powerful solution with an ever growing community? Count me in.
- Wrote recording/alerting rules with unit tests
- Experience with some storage engine and memory shenanigans
- Visualization with graphana
- Wrote exporters for databases and weather stations
- Very familiar with New Relic, Pingdom, Dynatrace, PagerDuty, VictorOps
(control your destiny - as much as you can)
- Experienced every phase of growth from 8 hypervisors to 60
- Have run opennebula as an EC2 replacement as a native cloud (extensive API) and as a simple server manager
- In the process of automating setup and configuration via teraform
- Qemu based
- NFS and Ceph storage backend
- Currently use it as a minikube virtualizer
- Runtastic’s pre-production system ran on vbox for a long time (hard to imagine)
- Mainly running older cookbook tests with vbox
- Automate Google Cloud Platform (GCP) and Azure instance deploys with terraform
- Experience the joys (its so easy) and pains (why is the db rebooting?) of not controlling your hypervisors
(run it like mike)
- Write dockerfiles to encapsulate many applications
- Build typical applications as well as X based, multi-arch, multi-stage ones
- Automated container builds with GitLab CI and BATs
- Wrote many manifests for different applications, ranging from banal to complex
- Run my own cluster on RaspberryPis for all my home needs
- Gave a talk at SFSCon about using Flux to automate manifest deployments: link
- Experience running complex and simple jobs
- Integrated with other HashiCorp products (Consul, Hashiui)
- As a plugin for new test-kitchen deployments
- Played around a bit LXD
- wrote extensive bash scripts for automation with unit tests (bats)
- found out that bash has its limits :)
- Enjoy writing and maintaining a go backend for a research project
- Wrote gobench to benchmark schemas in mysql and postgres for high throughput
- Learned about api design the hardway while using grpc
- vim > emacs
- zsh > bash
- tmux > screen
- Built automation for seemless mysql 5.7 to 8.0 live migrations
- Took on a more staff engineering role focused on mentoring and teaching
- Drove redefinition of higher level engineers role through out the company
- Re-imagine how database related support is handled with data and chatgpt
- Fully remote
- Automate, run, manage all database related technologies: MySQL/Postgres (Google Cloudsql), MsSQL (Azure)
- Ensured that monitoring and alerting was availible with end2end testing
- Build exporter for missing SQL server metrics
- Create runbooks for oncall team members with little database context to ensure service continuity
- Write design guides to help developers understand their schema and engine decisions
- Fully remote
- Downsized out of a job :/
- Tasked with automating, managing, running all database related technologies: MySQL (Google Cloudsql), Couchdb (Cloudant), Redis, Memcached, Elasticsearch
- Made fully monitored, highly available database creation self service
- Build exporters for missing observability in DBaaS platform
- Automate no-downtime sql based CI powered schema changes
- Continually document and assist developers in making persistence decisions
- Support developers in identifying design bottlenecks in query pattern, database design.
- Fully remote
- Memeber of a small fully remote team
- Scale gitlab.com (millions of users) using GitLab (typically built for thousands of users) in a cloud environment
- Collaborate on developing HA solution for PostgreSQL in the GitLab omnibus package
- Strove to fully automate environments from terraform to multi-tiered HA stack
- Build a back-end agnostic solution for secrets in chef
- Use chef to automate all-the-things
- Fully remote
- Define setup and strategy for each upcoming stack
- Ensure scalability of technologies and concepts
- Setup workflows for automation and deployments
- Organize small team while fighting to stay ahead of growth
- Very challenging for me to lead a team of inexperienced ops and shaping our infrastructure
- Nested under the web development team
- Start automation
- Improve uptime through monitoring and derive future actions
- Conceptualize private cloud based on opennebula
- Setup ticketing workflow based on ITIL best practices
- Created automated master/slave setup with slony for PostgreSQL 8.3/8.4
- Spent time training staff in the casino headquarters to be first level support techs
- Introduce metric collection to visualize hardware utilization for the customer
- Manage customer care projects
- Responsible for everything from planning to doing
- Largest project was complete warehouse upgrade to a medium sized 24x7 cosmetic distributed
- Organized and held numerous on-site training courses around the world
- Field production problems in a 24x7 environment
- Handle issues ranging from PLC (Siemens S7) to tablespace cleanups on a core Oracle instance
- Extra-occupational program
- Email: jason.tevnan@gmail.com
- Phone: +43.650.2167444
- LinkedIn: https://at.linkedin.com/in/jason-tevnan-5390b4a8
Prone to flu if left in rain.
Jason Tevnan (jason.tevnan@gmail.com)