|
| 1 | +{:.post-meta} |
| 2 | +*by [Roman Taranchenko][], Engineer at [Suredbits][]* |
| 3 | + |
| 4 | +After the first excitement of sending and, more importantly, receiving a |
| 5 | +payment over the Lightning Network has faded away, it’s always good to |
| 6 | +think about how to operate your node in a safe and reliable way. |
| 7 | +Failures almost always happen unexpectedly. How do you recover after a |
| 8 | +possible failure? How do you make backups reliable? How do you keep the |
| 9 | +seed in a secure location? Et cetera, et cetera… |
| 10 | + |
| 11 | +At [Suredbits][] we use Eclair for our nodes. Even though Eclair is |
| 12 | +pretty robust on its own, we took some steps to make it even more |
| 13 | +reliable. Such as using PostgreSQL as a database backend [(there is a PR |
| 14 | +in review at the time of writing)][db pr] and [AWS Secrets Manager][] to |
| 15 | +store private keys. |
| 16 | + |
| 17 | +Eclair has a built-in online backup feature, but it requires manual |
| 18 | +setup and script writing to automate, which is not really scalable and |
| 19 | +is error prone. Running PostgreSQL at AWS RDS allows us to automate |
| 20 | +backups and replication in the way almost every DevOps engineer is |
| 21 | +familiar with, and to restore the database state when needed more |
| 22 | +easily. |
| 23 | + |
| 24 | +Using PostgreSQL as a remote database backend makes node failover |
| 25 | +simpler to implement, because if the node crashes for some reason |
| 26 | +there’s no need to restore the database from a backup, all you need is |
| 27 | +to point a new Eclair instance to the correct database server. [Here’s a |
| 28 | +quick demo of an automated failover implemented with two Eclair |
| 29 | +instances, and AWS RDS, ELB, and NAT Gateway.][failover demo] |
| 30 | + |
| 31 | +In the failover scenario depicted in the demo, we needed a secure way to |
| 32 | +share the node’s seed private key between the Eclair instances. Eclair |
| 33 | +stores it in a file on the local file system and the seed file should be |
| 34 | +backed up somewhere and restored when needed. The current implementation |
| 35 | +requires extra steps to do so in an automated fashion. AWS Secrets |
| 36 | +Manager is an encrypted storage specifically designed to securely store |
| 37 | +various kinds of secrets, including database passwords and encryption |
| 38 | +keys. All you need to do to share the seed between the instances is to |
| 39 | +point them to the correct secrets location in the config file. And once |
| 40 | +configured, the instance can be stored as an AMI image, and re-imaged as |
| 41 | +many times as needed without manual configuration. |
| 42 | + |
| 43 | +The measures we took are just the first steps to building |
| 44 | +enterprise-grade Lightning nodes. There are still some more problems |
| 45 | +that need to be solved. For example, which Hardware Security Module |
| 46 | +(HSM) can be used for a Lightning node, or how to failover a Bitcoin |
| 47 | +Core node in a multi-instance setting. But we believe that our work is a |
| 48 | +solid base for scaling out Eclair and making it more fault-tolerant. |
| 49 | + |
| 50 | +More on this topic: <https://www.youtube.com/watch?v=tbwy9mJIrZE> |
| 51 | + |
| 52 | +Disclaimer: Since private keys are involved, don't use third party cloud |
| 53 | +services without a thorough risk assessment. |
| 54 | + |
| 55 | +[Roman Taranchenko]: https://github.com/rorp |
| 56 | +[suredbits]: https://suredbits.com |
| 57 | +[db pr]: https://github.com/ACINQ/eclair/pull/1249 |
| 58 | +[aws secrets manager]: https://github.com/rorp/eclair/tree/aws_secretsmanager |
| 59 | +[failover demo]: https://youtu.be/L2DtolwS8ew |
0 commit comments