Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operability #2

Open
eldavido opened this issue Apr 15, 2014 · 4 comments
Open

Operability #2

eldavido opened this issue Apr 15, 2014 · 4 comments

Comments

@eldavido
Copy link

A few days ago, we tried to deploy a new revision of our node.js-based data collection software into production. The revision in question required this library using the usual line in package.json, and invoked it using the parse method.

Everything worked fine in local development.

When we went to deploy it, we found the library was downloading a new user agent strings file into our production environment. After reading the code, I understand this is intended behavior, and the library does this automatically to ensure freshness of the user-agent definitions.

However, during our deployment, we hit a snag where the user-agent strings site was temporarily unavailable, causing our deployment to fail. After discussing this with our director of operations, we decided it's too risky to have a core part of our UA parse logic depending on the uptime of a site that's out of our control, especially given that this library doesn't appear to fail gracefully if it can't update itself.

To summarize, the current strategy has the following problems:

  • Deployments can fail if the remote library isn't available
  • Our deployment packages don't fully capture the state of the system
    • Behavior can differ between instances of the code, if they've loaded different versions of the updates file (we run this code on a cluster of dozens of machines)
    • Bug reproduction becomes more complicated, as the behavior of the system won't stay constant over time, which complicates change management and issue tracking

Before I try to write a patch, I wonder how @GUI feels about any of the following approaches:

  1. Allow specification of a filesystem-based path from which the user-agent string will be loaded, which our operations team will be responsible for updating (probably via either a deployment package, or a cronjob, but let us handle that part of it as part of our deployment/change management procedures)
  2. Allow automatic updates to be disabled, instructing the code to use the snapshot of the data which ships with the module

I think my preference would be (1) but I'm just brainstorming here, any input would be appreciated.

@GUI
Copy link
Owner

GUI commented Apr 15, 2014

Great points all around. Thanks for bringing this up. I like the sounds of option 1 too. If you wanted to write a patch for that, I'm definitely open to pull requests.

@eldavido
Copy link
Author

I chatted with the team, unfortunately we've decided to go with useragent NPM over this one as it already does some of these things. Good lessons to learn for future software dev though. - D

@GUI
Copy link
Owner

GUI commented Apr 15, 2014

Thanks for raising these issues, in any case. I'm going to reopen this since I would still like to tackle these issues (cleaning up the auto-update mechanism, making it more configurable, and handling potential outages better). I'm not quite sure when I'll be able to get to this, but hopefully it will be sometime soonish.

Thanks again!

@GUI GUI reopened this Apr 15, 2014
@eldavido
Copy link
Author

On behalf of ops teams everywhere, thanks. I wore the pager at my current place for 1.5yrs (Crittercism), was amazing what all I learned about deployment, safety, and operability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants