Skip to content

Latest commit

 

History

History
149 lines (82 loc) · 4.47 KB

RUNBOOK.md

File metadata and controls

149 lines (82 loc) · 4.47 KB

Origami Navigation Service

Provides consistent navigation for FT applications.

Code

origami-navigation-service

Service Tier

Platinum

Lifecycle Stage

Production

Primary URL

https://www.ft.com/__origami/service/navigation/v2

Host Platform

Heroku

Contains Personal Data

No

Contains Sensitive Data

No

Can Download Personal Data

No

Can Contact Individuals

No

Failover Architecture Type

ActiveActive

Failover Process Type

FullyAutomated

Failback Process Type

FullyAutomated

Data Recovery Process Type

Manual

Release Process Type

PartiallyAutomated

Rollback Process Type

PartiallyAutomated

Key Management Process Type

Manual

Architecture

This is mostly a Node.js application but with the following external components:

  • An S3 bucket which contains the navigation data as JSON

Loading Data

  1. When this service starts, the first thing it does is fetches a JSON file from an S3 bucket
  2. Having loaded the navigation data, the application can begin serving requests
  3. While running, this application polls the S3 bucket for new data at a regular interval. If the S3 bucket is not available when it's requested, then a stale in-memory copy is served

More Information

See the live service for more information on how to use.

Heroku Pipeline Name

origami-navigation-service

First Line Troubleshooting

There are a few things you can try before contacting the Origami team:

  1. Restart all of the dynos across the production EU and US Heroku apps (pipeline here)

Second Line Troubleshooting

If the application is failing entirely, you'll need to check a couple of things:

  1. Did a deployment just happen? If so, roll it back to bring the service back up (hopefully)
  2. Check the Heroku metrics page for both EU and US apps, to see what CPU and memory usage is like (pipeline here)
  3. Check the Splunk logs (see the monitoring section of this runbook for the link)

If only a few things aren't working, the Splunk logs (see monitoring) are the best place to start debugging. Always roll back a deploy if one happened just before the thing stopped working – this gives you the chance to debug in the relative calm of QA.

Monitoring

Failover Details

Our Fastly config automatically routes requests between the production EU and US Heroku applications. If one of those regions is down, Fastly will route all requests to the other region.

Data Recovery Details

If the data is lost, the S3 bucket would need to be recreated manually. This process should involve a single deploy of the Origami Navigation Data service, so it's relatively simple.

Release Details

The application is deployed to QA whenever a new commit is pushed to the main branch of this repo on GitHub. To release to production, the QA application must be manually promoted through the Heroku interface.

Key Management Details

This service uses two keys:

  1. GitHub (with read permissions only)
  2. AWS (read/write permissions for a single S3 bucket)

The process for rotating these keys is manual, via the GitHub and AWS interfaces.