Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2021-08-20 ipfs.io outage Post Mortem Tracking issue #469

Open
1 of 5 tasks
BigLep opened this issue Aug 27, 2021 · 4 comments
Open
1 of 5 tasks

2021-08-20 ipfs.io outage Post Mortem Tracking issue #469

BigLep opened this issue Aug 27, 2021 · 4 comments

Comments

@BigLep
Copy link
Contributor

BigLep commented Aug 27, 2021

The purpose of this issue is to publicly track the various action items we're taking as a result of the 2021-08-20 ipfs.io outage: https://blog.ipfs.io/2021-08-27-IPFS.io-gateway-outage-resolution/

  • Reduce the blast radius by separating the IPFS website from the ipfs.io gateway onto different domains. (see https://github.com/protocol/bifrost-infra/issues/178 )
  • Reduce the time to respond by paging engineers on sustained gateway inaccessibility.
  • Reduce time to mitigation by establishing and documenting direct human lines of communication for the registrars of domains of gateways operated by Protocol Labs.
  • Reduce the likelihood of a complete domain takedown by making it even easier for a concerned party to contact us directly about objectionable content (beyond our pre-existing abuse takedown email and resources on the Gateway FAQ and ipfs.io/legal, we will also publish and monitor email addresses in our domain records and ensure the root websites of our gateway domains provide links to our content policies. ).
  • Reduce recovery time by simplifying and better documenting the custom DNS resolution on our gateways.
@ipfs ipfs deleted a comment from welcome bot Aug 27, 2021
@BigLep
Copy link
Contributor Author

BigLep commented Aug 28, 2021

The blog post for this event is being published via ipfs/ipfs-blog#309

@BigLep
Copy link
Contributor Author

BigLep commented Jul 8, 2022

For reference the notes that were used to generate the public blog post and the list of action items is here: https://www.notion.so/pl-strflt/2021-08-20-ipfs-io-outage-Post-Mortem-d3937d6992fc4822be554807209f5fb9

@BigLep
Copy link
Contributor Author

BigLep commented Jan 2, 2023

2023-01-02 update: I know the first action item was completed. I haven't checked with infra and techops on the latest for the other items. I'm not currently able to chase this.

@andyschwab and @JesseXie are you able to verify here? Internal tracking of items was here: https://www.notion.so/pl-strflt/2021-08-20-ipfs-io-outage-Post-Mortem-d3937d6992fc4822be554807209f5fb9#56155b871db14a699f7e51afe375a866

@andyschwab
Copy link
Member

Items 3 & 4 are being tracked by RSS. More to come this quarter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants