Skip to content
Mike Grauer edited this page Mar 25, 2021 · 1 revision

Girder 4 Wiki

Purpose of Girder 4

The purpose of Girder 4 is to allow experienced software engineers in Kitware's Data and Analytics (D&A) team to build cloud native web applications that deal with heavyweight unstructured or semi-structured data, in order to support scientific and AI/ML workflows. This is in service of those customers who want robust, customizable solutions that start from a strong foundation of tooling.

Girder 4 can be used for rapid prototyping, but it is primarily intended for creating software services in production.

Scope of Girder 4

Girder 4 is a set of libraries and best practices that can be used to build cloud native Python web applications with support for:

  • authentication and user management
  • domain modeling
  • management of unstructured data
  • asynchronous processing

Differences with Girder 3

Philosophy

Girder 4 is not an application that can simply be started in the way that Girder 3 was, i.e. `pip install girder && girder build && girder serve``, but instead requires some amount of coding and data modeling to get started.

Girder 4 is built on top of well documented 3rd party open source tools, such that we are able to outsource much of the undifferentiated heavy lifting for building web apps to those tools. This means that more prerequisites need to be understood to develop a Girder 4 application, but also that the underlying tools are better documented and engineers are better able to self-service using e.g. Stack Overflow, rather than relying on a small number of D&A developers for answers.

With Girder 4, we have identified and filled gaps in the market that help solve our common use cases, and are creating artifacts, libraries, and patterns that are supported, generalized, and battle tested.

Technical Base

Girder 4 is built on Django, Postgres, and uses the AWS S3 API. In the most common case, it is intended to be deployed on Heroku (a platform-as-a-service provider), with some backing from AWS for S3, and other services such as sending email. There is no longer an assetstore abstraction over files from filesystems and S3-compliant object stores.

TODO:

  • dev tooling
  • deployment tooling

Domain Modeling and Data Driven Design

TODO:

  • explain DDD
  • comparison with files/folders

Ecosystem of Repos

pass

Development Prerequisites

pass

Girder 4 in Operation

TODO:

  • Terraform
  • Terraform Cloud
  • CI/CD path
  • Sentry
  • Papertrail

Choice of Heroku

  • 12 factor app and guardrails

Limitations of Heroku

pass

EC2 Workers

pass

Moving off of Heroku

Girder 4 Architecture

pass