Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-30446: Implement DMTN-183's alert database server #1

Merged
merged 10 commits into from
Jul 27, 2021

Conversation

spenczar
Copy link
Collaborator

@spenczar spenczar commented Jul 13, 2021

Add a FastAPI-based server which provides alerts and schemas according to the routes laid out in DMTN-183.

This is done in a package structure which is separate from the LSST stack because I do not expect it to depend upon the stack, and the stack's complexity is not warranted here.

I'm trusting the integration test pretty heavily. I haven't really run the command line application because there's no data to surface.

Add a FastAPI-based server which provides alerts and schemas according
to the routes laid out in DMTN-183.

This is done in a package structure which is separate from the LSST
stack because I do not expect it to depend upon the stack, and the
stack's complexity is not warranted here.
@spenczar spenczar marked this pull request as draft July 13, 2021 19:53
Moved the Alert DB's backend implementations into alertdb/storage.py.

Fixed the server's responses so they provide raw bytes rather than
JSON-encoding the byte sequences.

Added an integration test.

Added requests dependency for the integration test.
Fix an invalid import, and correctly spell 'AssertionError'.
@spenczar spenczar marked this pull request as ready for review July 16, 2021 21:06
@spenczar spenczar changed the title Add a first sketch of an alert database server Implement DMTN-183's alert database server Jul 16, 2021
@spenczar spenczar changed the title Implement DMTN-183's alert database server DM-30446: Implement DMTN-183's alert database server Jul 16, 2021
Copy link

@jonathansick jonathansick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fantastic. I really appreciate how this architecture cleanly abstracts the storage backend so that storage can be implemented multiple ways without needing to change the web service or orchestration layers.

I noticed that the server is built with FastAPI but everything about the app is synchronous, from the request handlers to the backend interface. Would you consider building around asyncio? Like, build the FileBackend with aiofiles. For the S3-compatible backends, I don't know, there is aiobotocore, aioboto3 and ways to use boto3 itself concurrently. I honestly don't know what sort of performance gain you'd get from using asyncio, but it might be something to look into.

I'd recommend getting some GitHub Actions-based CI up before going much further. In particular, linting with flake8 and type checking with mypy would be big wins (the annotations in place). I also think that it'd be beneficial to run tests against the FileBackend and have those run in GitHub Actions. You could either generate sample data on the fly or commit some sample data in a tests/data directory.

My last big comment is totally beyond the scope of this PR and more a comment for DMTN-183. I wonder if there's a use case for GET /v1/alerts/:alert_id to take an Accept header and either return the Confluent Wire Format or indeed, decode it into JSON given an application/json mime type. I say this because this alert server and alert database now serve to abstract the Kafka infrastructure away, and so perhaps it makes sense to go all and not force the Avro encoding on consumers? You could even start to subset fields in the response, etc. This is sort of pie in the sky thinking and in fact, you may not want the server to take on extra compute load doing this.

Everything else is sort of thoughts for making this service more amenable to containerization and/or opinions:

  • Rather than maintaining your app's dependencies in your setup.cfg, consider a system that lets you pin dependencies (requirements.txt) for repeatable container builds, while still allowing you to only thinking about your abstract dependencies. In SQuaRE we've got a system based on pip-tools that compiles our abstract dependencies in a requirements/main.in file into concrete dependencies in a requirements/main.txt file.
  • At some point, you may want a configuration object that gathers configurations from environment variables (we like pydantic.BaseSettings). You might find this more convenient and maintainable than gather settings via the alertdb.py script. In this mode, you application start up module would build the config object first and then build the backend based on that configuration. Specifically, I think the "FastAPI way" of doing this would be as a dependency and that dependency would be available as an argument on the web hander functions. Its definitely different, but that's sort of the explanation for why FastAPI recommends the app be a global singleton as you mention in your server.py comment.
  • If you add more endpoints, you may want to split them out into another module. Something I like to also do is create a sub-app to encompass my versioned endpoints (so a sub-app for v1). This is something I'm still working out, and I don't know if means all my dependencies and middleware need to be replicated, if necessary between each sub-app.
  • I don't know whether you prefer unittest or not, but we write all our tests directly as pytest functions and make use of fixtures for things like getting a test client or setting up a test dataset.
  • Once you start running in Kubernetes, you'll want to add a health check endpoint that indicates whether the app is ready to take requests, and whether the application continues to be healthy (they can either be the same endpoint, or two different endpoints). The simplest way to start doing that is implementing a GET / endpoint that just returns 200. A more sophisticated set of health check endpoints would be one that also checks if the app can talk to the backend.
  • For logging, we use structlog in combination with our own FastAPI middleware in Safir to inject context about the request. This may be more relevant if the service layer becomes more complex and you need to piece together log statements from multiple places and correlated them by request.

>>> import io
>>> alert_payload = backend.get_alert("alert-id")
>>> alert_payload = gzip.decompress(alert_payload)
>>> alert_

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo?

@spenczar
Copy link
Collaborator Author

This is fantastic. I really appreciate how this architecture cleanly abstracts the storage backend so that storage can be implemented multiple ways without needing to change the web service or orchestration layers.

I noticed that the server is built with FastAPI but everything about the app is synchronous, from the request handlers to the backend interface. Would you consider building around asyncio?

Yeah, this is worth looking into. There's no official support in Google's client (googleapis/python-storage#478). There's a third-party library that provides it though (https://github.com/talkiq/gcloud-aio).

I don't know very much about how ASGI works, so I don't know how much this would matter. I would assume that the server will handle multiple concurrent requests in multiple processes; if so, then I don't think that an async backend would make much of a difference: storage is the only thing the request does, so the client has to wait for the storage backend work to be complete either way.


I'd recommend getting some GitHub Actions-based CI up before going much further.

Definitely! Made https://jira.lsstcorp.org/browse/DM-31213.


this alert server and alert database now serve to abstract the Kafka infrastructure away, and so perhaps it makes sense to go all and not force the Avro encoding on consumers?

Right, I'm happy with the server being as simple as possible, and putting that extra logic in the client. I'd rather push the compute load onto clients, rather than have all the clients share the server's compute load.


Rather than maintaining your app's dependencies in your setup.cfg, consider a system that lets you pin dependencies (requirements.txt) for repeatable container builds, while still allowing you to only thinking about your abstract dependencies.

Yeah, I really like this. pip-tools looks awesome. https://jira.lsstcorp.org/browse/DM-31214


At some point, you may want a configuration object that gathers configurations from environment variables (we like pydantic.BaseSettings).

This seems like it might be worth doing... but I'll know more once I sit down and write some code to actually deploy this thing somewhere. It's not obvious to me yet whether env vars or command-line params are going to be more convenient.


Specifically, I think the "FastAPI way" of doing this would be as a dependency and that dependency would be available as an argument on the web hander functions. Its definitely different, but that's sort of the explanation for why FastAPI recommends the app be a global singleton as you mention in your server.py comment.

Yeah, I had a lot of trouble making sense of how FastAPI dependencies work. If I had a dependency on a backend, would it get instantiated on every request? Or would it be persisted across requests? I want it to persist, since the Google APIs use gRPC and expect persistence. '

Anyway, I decided that it's at least easy to know how this server.py mechanism works. If I understood FastAPI better, I'd probably do something different.


If you add more endpoints, you may want to split them out into another module. Something I like to also do is create a sub-app to encompass my versioned endpoints (so a sub-app for v1).

Sounds good. For now, I think there won't be many more endpoints.


I don't know whether you prefer unittest or not, but we write all our tests directly as pytest functions and make use of fixtures for things like getting a test client or setting up a test dataset.

Eh, yeah, I think unittest is easier to reason about. pytest is so magical that I find it hard to follow, sometimes. I like how unittest is just python. pytest modifies the AST and does its own separate compilation step, so it's really a different programming language, which just has confused me too many times.


Once you start running in Kubernetes, you'll want to add a health check endpoint

Good point! https://jira.lsstcorp.org/projects/DM/issues/DM-31215


For logging, we use structlog in combination with our own FastAPI middleware in Safir to inject context about the request. This may be more relevant if the service layer becomes more complex and you need to piece together log statements from multiple places and correlated them by request.

Oh, this looks terrific. I'd like to use this, yes: https://jira.lsstcorp.org/projects/DM/issues/DM-31216

@spenczar spenczar merged commit e780649 into main Jul 27, 2021
@spenczar spenczar deleted the tickets/DM-30446 branch July 27, 2021 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants