Skip to content

Miserlou/NoDB

Repository files navigation

NoDB

Build Status Coverage PyPI Slack Gun.io Patreon

NoDB isn't a database.. but it sort of looks like one!

NoDB an incredibly simple, Pythonic object store based on Amazon's S3 static file storage.

It's useful for prototyping, casual hacking, and (maybe) even low-traffic server-less backends for Zappa apps!

Features

  • Schema-less!
  • Server-less!
  • Uses S3 as a datastore.
  • Loads to native Python objects with cPickle
  • Can use JSON as a serialization format for untrusted data
  • Local filestore based caching
  • Cheap(ish)!
  • Fast(ish)! (Especially from Lambda)

Performance

Initial load test with Goad of 10,000 requests (500 concurrent) with a write and subsequent read of the same index showed an average time of 400ms. This should be more than acceptable for many applications, even those which don't have sparse data, although that is preferred.

Installation

NoDB can be installed easily via pip, like so:

$ pip install nodb

Warning!

NoDB is insecure by default! Do not use it for untrusted data before setting serializer to "json"!

Usage

NoDB is super easy to use!

You simply make a NoDB object, point it to your bucket and tell it what field you want to index on.

from nodb import NoDB

nodb = NoDB("my-s3-bucket")
nodb.index = "name"

After that, you can save and load literally anything you want, whenever you want!

# Save an object!
user = {"name": "Jeff", "age": 19}
nodb.save(user) # True

# Load our object!
user = nodb.load("Jeff")
print(user['age']) # 19

# Delete our object
nodb.delete("Jeff") # True

By default, you can save and load any Python object.

Here's the same example, but with a class. Note the import and configuration is the same!

class User(object):
    name = None
    age = None
    
    def print_name(self):
        print("Hi, I'm " + self.name + "!")
    
    def __repr__(self):
        """ show a human readable representation of this class """
        return "<%s: %s (%s)>" % (self.__class__.__name__, self.name, self.age)

new_user = User()
new_user.name = "Jeff"
new_user.age = 19
nodb.save(new_user) 
# True

jeff = nodb.load("Jeff")
jeff.print_name() 
# Hi, I'm Jeff!

You can return a list of all objects using the .all() method.

Here's an example following from the code above, adding some extra users to the database and then listing all.

newer_user = User()
newer_user.name = "Ben"
newer_user.age = 38
nodb.save(newer_user)
# True

newest_user = User()
newest_user.name = "Thea"
newest_user.age = 33
nodb.save(newest_user)
# True

nodb.all()
# [<User: Jeff (19)>, <User: Ben (38)>, <User: Thea (33)>]

Advanced Usage

Different Serializers

To use a safer, non-Pickle serializer, just set JSON as your serializer:

nodb = NoDB()
nodb.serializer = "json"

Note that for this to work, your object must be JSON-serializable.

Object Metadata

You can get metainfo (datetime and UUID) for a given object by passing metainfo=True to load, like so:

# Load our object and metainfo!
user, datetime, uuid = nodb.load("Jeff", metainfo=True)

You can also pass in a default argument for non-existent values.

user = nodb.load("Jeff", default={}) # {}

Human Readable Indexes

By default, the indexes are hashed. If you want to be able to debug through the AWS console, set human_readable_indexes to True:

nodb.human_readable_indexes = True

Caching

You can enable local file caching, which will store previously retrieved values in the local rather than remote filestore.

nodb.cache = True

AWS settings override

You can override your AWS Profile information or boto3 session by passing either as a initial keyword argument.

nodb = NoDB(profile_name='my_aws_development_profile')
# or supply the session
session = boto3.Session(
    aws_access_key_id=ACCESS_KEY,
    aws_secret_access_key=SECRET_KEY,
    aws_session_token=SESSION_TOKEN,
)
nodb = NoDB(session=session)

TODO (Maybe?)

  • Tests with Placebo
  • Local file storage
  • Quering ranges (numberic IDs only), etc.
  • Different serializers
  • Custom serializers
  • Multiple indexes
  • Compression
  • Bucket management
  • Pseudo-locking
  • Performance/load testing

Related Projects

  • Zappa - Python's server-less framework!
  • K.E.V. - a Python ORM for key-value stores based on Redis, S3, and a S3/Redis hybrid backend.
  • s3sqlite - An S3-backed database engine for Django

Contributing

This project is still young, so there is still plenty to be done. Contributions are more than welcome!

Please file tickets for discussion before submitting patches. Pull requests should target master and should leave NoDB in a "shippable" state if merged.

If you are adding a non-trivial amount of new code, please include a functioning test in your PR. For AWS calls, we use the placebo library, which you can learn to use in their README. The test suite will be run by Travis CI once you open a pull request.

Please include the GitHub issue or pull request URL that has discussion related to your changes as a comment in the code (example). This greatly helps for project maintainability, as it allows us to trace back use cases and explain decision making.

License

(C) Rich Jones 2017, MIT License.


Made by Gun.io