Skip to content
This repository has been archived by the owner on May 17, 2023. It is now read-only.
/ atom-tin Public archive

Tiny Scala API to stream, parse, and cache atom coordinates from Protein Data Bank files

License

Notifications You must be signed in to change notification settings

dmyersturnbull/atom-tin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

atom-tin

Tiny Scala API to stream, parse, and cache atom coordinates from Protein Data Bank files. Only ATOM and HETATM records are parsed. Uses ScalaCache as a facade to support virtually any caching backend, such as Ehcache, Redis, Caffeine, or a custom backend.

Provides a model (PdbAtom), a parser (PdbParser), and a cache (AtomTin). There are three SBT subprojects:

  • core, which includes only the parser, model, and cache
  • caffeinated, which uses Caffeine for in-memory caching (via CaffeinatedAtomTin)
  • pickled, which uses Pickling for on-disk serialization (via PickledAtomTin)

Examples

Simple example of using pickled and synchronous lookups:

val tin = new PickledAtomTin() // directory for serialization defaults to ~/atom-tin-cache
val atoms: TraversableOnce[PdbAtom] = tin.loadAndWait("1hiv") // max wait defaults to infinite

Or the same with caffeinated:

val tin = new CaffeinatedAtomTin(_.maximumSize(100)) // alter Caffeine defaults for maximumSize
val atoms: TraversableOnce[PdbAtom] = tin.loadAndWait("1hiv")

To use asynchronous lookups, use:

val atoms: Future[TraversableOnce[PdbAtom]] = tin.load("1hiv")

Here's a more complex example that prints the coordinates of Arginines for multiple PDB structures asynchronously:

def printArginines(pdbId: String) = {
	tin.load(pdbId) map {
		atoms => atoms filter (a => a.residueName == Right(AminoAcid.Arginine))
				map (_.coordinates)
	} onSuccess {
		case coordinates => println(coordinates)
	}
}
Seq("1HIV", "5AYR", "2D26", "3JD6") map printArginines

You can bypass the cache, delete items from the cache, or clear the cache:

AtomTin.download("1hiv") // skips the cache
cache.delete("1hiv") // removes 1hiv from the cache
cache.deleteAll() // clears the cache

Notes

  • The parser is covered by ScalaCheck property tests. The caching currently lacks tests but seems to work.
  • AtomTin and its subclasses require an implicit ExecutionContext.
  • PickledAtomTin is currently limited to Gzipped JSON.

License

The software is licensed under the Apache License, Version 2.0 by Douglas Myers-Turnbull.

About

Tiny Scala API to stream, parse, and cache atom coordinates from Protein Data Bank files

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages