Skip to content

ZooPhy/zoophy-genbankfactory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GenBankFactory

Data extraction and normalization pipeline used for bi-monthly GenBank data dumps.

Javadocs should be kept up to date here.

Dependencies:

Setup:

  1. Import the project into an IDE as "Existing Maven Project"

  2. Create an GenBankFactory.local.properties file in the src folder with your SQL and Lucene details. Refer to GenBankFactory.local.properties.template

  3. Run the build.sh script

  4. The build should run successfully and generate a runnable jar in the target folder.

Usage

  • Always double triple check parameters before building and running the .jar, as it may delete databases.
  • Any changes to GenBankFactory.local.properties will only be reflected after running a new build.
  • Allocate at minimum 6GB RAM, preferably 8GB.
  • Typical usage scenario commands:
    • Fresh data dump: nohup java -Xms4G -Xmx8G -jar target/zoophy-genbank-factory-1.x.x-jar-with-dependencies.jar dump create -f gbvrl &
    • Re-Run data dump: nohup java -Xms4G -Xmx8G -jar target/zoophy-genbank-factory-1.x.x-jar-with-dependencies.jar dump clean -f gbvrl &
    • Rebuild Index: nohup java -Xms4G -Xmx8G -jar target/zoophy-genbank-factory-1.x.x-jar-with-dependencies.jar index &

Releases

No releases published

Packages

No packages published