Cookiecutter for Science Projects

A cookiecutter template for science and data science projects that include data, code, and dissemination.

Optimized for data-based publications
Optimized for use with VS Code
Docker-based, version-controlled environment using VS Code Dev Containers
conda based environment inside the Dev Container - just add packages to envrionment.yml and rebuild. Same environment for the whole team
use of Dev container Features with pre-installed, Python, oh-my-zsh and LaTeX
Optimised for use with Python but could also be used with Julia, and R
Make commands for: collecting data, generating, figures, typsetting latex, clean temp files, clean demo files
use of VS Code tasks to trigger data collection, plotting and paper compilation
LaTeX-based paper
Added path definitions in the project_package Python module
Kedro-inspired data folder structure
filled with a demo - which can be cleaned with "make delete_demo"
used in at least 5 papers

For more detailed information, please see the README of the resulting project.

Quick Start

cookiecutter https://github.com/tgoelles/cookiecutter_science

File Structure

├── .devcontainer                      # Definition of the Docker container and environment for VS Code
│   ├── Dockerfile                     # Defines the Docker container
│   ├── devcontainer.json              # Defines the devcontainer settings for VS Code
│   └── noop.txt                       # Placeholder file to ensure the COPY instruction does not fail if no environment.yml exists
├── .gitattributes                     # Git attributes for handling line endings and merge strategies
├── .gitignore                         # Git ignore file to exclude files and directories from version control
├── Makefile                           # Makefile with commands like `make data` and `make clean`
├── README.md                          # Project readme
├── code                               # Source code and notebooks
│   ├── notebooks                      # Jupyter notebooks
│   │   └── exploratory                # Data explorations
│   │       └── 1.0-tg-example.ipynb   # Jupyter notebook with naming conventions. tg are initials
│   ├── project_package                # Project-specific Python package
│   │   ├── __init__.py                # Makes project_package a Python module
│   │   ├── data                       # Scripts to download, generate and parse data
│   │   │   ├── __init__.py
│   │   │   ├── config.py              # Project-wide path definitions
│   │   │   ├── example.py             # Example script
│   │   │   ├── import_data.py         # Functions to read raw data
│   │   │   └── make_dataset.py        # Scripts to download or generate data (used in the Makefile)
│   │   ├── tools                      # Scripts and functions for general use
│   │   │   ├── __init__.py
│   │   │   └── convert_latex.py       # Functions to convert elements for use in LaTeX
│   │   └── visualization              # Scripts and functions to create visualizations
│   │       ├── __init__.py
│   │       ├── make_plots.py          # Scripts to make all plots for the publication
│   │       └── visualize.py           # Functions to produce final plots
│   └── pyproject.toml                 # Configuration file for the project
├── data                               # Data directories
│   ├── 01_raw                         # The original, immutable data dump
│   │   └── demo.csv                   # Example raw data file
│   ├── 02_intermediate                # Intermediate processed data
│   ├── 03_primary                     # cleanes data, used for the publication
│   ├── 04_feature                     # For Machine learning, features based on the primary data
│   ├── 05_model_input                 # The final data used for machine learning
│   ├── 06_models                      # Stored, serialized pre-trained machine learning models
│   ├── 07_model_output                # Output from trained machine learning models
│   └── 08_reporting                   # Reporting data like log files
├── dissemination                      # Materials for dissemination
│   ├── figures                        # Figures for paper generated with Python
│   │   └── demo.png                   # Example figure file
│   ├── presentations                  # All related PowerPoint files, especially for deliverables
│   └── papers                         # LaTeX-based papers
│       └── paper.tex                  # Example LaTeX paper
├── environment.yml                    # Conda environment configuration file
└── literature                         # References and explanatory materials
    └── references.bib                 # Bibliography file for LaTeX documents

Tasks

Use of VS Code tasks:

Requirements

Git: Should be part of your OS or install it here
GitHub account
GitHub CLI: Install from here
Docker Desktop: Install from here
VS Code: Install from here
VS Code Extension: Remote Development: Install from here
Cookiecutter Python package: Install like this:

pip install cookiecutter

For Mac users:

brew install cookiecutter

Getting Started

Navigate to the folder where you want to create the project (on your local drive):
```
cookiecutter https://github.com/tgoelles/cookiecutter_science
```
Answer the questions prompted by cookiecutter.
A new VS Code window will open automatically.
Click "OK" to reopen the folder in a container (only asked the first time).
Read the README.md in the generated project folder.

Git and GitHub

Cookiecutter can generate a GitHub repository for you. This initializes the git repo and pushes it to GitHub. You can then invite your team members to join the project.

Each team member works on their local version of the project, regularly committing and pushing changes.
Avoid working on the same folder over a network.

Note for Windows Users

If you want to use git inside the container (recommended), you need to clone the repo from WSL, as Windows may mess up the .git folder. Git inside the container uses the same .gitconfig as Windows, which is copied into the container.

Ensure user.email and user.name are set (in PowerShell):

git config --global user.name "your_name"
git config --global user.email "your_email@gmail.com"

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
hooks		hooks
test		test
{{ cookiecutter.repo_name }}		{{ cookiecutter.repo_name }}
.bumpversion.cfg		.bumpversion.cfg
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Tasks.png		Tasks.png
changelog.md		changelog.md
cookiecutter.json		cookiecutter.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hooks

hooks

test

test

{{ cookiecutter.repo_name }}

{{ cookiecutter.repo_name }}

.bumpversion.cfg

.bumpversion.cfg

.gitattributes

.gitattributes

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Tasks.png

Tasks.png

changelog.md

changelog.md

cookiecutter.json

cookiecutter.json

Repository files navigation

Cookiecutter for Science Projects

Quick Start

File Structure

Tasks

Requirements

Getting Started

Git and GitHub

Note for Windows Users

About

Releases 3

Languages

tgoelles/cookiecutter_science

Folders and files

Latest commit

History

Repository files navigation

Cookiecutter for Science Projects

Quick Start

File Structure

Tasks

Requirements

Getting Started

Git and GitHub

Note for Windows Users

About

Topics

Resources

Stars

Watchers

Forks

Languages