Skip to content

davidzajac1/davidzajac1

Repository files navigation

Header





RE Data - Open Source Maintainer & Top Contributor

Stars PyPI - Downloads PyPI - Version MIT License Language Language

An open source data reliability framework for the modern data stack. RE Data is a DBT package, Python library and React UI. Adding the RE Data DBT package to a DBT project will run out of the box data observability SQL queries in the background when dbt run is called. These queries calculate and store metrics like standard deviation, mean, row count, etc. The RE Data Python library can be called from the CLI to read in the stored metrics and create and serve the RE Data UI. RE Data is hosted across two GitHub repos. I am a top contributor to both and manage reviewing/merging PRs and creating releases.


packages:
  - package: re-data/re_data
    version: 0.11.0


Zillacode.com - The ultimate resource to become a modern Data Engineer

Language Language Language Language

Created Zillacode.com a B2C SAAS platform used to help people study for coding interviews. It is the only online platform that runs live PySpark, Spark, DBT, Snowflake and Pandas code in the browser. Zillacode is proprietary, but feel free to reach out with requests to view the underlying code base.

On the backend Zillacode is hosted using primarily the AWS Serverless stack, has automated UI testing and deployments across environments, automated recurring billing and Single Sign On from various providers. What make Zillacode difficult to reproduce and sets it apart from other coding interview prep platforms is the speed in which it runs Spark/PySpark and DBT code by repackaging them in clever ways so that they will run quickly in a browser.



IAMScan - CLI tool checks code for AWS IAM Privileges

Language License Code style: black

IAMScan is an open source command line tool that reads your code and generates an AWS IAM policy with your needed permissions. Keeping track of AWS IAM permissions is annoying and time consuming. How often have you seen an update deployed to the cloud followed by The provided execution role does not have permissions to call CreateSomething on SomeService? IAMScan solves this issue by generating a perfectly least privileged AWS IAM Policy for all Python Files, JavaScript Files and Shell Scripts from a single command line command.

IAMScan is hosted on PyPI and is installed using pip


$ pip install iamscan


Reptoro - A failed B2B SAAS Platform

Language Flask pandas SQLAlchemy dash

Originally intended to be a paid analytics platform for exotic animal breeders, released as open source here due to lack of customer interest. Reptoro was a Dash/Flask App hosted on EC2 with graphs visualizing scraped industry data. On the backend it had an ETL pipeline architected using AWS Lambda, Athena, S3 and Apache Airflow to routinely webscrape online reptile marketplaces.



ZOil - Generate random Oil and Gas Data

Language License Version

ZOil is a python library used to generate random Oil and Gas data. Most Oil and Gas data is either proprietary or costly to acquire. ZOil lets you quickly generate an unlimited amount of production data that can be used to for testing, anonymization and much more. ZOil was inspired by the Faker library.

ZOil is hosted on PyPI and is installed using pip


$ pip install zoil

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published