Skip to content

doitintl/cloud-catalog

Repository files navigation

Public Cloud Services

Unfortunately, all cloud vendors do not provide a friendly API to list all public cloud services and categories, as listed on AWS Products, GCP Products and Azure Services pages.

The idea is to have a unified JSON schema for all cloud services.

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "array",
  "items": [
    {
      "type": "object",
      "properties": {
        "id": {
          "type": "string"
        },
        "name": {
          "type": "string"
        },
        "summary": {
          "type": "string"
        },
        "url": {
          "type": "string"
        },
        "categories": {
          "type": "array",
          "items": [
            {
              "type": "object",
              "properties": {
                "id": {
                  "type": "string"
                },
                "name": {
                  "type": "string"
                }
              },
              "required": [
                "id",
                "name"
              ]
            }
          ]
        },
        "tags": {
          "type": "array",
          "items": [
            {
              "type": "string"
            }
          ]
        }
      },
      "required": [
        "id",
        "name",
        "summary",
        "url",
        "categories",
        "tags"
      ]
    }
  ]
}

Scraping AWS Cloud Services

The AWS Products page uses undocumented https://aws.amazon.com/api/dirs/items/search endpoint to fetch paged JSON records for available cloud products.

# download AWS service JSON file and generate data/aws.json
pip install -r requirements.txt
python discovery/aws.py > data/aws.json

Scraping GCP Cloud Services

The GCP Products page is rendered on the server side and all data is embedded into the web page.

# scrap GCP Products page to get all services and generate data/gcp.json
pip install -r requirements.txt
python discovery/gcp.py > data/gcp.json

Scraping Azure Cloud Services

The Azure Services page is rendered on the server side and all data is embedded into the web page.

# scrap Azure Services page to get all services and generate data/azure.json 
pip install -r requirements.txt
python discovery/azure.py > data/azure.json

Microsoft365 Services

Edit the ms365.json file. Use data from this page.

Scraping Google Workspace Services (GSuite)

The page page contains all Google Workspace services.

# scrap Google Workspace page to get all services and generate data/gsuite.json
pip install -r requirements.txt
python discovery/gsuite.py > data/gsuite.json

CMP Services

Edit the cmp.json file. Use the CMP UI and documentation.

Credits

Edit the credits.json file.

Update/merge all tags

Run the tags.sh script to regenerate the tags.json file that contains all platform, category and services tags from all services.

Public static location

Upload all generated json files to the public cloud_tags Cloud Storage bucket.

About

Extract categories and services (as unified JSON) for major public cloud services.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages