Skip to content
Mike Ralphson edited this page Mar 20, 2023 · 5 revisions

APIs.guru Update Processes, Now & Future

Changelog

  • Original version, April 2021
  • Updated with GitHub actions etc, February 2023

History

The APIs.guru OpenAPI Directory was started by Ivan Goncharov in 2015. He expanded the directory to around 550 API definitions before becoming somewhat burned out by the process of maintaining the directory with the original set of scripts, and with the OpenAPI scene in general. Ivan now works on GraphQL.js and with the GraphQL Foundation.

Mike Ralphson took over as maintainer in March 2017. At the time of writing, the directory contains 3,963 API definitions.

Adding an API

The APIs.guru API addition process goes something like this:

image1

  1. A contributor raises a ‘drive-by’ issue on GitHub, using the form at https://apis.guru/add-api/ or a lead is generated by research
  2. A script runs which validates that the URL provided links to an API definition, not documentation. If it does not link to a usable definition (OpenAPI, Swagger, Postman, RAML, Google Discovery, WADL, API Blueprint, Mashery IO Docs…) we request an updated URL and mark the issue as “Awaiting response”. If no usable URL is provided within about 14 days, I close the issue. Note that AsyncAPI has its own directory and GraphQL / gRPC / OpenRPC are not easily converted to OpenAPI 3.
  3. The script validates that the definition at the URL provided validates with oas-kit’s oas-validator. https://github.com/Mermade/oas-kit If necessary, the source input format is converted first.
  4. If errors are detected and are not repairable by OAS-Kit, we either patch the definition in the site metadata or go back to the contributor to see if the source definition can be fixed. This is a delicate balancing act.
  5. Once the definition validates, the issue is marked with the label "Add API"
  6. A Script then runs which runs “npm run add -- --category {cat} [--logo {url}] [--service {serv}] [--host {url}] {url-to-definition}}”
  7. The issue is updated with a thank you comment and is closed
  8. Then we run “git add …”, “git push” and the CI job (currently GitHub Action workflow) runs to deploy the site, static API json and RSS feed files.

image3

The update process:

image5

  1. Amazon AWS definitions are automatically converted from the https://github.com/aws/aws-sdk-js repo. A daily GitHub Actions workflow cron job pulls the source repo, runs https://github.com/APIs-guru/aws2openapi and automatically commits to the OpenAPI Directory repo, triggering a CI build.
  2. This has also been setup for Google Discovery documents. The converter is located at https://github.com/APIs-guru/google-discovery-to-swagger and is implemented as a ‘driver’ (see below).
  3. Each ‘provider’ within APIs.guru has what we call a ‘driver’. This driver is responsible for fetching the source API definitions. The simplest driver is called ‘url’ and does exactly what you would expect, fetches the content of a given url.
  4. Other drivers are: github (to fetch a tarball of a given repository), apis.json, catalog (for processing an arbitrary JSON or YAML list), external (a marker that some external process manages these APIs - like for AWS), google (Google Discovery format metadata), zip (for handling compressed definitions), blob (for handling ReDoc blob urls) and nop (which does nothing).
  5. We need to develop a HTML scraper driver using cheerio and jQuery selectors for some API locations.
  6. We also need to implement a simple driver for swaggerhub which fetches the latest published version for a given API url.
  7. We also need to extend the url driver to be able to do POST requests and not just GETs.
  8. Other drivers which may be required in the future include: gitlab, bitbucket, gitea/gogs.
  9. We should also have a postman driver which can talk to the Postman API
  10. Drivers for apicurio and apigee API registry/repositories are also possible.
  11. The driver type for each provider is stored in the main metadata file: metadata/registry.yaml
  12. The scripts currently in use and the registry.yaml are checked into private repos at the moment. This is because there was a tentative plan to resell APIs.guru as a service for users to run their own internal API Registries. No customers as yet.
  13. All output API definitions are saved in the repo in YAML format for size and ease of diffing purposes. The deployment process is responsible for creating copies in JSON format

Example of a registry entry:

getpostman.com:
  apis:
    "": # this indicates no service parameter is applied
  	1.0.1: # this is the API version, there can be multiple as long as they have different source URLs
    	added: 2021-04-13T03:58:34.089Z
    	endpoints: 32
    	filename: APIs/getpostman.com/1.0.1/openapi.yaml 
    	fixes: 0 # we keep track of the number of automated patches, largely for vanity reasons
    	hash: 758ed46025ba13f56f8ef0e95817616ca08ef9bbeb8153d96423dbb988e74223
    	history: [] # this is the API provenance array
    	name: openapi.yaml
    	openapi: 3.0.3
    	run: 2021-04-13T07:38:22.156Z
    	source:
      	format: openapi
      	url: https://gist.githubusercontent.com/MikeRalphson/f5dd7e7e712a4f2caa8f1783f1053dbc/raw/b7d80bdda3497d7d4496c69d2444b170da00a5cf/postman-api.yaml # note we do not have a Postman API driver as yet!
      	version: "3.0"
    	status: 200 # we track 4xx and 5xx errors on retrieval
updated: 2021-04-13T07:38:22.156Z # used by RSS feed
    	valid: true # we track if the source API has gone invalid
  	Patch: # information added to the source API for round-tripping
    	  info:
      	x-apisguru-categories:
      	  - developer_tools
      	x-logo:
          url: https://getpostman.com/web-assets/icons/icon-48x48.png?v=6fa10b9ee2b6e5dcec30e5027a14e7a4 # if the API doesn't provide one, we use the highest-res favicon, we used to use Twitter profiles but they clamped down
  driver: url

The letters FRV in the above screenshot refer to Fetching, Resolving and Validating. If the source definition is not OpenAPI, a step C for Converting is also shown.

The package api-registry provides the following scripts:

image2 image4 image7

These script functions could be exposed by a REST API making APIs.guru a ‘competitor’ to the Apigee registry and the Apicurio registry. This would probably require changing the metadata store from simple YAML to something like Acebase.

  • The ‘dashboard’ script creates a number of badges for the website, such as:
  • The ‘404’ script shows the provider, service, version and url of failing retrievals.
  • The ‘deploy’ script is run by the GitHub ci Action.
  • The deploy script also writes a summary of the directory as metrics.json

image9

image6

image10

Current Issues

  1. I do not currently have a good story for the Azure APIs which are in OpenAPI v2. There are a lot of these: the source repo contains 52,567 files at last count, occupying 415Mb. The directory structure is complex (it is hard to map the multiple subdirectory structure to a single service value) and it is difficult to tell top level API files apart from component files. As a result, they have not been updated for many months, but also nobody has noticed in all that time! Possible solution: make service an array of values, with 0..n being allowable, instead of a single string where “” is allowed.
  2. There are issues and PRs
  3. There is only one regular external contributor to the directory: Helen Kosova at SmartBear
  4. There are two interested parties who report bugs: Tim Perry of HTTPToolkit and Hans Juergen Rennau who is creating a JSON Schema linter.
  5. There are a number of integrations with the directory’s API, but only one Partner Level sponsor of the OpenCollective fund: ApiDeck (until recently, $100 pcm), which paid for recent website improvements.