Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish packages for each individual API. #806

Closed
3 of 4 tasks
lukesneeringer opened this issue Sep 6, 2017 · 19 comments · Fixed by #2557
Closed
3 of 4 tasks

Publish packages for each individual API. #806

lukesneeringer opened this issue Sep 6, 2017 · 19 comments · Fixed by #2557
Assignees
Labels
type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. web

Comments

@lukesneeringer
Copy link

lukesneeringer commented Sep 6, 2017

This is going to be done in several steps:

  • Re-organize the directory structure to map to individual modules (chore: re-organize directory structure (and run generator) #1167)
  • Setup and publish a googleapis-common npm module that just contains shared bits that will be used by all packages
  • Generate a package.json for all of the individual APIs
  • Script the npm publish process so that each individual package is published along with the larger meta-package

These steps are of course subject to change as we discover things through the process :)

--- UPDATE ---
We're down to the last step of actually publishing the individual modules. @alexander-fenster, @jkwlui, and @JustinBeckwith last got together to chat about this, and came to a few conclusions:

State of the world today

  • We decide when to run the generator in a somewhat ad-hoc fashion. As someone asks for a release, we cut it.
  • We run the generator by having a developer clone the repo, run npm run generate, and then submitting a PR with like 1344247584 changes.
  • After checking in that PR, someone has to submit a new release with releasetool. This will create a new tag. It is almost always semver major (see the fact that we're on v38).
  • The generator is somewhat monolithic. It works by blasting away the src/apis directory, and re-creating every API from scratch every time. This makes it easy to detect new APIs, and easy to detect removed APIs.

Where we need to be

  • We want a bot that runs the generator(s) nightly. We should use synthtool for this.
  • The bot should submit many PRs - one for each API removed, deleted, or modified.
  • The individual commits could use the conventional commit scope as a way to signal which API/package this change applies to
  • Some combination of releasetool and semantic-release could be used to cut individual releases. A scoped tag would be used for each package release from the mono-repo.
  • The risk of this approach is that the number of tags could get out of control, fast.
  • Alternatively, we could approach this similar to the way the gRPC generated libraries work. We create a new GitHub repository for each API, and break the output out of this repo and into the individual package repos.

There's still a lot to figure out, but I suspect @bcoe is gonna love this problem.

@nicoabie
Copy link

@lukesneeringer #1167 is already merged so it can be marked as done I believe.
Are tasks created for all the pending steps in order to have this feature implemented?

@JustinBeckwith
Copy link
Contributor

👋 We are at the stage where each individual API can have npm pack run against it, and theoretically pushed to individual modules. Honestly the problem at this stage is the release pipeline and process. We could start cutting these today, but it would mean a semver major release every time we cut a build (sorta like it is today). With 232 APIs, we can't realistically look at the change log for each, and manually cut a build every time there's a change. We need to build automation tools.

We're really blocked on #525. There are a variety of ways to approach it, but it hasn't been pressing just yet.

@alexander-fenster
Copy link
Member

Just to add to the list: since each API can now be used in a browser, we need to be able to publish each individual API as a webpack bundle, versioned (e.g. drive-v1.2.3.min.js). This is a separate (but related) problem to autoreleases.

@nicoabie
Copy link

Is webpack the better choice here? I think other bundlers such as rollup should be considered, so you don't have to ship the full API to the browser but just what the user uses.

@bcoe
Copy link
Contributor

bcoe commented Mar 24, 2019

👋 some hot takes from this conversation so far:

  • while I'm not sure that we'd use lerna's publication functionality, I think it would be worth using some of its functionality, mainly how it manages directory structure.
    • the value here is that it sets up shared libraries as symlinks; rather than one googleapis-common module, we could divide up shared libraries along whatever logical lines make sense and use a resolution tool to determine when new publications of dependents are necessary.
  • I'd advocate that standard-version might be a good tool to consider using, for moving towards a more automated release process:
    • it can parse conventional commit messages; making recommendations about version bumps, and generating CHANGELOGS; but, the nice thing is:
    • we can start by running the bin manually (it doesn't assume that it's tightly integrated into an automated GitHub workflow).
    • the bin isn't opinionated about platform (e.g., it doesn't assume it will be run in the context of the GitHub API) so it's easy to slot beside existing release tools like synthtool.
  • on the topic of webpack, vs., rollup, etc; I think this is probably a separate conversation topic, although we should make sure as we refactor the codebase we don't make decisions that are hostile to bundlers, e.g., lazy loading dependencies, using compiled dependencies.

@alexander-fenster
Copy link
Member

alexander-fenster commented Mar 24, 2019

@nicoabie @bcoe re: webpack, there is no need to bundle all APIs since we now support webpacking just one API: you need to run webpack from inside the API folder. E.g. see the README from Drive API: https://github.com/googleapis/google-api-nodejs-client/tree/master/src/apis/drive#building-a-browser-bundle (the same applies to all other APIs).

(I'm not arguing on webpack vs rollup vs whatever here, it's just webpack that I used and know it works, other bundlers will probably work as well)

@nicoabie
Copy link

@alexander-fenster fair point, I think this issue is almost resolved then.
The thing with Webpack is that you are shipping the full, in this example, drive library to the browser when the user is just using {drive, auth}. Using Rollup for building the library instead of Webpack would allow the user's bundler (probably it will be webpack) to perform tree shaking of the parts of the library that are not being used and lower the amount of data that is being sent with no need to the browser.

@joonhocho
Copy link

joonhocho commented Jul 13, 2019

any updates on this?
https://bundlephobia.com/result?p=googleapis@41.0.0
https://packagephobia.now.sh/result?p=googleapis
It says install size is 45MB
Could you also make it side effects free so that it's tree shakable?

@YasharF
Copy link

YasharF commented Oct 17, 2019

Any updates would be appreciated. The current version of the npm package is costly, 42.6MB, 846 files, and 165 folders.

@danvk
Copy link

danvk commented May 19, 2020

In case it's useful to others finding this thread… my use of googleapis was quite light (just one endpoint in the Google Sheets API) and I got good results by depending directly on googleapis-common and google-auth-library and copying over the methods I was interested in.

Before, using googleapis monolith:

import {google, sheets_v4} from 'googleapis';

const creds = JSON.parse(await fs.readFile(credentialsFile, 'utf8'));
const client = new google.auth.JWT({
  email: creds.client_email,
  key: creds.private_key,
  scopes: ['https://www.googleapis.com/auth/spreadsheets.readonly'],
});

await client.authorize();

const sheets = google.sheets({version: 'v4', auth: client});

const result = await sheets.spreadsheets.values.get({
  spreadsheetId: sheetId,
  range: 'A:ZZ',
});

After, using the underlying libraries:

import {JWT} from 'google-auth-library';
import {AuthPlus, createAPIRequest} from 'googleapis-common';

interface GetParams {
  spreadsheetId: string;
  range: string;
}

export interface ValueRange {
  majorDimension?: string | null;
  range?: string | null;
  values?: any[][] | null;
}

// This is adapted from google-api-nodejs-client:
// https://github.com/googleapis/google-api-nodejs-client/blob/master/src/apis/sheets/v4.ts#L6483
// See https://github.com/googleapis/google-api-nodejs-client/issues/806
function getValues(auth: JWT, params: GetParams) {
  return createAPIRequest<ValueRange>({
    options: {
      url: 'https://sheets.googleapis.com/v4/spreadsheets/{spreadsheetId}/values/{range}',
      method: 'GET',
    },
    params,
    requiredParams: ['spreadsheetId', 'range'],
    pathParams: ['range', 'spreadsheetId'],
    context: {
      _options: {
        auth,
      },
    },
  });
}

const creds = JSON.parse(await fs.readFile(credentialsFile, 'utf8'));
const auth = new AuthPlus();
const client = new auth.JWT({
  email: creds.client_email,
  key: creds.private_key,
  scopes: ['https://www.googleapis.com/auth/spreadsheets.readonly'],
});
await client.authorize();
const result = await getValues(client, {
  spreadsheetId: sheetId,
  range: 'A:ZZ',
});

I ran across this issue in the context of TypeScript getting slow. Removing googelapis let me remove 300+ .d.ts files from my project.

@dsegovia90
Copy link

Hello, is there any update on this?

Having trouble treeshaking googleapis. My particular use case is for Google Cloud Functions (Firebase functions). The cold start of functions using googleapis is extremely slow since the function needs to load the whole package at cold start, and I'm only using calendar and auth from all the apis.

@JustinBeckwith
Copy link
Contributor

@nermaljcat just a heads up, I edited your comment to remove a bit of sass. I'll refer you to our code of conduct, in case there are any questions.

@nermaljcat
Copy link

ok @JustinBeckwith - I've deleted my comments. I reject censorship and prefer not to participate in a censored forum.

@YasharF
Copy link

YasharF commented Jun 18, 2020

It was delightful to read that your work on the automated release process hit its next step. I recall that was blocking this issue here. Thank you for the great work! #525 (comment)

@JustinBeckwith
Copy link
Contributor

Just to keep y'all in the loop - one of the big remaining concerns over the split was creating confusion between the packages for cloud focused APIs in google-cloud-node. In many cases here, there will be two very similar packages.

For example, as laid out today we'd have a @googleapis/datastore and a @google-cloud/datastore for about 40 APIs. This is wiiiiiildly confusing if you don't understand the differences. I'm interested in how folks think we can make this more clear.

As a first step, we landed #2242 which adds a specific callout on the individual READMEs.

The next adventure is googleapis/release-please#471

@proppy
Copy link
Contributor

proppy commented Jun 24, 2020

I'm interested in how folks think we can make this more clear.

@JustinBeckwith wouldn't the cloud focused package be able to depends on the @googleapis/* ones, if those were split in a separate artefact? If yes, that could simply be explained as a low-level interface (just the raw types, slim client) vs a high level one (higher level wrapper, fat client w/ friendly dx).

@JustinBeckwith
Copy link
Contributor

Sorta not really :/ With a few exceptions, the majority of cloud packages in the @google-cloud scope are based on grpc / proto based interfaces. There is an entirely different generator that creates those packages.

We're starting to come around to the idea of @googleapis/datastore-rest as the naming convention for these, while keeping @google-cloud/datastore for the higher level modules.

@proppy
Copy link
Contributor

proppy commented Jun 24, 2020

Something to consider might be to start splitting API that are not cloud first?

@bcoe
Copy link
Contributor

bcoe commented Mar 4, 2021

👋 an update on this long standing issue. There's work that's been happening this quarter that will facilitate publishing submodules, using the same release automation we use elsewhere -- this project has been bigger than expected, due to some of the constraints:

  • releasing each API individual would be too noisy, in terms of the bulk of PRs (so we wish to combine the release in one large PR).
  • we don't want to store secrets in GitHub actions, so we've been developing an approach for publishing multiple submodules from our kokoro CI/CD system.

In terms of time lines, I'm hopeful we'll be able to start testing individual package publishes within ~2 weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. web
Projects
None yet
Development

Successfully merging a pull request may close this issue.