Skip to content

Commit

Permalink
feat: infer uid/gid instead of accepting as options
Browse files Browse the repository at this point in the history
BREAKING CHANGE: the uid gid options are no longer respected or
necessary.  As of this change, cacache will always match the cache
contents to the ownership of the cache directory (or its parent
directory), regardless of what the caller passes in.

Reasoning:

The number one reason to use a uid or gid option was to keep root-owned
files from causing problems in the cache.  In npm's case, this meant
that CLI's ./lib/command.js had to work out the appropriate uid and gid,
then pass it to the libnpmcommand module, which had to in turn pass the
uid and gid to npm-registry-fetch, which then passed it to
make-fetch-happen, which passed it to cacache.  (For package fetching,
pacote would be in that mix as well.)

Added to that, `cacache.rm()` will actually _write_ a file into the
cache index, but has no way to accept an option so that its call to
entry-index.js will write the index with the appropriate uid/gid.
Little ownership bugs were all over the place, and tricky to trace
through.  (Why should make-fetch-happen even care about accepting or
passing uids and gids?  It's an http library.)

This change allows us to keep the cache from having mixed ownership in
any situation.

Of course, this _does_ mean that if you have a root-owned but
user-writable folder (for example, `/tmp`), then the cache will try to
chown everything to root.

The solution is for the user to create a folder, make it user-owned, and
use that, rather than relying on cacache to create the root cache folder.

If we decide to restore the uid/gid opts, and use ownership inferrence
only when uid/gid are unset, then take care to also make rm take an
option object, and pass it through to entry-index.js.
  • Loading branch information
isaacs committed Jul 15, 2019
1 parent 676cb32 commit ac84d14
Show file tree
Hide file tree
Showing 9 changed files with 232 additions and 90 deletions.
51 changes: 34 additions & 17 deletions README.md
@@ -1,12 +1,16 @@
# cacache [![npm version](https://img.shields.io/npm/v/cacache.svg)](https://npm.im/cacache) [![license](https://img.shields.io/npm/l/cacache.svg)](https://npm.im/cacache) [![Travis](https://img.shields.io/travis/zkat/cacache.svg)](https://travis-ci.org/zkat/cacache) [![AppVeyor](https://ci.appveyor.com/api/projects/status/github/zkat/cacache?svg=true)](https://ci.appveyor.com/project/zkat/cacache) [![Coverage Status](https://coveralls.io/repos/github/zkat/cacache/badge.svg?branch=latest)](https://coveralls.io/github/zkat/cacache?branch=latest)
# cacache [![npm version](https://img.shields.io/npm/v/cacache.svg)](https://npm.im/cacache) [![license](https://img.shields.io/npm/l/cacache.svg)](https://npm.im/cacache) [![Travis](https://img.shields.io/travis/npm/cacache.svg)](https://travis-ci.org/npm/cacache) [![AppVeyor](https://ci.appveyor.com/api/projects/status/github/npm/cacache?svg=true)](https://ci.appveyor.com/project/npm/cacache) [![Coverage Status](https://coveralls.io/repos/github/npm/cacache/badge.svg?branch=latest)](https://coveralls.io/github/npm/cacache?branch=latest)

[`cacache`](https://github.com/zkat/cacache) is a Node.js library for managing
[`cacache`](https://github.com/npm/cacache) is a Node.js library for managing
local key and content address caches. It's really fast, really good at
concurrency, and it will never give you corrupted data, even if cache files
get corrupted or manipulated.

It was originally written to be used as [npm](https://npm.im)'s local cache, but
can just as easily be used on its own.
On systems that support user and group settings on files, cacache will
match the `uid` and `gid` values to the folder where the cache lives, even
when running as `root`.

It was written to be used as [npm](https://npm.im)'s local cache, but can
just as easily be used on its own.

_Translations: [español](README.es.md)_

Expand Down Expand Up @@ -414,13 +418,6 @@ may also use any anagram of `'modnar'` to use this feature.
Currently only supports one algorithm at a time (i.e., an array length of
exactly `1`). Has no effect if `opts.integrity` is present.

##### `opts.uid`/`opts.gid`

If provided, cacache will do its best to make sure any new files added to the
cache use this particular `uid`/`gid` combination. This can be used,
for example, to drop permissions when someone uses `sudo`, but cacache makes
no assumptions about your needs here.

##### `opts.memoize`

Default: null
Expand Down Expand Up @@ -498,10 +495,11 @@ Completely resets the in-memory entry cache.
Returns a unique temporary directory inside the cache's `tmp` dir. This
directory will use the same safe user assignment that all the other stuff use.

Once the directory is made, it's the user's responsibility that all files within
are made according to the same `opts.gid`/`opts.uid` settings that would be
passed in. If not, you can ask cacache to do it for you by calling
[`tmp.fix()`](#tmp-fix), which will fix all tmp directory permissions.
Once the directory is made, it's the user's responsibility that all files
within are given the appropriate `gid`/`uid` ownership settings to match
the rest of the cache. If not, you can ask cacache to do it for you by
calling [`tmp.fix()`](#tmp-fix), which will fix all tmp directory
permissions.

If you want automatic cleanup of this directory, use
[`tmp.withTmp()`](#with-tpm)
Expand All @@ -514,6 +512,27 @@ cacache.tmp.mkdir(cache).then(dir => {
})
```

#### <a name="tmp-fix"></a> `> tmp.fix(cache) -> Promise`

Sets the `uid` and `gid` properties on all files and folders within the tmp
folder to match the rest of the cache.

Use this after manually writing files into [`tmp.mkdir`](#tmp-mkdir) or
[`tmp.withTmp`](#with-tmp).

##### Example

```javascript
cacache.tmp.mkdir(cache).then(dir => {
writeFile(path.join(dir, 'file'), someData).then(() => {
// make sure we didn't just put a root-owned file in the cache
cacache.tmp.fix().then(() => {
// all uids and gids match now
})
})
})
```

#### <a name="with-tmp"></a> `> tmp.withTmp(cache, opts, cb) -> Promise`

Creates a temporary directory with [`tmp.mkdir()`](#tmp-mkdir) and calls `cb`
Expand Down Expand Up @@ -591,8 +610,6 @@ of entries removed, etc.

##### Options

* `opts.uid` - uid to assign to cache and its contents
* `opts.gid` - gid to assign to cache and its contents
* `opts.filter` - receives a formatted entry. Return false to remove it.
Note: might be called more than once on the same entry.

Expand Down
6 changes: 3 additions & 3 deletions lib/content/write.js
Expand Up @@ -121,7 +121,7 @@ function pipeToTmp (inputStream, cache, tmpTarget, opts, errCheck) {
function makeTmp (cache, opts) {
const tmpTarget = uniqueFilename(path.join(cache, 'tmp'), opts.tmpPrefix)
return fixOwner.mkdirfix(
path.dirname(tmpTarget), opts.uid, opts.gid
cache, path.dirname(tmpTarget)
).then(() => ({
target: tmpTarget,
moved: false
Expand All @@ -134,14 +134,14 @@ function moveToDestination (tmp, cache, sri, opts, errCheck) {
const destDir = path.dirname(destination)

return fixOwner.mkdirfix(
destDir, opts.uid, opts.gid
cache, destDir
).then(() => {
errCheck && errCheck()
return moveFile(tmp.target, destination)
}).then(() => {
errCheck && errCheck()
tmp.moved = true
return fixOwner.chownr(destination, opts.uid, opts.gid)
return fixOwner.chownr(cache, destination)
})
}

Expand Down
12 changes: 5 additions & 7 deletions lib/entry-index.js
Expand Up @@ -32,9 +32,7 @@ module.exports.NotFoundError = class NotFoundError extends Error {

const IndexOpts = figgyPudding({
metadata: {},
size: {},
uid: {},
gid: {}
size: {}
})

module.exports.insert = insert
Expand All @@ -49,7 +47,7 @@ function insert (cache, key, integrity, opts) {
metadata: opts.metadata
}
return fixOwner.mkdirfix(
path.dirname(bucket), opts.uid, opts.gid
cache, path.dirname(bucket)
).then(() => {
const stringified = JSON.stringify(entry)
// NOTE - Cleverness ahoy!
Expand All @@ -63,7 +61,7 @@ function insert (cache, key, integrity, opts) {
bucket, `\n${hashEntry(stringified)}\t${stringified}`
)
}).then(
() => fixOwner.chownr(bucket, opts.uid, opts.gid)
() => fixOwner.chownr(cache, bucket)
).catch({ code: 'ENOENT' }, () => {
// There's a class of race conditions that happen when things get deleted
// during fixOwner, or between the two mkdirfix/chownr calls.
Expand All @@ -86,13 +84,13 @@ function insertSync (cache, key, integrity, opts) {
size: opts.size,
metadata: opts.metadata
}
fixOwner.mkdirfix.sync(path.dirname(bucket), opts.uid, opts.gid)
fixOwner.mkdirfix.sync(cache, path.dirname(bucket))
const stringified = JSON.stringify(entry)
fs.appendFileSync(
bucket, `\n${hashEntry(stringified)}\t${stringified}`
)
try {
fixOwner.chownr.sync(bucket, opts.uid, opts.gid)
fixOwner.chownr.sync(cache, bucket)
} catch (err) {
if (err.code !== 'ENOENT') {
throw err
Expand Down
106 changes: 69 additions & 37 deletions lib/util/fix-owner.js
Expand Up @@ -5,83 +5,115 @@ const BB = require('bluebird')
const chownr = BB.promisify(require('chownr'))
const mkdirp = BB.promisify(require('mkdirp'))
const inflight = require('promise-inflight')
const inferOwner = require('./infer-owner.js')

// Memoize getuid()/getgid() calls.
// patch process.setuid/setgid to invalidate cached value on change
const self = { uid: null, gid: null }
const getSelf = () => {
if (typeof self.uid !== 'number') {
self.uid = process.getuid()
const setuid = process.setuid
process.setuid = (uid) => {
self.uid = null
process.setuid = setuid
return process.setuid(uid)
}
}
if (typeof self.gid !== 'number') {
self.gid = process.getgid()
const setgid = process.setgid
process.setgid = (gid) => {
self.gid = null
process.setgid = setgid
return process.setgid(gid)
}
}
}

module.exports.chownr = fixOwner
function fixOwner (filepath, uid, gid) {
function fixOwner (cache, filepath) {
if (!process.getuid) {
// This platform doesn't need ownership fixing
return BB.resolve()
}
if (typeof uid !== 'number' && typeof gid !== 'number') {
// There's no permissions override. Nothing to do here.
return BB.resolve()
}
if ((typeof uid === 'number' && process.getuid() === uid) &&
(typeof gid === 'number' && process.getgid() === gid)) {
return inferOwner(cache).then(owner => {
const { uid, gid } = owner
getSelf()

// No need to override if it's already what we used.
return BB.resolve()
}
return inflight(
'fixOwner: fixing ownership on ' + filepath,
() => chownr(
filepath,
typeof uid === 'number' ? uid : process.getuid(),
typeof gid === 'number' ? gid : process.getgid()
).catch({ code: 'ENOENT' }, () => null)
)
if (self.uid === uid && self.gid === gid) {
return
}

return inflight(
'fixOwner: fixing ownership on ' + filepath,
() => chownr(
filepath,
typeof uid === 'number' ? uid : self.uid,
typeof gid === 'number' ? gid : self.gid
).catch({ code: 'ENOENT' }, () => null)
)
})
}

module.exports.chownr.sync = fixOwnerSync
function fixOwnerSync (filepath, uid, gid) {
function fixOwnerSync (cache, filepath) {
if (!process.getuid) {
// This platform doesn't need ownership fixing
return
}
if (typeof uid !== 'number' && typeof gid !== 'number') {
// There's no permissions override. Nothing to do here.
return
}
if ((typeof uid === 'number' && process.getuid() === uid) &&
(typeof gid === 'number' && process.getgid() === gid)) {
const { uid, gid } = inferOwner.sync(cache)
getSelf()
if (self.uid === uid && self.gid === gid) {
// No need to override if it's already what we used.
return
}
try {
chownr.sync(
filepath,
typeof uid === 'number' ? uid : process.getuid(),
typeof gid === 'number' ? gid : process.getgid()
typeof uid === 'number' ? uid : self.uid,
typeof gid === 'number' ? gid : self.gid
)
} catch (err) {
// only catch ENOENT, any other error is a problem.
if (err.code === 'ENOENT') {
return null
}
throw err
}
}

module.exports.mkdirfix = mkdirfix
function mkdirfix (p, uid, gid, cb) {
return mkdirp(p).then(made => {
if (made) {
return fixOwner(made, uid, gid).then(() => made)
}
}).catch({ code: 'EEXIST' }, () => {
// There's a race in mkdirp!
return fixOwner(p, uid, gid).then(() => null)
function mkdirfix (cache, p, cb) {
// we have to infer the owner _before_ making the directory, even though
// we aren't going to use the results, since the cache itself might not
// exist yet. If we mkdirp it, then our current uid/gid will be assumed
// to be correct if it creates the cache folder in the process.
return inferOwner(cache).then(() => {
return mkdirp(p).then(made => {
if (made) {
return fixOwner(cache, made).then(() => made)
}
}).catch({ code: 'EEXIST' }, () => {
// There's a race in mkdirp!
return fixOwner(cache, p).then(() => null)
})
})
}

module.exports.mkdirfix.sync = mkdirfixSync
function mkdirfixSync (p, uid, gid) {
function mkdirfixSync (cache, p) {
try {
inferOwner.sync(cache)
const made = mkdirp.sync(p)
if (made) {
fixOwnerSync(made, uid, gid)
fixOwnerSync(cache, made)
return made
}
} catch (err) {
if (err.code === 'EEXIST') {
fixOwnerSync(p, uid, gid)
fixOwnerSync(cache, p)
return null
} else {
throw err
Expand Down
80 changes: 80 additions & 0 deletions lib/util/infer-owner.js
@@ -0,0 +1,80 @@
'use strict'

// This is only called by lib/util/fix-owner.js
//
// Get the uid/gid from the cache folder itself, not from
// settings being passed in. Too flaky otherwise, because the
// opts baton has to be passed properrly through half a dozen
// different modules.
//
// This module keeps a Map of cache=>{uid,gid}. If not in the map,
// then stat the folder, then the parent, ..., until it finds a folder
// that exists, and use that folder's uid and gid as the owner.
//
// If we don't have getuid/getgid, then this never gets called.

const BB = require('bluebird')
const fs = require('fs')
const lstat = BB.promisify(fs.lstat)
const lstatSync = fs.lstatSync
const { dirname } = require('path')
const inflight = require('promise-inflight')

const cacheToOwner = new Map()

const inferOwner = cache => {
if (cacheToOwner.has(cache)) {
// already inferred it
return BB.resolve(cacheToOwner.get(cache))
}

const statThen = st => {
const { uid, gid } = st
cacheToOwner.set(cache, { uid, gid })
return { uid, gid }
}
// check the parent if the cache itself fails
// likely it does not exist yet.
const parent = dirname(cache)
const parentTrap = parent === cache ? null : er => {
return inferOwner(parent).then((owner) => {
cacheToOwner.set(cache, owner)
return owner
})
}
return lstat(cache).then(statThen, parentTrap)
}

const inferOwnerSync = cache => {
if (cacheToOwner.has(cache)) {
// already inferred it
return cacheToOwner.get(cache)
}

// the parent we'll check if it doesn't exist yet
const parent = dirname(cache)
// avoid obscuring call site by re-throwing
// "catch" the error by returning from a finally,
// only if we're not at the root, and the parent call works.
let threw = true
try {
const st = lstatSync(cache)
threw = false
const { uid, gid } = st
cacheToOwner.set(cache, { uid, gid })
return { uid, gid }
} finally {
if (threw && parent !== cache) {
const owner = inferOwnerSync(parent)
cacheToOwner.set(cache, owner)
return owner // eslint-disable-line no-unsafe-finally
}
}
}

module.exports = cache => inflight(
'inferOwner: detecting ownership of ' + cache,
() => inferOwner(cache)
)

module.exports.sync = inferOwnerSync

0 comments on commit ac84d14

Please sign in to comment.