Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prevent yarn install from running twice in Makefile?

Tags:

gnu-make

I have this in my Makefile:

node_modules: yarn.lock
    yarn install --production=false
    touch node_modules

yarn.lock: package.json
    yarn install --production=false
    touch yarn.lock

Basically, if the node_modules directory is missing (or someone has tampered with it by adding/removing files), or yarn.lock has been updated, then it should run yarn install to rebuild/integrity check the node_modules dir.

However, if yarn.lock is missing, it can be rebuilt from package.json, or if package.json is updated, then it should install and rebuild the lock file.

The problem is when both node_modules and yarn.lock are missing, then the same commands run twice.

How can I prevent this?


I can nearly get it to work by wrapping the directives in a conditional:

ifneq ("$(wildcard yarn.lock)","")
node_modules: yarn.lock
    @yarn install --production=false
    touch node_modules

yarn.lock: package.json
    touch yarn.lock
else # yarn.lock does not exist
node_modules: yarn.lock
    touch node_modules

yarn.lock: package.json
    @yarn install --production=false
endif

Now if you touch package.json and then make node_modules and yarn.lock exists, then it'll subsequently touch yarn.lock which will cause node_modules to rebuild, just like I want.

However, if you touch package.json and then make yarn.lock, technically it should attempt a yarn install but it won't because I removed the command from this directive:

yarn.lock: package.json
    touch yarn.lock

To prevent it from running twice in the former scenario.

like image 256
mpen Avatar asked May 18 '17 01:05

mpen


1 Answers

To a first approximation, consider the approach illustrated here:

Makefile (1)

.PHONY: all clean

all: yarn.lock

yarn.lock: node_modules package.json
    $(MAKE clean)
    yarn install --production=false

node_modules:
    mkdir -p $@

clean:
    rm -fr node_modules yarn.lock

This will never run yarn install redundantly, and it's a somewhat more robust solution than you're considering. I'll explain.

The one source item in the problem is the package.json. It is the sole logical prerequisite of everything else and is not itself to be built.

The yarn.lock is a build-artefact whose production signifies that that yarn install has been done successfully with respect to the snapshot of package.json that existed when it was done. yarn install will subsequently consider that the installation is up-to-date as long as yarn.lock exists and has contents that "agree" with package.json by criteria that are algorithmically embodied in yarn install.

So, viewed simplistically, the mission of this build is just to make yarn.lock up-to-date with respect to package.json:

yarn.lock: package.json
    yarn install --production=false

But it's actually more complicated. yarn.lock is the build target but it isn't the only build-artefact and it isn't even one of the artefacts of primary value. Those, of course, are whatever artefacts will populate node-modules as a result of running yarn install.

So the primary build-artefacts appear as side effects of this build, while the actual target, yarn.lock matters to us only as a token that the primary artefacts, whatever they may be, have been made up-to-date with the package.json.

And it is a frail token. Agencies can mess with the contents of node_modules - adding files that shouldn't be there, deleting or modifying ones that should - and yarn install won't do anything to rectify it as long as it considers yarn.lock up-to-date with package.json, by its own criteria.

You adverted to that fragility in explaining the suggested recipe:

node_modules: yarn.lock
    yarn install --production=false
    touch node_modules

if the node_modules directory is missing (or someone has tampered with it by adding/removing files), or yarn.lock has been updated, then it should run yarn install to rebuild/integrity check the node_modules dir.

But that rule is the wrong way round to be triggered by such tampering. The tampering - if you're lucky - will update the modification time of node_modules. But that will make it younger, not older, than yarn.lock, and won't fire the recipe. The recipe is only good for the case in which node_modules does not exist.

That weakness is mitigated by the recipe:

yarn.lock: node_modules package.json
    $(MAKE) clean
    yarn install --production=false 

If yarn.lock doesn't exist, or is out-of-date w.r.t to either node_modules or package_json, we'll remake all the build-artefacts from scratch.

That's better, but brings with it a boot-strap problem, when neither of the artefacts yarn.lock or node_modules exists but node_modules - which is to be populated as a by product of making yarn.lock - is also a perquisite of yarn.lock.

It's a trivial problem however. The prerequisite for yarn.lock is the mere existence of node_modules and it is satisfiable prior to making yarn.lock, and the contents of node_modules, up-to-date - by just adding the recipe:

node_modules:
    mkdir -p $@

With this, if node_modules ever doesn't exist, it will be created as a prerequisite of yarn.lock, making it newer than yarn.lock, and requiring yarn.lock and the primary build-artefacts to be made.

But...

This solution expresses the dependency relations essentially right and - as a consequence - shows how yarn install never needs to be run redundantly. And it corrects the wrong-way-round bug in your tamper-detection logic.

But is still falls short of strong tamper-detection.

The tamper-detection mechanism we've got is: Something happens in the node_modules directory that makes its modification date later than that of yarn.lock. That will detect some tampering, but not all tampering.

As a filesystem object, a directory is modified - and its modification time updated - if and only if an immediate child object is added, removed, or renamed. So the tamper-detection mechanism is blind to all events inside any subdirectory of node_modules and to any modification of an existing file or subdirectory of node_modules except renaming it. That leaves ample scope for messing up the node_modules.

In that light, you might decide to:-

Stick

Flimsy tamper-detection is better than none. I don't want to use any more expensive than this.

But you probably wouldn't. More likely alternatives:

Fold

Tamper-detection this flimsy is no better than none, so I'll fall back to:

yarn.lock: package.json
    yarn install --production=false

I'll regard improper tampering as out-of-scope for my build. If it happens, something will break, I'll notice it, make clean and try again.

Up the ante

I want strong tamper-detection.

Strong tamper-detection makes rather heavier lifting - but not much heavier. You need to force a clean yarn install, or not, depending on the outcome of an old-versus-new comparison between complete manifests of the contents of node_modules - manifests informative enough that any material difference will show up.

A manifest detailing the pathname and modification time of every file in node_modules is the best candidate. This manifest will contain the information that make would need to know, and would get from the file system, if the elusive primary artefacts of this build could be spelled out to it, and a change in that information relative to its last recorded state is a reliable trigger to remake everything. Thus:

Makefile (2)

RM := rm -fr
MANIFEST_DIR := .manifest
LAST_MANIFEST := $(MANIFEST_DIR)/node_modules.last
NEW_MANIFEST := $(MANIFEST_DIR)/node_modules.peek
GEN_MANIFEST := find node_modules/ -exec stat -c '%n %y' {} \;

$(shell mkdir -p $(MANIFEST_DIR) node_modules)
$(if $(wildcard $(LAST_MANIFEST)),,$(shell touch $(LAST_MANIFEST)))
$(shell $(GEN_MANIFEST) > $(NEW_MANIFEST))
$(shell cmp -s $(LAST_MANIFEST) $(NEW_MANIFEST) || touch node_modules)

.PHONY: all clean

all: $(LAST_MANIFEST)

yarn.lock: node_modules package.json
    $(RM) yarn.lock node_modules
    yarn install --production=false

$(LAST_MANIFEST): yarn.lock
    $(GEN_MANIFEST) > $@

clean:
    $(RM) yarn.lock node_modules $(MANIFEST_DIR)

This develops Makefile (1) mainly with the unconditionally executed apparatus at the top, which:-

  • Ensures (as before) we start with a node_modules directory, even if empty.
  • Ensures we start with a hidden directory (.manifest) for working in and for persisting the latest manifest of node_modules. (Analogous to the hidden .deps directory classically used for persisting autodependency files in C/C++ makes).
  • Ensures we start with a persisted manifest, even if empty.
  • Generates a new, true, manifest snapshot with: find node_modules/ -exec stat -c '%n %y' {} \;, which writes <filename> <modification_time> for each item under node_modules. This snapshot is true of node_modules as it is, but not necessarily true of node_modules as it should be. (Should it follow symlinks? - find -L ...? No. Because make wouldn't follow symlinks for targets or prerequisites).
  • Compares the new, true, snapshot with the persisted manifest and, if there is any difference, then touches node_modules.

This amounts to a build preamble that will update the modification time of node_modules, or not, by a strong tamper-detecting test. Then the build is much as before except that its target is no longer yarn.lock, but a new persisted manifest, $(LAST_MANIFEST), that is always an immediate post-yarn-install snapshot and accordingly dependent on yarn.lock.

A workout for Makefile (2)

For a lab-rat package.json I'll use:

{
  "name": "node-js-sample",
  "version": "0.2.0",
  "description": "A sample Node.js app using Express 4",
  "main": "index.js",
  "scripts": {
    "start": "node index.js"
  },
  "dependencies": {
    "express": "^4.13.3"
  },
  "engines": {
    "node": "4.0.0"
  },
  "repository": {
    "type": "git",
    "url": "https://github.com/heroku/node-js-sample"
  },
  "keywords": [
    "node",
    "heroku",
    "express"
  ],
  "author": "Mark Pundsack",
  "contributors": [
    "Zeke Sikelianos <[email protected]> (http://zeke.sikelianos.com)"
  ],
  "license": "MIT"
}

Make from scratch

$ make
rm -fr yarn.lock node_modules
yarn install --production=false
yarn install v0.24.5
info No lockfile found.
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...
success Saved lockfile.
Done in 1.17s.
find node_modules/ -exec stat -c '%n %y' {} \; > .manifest/node_modules.last

Change nothing and remake

$ make
make: Nothing to be done for 'all'.

Touch node_modules only

$ touch node_modules/
$ make
rm -fr yarn.lock node_modules
yarn install --production=false
yarn install v0.24.5
info No lockfile found.
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...
success Saved lockfile.
Done in 1.01s.
find node_modules/ -exec stat -c '%n %y' {} \; > .manifest/node_modules.last

Touch package.json only

$ touch package.json 
imk@imk-ThinkPad-T420:~/develop/so/make_prob$ make
rm -fr yarn.lock node_modules
yarn install --production=false
yarn install v0.24.5
info No lockfile found.
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...
success Saved lockfile.
Done in 1.22s.
find node_modules/ -exec stat -c '%n %y' {} \; > .manifest/node_modules.last

Touch node_modules and package.json

$ touch package.json node_modules/
imk@imk-ThinkPad-T420:~/develop/so/make_prob$ make
rm -fr yarn.lock node_modules
yarn install --production=false
yarn install v0.24.5
info No lockfile found.
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...
success Saved lockfile.
Done in 1.05s.
find node_modules/ -exec stat -c '%n %y' {} \; > .manifest/node_modules.last

Touch yarn.lock

$ touch yarn.lock 
$ make
find node_modules/ -exec stat -c '%n %y' {} \; > .manifest/node_modules.last

Delete yarn.lock

$ rm yarn.lock 
$ make
rm -fr yarn.lock node_modules
yarn install --production=false
yarn install v0.24.5
info No lockfile found.
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...
success Saved lockfile.
Done in 1.17s.
find node_modules/ -exec stat -c '%n %y' {} \; > .manifest/node_modules.last

Change a dependency in package.json

$ sed -i 's/4\.13\.3/4.15.3/' package.json
$ make
rm -fr yarn.lock node_modules
yarn install --production=false
yarn install v0.24.5
info No lockfile found.
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...
success Saved lockfile.
Done in 1.03s.
find node_modules/ -exec stat -c '%n %y' {} \; > .manifest/node_modules.last

Undo the change

$ sed -i 's/4\.15\.3/4.15.3/' package.json
$ make
rm -fr yarn.lock node_modules
yarn install --production=false
yarn install v0.24.5
info No lockfile found.
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...
success Saved lockfile.
Done in 2.35s.
find node_modules/ -exec stat -c '%n %y' {} \; > .manifest/node_modules.last

Touch an existing file in a subdirectory of node_modules

$ ls node_modules/vary/
HISTORY.md  index.js  LICENSE  package.json  README.md
$ touch node_modules/vary/README.md 
$ make
rm -fr yarn.lock node_modules
yarn install --production=false
yarn install v0.24.5
info No lockfile found.
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...
success Saved lockfile.
Done in 1.02s.
find node_modules/ -exec stat -c '%n %y' {} \; > .manifest/node_modules.last

Add a file to a subdirectory of node_modules

$ touch node_modules/vary/interloper
$ make
rm -fr yarn.lock node_modules
yarn install --production=false
yarn install v0.24.5
info No lockfile found.
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...
success Saved lockfile.
Done in 1.20s.
find node_modules/ -exec stat -c '%n %y' {} \; > .manifest/node_modules.last

Remove a file from a subdirectory of node_modules

$ rm node_modules/vary/README.md 
$ make
rm -fr yarn.lock node_modules
yarn install --production=false
yarn install v0.24.5
info No lockfile found.
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Building fresh packages...
success Saved lockfile.
Done in 1.16s.
find node_modules/ -exec stat -c '%n %y' {} \; > .manifest/node_modules.last
like image 194
Mike Kinghan Avatar answered Oct 13 '22 07:10

Mike Kinghan