Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache go install-ed binaries #483

Open
SirSova opened this issue Jun 2, 2024 · 19 comments
Open

Cache go install-ed binaries #483

SirSova opened this issue Jun 2, 2024 · 19 comments
Labels
feature request New feature or request to improve the current logic

Comments

@SirSova
Copy link

SirSova commented Jun 2, 2024

Description:
Cache go install-ed binaries (optionally I suppose) along with go mod dependencies. Store $GOBIN folder with all installed during workflow execution binaries on post step (cache save).

Justification:
In my scenario, I use tparse tool to prettify tests results. I can imagine other cases such as code generator tools. Basically pre-run/post-run scripts. For now, I turned off the cache option of this action and wrote my own using action/cache, but it adds significant complexity to keep it around multiple workflows the same way.

Are you willing to submit a PR?
Sure, as soon as the feature is approved.

@SirSova SirSova added feature request New feature or request to improve the current logic needs triage labels Jun 2, 2024
@HarithaVattikuti
Copy link
Contributor

Hello @SirSova
We appreciate your suggestion for a new feature! We'll make sure to address it when we have the opportunity

@silverwind
Copy link

silverwind commented Jun 5, 2024

I don't know what GOBIN is, but GOMODCACHE (go env GOMODCACHE aka. $GOPATH/pkg/mod) is generally cachable and caching it would speed up any go run tool@version or go install tool@version invokations, so it would be welcome to include them in the caching.

@SirSova
Copy link
Author

SirSova commented Jun 8, 2024

GOBIN represents $(go env GOPATH)/bin. It's an env for the folder with all go-installed binaries.

So if I run go install tool@version -- it won't add a new dependency in my go.mod (meaning it won't be cached), but inside my GH workflows I do this:

go install github.com/mfridman/tparse@vX.Y.Z
go test -json  ./... | tparse -all

It will download and build tparse tool on each run which I want to avoid.
And since these installations managed by Go, I believe that it's appropriate to do using setup-go action

@silverwind
Copy link

silverwind commented Jun 8, 2024

I have been experimenting using https://github.com/actions/cache and got some good performance results by caching GOCACHE (build cache) and GOMODCACHE (modules cache) but I see it as a bit of risky activity because it relies on golang correctly invalidating its cache and I'm not fully trusting it yet.

@StefMa
Copy link

StefMa commented Jun 8, 2024

Even through I would love to have this feature build in into this action, I honestly think it's not the responsibility of this action to cache such things.

This action is designed to install go, not more. What you're doing with go is not really part of this action. Is it? 🤔

@tak11173132
Copy link

I don't know what GOBIN is, but GOMODCACHE (go env GOMODCACHE aka. $GOPATH/pkg/mod) is generally cachable and caching it would speed up any go run tool@version or go install tool@version invokations, so it would be welcome to include them in the caching.

@silverwind
Copy link

silverwind commented Jun 11, 2024

Even through I would love to have this feature build in into this action, I honestly think it's not the responsibility of this action to cache such things.

This action is designed to install go, not more. What you're doing with go is not really part of this action. Is it? 🤔

I tend to agree that caching should not be in scope of setup-* actions (do one thing), but apparently these caching features have been creeping into them and setup-go is as far as I'm aware the only setup action that enables caching by default.

I think the most important thing is that only safe things should be cached and I don't know how safe it is to cache these go directories. There could always be undiscovered cache invalidation bugs in golang.

@nferch
Copy link

nferch commented Jun 18, 2024

I too would appreciate this feature. Even if it just cached the dependencies for something that was go install'd, that'd speed up my builds quite a bit.

I'm a bit confused/ignorant as to why this isn't happening already. I have multiple workflows that run at the same time on the same commit, is it that the first run that completes doesn't contain the cached modules in GOMODCACHE?

@zaibon
Copy link

zaibon commented Jun 20, 2024

I have been experimenting using https://github.com/actions/cache and got some good performance results by caching GOCACHE (build cache) and GOMODCACHE (modules cache) but I see it as a bit of risky activity because it relies on golang correctly invalidating its cache and I'm not fully trusting it yet.

@silverwind can you share your solution in the mean time, while this issue is being decided/worked on ?

@silverwind
Copy link

silverwind commented Jun 20, 2024

Here is what I have been experimenting with and it seemed to work. The cache key surely is too aggressive and GOVERSION and go.mod hash can likely be removed.

- uses: actions/setup-go@v5
  with:
    go-version-file: go.mod
    check-latest: true
- id: vars
  run: |
    echo "GOCACHE=$(go env GOCACHE)" >> "$GITHUB_OUTPUT"
    echo "GOMODCACHE=$(go env GOMODCACHE)" >> "$GITHUB_OUTPUT"
    echo "GOVERSION=$(go env GOVERSION)" >> "$GITHUB_OUTPUT"
- uses: actions/cache/restore@v4
  with:
    path: |
      ${{ steps.vars.outputs.GOCACHE }}
      ${{ steps.vars.outputs.GOMODCACHE }}
    key: golint-v1-${{ github.job }}-${{ runner.os }}-${{ runner.arch }}-${{ steps.vars.outputs.GOVERSION }}-${{ hashFiles('go.mod') }}
- run: make lint
- uses: actions/cache/save@v4
  with:
    path: |
      ${{ steps.vars.outputs.GOCACHE }}
      ${{ steps.vars.outputs.GOMODCACHE }}
    key: golint-v1-${{ github.job }}-${{ runner.os }}-${{ runner.arch }}-${{ steps.vars.outputs.GOVERSION }}-${{ hashFiles('go.mod') }}

@nferch
Copy link

nferch commented Jun 20, 2024

@silverwind thanks for sharing!

That seems to work for me, although the actions/cache/restore generates warnings when it tries to overwrite files that the actions/setup-go action restored from its cache.

I was able to achieve similar results by adding my Makefile to the cache-dependency-path value, which has two effects:

  • It keys the cache using the Makefile, which includes versions of the tools that are installed, so the cache is repopulated when the versions change
  • It forces a separate cache key from my main build workflow, which doesn't install or run the tools. This ensures that the cache used by this workflow contains the tools.

@peterbourgon
Copy link

Running

go install github.com/mfridman/tparse@v0.14.0

will create a binary named tparse in $GOPATH/bin (if $GOPATH is set) or in $HOME/go/bin (if $GOPATH is not set). That binary will be tparse at version v0.14.0, but that version is not represented in the binary filename, or in anything else that can be reasonably captured by a cache key. So you can't really cache $GOPATH/bin (or $HOME/go/bin), at least not effectively.

@zaibon
Copy link

zaibon commented Jun 24, 2024

will create a binary named tparse in $GOPATH/bin (if $GOPATH is set) or in $HOME/go/bin (if $GOPATH is not set). That binary will be tparse at version v0.14.0, but that version is not represented in the binary filename, or in anything else that can be reasonably captured by a cache key. So you can't really cache $GOPATH/bin (or $HOME/go/bin), at least not effectively.

The trick there is to allow the user to define a cache key that is generated from the script that contain go install github.com/mfridman/tparse@v0.14.0. This is where the version exists, so it can be used as cache key.

@SirSova
Copy link
Author

SirSova commented Jun 24, 2024

I suppose Go install verify the version of the binary using --version, checksum or somewhere stored in go mod cache, but right now just by caching $GOPATH/bin it won't download & build the binary again.

My working workflow with cache binaries:

 - name: Set up Go 1.22
        uses: actions/setup-go@v5
        with:
          go-version: '1.22'
          cache: false # we use our own cache for go modules, since setup-go cache doesn't save `~/go/bin`
          
      - name: Check out source code
        uses: actions/checkout@v4

      - name: Cache go modules
        uses: actions/cache@v4
        with:
          # /go/bin is for `go install`-ed tools
          path: |
            ~/.cache/go-build
            ~/go/pkg/mod
            ~/go/bin
          key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
          restore-keys: |
            ${{ runner.os }}-go-

      - name: Run tests
        run: >
        go install github.com/mfridman/tparse@v0.14.0
        go test -json  ./... | tparse -all

Also good to notice. tparse itself isn't a dependency of my code, so go.mod doesn't contain any information about it. I install it manually just before the tests

P.S: I want to replace Cache go modules part with some additional config for setup-go, such as:

        uses: actions/setup-go@v5
        with:
          go-version: '1.22'
          cache: true
          cache-install: true # <-----

@Zxilly
Copy link
Contributor

Zxilly commented Jun 27, 2024

go install downloads the code, compiles it, and then puts the binary in a directory. But there is no standard way of describing the version of the program being installed, and each time go install is executed it is a brand new installation, so how should the cache key be designed? Should the cache key be designed to keep track of each go install call? I can almost visualize a big pile of ugly workaround code already.

@SirSova
Copy link
Author

SirSova commented Jun 28, 2024

The first run of go install does it, but it's 100% not a brand-new installation for the next calls.
Just try it out. The 2nd+ calls are almost instantaneous. It must cache at least all dependencies.
The workaround described above (GH workflow) worked for me perfectly.

P.S: I see also significant difference if I use "latest" vs specific version.

@silverwind
Copy link

But there is no standard way of describing the version of the program being installed

Since go 1.16, you can use go install module@version and go run module@version to specify the version.

@Zxilly
Copy link
Contributor

Zxilly commented Jun 28, 2024

yes, you can specify version, but after the install no way to get that.
The second install call faster because go compiler cache the intermediate object files, but the final binary still been created duplicate.

@oxisto
Copy link

oxisto commented Dec 27, 2024

Just to give some additional workaround ideas: It is possible to create a separate folder, e.g., tools with its own go.mod that looks like this (just for example)

module tools

go 1.23.4

require (
	github.com/99designs/gqlgen v0.17.61
	github.com/mfridman/tparse v0.16.0
	github.com/sqlc-dev/sqlc v1.27.0
)

Put also an additional tools.go file

//go:build tools
// +build tools

package tools

import (
	_ "github.com/99designs/gqlgen"
	_ "github.com/mfridman/tparse"
	_ "github.com/sqlc-dev/sqlc/cmd/sqlc"
)

This has the advantage that the tool dependencies will not leak into your main go.mod. Then use the additional cache-dependency-path: "**/*.sum" in your setup-go action. This will also cache your build / tools dependencies.

I wonder whether some of this workflow will be made obsolete by the new tool directive in Go 1.24 (https://tip.golang.org/doc/go1.24#tools)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request to improve the current logic
Projects
None yet
Development

No branches or pull requests

10 participants