Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to verify token: oidc: expected audience "argo-cd-cli" got ["argo-cd"] #13432

Closed
3 tasks done
paoll opened this issue May 3, 2023 · 31 comments
Closed
3 tasks done
Labels
bug Something isn't working

Comments

@paoll
Copy link

paoll commented May 3, 2023

Checklist:

  • I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
  • I've included steps to reproduce the bug.
  • I've pasted the output of argocd version.

Describe the bug

We did upgrade both our deployment of ArgoCD using in-house chart, and the image to latest 2.7.0 (then 2.7.1.) , and since argocd-server is spamming with logs:

failed to verify token: oidc: expected audience \"argo-cd-cli\" got [\"argo-cd\"]

We did rollbacked both chart and argocd version to previous, and warning is still present. Both logging through UI or CLI is working as well.

We are using Google authentication using Google groups through the dex, without any allowedAudiences configured, so we do expect argo-cd being part of allowed audiences, according to

allowedAudiences := []string{config.ClientID}

To Reproduce

Dex configuration:

dex.config: |
  connectors:
    - type: google
      id: google
      name: Google
      config:
        redirectUri: https://<REDACTED>/api/dex/callback
        clientID: <REDACTED>.apps.googleusercontent.com
        clientSecret: $oidc.google.clientSecret
        serviceAccountFilePath: /tmp/oidc/googleAuth.json
        adminEmail: <REDACTED>

Expected behavior

Without any allowed audience explicitely configured, we're expecting that argo-cd is allowed as one.

Version

argocd: v2.7.1+5e54351.dirty
  BuildDate: 2023-05-02T19:24:14Z
  GitCommit: 5e543518dbdb5384fa61c938ce3e045b4c5be325
  GitTreeState: dirty
  GoVersion: go1.20.3
  Compiler: gc
  Platform: darwin/arm64
argocd-server: v2.7.1+5e54351.dirty
  BuildDate: 2023-05-02T19:24:14Z
  GitCommit: 5e543518dbdb5384fa61c938ce3e045b4c5be325
  GitTreeState: dirty
  GoVersion: go1.20.3
  Compiler: gc
  Platform: darwin/arm64
  Kustomize Version: could not get kustomize version: exec: "kustomize": executable file not found in $PATH
  Helm Version: v3.11.3+g66a969e
  Kubectl Version: v0.24.2
  Jsonnet Version: v0.19.1
@paoll paoll added the bug Something isn't working label May 3, 2023
@Epiclemonaid
Copy link

Hey having this same isue. upgrading from 2.6.7 to 2.7.3 Please advise on a solution this is blocking.

@batkins-carecloud
Copy link

We upgraded to 2.7.3 and had the same issue. Rolling back to the previous version didn't fix it. Completely uninstalling ArgoCD and redeploying 2.7.3 it fixed the issue for us.

@JhoLee
Copy link
Contributor

JhoLee commented Jun 22, 2023

Same issue in 2.7.1

@Moulick
Copy link
Contributor

Moulick commented Jul 25, 2023

Same issue on 2.7.8 While I was seeing the issue, I accidentally has multiple pods of Dex deployed. Reducing that to 1 pod fixed the issue

@legal90
Copy link

legal90 commented Aug 2, 2023

@paoll Could you please confirm if that issue actually breaks the authorization flow in your case? If it does - which error do you see in the browser?

P.s. I'm seeing the same warning on v2.7.10

@paoll
Copy link
Author

paoll commented Aug 13, 2023

@legal90, afaik it does not break the authorization flow, as the the authentication one is working (users can login and use dashboard mainly as read-only). We don't operate actions through the dashboard, and everyone can see everything, so if we had authorization issue, we would'nt notice.

The only dysfunctional behavior we're experiencing is the error message from argocd-server.

@sachk
Copy link

sachk commented Sep 10, 2023

We are also experiencing this issue on v2.8.2+dbdfc71

@donovanrost
Copy link

Same issue on 2.7.8 While I was seeing the issue, I accidentally has multiple pods of Dex deployed. Reducing that to 1 pod fixed the issue

Reducing the Dex replications 3 -> 1 seemingly solved this for me

@crenshaw-dev
Copy link
Member

The error message is likely misleading. The token verification code loops over allowed audiences, runs token verification for each, and logs the last error message encountered.

In this case I believe the first error message (the relevant one) was discarded, and the second error message was irrelevant (because it was attempting to verify the wrong audience).

This change will cause all token verification errors to be logged:

#16625

@parthyp
Copy link

parthyp commented Dec 16, 2023

Having same issue with gitlab <-> dex

  repository.credentials: ''
  dex.config: |
    connectors:
    - type: gitlab
      id: gitlab
      name: Gitlab
      config:
        baseURL: https://SELFHOST URL
        clientID: xxxxxxxxxxxx
        clientSecret: $oidc.dex.clientSecret
        groups: 
          - admin
        issuer: https://URL/api/dex
        redirectURI: https://URL/api/dex/callback
        useLoginAsID: false
        users.anonymous.enabled: "false"``` 

@crenshaw-dev
Copy link
Member

To be clear, I think it's possible that every person on this thread is facing a completely different issue, hidden by an irrelevant error message.

@ghost
Copy link

ghost commented Feb 7, 2024

After upgrading an ArgoCD 2.3.10 instance to 2.6.15, I'm seeing the following in the argocd-server logs:

time="2024-02-07T14:59:17Z" level=warning msg="Failed to verify token: failed to verify token: oidc: expected audience \"argo-cd-cli\" got [\"argo-cd\"]"
time="2024-02-07T14:59:17Z" level=info msg="finished streaming call with code Unauthenticated" error="rpc error: code = Unauthenticated desc = invalid session: failed to verify the token" grpc.code=Unauthenticated grpc.method=Watch grpc.service=application.ApplicationService grpc.start_time="2024-02-07T14:59:17Z" grpc.time_ms=2.521 span.kind=server system=grpc

These are happening once per second. That doesn't match the liveness (30s) or readiness (10s) probe periods, so I'm trying to find out why these are happening.

To be clear, nothing seems broken. Dex is working just fine and SSO (Google) works. I just want to figure out why this is happening in order to improve the logs.

EDIT: Could this be users with old ArgoCD JS cached and invalidated sessions that are trying to access the upgraded ArgoCD?

@crenshaw-dev
Copy link
Member

Could this be users with old ArgoCD JS cached and invalidated sessions that are trying to access the upgraded ArgoCD?

100%

I think we should follow this suggestion: coreos/go-oidc#406 (review)

We should skip the client ID check in the go-oidc library and instead check it ourselves. That'll allow us to ignore irrelevant "failed to verify token" errors and instead log the actual error.

@ForbiddenEra
Copy link

ForbiddenEra commented Feb 8, 2024

Having same issue with gitlab <-> dex

Me as well. SSO login on UI working fine.

$argocd app list
FATA[0012] rpc error: code = Unauthenticated desc = invalid session: failed to verify the token

with logs:

time="2024-02-08T11:01:37Z" level=warning msg="Failed to verify token: failed to verify token: token verification failed for all audiences: error for aud \"argo-cd\": oidc: expected audience \"argo-cd\" got [\"argo-cd-cli\"]"
time="2024-02-08T11:01:37Z" level=info msg="received unary call /version.VersionService/Version" grpc.method=Version grpc.request.content= grpc.service=version.VersionService grpc.start_time="2024-02-08T11:01:37Z" span.kind=server system=grpc
time="2024-02-08T11:01:37Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Version grpc.service=version.VersionService grpc.start_time="2024-02-08T11:01:37Z" grpc.time_ms=2.739 span.kind=server system=grpc
time="2024-02-08T11:01:38Z" level=warning msg="Failed to verify token: failed to verify token: token verification failed for all audiences: error for aud \"argo-cd\": oidc: expected audience \"argo-cd\" got [\"argo-cd-cli\"]"
time="2024-02-08T11:01:38Z" level=info msg="finished unary call with code Unauthenticated" error="rpc error: code = Unauthenticated desc = invalid session: failed to verify the token" grpc.code=Unauthenticated grpc.method=List grpc.service=application.ApplicationService grpc.start_time="2024-02-08T11:01:38Z" grpc.time_ms=1.292 span.kind=server system=grpc

I only have a single dex replica, never had more than one.

Note that ripping the token out of my cookie from the UI and using it with the CLI works, so that's a possible workaround for those in need.

@aleks-andr
Copy link

This exact issue happened to us seemingly out of nowhere. These log entries correlated perfectly with a sudden wave of requests in our network logs with a HTTP 401 result. They all ended up being requests to:

https://<MY_ARGO_HOST>/api/v1/stream/applications?resourceVersion=137215386&fields=result.type%2Cresult.application.metadata.name%2Cresult.application.metadata.namespace%2Cresult.application.metadata.annotations%2Cresult.application.metadata.labels%2Cresult.application.metadata.creationTimestamp%2Cresult.application.metadata.deletionTimestamp%2Cresult.application.spec%2Cresult.application.operation.sync%2Cresult.application.status.sync.status%2Cresult.application.status.sync.revision%2Cresult.application.status.health%2Cresult.application.status.operationState.phase%2Cresult.application.status.operationState.finishedAt%2Cresult.application.status.operationState.operation.sync%2Cresult.application.status.summary%2Cresult.application.status.resources&selector=&appNamespace=

with the Referer field set to: https://<MY_ARGO_HOST>/applications?showFavorites=false&proj=&sync=Unknown%2CSynced&autoSync=&health=&namespace=&cluster=&labels=.
In our case, the source IP of the requests ended up being one of our employees who left the ArgoCD UI open in their browser.

So this definitely seems like a bug in the UI, where it endlessly tries to refresh the Applications listing even though all its getting in response is 401s. Let me know if you want me to provide any more details on this 🙏

@ForbiddenEra
Copy link

Could this be users with old ArgoCD JS cached and invalidated sessions that are trying to access the upgraded ArgoCD?

100%

I think we should follow this suggestion: coreos/go-oidc#406 (review)

We should skip the client ID check in the go-oidc library and instead check it ourselves. That'll allow us to ignore irrelevant "failed to verify token" errors and instead log the actual error.

Just commenting that I don't think that's at all the issue I'm having; whatever might be happening in a browser somewhere shouldn't affect me trying to login with the CLI.

@evenh
Copy link

evenh commented Mar 13, 2024

We're seeing this is one of our environments (the least critical thankfully): Running v2.10.2+fcf5d8c from this YAML. Our argocd-cm ConfigMap contains the following for dex.config:

logger:
  level: debug
  format: json
connectors:
- type: saml
  id: saml
  name: saml
  config:
    entityIssuer: $sso:externalUrl
    ssoURL: $sso:ssoUrl
    caData: $sso:certificate
    redirectURI: $sso:externalUrl
    usernameAttr: email
    emailAttr: email
    groupsAttr: Group

We're observing ArgoCD being slow as molasses and the error message Failed to verify token: failed to verify token: oidc: expected audience \"argo-cd-cli\" got [\"argo-cd\"]. Authentication and normal functionality works as intended albeit it being very slow.

@crenshaw-dev
Copy link
Member

It's possible that every person on this thread is experiencing an entirely different root cause. We won't know until we make changes to avoid obscuring the actual token validation error.

I think we should follow this suggestion: coreos/go-oidc#406 (review)

We should skip the client ID check in the go-oidc library and instead check it ourselves. That'll allow us to ignore irrelevant "failed to verify token" errors and instead log the actual error.

This is where the change should be make.

idToken, err = p.verify(aud, tokenString, false)

@crenshaw-dev
Copy link
Member

Actually, I forgot I wrote code to log all audiences' error messages. That'll be in 2.11.

@ForbiddenEra
Copy link

It's possible that every person on this thread is experiencing an entirely different root cause. We won't know until we make changes to avoid obscuring the actual token validation error.

Entirely possible; hopefully people come back with more details.

Actually, I forgot I wrote code to log all audiences' error messages. That'll be in 2.11.

I'll check it out when I get to 2.11 then and report back; fortunately only I use the CLI and even then very occasionally, even the UI is only for monitoring as everything else is declarative in my case.

Though, there's still a few CLI things I need to do occasionally and it'd be nice to not have to kick things and copy/paste tokens around. :-)

@zswanson
Copy link

zswanson commented Apr 14, 2024

@crenshaw-dev I have upgraded 2.10 to 2.11 RC1 and we're seeing a single oidc connector fail with a similar but reversed error message as everyone else in this issue.

time="2024-04-14T17:34:05Z" level=warning 
msg="Failed to verify token: failed to verify token: token verification failed for all audiences: error for aud \"argo-cd\": oidc: expected audience \"argo-cd\" got [\"argo-cd-cli\"]"

Unfortunately even in the 2.11 RC image this seems to be the only information logged. The connector that's failing is related to the single place where we use the argo cli, in a CI pipeline. Users are having no issue with our integrated SSO in the UI. We fetch the appropriate CLI version from the server during the pipeline and generate a token in each run from a secret.

  # Using Buildkite JWT to generate a token for Dex
  BUILDKITE_JWT=$(buildkite-agent oidc request-token \
    --lifetime 300 \
    --audience argo-cd-cli )
  export ARGOCD_AUTH_TOKEN=$(curl  http://${ARGOCD_SERVER}/api/dex/token \
  --user argo-cd-cli:bk-oidc-secret \
  --data-urlencode connector_id=bk-oidc \
  --data-urlencode grant_type=urn:ietf:params:oauth:grant-type:token-exchange \
  --data-urlencode scope="openid groups federated:id" \
  --data-urlencode requested_token_type=urn:ietf:params:oauth:token-type:access_token \
  --data-urlencode subject_token="$BUILDKITE_JWT" \
  --data-urlencode subject_token_type=urn:ietf:params:oauth:token-type:id_token | jq -r .access_token)

We expect the value of 'argo-cd-cli' so the error message is particularly odd.

      staticClients:
        - name: BK agents OIDC
          id: argo-cd-cli
          secret: bk-oidc-secret
          public: true

You mentioned above that something might be caching a token. I have restarted all our BK agent pods (they run in-cluster) and all of the argocd pods were new (including the dex server) from the 2.11 upgrade. However in the interests of science, I also restarted every argocd pod. Same result regardless; the CI pipeline is blocked and cannot authenticate.

@crenshaw-dev
Copy link
Member

@zswanson your error is because my change to improve the error logs was bad. Put up a fix: #17876

@hc2p
Copy link

hc2p commented May 6, 2024

I tried 2.11 RC3 and i still see errors:

time="2024-05-06T10:25:03Z" level=info msg="Initializing OIDC provider (issuer: https://argocd.my-domain.com/api/dex)"
time="2024-05-06T10:25:03Z" level=info msg="OIDC supported scopes: [openid email groups profile offline_access]"
time="2024-05-06T10:25:03Z" level=warning msg="Failed to verify token: failed to verify token: token verification failed for all audiences: error for aud \"argo-cd\": failed to verify signature: failed to verify id token signature, error for aud \"argo-cd-cli\": oidc: expected audience \"argo-cd-cli\" got [\"argo-cd\"]"
time="2024-05-06T10:25:03Z" level=info msg="finished streaming call with code Unauthenticated" error="rpc error: code = Unauthenticated desc = invalid session: failed to verify the token" grpc.code=Unauthenticated grpc.method=Watch grpc.service=application.ApplicationService grpc.start_time="2024-05-06T10:25:03Z" grpc.time_ms=7.658 span.kind=server system=grpc

Dex pod's dex.yaml looks like the following:

connectors:
- config:
    clientID: xyz.apps.googleusercontent.com
    clientSecret: STR-xyz
    issuer: accounts.google.com
    redirectURI: https://argocd.my-domain.com/api/dex/callback
  id: google
  name: Google
  type: google
grpc:
  addr: 0.0.0.0:5557
issuer: https://argocd.my-domain.com/api/dex
oauth2:
  skipApprovalScreen: true
staticClients:
- id: argo-cd
  name: Argo CD
  redirectURIs:
  - https://argocd.my-domain.com/auth/callback
  secret: xyz
- id: argo-cd-cli
  name: Argo CD CLI
  public: true
  redirectURIs:
  - http://localhost
  - http://localhost:8085/auth/callback
- id: argo-cd-pkce
  name: Argo CD PKCE
  public: true
  redirectURIs:
  - http://localhost:4000/pkce/verify
storage:
  type: memory
telemetry:
  http: 0.0.0.0:5558
web:
  https: 0.0.0.0:5556
  tlsCert: /tmp/tls.crt
  tlsKey: /tmp/tls.key

@crenshaw-dev
Copy link
Member

@hc2p in your case the actual error is this: "failed to verify signature: failed to verify id token signature." I'd start searching for answers specific to that error message, I see a few examples with a quick Google.

You can safely ignore this error, it's just a result of the audience check loop and is expected to fail:

error for aud "argo-cd-cli": oidc: expected audience "argo-cd-cli" got ["argo-cd"]"

@d1egoaz
Copy link

d1egoaz commented May 14, 2024

Hi @crenshaw-dev,

I hope you're doing well!

I was wondering if the fix in this pull request might be backported to the v2.10.x release? I’m currently using dex.config.

If I’m reading the code correctly, it looks like it adds two allowed audiences by default in this section of the code.

if a.DexConfig != "" {
return []string{common.ArgoCDClientAppID, common.ArgoCDCLIClientAppID}
}

Without the fix from the pull request, it seems like the warning log message will keep popping up.

Similar to this issue, I'm seeing the warning message logged thousands of times per minute.

Thanks a lot for your help!

@d1egoaz
Copy link

d1egoaz commented May 14, 2024

In one of our test clusters, I tried this ArgoCDClientAppID == ArgoCDCLIClientAppID, and it seems to have stopped the warning logs. The auth* flow is working as intended.

    dex.config: |
      connectors:
      - type: saml
        id: okta
        name: Okta
        config:
          clientID: argo-cd
          cliClientID: argo-cd
          ssoURL: https://<redacted>
          redirectURI: <redacted>
          ...

@Invincibear
Copy link

You can safely ignore this error, it's just a result of the audience check loop and is expected to fail:

error for aud "argo-cd-cli": oidc: expected audience "argo-cd-cli" got ["argo-cd"]"

@crenshaw-dev If it was once every few minutes sure, but it's multiple times per minute, per pod, and has an impact on logging costs, plus the extra noise in the logs makes troubleshooting more cumbersome. Is there no way to resolve this properly? An "expected" warning shouldn't exist if it's sole purpose is to cry wolf.

@crenshaw-dev
Copy link
Member

@Invincibear the spurious warning is fixed in recent versions.

@Invincibear
Copy link

@Invincibear the spurious warning is fixed in recent versions.

Glad to hear, thanks. I realize now that OLM-deployed Argo CD isn't using the newest version. I'll have to investigate this myself.

argocd: v2.10.12+cb6f5ac
  BuildDate: 2024-06-06T09:27:57Z
  GitCommit: cb6f5ac8b8c08ce26cce08b8a1bd2144034bd79b
  GitTreeState: clean
  GoVersion: go1.21.10
  Compiler: gc
  Platform: linux/amd64
argocd-server: v2.11.5+c4b283c
  BuildDate: 2024-07-15T17:39:54Z
  GitCommit: c4b283ce0c092aeda00c78ae7b3b2d3b28e7feec
  GitTreeState: clean
  GoVersion: go1.21.10
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v5.2.1 2023-10-19T20:13:51Z
  Helm Version: v3.14.4+g81c902a
  Kubectl Version: v0.26.11
  Jsonnet Version: v0.20.0

@crenshaw-dev
Copy link
Member

I'm gonna close this, because the causes are various, and newer versions of Argo CD will produce much more precise/useful error messages.

@ikevinc
Copy link

ikevinc commented Sep 25, 2024

@crenshaw-dev I have upgraded 2.10 to 2.11 RC1 and we're seeing a single oidc connector fail with a similar but reversed error message as everyone else in this issue.

time="2024-04-14T17:34:05Z" level=warning 
msg="Failed to verify token: failed to verify token: token verification failed for all audiences: error for aud \"argo-cd\": oidc: expected audience \"argo-cd\" got [\"argo-cd-cli\"]"

Unfortunately even in the 2.11 RC image this seems to be the only information logged. The connector that's failing is related to the single place where we use the argo cli, in a CI pipeline. Users are having no issue with our integrated SSO in the UI. We fetch the appropriate CLI version from the server during the pipeline and generate a token in each run from a secret.

  # Using Buildkite JWT to generate a token for Dex
  BUILDKITE_JWT=$(buildkite-agent oidc request-token \
    --lifetime 300 \
    --audience argo-cd-cli )
  export ARGOCD_AUTH_TOKEN=$(curl  http://${ARGOCD_SERVER}/api/dex/token \
  --user argo-cd-cli:bk-oidc-secret \
  --data-urlencode connector_id=bk-oidc \
  --data-urlencode grant_type=urn:ietf:params:oauth:grant-type:token-exchange \
  --data-urlencode scope="openid groups federated:id" \
  --data-urlencode requested_token_type=urn:ietf:params:oauth:token-type:access_token \
  --data-urlencode subject_token="$BUILDKITE_JWT" \
  --data-urlencode subject_token_type=urn:ietf:params:oauth:token-type:id_token | jq -r .access_token)

We expect the value of 'argo-cd-cli' so the error message is particularly odd.

      staticClients:
        - name: BK agents OIDC
          id: argo-cd-cli
          secret: bk-oidc-secret
          public: true

You mentioned above that something might be caching a token. I have restarted all our BK agent pods (they run in-cluster) and all of the argocd pods were new (including the dex server) from the 2.11 upgrade. However in the interests of science, I also restarted every argocd pod. Same result regardless; the CI pipeline is blocked and cannot authenticate.

hey there @zswanson - were you able to resolve this issue? i'm trying to do something similar, where i'm using github actions oidc to exchange for a dex token for argocd MACHINE auth...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests