Skip to content

Commit

Permalink
Docs: infrastructure (#350)
Browse files Browse the repository at this point in the history
  • Loading branch information
angela-tran authored Oct 23, 2023
2 parents 05a7f35 + eb82b9b commit a488cc6
Show file tree
Hide file tree
Showing 3 changed files with 141 additions and 122 deletions.
2 changes: 2 additions & 0 deletions docs/deployment/.pages
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
nav:
- infrastructure.md
138 changes: 138 additions & 0 deletions docs/deployment/infrastructure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# Infrastructure

The infrastructure is configured as code via [Terraform](https://www.terraform.io/), for [various reasons](https://techcommunity.microsoft.com/t5/fasttrack-for-azure/the-benefits-of-infrastructure-as-code/ba-p/2069350).

## Architecture

## Resources outside of Terraform

The following things in Azure are managed outside of Terraform:

- Subcriptions
- Active Directory (users, groups, service principals, etc.)
- Service connections
- Configuration files, stored as blobs

## Environments

| Environment | Azure Resource Group | Terraform Workspace | Git Branch |
| ----------- | --------------------------------- | ------------------- | ---------- |
| Dev | `$(AGENCY_RESOURCE_GROUP_PREFIX)-eligibility-dev` | `dev` | `dev` |
| Test | `$(AGENCY_RESOURCE_GROUP_PREFIX)-eligibility-test` | `test` | `test` |
| Prod | `$(AGENCY_RESOURCE_GROUP_PREFIX)-eligibility-prod` | `default` | `prod` |

All resources in these Resource Groups should be reflected in Terraform in this repository. The exceptions are:

- Secrets, such as values under [Key Vault](https://azure.microsoft.com/en-us/services/key-vault/). [`prevent_destroy`](https://developer.hashicorp.com/terraform/tutorials/state/resource-lifecycle#prevent-resource-deletion) is used on these Resources.
- Things managed outside of [Terraform](#resources-outside-of-terraform)

For browsing the [Azure portal](https://portal.azure.com), you can [switch your `Default subscription filter`](https://docs.microsoft.com/en-us/azure/azure-portal/set-preferences).

## Access restrictions

We restrict which IP addresses that can access the app service by using a Web Application Firewall (WAF) configured on a Front Door. There is an exception for the `/healthcheck` and `/static` paths, which can be accessed by any IP address.

The app service itself gives access only to our Front Door and to Azure availability tests.

## Monitoring

We have [ping tests](https://docs.microsoft.com/en-us/azure/azure-monitor/app/monitor-web-app-availability) set up to notify about availability of each environment. Alerts go to [#benefits-notify](https://cal-itp.slack.com/archives/C022HHSEE3F).

## Logs

Logs can be found a couple of places:

### Azure App Service Logs

[Open the `Logs` for the environment you are interested in.](https://docs.google.com/document/d/11EPDIROBvg7cRtU2V42c6VBxcW_o8HhcyORALNtL_XY/edit#heading=h.6pxjhslhxwvj) The following tables are likely of interest:

- `AppServiceConsoleLogs`: `stdout` and `stderr` coming from the container
- `AppServiceHTTPLogs`: requests coming through App Service
- `AppServicePlatformLogs`: deployment information

For some pre-defined queries, click `Queries`, then `Group by: Query type`, and look under `Query pack queries`.

### [Azure Monitor Logs](https://docs.microsoft.com/en-us/azure/azure-monitor/logs/data-platform-logs)

[Open the `Logs` for the environment you are interested in.](https://docs.google.com/document/d/11EPDIROBvg7cRtU2V42c6VBxcW_o8HhcyORALNtL_XY/edit#heading=h.n0oq4r1jo7zs)

The following [tables](https://docs.microsoft.com/en-us/azure/azure-monitor/app/opencensus-python#telemetry-type-mappings) are likely of interest:

- `requests`
- `traces`

In the latter two, you should see recent log output. Note [there is some latency](https://docs.microsoft.com/en-us/azure/azure-monitor/logs/data-ingestion-time).

See [`Failures`](https://docs.microsoft.com/en-us/azure/azure-monitor/app/asp-net-exceptions#diagnose-failures-using-the-azure-portal) in the sidebar (or `exceptions` under `Logs`) for application errors/exceptions.

### Live tail

After [setting up the Azure CLI](#making-changes), you can use the following command to [stream live logs](https://docs.microsoft.com/en-us/azure/app-service/troubleshoot-diagnostic-logs#in-local-terminal):

```sh
az webapp log tail --resource-group <resource group name> --name <app service name> 2>&1 | grep -v /healthcheck
```

e.g.

```bash
az webapp log tail --resource-group courtesy-cards-eligibility-prod --name mst-courtesy-cards-eligibility-server-prod 2>&1 | grep -v /healthcheck
```

### SCM

Docker logs can be viewed in the Advanced Tools for the instance. The URL pattern is `https://<app service name>.scm.azurewebsites.net/api/logs/docker`

## Making changes

Terraform is [`plan`](https://www.terraform.io/cli/commands/plan)'d when code is pushed to any branch on GitHub, then [`apply`](https://www.terraform.io/cli/commands/apply)'d when merged to `dev`. While other automation for this project is done through GitHub Actions, we use an Azure Pipeline (above) for a couple of reasons:

- Easier authentication with the Azure API using a service connnection
- Log output is hidden, avoiding accidentally leaking secrets

### Local development

1. Get access to the Azure account.
1. Install dependencies:

- [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)
- [Terraform](https://www.terraform.io/downloads) - see exact version in [`pipeline/deploy.yml`](pipeline/deploy.yml)

1. [Authenticate using the Azure CLI](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/guides/azure_cli).

```sh
az login
```

1. Outside the [dev container](https://docs.calitp.org/eligibility-server/getting-started/), navigate to the `terraform/` directory.
1. Create a [`terraform.tfvars` file](https://developer.hashicorp.com/terraform/language/values/variables#variable-definitions-tfvars-files) and specify the [variables](variables.tf).
1. [Initialize Terraform.](https://www.terraform.io/cli/commands/init) You can also use this script later to switch between [environments](#environments).

```sh
./init.sh <env> <agency>
```

1. Make changes to Terraform files.
1. Preview the changes, as necessary.

```sh
terraform plan
```

1. Submit the changes via pull request.

## Azure environment setup

The steps we took to set up MST's environment are documented in [a separate Google Doc](https://docs.google.com/document/d/12uzuKyvyabHAOaeQc6k2jQIG5pQprdEyBpfST_dY2ME/edit#heading=h.1vs880ltbo58).

In general, the steps that must be done manually before the pipeline can be run are:

- Create an Azure DevOps organization and project
- Request a free grant of parallel jobs using the form at [https://aka.ms/azpipelines-parallelism-request](https://aka.ms/azpipelines-parallelism-request)
- Create Resource Group and storage account dedicated to the Terraform state
- Create container in storage account for Terraform state
- Create environment Resource Group for each environment, Region: West US
- We create these manually to avoid having to give the pipeline service connection permissions for creating resource groups
- Create Terraform workspace for each environment
- Trigger a pipeline run to verify `plan` and `apply`
- Known chicken-and-egg problem: Terraform both creates the Key Vault and expects a secret within it, so will always fail on the first deploy. Add the Benefits slack email secret and re-run the pipeline.
123 changes: 1 addition & 122 deletions terraform/README.md
Original file line number Diff line number Diff line change
@@ -1,124 +1,3 @@
# Infrastructure

The infrastructure is configured as code via [Terraform](https://www.terraform.io/), for [various reasons](https://techcommunity.microsoft.com/t5/fasttrack-for-azure/the-benefits-of-infrastructure-as-code/ba-p/2069350).

## Architecture

## Resources outside of Terraform

The following things in Azure are managed outside of Terraform:

- Subcriptions
- Active Directory (users, groups, service principals, etc.)
- Service connections
- Configuration files, stored as blobs

## Environments

| Environment | Azure Resource Group | Terraform Workspace | Git Branch |
| ----------- | --------------------------------- | ------------------- | ---------- |
| Dev | `courtesy-cards-eligibility-dev` | `dev` | `dev` |
| Test | `courtesy-cards-eligibility-test` | `test` | `test` |
| Prod | `courtesy-cards-eligibility-prod` | `default` | `prod` |

All resources in these Resource Groups should be reflected in Terraform in this repository. The exceptions are:

- Secrets, such as values under [Key Vault](https://azure.microsoft.com/en-us/services/key-vault/). [`prevent_destroy`](https://developer.hashicorp.com/terraform/tutorials/state/resource-lifecycle#prevent-resource-deletion) is used on these Resources.
- Things managed outside of [Terraform](#resources-outside-of-terraform)

For browsing the [Azure portal](https://portal.azure.com), you can [switch your `Default subscription filter`](https://docs.microsoft.com/en-us/azure/azure-portal/set-preferences).

## Access restrictions

We restrict which IP addresses that can access the app service by using a Web Application Firewall (WAF) configured on a Front Door. There is an exception for the `/healthcheck` path, which can be accessed by any IP address.

The app service itself gives access only to our Front Door and to Azure availability tests.

## Monitoring

We have [ping tests](https://docs.microsoft.com/en-us/azure/azure-monitor/app/monitor-web-app-availability) set up to notify about availability of each environment. Alerts go to [#benefits-notify](https://cal-itp.slack.com/archives/C022HHSEE3F).

## Logs

Logs can be found a couple of places:

### Azure App Service Logs

[Open the `Logs` for the environment you are interested in.](https://docs.google.com/document/d/11EPDIROBvg7cRtU2V42c6VBxcW_o8HhcyORALNtL_XY/edit#heading=h.6pxjhslhxwvj) The following tables are likely of interest:

- `AppServiceConsoleLogs`: `stdout` and `stderr` coming from the container
- `AppServiceHTTPLogs`: requests coming through App Service
- `AppServicePlatformLogs`: deployment information

For some pre-defined queries, click `Queries`, then `Group by: Query type`, and look under `Query pack queries`.

### [Azure Monitor Logs](https://docs.microsoft.com/en-us/azure/azure-monitor/logs/data-platform-logs)

[Open the `Logs` for the environment you are interested in.](https://docs.google.com/document/d/11EPDIROBvg7cRtU2V42c6VBxcW_o8HhcyORALNtL_XY/edit#heading=h.n0oq4r1jo7zs)

The following [tables](https://docs.microsoft.com/en-us/azure/azure-monitor/app/opencensus-python#telemetry-type-mappings) are likely of interest:

- `requests`
- `traces`

In the latter two, you should see recent log output. Note [there is some latency](https://docs.microsoft.com/en-us/azure/azure-monitor/logs/data-ingestion-time).

See [`Failures`](https://docs.microsoft.com/en-us/azure/azure-monitor/app/asp-net-exceptions#diagnose-failures-using-the-azure-portal) in the sidebar (or `exceptions` under `Logs`) for application errors/exceptions.

### Live tail

After [setting up the Azure CLI](#making-changes), you can use the following command to [stream live logs](https://docs.microsoft.com/en-us/azure/app-service/troubleshoot-diagnostic-logs#in-local-terminal):

```sh
az webapp log tail --resource-group courtesy-cards-eligibility-prod --name mst-courtesy-cards-eligibility-server-prod 2>&1 | grep -v /healthcheck
```

### SCM

[Docker logs](https://mst-courtesy-cards-eligibility-server-dev.scm.azurewebsites.net/api/logs/docker)

## Making changes

[![Build Status](https://dev.azure.com/mstransit/courtesy-cards/_apis/build/status/cal-itp.eligibility-server?branchName=dev)](https://dev.azure.com/mstransit/courtesy-cards/_build/latest?definitionId=1&branchName=dev)

Terraform is [`plan`](https://www.terraform.io/cli/commands/plan)'d when code is pushed to any branch on GitHub, then [`apply`](https://www.terraform.io/cli/commands/apply)'d when merged to `dev`. While other automation for this project is done through GitHub Actions, we use an Azure Pipeline (above) for a couple of reasons:

- Easier authentication with the Azure API using a service connnection
- Log output is hidden, avoiding accidentally leaking secrets

### Local development

1. Get access to the Azure account.
1. Install dependencies:

- [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)
- [Terraform](https://www.terraform.io/downloads) - see exact version in [`pipeline/azure-pipelines.yml`](pipeline/azure-pipelines.yml)

1. [Authenticate using the Azure CLI](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/guides/azure_cli).

```sh
az login
```

1. Outside the [dev container](https://docs.calitp.org/eligibility-server/getting-started/), navigate to the `terraform/` directory.
1. Create a [`terraform.tfvars` file](https://developer.hashicorp.com/terraform/language/values/variables#variable-definitions-tfvars-files) and specify the [variables](variables.tf).
1. [Initialize Terraform.](https://www.terraform.io/cli/commands/init) You can also use this script later to switch between [environments](#environments).

```sh
./init.sh <env>
```

1. Make changes to Terraform files.
1. Preview the changes, as necessary.

```sh
terraform plan
```

1. Submit the changes via pull request.

## Azure environment setup

The steps we took to set up MST's environment are documented in [a separate Google Doc](https://docs.google.com/document/d/12uzuKyvyabHAOaeQc6k2jQIG5pQprdEyBpfST_dY2ME/edit#heading=h.1vs880ltbo58).

This is not a complete step-by-step guide; more a list of things to remember. This may be useful as part of incident response.
[Documentation](https://docs.calitp.org/eligibility-server/deployment/infrastructure/)

0 comments on commit a488cc6

Please sign in to comment.