Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/components/sk-ctrl.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,8 @@ owned by the SimulationRoot, so that users can still see the results and logs fr

## Configuring Metrics Collection

> [!NOTE] In the future we may move metrics collection out of SimKube proper and instead run it as a standard "hook".
> [!NOTE]
> In the future we may move metrics collection out of SimKube proper and instead run it as a standard "hook".
> If you do not want to use Prometheus for metrics collection, or wish to configure it differently, you can disable
> metrics collection using `skctl --disable-metrics` and configure your own metrics solution with a preStart hook.

Expand Down
45 changes: 45 additions & 0 deletions docs/infra/amis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
<!--
template: docs.html
-->
# SimKube Amazon Machine Images (AMIs)
[Applied Computing Research Labs](https://appliedcomputing.io) provides prebuilt Amazon Machine Images (AMIs) for running simulations in AWS without having to install or configure SimKube manually.

Our AMIs are intended for users who want a repeatable, preconfigured simulation environment for SimKube.

## Quick Start Guides
- [Run SimKube in AWS EC2](./run-sim.md)
- [Run SimKube in CI](./ci-sim.md)

## What the AMIs are for
The SimKube AMIs are designed for:

- running SimKube simulations on EC2
- providing a consistent environment across runs
- reducing setup and dependency management
- running larger, longer simulations than you can run locally with SimKube
- running SimKube in CI pipelines

## Available AMIs
- **SimKube AMI**: Suitable for running on demand SimKube workloads and long running simulations.
- **SimKube GitHub Runner AMI**: Designed specifically with CI in mind. Use this AMI as an ephemeral SimKube GitHub Action Runner.

## What's included in the AMIs

| Feature | SimKube AMI | SimKube GitHub Runner AMI |
|:-----------------------------------------------|:-----------:|:-------------------------:|
| Ubuntu 24.04 LTS Operating System | ✅ | ✅ |
| A running Kubernetes cluster & management tools| ✅ | ✅ |
| All SimKube components | ✅ | ✅ |
| All SimKube dependencies | ✅ | ✅ |
| Container runtime + system dependencies | ✅ | ✅ |
| GitHub Actions Runner software | ❌ | ✅ |


Our AMIs are optimized for running simulations, and are not recommended for any other use cases.

> [!NOTE]
> More on SimKube components [here](../components/sk-ctrl.md)!

## Next steps
- [Launch and use SimKube AMIs](./usage.md)
- [Configure GitHub Actions to run simulations on self-hosted runners](github-runners.md)
105 changes: 105 additions & 0 deletions docs/infra/ci-sim.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
<!--
template: docs.html
-->
# Run SimKube in CI

This quickstart guide explains how to use SimKube in CI using GitHub Actions and AWS EC2.

## Assumptions
- you have collected a trace from the cluster you want to simulate, if you still need to do this see [the sk-tracer docs](../intro/running.md).
- you have sufficient permissions to managed the AWS resources described, for more on this see the AWS Permissions section on our [usage](./usage.md) page.


## 0. Create a key pair

You will need to generate a key pair in AWS for the IAM user you are using to access AWS resources. Hang onto those; you will need them when you configure the secrets.

AWS provides instructions on creating key pairs in AWS IAM via the console or CLI [here](https://docs.aws.amazon.com/IAM/latest/UserGuide/access-keys-admin-managed.html#admin-create-access-key).

## 1. GitHub Permissions

To use SimKube in CI the GitHub account will need:
- permissions to access code and manage custom runners
- a method of accessing those permissions

### Example using a fine grained PAT:

#### Setup the PAT in GitHub:
- Go to user `Settings`
- Click `Developer settings`
- Under `Personal access tokens`
- Choose `Fine-grained tokens`
- Select the `Resource owner`: if the repo is not owned by you it will send an access request to the owner(s) of the repos you select
- Give the token a descriptive `Token name` and `Description`
- The `Request message` should give some context to the admin
- Choose an `Expiration` that meets your organization's policy requirements
- Select `Only select repositories`
- Choose the repositories you want to run SimKube in
- Click `Add permissions`
- Select Read and Write access for `Actions` and `Administration`; `metadata` will be selected by default
- Click `Generate token and request access`
- In the next step we will add the PAT to our secrets

## 2. Configure secrets
Add the following secrets to the repo you will be testing in

- `SIMKUBE_RUNNER_PAT` - PAT with repo scope created in Step 1
- `AWS_ACCESS_KEY_ID` - AWS access key created in Step 0
- `AWS_SECRET_ACCESS_KEY` - AWS secret key created in Step 0

## 3. Create a GitHub Actions workflow
We will be using a custom action created by ACRL called [simkube-ci-action](https://github.com/acrlabs/simkube-ci-action). Our custom action simplifies the setup and teardown of ephemeral runners so you can focus on running impactful simulations in CI.
To use `simkube-ci-action` use the `launch-runner` and `run-simulation` custom actions in your workflow.

### A basic action workflow file might look like:

```yaml
name: Run simulation
on:
workflow_dispatch:
push:
branches:
- "main"
jobs:
launch-runner:
runs-on: ubuntu-latest
steps:
- name: Setup SimKube GitHub Action runner
uses: acrlabs/simkube-ci-action/actions/launch-runner@main
with:
instance-type: m6a.large
aws-region: us-west-2
subnet-id: subnet-xxxx
security-group-ids: sg-xxxx
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this field a list? or just a single value?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specifically security-group-ids? It's a space separated list of strings.

Should we explain the options for simkube-ci-action here or link to that repo? I still need to add a full explanation in that repo too.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think linking to that repo is fine.

simkube-runner-pat: ${{ secrets.SIMKUBE_RUNNER_PAT }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
run-simulation:
needs: launch-runner
runs-on: [self-hosted, simkube, ephemeral]
steps:
- uses: actions/checkout@v5
- name: Run simulation
uses: acrlabs/simkube-ci-action/actions/run-simulation@main
with:
simulation-name: your-sim-name
trace-path: path/to/your/trace
```

## 4. Test your SimKube workflow
Test your workflow by manually dispatching it in the actions menu.

Currently `simkube-ci-action` is pass/fail. The simulation either runs to completion or it fails. We do not currently have a method for injecting evaluation criteria for simulations.

A successful simulation will exit with code 0 and you will see a `✓ Simulation completed successfully!` in the actions logs.

A failed simulation will exit with a non-zero exit code failing the CI action and printing a detailed failure summary.

## 5. Evaluating your results
Prometheus and Grafana are installed natively. Users can view simulation results by connecting to the Grafana pod on your EC2 instance.

See [Evaluate your results](./evaluate.md) for more details.

> [!NOTE]
> `simkube-ci-action` runners are ephemeral-only and all data from the simulation is lost.
> In the future we will expose functionality that will allow data to be sent to external prometheus endpoints.
54 changes: 54 additions & 0 deletions docs/infra/evaluate.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
<!--
template: docs.html
-->
# Evaluate your results
Prometheus and Grafana are installed natively. Users can view simulation results by connecting to the Grafana pod on your EC2 instance:

## 0. Establish an SSH tunnel:

```sh
ssh -L 3000:<REMOTE_HOST>:3000 ec2-user@<ec2-instance-ip>

```

## 1. Set up port forwarding:

```sh
kubectl port-forward -n monitoring svc/grafana 3000
```

## 2. Open the Grafana UI
<http://localhost:3000/>
Comment thread
ogorman89 marked this conversation as resolved.

## 3. Create a Dashboard

- `Dashboards > New > New Dashboard > Add visualization`
- In the `Data source` field select `prometheus`
- In the `Query` field select `Code`
- Enter your PromQL query

Here are some queries to try:

### See all simulated pods over time
```promql
sum(
kube_pod_status_phase{
phase="Running",
namespace=~"virtual-.*"
}
)
```

### See all virtual KWOK nodes by instance type
```promql
sum(
kube_node_status_condition{
condition="Ready",
status="true"
}
* on (node) group_right kube_pod_labels{
label_type="virtual"
}
)
by (label_node_kubernetes_io_instance_type)
```
37 changes: 37 additions & 0 deletions docs/infra/github-runners.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
<!--
template: docs.html
-->
# SimKube GitHub Action Runner AMI

[Applied Computing Research Labs](https://appliedcomputing.io) provides support for running simulations on self-hosted GitHub Actions runners that are backed by SimKube AMIs.

These runners are intended for teams that want reliable, repeatable simulation as part of their CI pipelines.

## When to use a SimKube GitHub Actions Runner
The primary use case for using the SimKube GitHub Actions Runner AMI is to run simulations in CI.

An example configuration using the SimKube GitHub Actions Runner AMI is available in the SimKube repo using the [simkube-ci-action](https://github.com/acrlabs/simkube-ci-action) GitHub action maintained by ACRL.

## Runner lifecycle
SimKube GitHub runners are self-hosted and managed by your organization.

- runners must be registered with GitHub at the repository or organization level
- authentication and registration follow GitHub's standard self-hosted runner process
- runners are currently ephemeral-only and designed to be launched via GitHub Actions

For more information on configuration self-hosted GitHub runners, please see the [instructions provided by GitHub](https://docs.github.com/en/actions/how-tos/manage-runners/self-hosted-runners/add-runners).

## Using the runners in workflows
Once registered, the runner can be targeted using [GitHub `runs-on` labels](https://docs.github.com/en/actions/how-tos/manage-runners/self-hosted-runners/apply-labels).

Example using our default labels:

```yaml
runs-on: [self-hosted, simkube, ephemeral]
Comment thread
drmorr0 marked this conversation as resolved.
```

## SimKube custom GitHub Actions
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section should probably be expanded: what properties or config values need to be set for the action(s) to work? Which ones are required/optional? etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we explain the details of simkube-ci-action in github-runners.md or just link them to the repo and provide a full explanation there? or both?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, put them in the simkube-ci-action repo.

ACRL maintains a set of custom GitHub actions for running SimKube in CI.

- for more visit the [simkube-ci-actions](https://github.com/acrlabs/simkube-ci-action) repo
- or see an example in our [Run SimKube in CI](./ci-sim.md) quick start guide
37 changes: 37 additions & 0 deletions docs/infra/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
<!--
template: docs.html
-->
# SimKube in the Cloud Overview
SimKube can run wherever you run k8s, from local testing environments to automated CI pipelines. For running simulations at scale or integrating SimKube into CI workflows, [Applied Computing Research Labs](https://appliedcomputing.io) provides prebuilt infrastructure components to simplify setup and improve reliability.

This section documents those components and where you can use them.
Comment thread
drmorr0 marked this conversation as resolved.

## When to use these components
You may wish to use the infrastructure components described here if you want:

- to run SimKube simulations in CI (for example, GitHub Actions)
- a repeatable, preconfigured SimKube environment
- to avoid maintaining your own base images or runners
- to run SimKube simulations in AWS

## What's included
- **Amazon Machine Images (AMIs)** - Prebuilt EC2 images with SimKube and its dependencies installed and configured; available for free
- **GitHub Actions Runners** - Self-hosted runners built on top of our AMIs, designed for running SimKube workloads in CI; available for a small fee

> [!NOTE]
> These runners are self-hosted in your AWS account using ACRL's runner AMI.

The next steps cover how to use these components.

## Next steps
- [Learn about SimKube AMI options](./amis.md)
- [Launch and use SimKube AMIs](./usage.md)
- [Configure GitHub Actions to run simulations on self-hosted runners](github-runners.md)

## Quick Start guides
- [Run SimKube in AWS EC2](./run-sim.md)
- [Run SimKube in CI](./ci-sim.md)

## How to get support
- open an issue in the [SimKube GitHub repo](https://github.com/acrlabs/simkube/issues)
- message us in the [SimKube Slack Channel](https://kubernetes.slack.com/archives/C07LTUB823Z)
86 changes: 86 additions & 0 deletions docs/infra/run-sim.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
<!--
template: docs.html
-->
# Run SimKube in AWS EC2
This guide is intended for users who want to run SimKube in EC2 for one off simulations or longer-lived simulation environments.

## Assumptions
- you have collected a trace from the cluster you want to simulate, if you still need to do this see [the sk-tracer docs](../intro/running.md).
- you have sufficient permissions to managed the AWS resources described, for more on this see the AWS Permissions section on our [usage](./usage.md) page.

## 0. Locate the SimKube AMI

### Via the AWS CLI
```sh
aws ec2 describe-images \
--owners 174155008850 \
--filters "Name=name,Values=simkube-x86-64-*" \
--query "Images[].{
ImageId: ImageId,
Name: Name,
CreationDate: CreationDate
}" \
--region us-west-2 \
--output table
```

### Via the AWS Console
- Open the EC2 Console
- Navigate to `AMIs`
- Filter by:
Owner: Owned by another account
Owner ID: 174155008850
- Search by name: `simkube-ami-*`
Comment thread
drmorr0 marked this conversation as resolved.

## 1. Launch an EC2 instance from the AMI
- we recommend using the latest available SimKube AMI
- choose an instance type appropriate for your workload
Comment thread
ogorman89 marked this conversation as resolved.
- attach a keypair for ssh access

## 2. Connect to the instance
```sh
ssh ubuntu@<instance-public-ip>
Comment thread
ogorman89 marked this conversation as resolved.
```

> [!NOTE]
> The default username to use to connect to your EC2 instance `ubuntu`, not `ec2-user`.

## 3. Load your trace
> [!NOTE]
> For simplicity and ease of use, we recommend using AWS S3 to store your trace files.
> If your trace files are in S3, you can skip this step; SimKube will need additional IAM permissions to access your S3 bucket.

Copy your trace to the instance, the default SimKube trace location is `/var/kind/cluster/trace`:

```sh
scp your_trace_file ubuntu@<instance-ip>:/var/kind/cluster/trace
```

Comment thread
ogorman89 marked this conversation as resolved.
> [!WARNING]
> The trace file path on the EC2 host is not the same as the trace file path specified in the Simulation custom resource.
> This is because there's three layers of indirection for running on a local trace: the EC2 host gets mounted into the kind docker
> container which gets mounted into the SimKube pod.

## 4. Run your simulation
```sh
skctl run my-simulation --trace-path s3://your-simkube-bucket/path/to/trace
Comment thread
ogorman89 marked this conversation as resolved.
```

> [!NOTE]
> --trace-path defaults to file:///data/trace so this field is optional for local simulations

More information on running simulations with SimKube can be found [here](https://github.com/acrlabs/simkube/blob/main/docs/intro/running.md).

You can check the status of your simulation by running:
```sh
kubectl get simulation my-sim-name
```

> [!NOTE]
> Simulations will start in the `Initializing` state progress to `Running` once they have been scheduled.
> Finally, the simulation will complete with either a `Failed` or `Finished` state.

## 5. Evaluate your results
Prometheus and Grafana are installed natively. Users can view simulation results by connecting to the Grafana pod on your EC2 instance.

See [Evaluate your results](./evaluate.md).
6 changes: 6 additions & 0 deletions docs/infra/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
<!--
template: docs.html
-->
# Troubleshooting SimKube AMIs

🚧 COMING SOON!
Loading
Loading