Regaining Access to a Bottlerocket Node
Introduction
The standard way to access a shell on a Bottlerocket node is to use either the admin container or the control container. In some cases where both the admin and control containers are disabled, it is still possible to regain access to a Bottlerocket node.
Solution Description
In general, the solution is to mount the API client and API socket into a container on the Bottlerocket node and use the API client to re-enable the admin container, control container, or both.
Warning
The solutions on this page provide a minimal overview. It involves mounting critical resources into containers with elevated privileges. Use carefully, only run as long as necessary, and clean up afterwards.
Steps to Regain Access on Kubernetes
Create a pod that mounts the API client and API socket with the correct SELinux labels.
Some notes on the Pod spec below:
- The API socket has a specific SELinux label applied to it (
system_u:object_r:api_socket_t:s0
) that restricts access to theapi_socket_t
type ands0
sensitivity level. To learn more about the SELinux labels in Bottlerocket, see the Security Guidance documentation. - In order to access the API socket, the Pod must have compatible SELinux options applied to it.
- Use the
control_t
type (has access toapi_socket_t
) ands0
sensitivity level to allow the container to access the API socket. - The entrypoint for the container is
sleep infinity
so that the container stays running for you toexec
into.
- The API socket has a specific SELinux label applied to it (
apiVersion: v1
kind: Pod
metadata:
name: regain-access
spec:
containers:
- name: regain-access
image: fedora
command: ["sleep", "infinity"]
volumeMounts:
- mountPath: /usr/bin/apiclient
name: apiclient
readOnly: true
- mountPath: /run/api.sock
name: apiserver-socket
securityContext:
seLinuxOptions:
level: s0
role: system_r
type: control_t
user: system_u
volumes:
- name: apiclient
hostPath:
path: /usr/bin/apiclient
type: File
- name: apiserver-socket
hostPath:
path: /run/api.sock
type: Socket
exec
into the container.
kubectl exec -it regain-access -- bash
- Use
apiclient
to enable the admin container, control container, or both.
Admin container:
apiclient set host-containers.admin.enabled=true
Control container:
apiclient set host-containers.control.enabled=true
- Exit the shell and delete the
regain-access
pod. The Bottlerocket node should be accessible again.
Steps to Regain Access on ECS
You can regain access to a Bottlerocket node using ECS Exec starting in v1.14.0.
This solution requires the use of ECS Exec and associated IAM permissions. The instructions for proper IAM permissions to enable ECS Exec can be found in the AWS blog post Using Amazon ECS Exec to access your containers on AWS Fargate and Amazon EC2.
- First, you will need to setup a few environment variables.
CONTAINER_NAME
andSERVICE_NAME
are not specifically relevant, but they need to be consistent across multiple files and commands.
export CONTAINER_NAME="regain-access"
export CLUSTER=<your cluster name>
export SERVICE_NAME="regain-access"
export TASK_ROLE_ARN=<the ARN for your IAM task role>
- Now you will create a task definition that will mount the Bottlerocket
apiclient
and the related socket path as well as giving it the proper SELinux labels. Pipingcat
to a file to generate the task definition1 JSON will allow for interpolating the variables into the final file.
cat << EOF > task-def.json
{
"family": "regain-access",
"containerDefinitions": [
{
"name": "${CONTAINER_NAME}",
"image": "fedora",
"cpu": 0,
"memoryReservation": 300,
"portMappings": [],
"essential": true,
"command": ["sleep", "infinity"],
"environment": [],
"mountPoints": [
{
"sourceVolume": "apiclient",
"containerPath": "/usr/bin/apiclient",
"readOnly": true
},
{
"sourceVolume": "apisocket",
"containerPath": "/run/api.sock"
}
],
"volumesFrom": [],
"dockerSecurityOptions": [
"label:user:system_u",
"label:role:system_r",
"label:type:control_t",
"label:level:s0"
]
}
],
"taskRoleArn": "${TASK_ROLE_ARN}",
"volumes": [
{
"name": "apiclient",
"host": {
"sourcePath": "/usr/bin/apiclient"
}
},
{
"name": "apisocket",
"host": {
"sourcePath": "/run/api.sock"
}
}
],
"requiresCompatibilities": [
"EC2"
],
"cpu": "1024",
"memory": "1024"
}
EOF
- The following will take the file from the previous step and use the AWS CLI to register the task definition.
Running the AWS CLI command without any options, will produce a large amount of JSON that is hard to sift through. Instead, use the
query
parameter to get only what you need, format the output text and skip the pager. Then take the results and put it into another environment variable. Any errors will be visible as a response in standard output.
export REGAIN_ACCESS_ARN=$(aws ecs register-task-definition \
--cli-input-json file://task-def.json \
--query "taskDefinition.taskDefinitionArn" \
--output text --no-cli-pager)
- As a confidence check,
echo
the environment variable to make sure it matches what you’d expect from an ARN.
echo $REGAIN_ACCESS_ARN
- With the task definition ARN stored in the environment variable, you can create the service. You may want to limit the deployment of this service to specific instance(s).
aws ecs create-service --cluster "${CLUSTER}" \
--task-definition "${REGAIN_ACCESS_ARN}" \
--service-name "${SERVICE_NAME}" \
--desired-count 1 \
--launch-type EC2 \
--enable-execute-command \
--no-cli-pager
- The subsequent step will need the task ARN which is not included in the response from the
create-service
call; the command below will grab all the task ARNs from the service name and cluster you specified earlier (which should only be one), filter the response with thequery
parameter, and use an environment variable to store only what you need. Wait a few seconds after the previous step and run the following command:
export TASK_ARN=$(aws ecs list-tasks --cluster ${CLUSTER} \
--service-name ${SERVICE_NAME} \
--no-cli-pager \
--output text \
--query "taskArns[0]")
- Again, do a confidence check to ensure the environment variable looks as you’d expect.
echo $TASK_ARN
- Finally, you can use
execute-command
to open an interactive shell with the container.
aws ecs execute-command --cluster "${CLUSTER}" \
--task "${TASK_ARN}" \
--container "${CONTAINER_NAME}" \
--interactive \
--command /bin/bash"
- You should see a message like this followed by an interactive shell
Starting session with SessionId: ecs-execute-command-<some hex values>
- In the shell use
apiclient
to enable the admin container, control container, or both.
# Admin container
apiclient set host-containers.admin.enabled=true
# Control container:
apiclient set host-containers.control.enabled=true
- Now you should be able to access your admin or control container using SSM. You no longer need anything created in this how-to, feel free to delete the service and IAM policies.
This task definition uses
fedora
as an image, but almost any base image with a shell will also work. ↩︎