The standard way to access a shell on a Bottlerocket node is to use either the admin container or the control container. In some cases where both the admin and control containers are disabled, it is still possible to regain access to a Bottlerocket node.
In general, the solution is to mount the API client and API socket into a container on the Bottlerocket node and use the API client to re-enable the admin container, control container, or both.
The solutions on this page provide a minimal overview. It involves mounting critical resources into containers with elevated privileges. Use carefully, only run as long as necessary, and clean up afterwards.
Create a pod that mounts the API client and API socket with the correct SELinux labels.
Some notes on the Pod spec below:
system_u:object_r:api_socket_t:s0
) that restricts access to the api_socket_t
type and s0
sensitivity level.
To learn more about the SELinux labels in Bottlerocket, see the Security Guidance documentation.control_t
type (has access to api_socket_t
) and s0
sensitivity level to allow the container to access the API socket.sleep infinity
so that the container stays running for you to exec
into.apiVersion: v1
kind: Pod
metadata:
name: regain-access
spec:
containers:
- name: regain-access
image: fedora
command: ["sleep", "infinity"]
volumeMounts:
- mountPath: /usr/bin/apiclient
name: apiclient
readOnly: true
- mountPath: /run/api.sock
name: apiserver-socket
securityContext:
seLinuxOptions:
level: s0
role: system_r
type: control_t
user: system_u
volumes:
- name: apiclient
hostPath:
path: /usr/bin/apiclient
type: File
- name: apiserver-socket
hostPath:
path: /run/api.sock
type: Socket
exec
into the container.kubectl exec -it regain-access -- bash
apiclient
to enable the admin container, control container, or both.Admin container:
apiclient set host-containers.admin.enabled=true
Control container:
apiclient set host-containers.control.enabled=true
regain-access
pod.
The Bottlerocket node should be accessible again.You can regain access to a Bottlerocket node using ECS Exec starting in v1.14.0.
This solution requires the use of ECS Exec and associated IAM permissions. The instructions for proper IAM permissions to enable ECS Exec can be found in the AWS blog post Using Amazon ECS Exec to access your containers on AWS Fargate and Amazon EC2.
CONTAINER_NAME
and SERVICE_NAME
are not specifically relevant, but they need to be consistent across multiple files and commands.export CONTAINER_NAME="regain-access"
export CLUSTER=<your cluster name>
export SERVICE_NAME="regain-access"
export TASK_ROLE_ARN=<the ARN for your IAM task role>
apiclient
and the related socket path as well as giving it the proper SELinux labels.
Piping cat
to a file to generate the task definition1 JSON will allow for interpolating the variables into the final file.cat << EOF > task-def.json
{
"family": "regain-access",
"containerDefinitions": [
{
"name": "${CONTAINER_NAME}",
"image": "fedora",
"cpu": 0,
"memoryReservation": 300,
"portMappings": [],
"essential": true,
"command": ["sleep", "infinity"],
"environment": [],
"mountPoints": [
{
"sourceVolume": "apiclient",
"containerPath": "/usr/bin/apiclient",
"readOnly": true
},
{
"sourceVolume": "apisocket",
"containerPath": "/run/api.sock"
}
],
"volumesFrom": [],
"dockerSecurityOptions": [
"label:user:system_u",
"label:role:system_r",
"label:type:control_t",
"label:level:s0"
]
}
],
"taskRoleArn": "${TASK_ROLE_ARN}",
"volumes": [
{
"name": "apiclient",
"host": {
"sourcePath": "/usr/bin/apiclient"
}
},
{
"name": "apisocket",
"host": {
"sourcePath": "/run/api.sock"
}
}
],
"requiresCompatibilities": [
"EC2"
],
"cpu": "1024",
"memory": "1024"
}
EOF
query
parameter to get only what you need, format the output text and skip the pager.
Then take the results and put it into another environment variable. Any errors will be visible as a response in standard output.export REGAIN_ACCESS_ARN=$(aws ecs register-task-definition \
--cli-input-json file://task-def.json \
--query "taskDefinition.taskDefinitionArn" \
--output text --no-cli-pager)
echo
the environment variable to make sure it matches what you’d expect from an ARN.echo $REGAIN_ACCESS_ARN
aws ecs create-service --cluster "${CLUSTER}" \
--task-definition "${REGAIN_ACCESS_ARN}" \
--service-name "${SERVICE_NAME}" \
--desired-count 1 \
--launch-type EC2 \
--enable-execute-command \
--no-cli-pager
create-service
call; the command below will grab all the task ARNs from the service name and cluster you specified earlier (which should only be one), filter the response with the query
parameter, and use an environment variable to store only what you need.
Wait a few seconds after the previous step and run the following command:export TASK_ARN=$(aws ecs list-tasks --cluster ${CLUSTER} \
--service-name ${SERVICE_NAME} \
--no-cli-pager \
--output text \
--query "taskArns[0]")
echo $TASK_ARN
execute-command
to open an interactive shell with the container.aws ecs execute-command --cluster "${CLUSTER}" \
--task "${TASK_ARN}" \
--container "${CONTAINER_NAME}" \
--interactive \
--command /bin/bash"
Starting session with SessionId: ecs-execute-command-<some hex values>
apiclient
to enable the admin container, control container, or both.# Admin container
apiclient set host-containers.admin.enabled=true
# Control container:
apiclient set host-containers.control.enabled=true
This task definition uses fedora
as an image, but almost any base image with a shell will also work. ↩︎