Personal Infrastructure Part 1: Introduction and Basic Ansible Setup
Published on September 13, 2024 | Last updated on September 18, 2024
In this first step, I'm going to build Ansible skeleton project and test connectivity. I will explain the motivation, my existing setup, how I setup Ansible, and how I make the process a bit smoother and more secure.
Introduction and Motivation
Eventually I will build out Personal Microservices Infrastructure Project. I will have a collection of files that given access to a handful of physical or virtual machines will "build" a complete foundation for a personal microservices project. To avoid moving too slowly, I will avoid trying to make any part of this process perfectly generic. It will work with my chosen hardware, software and 3rd party services. I will make some effort so that anyone following alone should be able to recreate something similar.
My Existing Setup
Hardware
- Protectli Vault, configured with 64GB RAM, and a Samsung 4TB SSD.
- AMD Threadripper desktop with 128GB RAM, and 8TB of SSD storage.
- Old MSI laptop with 16GB or RAM and 1 TB of SSD Storage
- Digital Ocean Intel SSD VM with 4GB of RAM and 100 GB or storage.
- Smaller Protectli Vault running pfSense
Existing Use Cases
Right now I host a few personal web services:
- Plex
- My "spiritual" website: i am that i am, an "almost" static site that uses a bit of Django.
- A personal Sentry instance.
- A personal Jenkins instance
- A Plausible Analytics instance
- A few more
Network Topology
All the physical hardware is connected via a gigabit switch behind pfSense router. The pfSense router is running a Wireguard server that I connect to with my roaming Mac laptop, and iPhone. All these servers are on the 10.0.0.0/24
subnet. All Wireguard clients are on the 10.2.0.0/24
subnet, with a specific tunnel for each one.
My VM provided by Digital Ocean is running Debian 12. It has a permanent Wireguard tunnel back to the pfSense box. For this website, I will call this box gateway
.
I'ved named all three physical compute nodes after cats:
aslan
- AMD Threadripper system, and main computer nodegreen-lion
- Large Protectli vaultbagheera
- Old MSI latop
Click on the links to see a bit behind each name. I must confess, I'm not fully aware of the history of green-lion
within the field of alchemy, so I hope it doesn't mean something terrible!
According to ChatGPT:
In essence, the Green Lion is a metaphor for transformation, representing both the destructive and creative forces in alchemy.
These are running in containers manually deployed using Docker Compose. Most of them are running on aslan
, with some running in "high availability" mode, with containers running on both aslan
and green-lion
. An nginx reverse proxy runs on gateway
proxying traffic and terminating SSL using Let's Encrypt.
I'd like to leave all of these services running with close to zero downtime, while deploying new services using increasingly more advanced techniques, culminating in a platform built on top of Kubernetes.
To do so, I'm going to first get some automation in place to configure and manage these physical servers. Then I'll move the manual configuration of my existing service into Ansible, and then from there will setup CI using Drone CI.
Installing and testing Ansible
What is Ansible. Why am I using it?
I've used Ansible a few times to deploy web applications and configure servers. I don't love the giant collection of templated YAML, and yet, it provides too much value to ignore.
As described on their homepage:
Ansible is an open source IT automation engine that automates provisioning, configuration management, application deployment, orchestration, and many other IT processes. It is free to use, and the project benefits from the experience and intelligence of its thousands of contributors.
In my words, Ansible is a tool that let's you write YAML file that describe actions that should be taken on a collection of servers, including copying files, installing software and more. When structured and written well, Ansible "Playbooks" are idempotent, and repeatable.
I'm going to be using ansible primarily to manage the physical servers before any additional infrastructure is in place.
Requirements
Before we can use ansible, we need passwordless sudo ssh access to all nodes.
Repeat this step for all physical nodes that you wish to manage with Ansible. I'm going to target aslan
and green-lion
initially, and then maybe move onto gateway
and bagheera
later.
1. Passwordless sudo ansible_user account
- Create a new user named
ansible_user
:
sudo adduser ansible_user
- Give
ansible_user
sudo access without requiring a password:
echo "ansible_user ALL=(ALL) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/ansible_user
- Set up SSH key authentication for
ansible_user
:
sudo mkdir -p /home/ansible_user/.ssh
sudo chmod 700 /home/ansible_user/.ssh
sudo touch /home/ansible_user/.ssh/authorized_keys
sudo chmod 600 /home/ansible_user/.ssh/authorized_keys
- Copy your public SSH key into the
authorized_keys
file:
sudo sh -c 'echo "YOUR_PUBLIC_SSH_KEY" >> /home/ansible_user/.ssh/authorized_keys'
Replace YOUR_PUBLIC_SSH_KEY
with your actual public SSH key.
- Set proper ownership for the
.ssh
directory and its contents:
sudo chown -R ansible_user:ansible_user /home/ansible_user/.ssh
After completing these steps, you should be able to SSH into the server as ansible_user
using your SSH key, and execute sudo commands without a password prompt.
2: Ansible installed on local development machine
To install Ansible on your local development machine, follow these steps:
- Create a virtual environment:
python3 -m venv ansible-venv
- Activate the virtual environment:
source ansible-venv/bin/activate
- Install Ansible within the virtual environment:
pip install ansible
- Verify the installation:
ansible --version
This approach isolates Ansible and its dependencies in a dedicated environment, preventing conflicts with other Python packages on your system.
Create Ansible inventory and test connectivity
Create the following files and directory structure
$ tree physical-server-ansible-playbook
├── ansible.cfg
├── inventory
│ └── hosts
├── playbook.yml
└── roles
└── hello
└── tasks
└── main.yml
cat ansible.cfg
[defaults]
inventory = inventory/hosts
remote_user = ansible_user
private_key_file = ~/.ssh/id_ed25519_aslan_ansible
host_key_checking = False
interpreter_python = auto_silent
[lz]
cat inventory/hosts
aslan ansible_host=aslan
green-lion ansible_host=green-lion
[all:vars]
ansible_user=ansible_user
ansible_ssh_private_key_file=~/.ssh/id_ed25519_aslan_ansible
Of note, make sure the ssh key is correct in ansible.cfg
. Also note that host_key_checking = False
is a potential security risk. I'm running this on my home LAN so I think I'm good, but just be aware.
I've called this group of servers the "lz" for landing zone. I'll continue the metaphor, as I "land" on a distant planet and begin "terraforming."
To verify this is working you can run the ansible ping command
ansible all -m ping ✘ 1 master ⬆ ✱ ◼
green-lion | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: ansible_user@10.0.0.22: Permission denied (publickey).",
"unreachable": true
}
aslan | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python3.11"
},
"changed": false,
"ping": "pong"
}
As you can see, my connectivity to green-lion
is not correct. I'll go ahead and make the ansible user on green-lion
and try again.
ansible all -m ping master ⬆ ✱ ◼
aslan | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python3.11"
},
"changed": false,
"ping": "pong"
}
green-lion | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python3.10"
},
"changed": false,
"ping": "pong"
}
Next Steps
As you may be able to guess simply from the direction of this blog, I like automating things. In my next post I'll describe how a securely store secrets for usage within Ansible playbooks, and how I create initial random secrets usable for passwords and keys for deployed software.
Read more in Personal Infrastructure Part 2: Setting up Secret Storage for Ansible.
Untested Sketchy Scripts
I made a script for setting up the Ansible user on a remote machine. This assumes that you have SSH access to an account with sudo permissions on the remote server.
This may break for various reasons, but it has worked for me.
#!/usr/bin/env python3
#!/usr/bin/env python3
import os
import subprocess
import sys
import getpass
def run_ssh_command(hostname, command, control_path=None, use_sudo=False):
ssh_command = ['ssh']
if control_path:
ssh_command.extend(['-S', control_path])
if use_sudo:
full_command = f"sudo -S bash -c '{command}'"
else:
full_command = command
ssh_command.extend([hostname, full_command])
if use_sudo:
try:
# First try without password
result = subprocess.run(ssh_command, capture_output=True, text=True, input='', timeout=5)
if result.returncode != 0:
# If that fails, prompt for password
sudo_password = getpass.getpass(f"Enter sudo password for {hostname}: ")
result = subprocess.run(ssh_command, capture_output=True, text=True, input=sudo_password + '\n')
except subprocess.TimeoutExpired:
print("Sudo command timed out. Assuming no password is required.")
result = subprocess.run(ssh_command, capture_output=True, text=True)
else:
result = subprocess.run(ssh_command, capture_output=True, text=True)
if result.returncode != 0:
print(f"Error running command: {command}")
print("Remote host stderr output:")
print(result.stderr.strip())
return result.stdout.strip()
def get_ssh_keys():
ssh_dir = os.path.expanduser("~/.ssh")
return [f for f in os.listdir(ssh_dir) if f.endswith(".pub")]
def prompt_for_ssh_key(keys):
print("Available SSH public keys:")
for i, key in enumerate(keys):
print(f"{i + 1}: {key}")
choice = int(input("Select the number of the SSH key to use: ")) - 1
return keys[choice]
def check_user_exists(hostname, control_path):
return run_ssh_command(hostname, 'id -u ansible_user', control_path).isdigit()
def check_sudo_permissions(hostname, control_path):
return "NOPASSWD: ALL" in run_ssh_command(hostname, 'sudo -l -U ansible_user', control_path, use_sudo=True)
def check_ssh_key_installed(hostname, control_path, ssh_key):
result = run_ssh_command(hostname, f'grep -q "{ssh_key}" /home/ansible_user/.ssh/authorized_keys && echo "Key found" || echo "Key not found"', control_path)
return "Key found" in result
def enable_ansible_user(hostname, ssh_key, control_path):
user_exists = check_user_exists(hostname, control_path)
if user_exists:
if check_sudo_permissions(hostname, control_path):
print(f"User ansible_user already exists with appropriate sudo permissions.")
else:
print(f"User ansible_user exists but does not have appropriate sudo permissions.")
run_ssh_command(hostname, 'echo "ansible_user ALL=(ALL) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/ansible_user', control_path, use_sudo=True)
print("Added sudo permissions for ansible_user.")
else:
run_ssh_command(hostname, 'sudo adduser --disabled-password --gecos "" ansible_user', control_path, use_sudo=True)
run_ssh_command(hostname, 'echo "ansible_user ALL=(ALL) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/ansible_user', control_path, use_sudo=True)
print("Created ansible_user with sudo permissions.")
if not check_ssh_key_installed(hostname, control_path, ssh_key):
commands = [
'sudo mkdir -p /home/ansible_user/.ssh',
'sudo chmod 700 /home/ansible_user/.ssh',
'sudo touch /home/ansible_user/.ssh/authorized_keys',
'sudo chmod 600 /home/ansible_user/.ssh/authorized_keys',
f'echo "{ssh_key}" | sudo tee -a /home/ansible_user/.ssh/authorized_keys',
'sudo chown -R ansible_user:ansible_user /home/ansible_user/.ssh'
]
for command in commands:
run_ssh_command(hostname, command, control_path, use_sudo=True)
print("Installed SSH key for ansible_user.")
else:
print("SSH key already installed for ansible_user.")
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: enable_ansible <hostname>")
sys.exit(1)
hostname = sys.argv[1]
keys = get_ssh_keys()
if not keys:
print("No SSH public keys found in ~/.ssh")
sys.exit(1)
selected_key = prompt_for_ssh_key(keys)
ssh_key_path = os.path.expanduser(f"~/.ssh/{selected_key}")
with open(ssh_key_path, 'r') as key_file:
ssh_key = key_file.read().strip()
control_path = f"/tmp/ansible-ssh-{hostname}-22-control"
# Set up the control master connection
subprocess.run(['ssh', '-M', '-S', control_path, '-fNT', hostname], check=True)
try:
enable_ansible_user(hostname, ssh_key, control_path)
print(f"Ansible user enabled on {hostname} with SSH key {selected_key}")
finally:
# Close the control master connection
subprocess.run(['ssh', '-S', control_path, '-O', 'exit', hostname], check=True)