XZ utils CVE-2024-3094: the backdoor a maintainer planted over three years

xz-backdoor-documentation

CVE-2024-3094 is a backdoor introduced in xz-utils by a maintainer who joined the project in October 2021 and earned commit access in December 2022. It affects versions 5.6.0 (24 Feb 2024) and 5.6.1 (9 Mar 2024). CVSS 10.0. Andres Freund (PostgreSQL, Microsoft employee) finds it on 29 March 2024 while investigating a slow sshd on Debian sid during PostgreSQL benchmarks.

The payload is delivered in a place no one was looking: two binary files in tests/files/ that a hook hidden in m4/build-to-host.m4 extracts and runs only at ./configure time. The resulting library, liblzma.so, hooks RSA_public_decrypt when sshd loads it indirectly through libsystemd, the systemd-notify patch that Debian, Ubuntu, Fedora and derivatives apply to the OpenSSH package.

It isn’t a vulnerability: it’s a deliberate backdoor, with two and a half years of social engineering behind it.

Lab: the bug doesn’t reproduce against a production system; distros reverted to 5.4.x in under 48 hours. The technical analysis below builds on Andres Freund’s disclosure, the archived tukaani-project repository, Filippo Valsorda’s payload dump and the timeline reconstructed by Russ Cox.

Maintainer timeline — Jia Tan / JiaT75

The most useful part of the case isn’t the payload. It’s the chain of events that places a hostile maintainer with commit access on a library that sshd loads indirectly. Russ Cox publishes a detailed reconstruction at research.swtch.com/xz-timeline.

Date	Event
2021-10-29	Jia Tan’s first patch to `xz-devel`: an innocuous `.editorconfig`.
2022-02-07	First commit merged with `jiat0218@gmail.com` as author.
2022-04 to 06	Sockpuppet pressure campaign: `Jigar Kumar` and `Dennis Ens` complaining about Lasse Collin’s pace, the original maintainer.
2022-06-08	Lasse Collin admits on the list to personal and mental health issues; signals Jia Tan will take “a bigger role”.
2022-06-29	Lasse Collin announces Jia Tan as “practically a co-maintainer”.
2022-10-28	Jia Tan added to the `tukaani-project` GitHub org.
2022-12-30	Direct commit access.
2023-03-18	Jia Tan ships their first release (5.4.2).
2024-02-23	Backdoor code merged into binary test files.
2024-02-24	Release 5.6.0 with malicious `build-to-host.m4`.
2024-03-09	Release 5.6.1 with updated payload.
2024-03-25	Hans Jansen — new account, first `ifunc` commit to xz in early 2023 — opens a bug in Debian asking to push 5.6.1 to `unstable`.
2024-03-28	Andres Freund reports the find privately to `distros@vs.openwall.org`.
2024-03-29	Public disclosure on oss-security.

Jigar Kumar and Dennis Ens have no history outside the pressure threads aimed at Lasse Collin. Their only documented use is pushing a maintainer in personal crisis to delegate commit access. Russ Cox identifies them as operational accounts of the same actor that controls Jia Tan.

Two and a half years between first patch and backdoored release. Three different fictitious maintainers coordinated. A Hans Jansen ifunc commit in 2023 that serves as a chain link. This isn’t an opportunistic attacker; it’s a patient operation.

The bug — payload delivered by an autotools hook

xz is shipped as a tarball generated with autotools. The tarball is not the git repo: it includes generated scripts (configure), expanded m4 macros and auxiliary files that aren’t version-controlled. That gap between repo and tarball is where Jia Tan slips the code in.

The `m4/build-to-host.m4` file

build-to-host.m4 originally comes from gnulib. The upstream version handles path conversion for cross-platform builds. The version travelling in the 5.6.0 and 5.6.1 tarballs replaces that macro with a variant that, at the end of ./configure, runs a script built on the fly.

The script searches the source tree for files matching a specific regex pattern:

grep -aErls "#{4}\[\[:alnum:\]\]{5}#{4}$"

Four #, five alphanumeric characters, four # and end of line. The only files matching live in tests/files/:

tests/files/bad-3-corrupt_lzma2.xz
tests/files/good-large_compressed.lzma

That’s the elegant part of the attack: to anyone glancing at the repo, tests/files/ contains intentionally corrupt .xz files used to validate the parser. No human reviewer is going to sit down and hexdump binaries labelled bad corrupt test data.

Payload extraction

The script chains tr, xz and RC4 with XOR over the bytes of the two files. The “corrupt” header of the .xz is what marks it as discardable for the test; the bytes after it form, after several rounds of de-XOR and decompression, a precompiled ELF object.

Before extracting, the script verifies specific conditions. Reconstructed from gynvael.coldwind.pl’s analysis:

Architecture x86_64.
Linux with gcc + GNU ld + glibc.
Build via dpkg-buildpackage or rpmbuild.
Build target is not Alpine (musl). Doesn’t trigger with clang.

If the conditions hold, the script injects the extracted ELF into the liblzma build system. The resulting Makefile chains the malicious object with the rest of the library without it appearing anywhere in the visible source tree.

On 30 March, Lasse Collin publishes a note confirming that neither the build-to-host.m4 nor the backdoor test files were ever in the git repository: only in the tarballs Jia Tan signed.

The resulting library — IFUNC abused

A liblzma compiled with the payload behaves like a normal library for all legitimate calls. The difference is in initialisation: it registers an IFUNC resolver that runs before main.

IFUNC is a glibc feature that lets a library decide at runtime which implementation of a function to use based on the CPU. It’s legitimate: glibc uses it so memcpy jumps to the AVX2 implementation if the processor supports it. The backdoor abuses it: the resolver doesn’t return an optimised version of a compression function. It reads the symbol table of the process loading liblzma, looks up RSA_public_decrypt@plt from OpenSSL, and replaces the pointer with its own function.

If the process is sshd and it’s linked against OpenSSL, any public-key authentication goes through the malicious RSA_public_decrypt.

The bridge — libsystemd loads liblzma, sshd loads libsystemd

Upstream OpenSSH doesn’t depend on libsystemd. It doesn’t depend on liblzma. If you compile sshd from openssh.com’s official tarball, liblzma doesn’t get loaded.

What happens on Debian, Ubuntu, Fedora, openSUSE and derivatives is another story: the distro’s openssh-server package applies a downstream patch to integrate sshd with systemd-notify. The patch adds a runtime dependency on libsystemd. libsystemd is linked against liblzma (to support compressed journal). That means as soon as sshd starts, libsystemd loads, and libsystemd loads liblzma. The backdoor’s IFUNC resolver runs before the process handles any connection.

sshd (distro binary)
  → libsystemd.so (sd_notify for systemd)
    → liblzma.so (backdoor: registers IFUNC resolver)
      → modifies the RSA_public_decrypt PLT in sshd

Upstream OpenSSH never had this chain. It’s a product of the downstream patch + systemd’s modular design. Damien Miller (OpenSSH maintainer) makes clear on his list in April that the chain doesn’t exist without the patch distros add, and that vanilla sshd doesn’t expose this surface.

The activation — Ed448 + system()

The hooked RSA_public_decrypt receives the data a client sends during SSH public-key authentication. The legitimate version verifies the RSA signature against the client’s public key. The malicious version:

Checks that a specific field in the payload contains a valid Ed448 signature over the host’s public key.
The Ed448 public key it verifies against is hardcoded in the malicious object. The private key is only held by the attacker.
If the signature validates, it extracts bytes from the payload and passes them directly to system().
Pre-auth RCE as root (sshd runs as root before dropping to the user).

Clean design:

Only the attacker can activate the backdoor, because only they hold the Ed448 private key.
The signature is bound to the host’s public key, so a valid payload against host-a.example doesn’t work against host-b.example.
It leaves no detectable traces in logs, because the authentication fails to the client (with the legitimate version behind it) if the condition isn’t met, and the execution happens before sshd records the attempt as accepted.

Filippo Valsorda and other analysts reconstruct the payload details over the 30-31 March weekend. The Ed448 + system() piece is confirmed by several independent traces against the malicious object extracted from package 5.6.1.

How Andres Freund found it

Freund was benchmarking PostgreSQL on Debian sid. He noticed two things:

SSH logins consumed more CPU than usual.
valgrind was producing new noise about liblzma.

He measured timings: a login went from ~0.3 s to ~0.8 s. He ran strace, looked at the call graph, ended up at liblzma interfering on a path that shouldn’t touch authentication. He minimised the case, confirmed the hook, wrote the report.

... since liblzma is a dependency of openssh's sshd on debian and many other
distros, this means that ssh logins (and many other things) are way slower
than they should be ... I assumed the system was very tight on memory, but
then valgrind started complaining ...

— Andres Freund, oss-security, 29 March 2024.

If Freund hadn’t been measuring SSH latency during a benchmark, 5.6.1 would have reached Ubuntu LTS and Fedora stable in a matter of weeks.

Detection — the tricks that appear on 29 and 30 March

While distro maintainers start reverting, several analysts publish detection one-liners. The cleanest one is based on a simple observation: liblzma has nothing to do with OpenSSL. If the binary mentions OpenSSL symbols, it’s because it’s modified.

# strings on a legitimate liblzma shouldn't mention OpenSSL or RSA
strings /usr/lib/x86_64-linux-gnu/liblzma.so.5 | grep -i 'rsa\|openssl'

Another indicator is size: the malicious liblzma.so.5.6.0 is ~100 KB larger than the 5.4.6 one. Vegard Nossum and others publish hexdump comparisons of the differing block.

Quick version check:

# Debian / Ubuntu
dpkg -l | grep xz-utils
# Fedora / RHEL
rpm -q xz-libs
# Any distro with xz installed
xz --version

Any 5.6.0 or 5.6.1 is vulnerable. 5.4.x and 5.6.2+ are not.

The “official” detection scripts (Red Hat, GitHub, Binarly) come in the following 24 hours and compare hashes against a known list.

Hashes and public artefacts

Hashes published by Red Hat and CISA for the malicious binaries:

File	SHA-256
`xz-5.6.0.tar.gz` (malicious upstream tarball)	`0f5c81d545d5269d5d8c7f2447e44ac1d2d52a5bb2d6418dbc44de4204aaa600`
`xz-5.6.1.tar.gz` (malicious upstream tarball)	`2398f4a8e53345325f44bdd9f0cc7401bd9025d736c6d43b372f4dea77bf75b8`
`liblzma.so.5.6.0` (Debian sid amd64)	`bf6f4a4f3fb29c5b04c2c8fd6abe2cefa3766fb20bd13c5a1e1c3a3e25e0fc1f`

Confirmed clean versions: 5.4.6-1 (Debian stable), 5.4.5-1ubuntu0.2 (Ubuntu LTS), 5.4.6-3 (Fedora 39).

YARA rule — static detection

Public rule from Binarly:

rule liblzma_xz_backdoor_3094
{
    meta:
        author = "Binarly + community"
        cve = "CVE-2024-3094"
        description = "Detects liblzma 5.6.0/5.6.1 with RSA_public_decrypt hook"
    strings:
        $sym_openssl  = "RSA_public_decrypt" wide ascii
        $ifunc_hook   = { 48 83 fa 30 0f 84 ?? ?? ?? ?? 48 83 fa 31 }
        $ed448_const  = { f3 0f 1e fa 41 57 41 56 41 55 41 54 53 48 83 ec }
    condition:
        uint32(0) == 0x464c457f and
        $sym_openssl and ($ifunc_hook or $ed448_const)
}

The RSA_public_decrypt symbol referenced from liblzma doesn’t appear in any clean version — the rule has zero known false positives.

Dynamic exposure confirmation

The backdoor only activates if liblzma is loaded via libsystemd (which only sshd does on distros that load libsystemd for socket-activation notification):

# Does libsystemd transitively load liblzma?
ldd $(which sshd) | grep -E 'libsystemd|liblzma'

# The benchmark that triggered the discovery (Andres Freund):
time ssh -i wrongkey user@localhost 2>/dev/null
# Clean version: ~50 ms to rejection
# Backdoor version: ~500 ms to rejection (Ed448 verification overhead)

Reproduction in a closed lab

For static analysis with no risk, snapshot of Debian sid before the revert:

docker run --rm -it debian:sid-20240311-slim bash
# Inside the container:
apt-get update && apt-get install -y xz-utils
xz --version  # Should show 5.6.0 or 5.6.1
strings /lib/x86_64-linux-gnu/liblzma.so.5 | grep -i 'rsa\|openssl'
# If OpenSSL symbols show up, the binary is modified

For analysis of the m4/build-to-host.m4 that injects the payload during ./configure, the file is available in the reverted commit of the tukaani-project/xz repo on GitHub.

Mitigation — revert, not patch

The distro response was uniform: revert to 5.4.x, not patch over 5.6.x. The reasons:

5.6.0 and 5.6.1 already have the payload embedded. Fixing the m4 without replacing the binaries leaves the malicious ELF object in the library.
Jia Tan’s releases after March 2023 (including 5.4.2) could contain precursor pieces of the payload that haven’t been identified yet. The audit of releases signed by Jia Tan is still ongoing months later.
5.6.2, with Lasse Collin back in charge, ships in May 2024 with everything from Jia Tan reviewed and clean.

Operational actions:

Confirm the installed version with the commands above.
If it’s 5.6.0 or 5.6.1, immediate downgrade via the distro repository. Debian and Ubuntu publish reverted packages on 29-30 March.
Rotate sshd host keys on any server that had 5.6.x running. Even without activation signal, the threat model includes the attacker having issued a payload against that machine and saved the access.
Review sshd logs from 24 February to 29 March looking for connections from anomalous IPs. The backdoor doesn’t mark them as accepted, but the network activity remains.

CISA publishes an advisory on 29 March. Red Hat assigns CVE-2024-3094 and publishes analysis in its Security Center.

Lessons

Three concrete things, no sentencing.

Trust in open source is trust in people, not in code. Lasse Collin maintained xz for 14 years. When an actor with time and patience comes along, offers help, improves the code and runs sockpuppets to pressure you into delegating, the attack that follows isn’t technical anymore, it’s organisational. The defence relies on governance models that don’t depend on a single human in crisis.
The tarball and the repo are different artefacts. Distros usually build from a tarball signed by the maintainer, not from git checkout. Code reviewers look at the repo. That asymmetry is where Jia Tan plants build-to-host.m4. Reproducible builds that start from the tarball produce exactly the same malicious binary; reproducible builds that start from the repo don’t.
liblzma didn’t have to end up in sshd. The sshd → libsystemd → liblzma chain is a side effect of the downstream systemd-notify patch. Every link a distro adds between a sensitive binary and indirect dependencies widens the surface. Distroless binaries or images with sshd linked from upstream weren’t vulnerable.

Andres Freund doesn’t belong to a security team. PostgreSQL is a database project. The difference between xz-utils 5.6.1 reaching Ubuntu LTS and being caught in time is that someone measured a benchmark’s latency and asked the tooling questions.

References

Andres Freund, oss-security: https://www.openwall.com/lists/oss-security/2024/03/29/4
NVD CVE-2024-3094: https://nvd.nist.gov/vuln/detail/CVE-2024-3094
Russ Cox, Timeline of the xz open source attack: https://research.swtch.com/xz-timeline
Russ Cox, The xz attack shell script: https://research.swtch.com/xz-script
Sam J. (thesamesam), xz-utils backdoor situation: https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27
gynvael.coldwind.pl, xz/liblzma: Bash-stage Obfuscation Explained: https://gynvael.coldwind.pl/?lang=en&id=782
Red Hat Security advisory: https://access.redhat.com/security/cve/CVE-2024-3094
Lasse Collin official note: https://tukaani.org/xz-backdoor/
Filippo Valsorda notes on the payload (Bluesky, 30 March 2024): https://bsky.app/profile/filippo.abyssdomain.expert
Wikipedia (aggregated references): https://en.wikipedia.org/wiki/XZ_Utils_backdoor

XZ utils CVE-2024-3094: the backdoor a maintainer planted over three years

Maintainer timeline — Jia Tan / JiaT75

The bug — payload delivered by an autotools hook

The `m4/build-to-host.m4` file

Payload extraction

The resulting library — IFUNC abused

The bridge — libsystemd loads liblzma, sshd loads libsystemd

The activation — Ed448 + system()

How Andres Freund found it

Detection — the tricks that appear on 29 and 30 March

Hashes and public artefacts

YARA rule — static detection

Dynamic exposure confirmation

Reproduction in a closed lab

Mitigation — revert, not patch

Lessons

References

Related Posts

PKfail: Secure Boot keys leaked and shipped in production for 12 years

Cisco ASA: ArcaneDoor returns with CVE-2025-20333 and a ROM bootkit

SharePoint ToolShell: the auth bypass Microsoft patches twice

Maintainer timeline — Jia Tan / JiaT75

The bug — payload delivered by an autotools hook

The m4/build-to-host.m4 file

Payload extraction

The resulting library — IFUNC abused

The bridge — libsystemd loads liblzma, sshd loads libsystemd

The activation — Ed448 + system()

How Andres Freund found it

Detection — the tricks that appear on 29 and 30 March

Hashes and public artefacts

YARA rule — static detection

Dynamic exposure confirmation

Reproduction in a closed lab

Mitigation — revert, not patch

Lessons

References

Related Posts

PKfail: Secure Boot keys leaked and shipped in production for 12 years

Cisco ASA: ArcaneDoor returns with CVE-2025-20333 and a ROM bootkit

SharePoint ToolShell: the auth bypass Microsoft patches twice

The `m4/build-to-host.m4` file