Friday, April 21, 2017

DoublePulsar Initial SMB Backdoor Ring 0 Shellcode Analysis

One week ago today, the Shadow Brokers (an unknown hacking entity) leaked the Equation Group's (NSA) FuzzBunch software, an exploitation framework similar to Metasploit. In the framework were several unauthenticated, remote exploits for Windows (such as the exploits codenamed EternalBlue, EternalRomance, and EternalSynergy). Many of the vulnerabilities that are exploited were fixed in MS17-010, perhaps the most critical Windows patch in almost a decade.

Side note: You can use my MS17-010 Metasploit auxiliary module to scan your networks for systems missing this patch (uncredentialed and non-intrusive). If a missing patch is found, it will also check for an existing DoublePulsar infection.


For those unfamiliar, DoublePulsar is the primary payload used in SMB and RDP exploits in FuzzBunch. Analysis was performed using the EternalBlue SMBv1/SMBv2 exploit against Windows Server 2008 R2 SP1 x64.

The shellcode, in tl;dr fashion, essentially performs the following:

  • Step 0: Shellcode sorcery to determine if x86 or x64, and branches as such.
  • Step 1: Locates the IDT from the KPCR, and traverses backwards from the first interrupt handler to find ntoskrnl.exe base address (DOS MZ header).
  • Step 2: Reads ntoskrnl.exe's exports directory, and uses hashes (similar to usermode shellcode) to find ExAllocPool/ExFreePool/ZwQuerySystemInformation functions.
  • Step 3: Invokes ZwQuerySystemInformation() with the enum value SystemQueryModuleInformation, which loads a list of all drivers. It uses this to locate Srv.sys, an SMB driver.
  • Step 4: Switches the SrvTransactionNotImplemented() function pointer located at SrvTransaction2DispatchTable[14] to its own hook function.
  • Step 5: With secondary DoublePulsar payloads (such as inject DLL), the hook function sees if you "knock" correctly and allocates an executable buffer to run your raw shellcode. All other requests are forwarded directly to the original SrvTransactionNotImplemented() function. "Burning" DoublePulsar doesn't completely erase the hook function from memory, just makes it dormant.

After exploitation, you can see the missing symbol in the SrvTransaction2DispatchTable. There are supposed to be 2 handlers here with the SrvTransactionNotImplemented symbol. This is the DoublePulsar backdoor (array index 14):

Honestly, you don't usually wake up in the morning and feel like spending time dissecting ~3600 some odd bytes of Ring-0 shellcode, but I felt productive today. Also I was really curious about this payload and didn't see many details about it outside of Countercept's analysis of the DLL injection code. But I was interested in how the initial SMB backdoor is installed, which is what this post is about.

Zach Harding, Dylan Davis, and I kind of rushed through it in a few hours in our red team lab at RiskSense. There is some interesting setup in the EternalBlue exploit with the IA32_LSTAR syscall MSR (0xc000082) and a region of the Srv.sys containing FEFEs, but I will instead focus on just the raw DoublePulsar methodology... Much like the EXTRABACON shellcode, this one is crafty and does not simply spawn a shell.

Detailed Shellcode Analysis

Inside the Shadow Brokers dump you can find DoublePulsar.exe and EternalBlue.exe. When you use DoublePulsar in FuzzBunch, there is an option to spit its shellcode out to a file. We found out this is a red herring, and that the EternalBlue.exe contained its own payload.

Step 0: Determine CPU Architecture

The main payload is quite large because it contains shellcode for both x86 and x64. The first few bytes use opcode trickery to branch to the correct architecture (see my previous article on assembly architecture detection).

Here is how x86 sees the first few bytes.

You'll notice that inc eax means the je (jump equal/zero) instruction is not taken. What follows is a call and a pop, which is to get the current instruction pointer.

And here is how x64 sees it:

The inc eax byte is instead the REX preamble for a NOP. So the zero flag is still set from the xor eax, eax operation. Since x64 has RIP-relative addressing it doesn't need to get the RIP register.

The x86 payload is essentially the same thing as the x64 so this post only focuses on x64.

Since the NOP was a true NOP on x64, I overwrote the 40 90 with cc cc (int 3) using a hex editor. Interrupt 3 is how debuggers set software breakpoints.

Now when the system is exploited, our attached kernel debugger will automatically break when the shellcode starts executing.

Step 1: Find ntoskrnl.exe Base Address

Once the shellcode figures out it is x64 it begins to search for the base of ntoskrnl.exe. This is done with the following stub:

Fairly straightforward code. In user mode, the GS segment for x64 contains the Thread Information Block (TIB), which holds the Process Environment Block (PEB), a struct which contains all kinds of information about the current running process. In kernel mode, this segment instead contains the Kernel Process Control Region (KPCR), a struct which at offset zero actually contains the current process PEB.

This code grabs offset 0x38 of the KPCR, which is the "IdtBase" and contains a pointer struct of KIDTENTRY64. Those familiar with the x86 family will know this is the Interrupt Descriptor Table.

At offset 4 into the KIDENTRY64 struct you can get a function pointer to the interrupt handler, which is code defined inside of ntoskrnl.exe. From there it searches backwards in memory in 0x1000 increments (page size) for the .exe DOS MZ header (cmp bx, 0x5a4d).

Step 2: Locate Necessary Function Pointers

Once you know where the MZ header of a PE file is, you can peek into defined offsets for the export directory and get the relative virtual address (RVA) of any function you want. Userland shellcode does this all the time, usually to find necessary functions it needs out of ntdll.dll and kernel32.dll. Just like most userland shellcode, this ring 0 shellcode also uses a hashing algorithm instead of hard-coded strings in order to find the necessary functions.

The following functions are found:

ExAllocatePool can be used to create regions of executable memory, and ExFreePool can clean it up when done. These are important so the shellcode can allocate space for its hooks and other functions. ZwQuerySystemInformation is important in the next step.

Step 3: Locate Srv.sys SMB Driver

A feature of ZwQuerySystemInformation is a constant named SystemQueryModuleInformation, with the value 0xb. This gives a list of all loaded drivers in the system.

The shellcode then searched this list for two different hashes, and it landed on Srv.sys, which is one of the main drivers that SMB runs on.

The process here is basically equivalent to getting PEB->Ldr in userland, which lets you iterate loaded DLLs. Instead, it was looking for the SMB driver.

Step 4: Patch the SMB Trans2 Dispatch Table

Now that the DoublePulsar shellcode has the main SMB driver, it iterates over the .sys PE sections until it gets to the .data section.

Inside of the .data section is generally global read/write memory, and stored here is the SrvTransaction2DispatchTable, an array of function pointers that handle different SMB tasks.

The shellcode allocates some memory and copies over the code for its function hook.

Next the shellcode stores the function pointer for the dispatch named SrvTransactionNotImplemented() (so that it can call it from within the hook code). It then overwrites this member inside SrvTransaction2DispatchTable with the hook.

That's it. The backdoor is complete. Now it just returns up its own call stack and does some small cleanup chores.

Step 5: Send "Knock" and Raw Shellcode

Now when DoublePulsar sends its specific "knock" requests (which are seen as invalid SMB calls), the dispatch table calls the hooked fake SrvTransactionNotImplemented() function. Odd behavior is observed: normally the SMB response MultiplexID must match the SMB request MultiplexID, but instead it is incremented by a delta, which serves as a status code.

Operations are hidden in plain sight via steganography, which do not have proper dissectors in Wireshark.

The status codes (via MultiplexID delta) are:

  • 0x10 = success
  • 0x20 = invalid parameters
  • 0x30 = allocation failure

The opcode list is as follows:

  • 0x23 = ping
  • 0xc8 = exec
  • 0x77 = kill

You can tell which opcode was called by using the following algorithm:

t = SMB.Trans2.Timeout
op = (t) + (t >> 8) + (t >> 16) + (t >> 24);

Conversely, you can make the packet using this algorithm, where k is randomly generated:

op = 0x23
k = 0xdeadbeef
t = 0xff & (op - ((k & 0xffff00) >> 16) - (0xffff & (k & 0xff00) >> 8)) | k & 0xffff00

Sending a ping opcode in a Trans2 SESSION_SETUP request will yield a response that holds part of a XOR key that needs to be calculated for exec requests.

The "XOR key" algorithm is:

s = SMB.Signature1
x = 2 * s ^ (((s & 0xff00 | (s << 16)) << 8) | (((s >> 16) | s & 0xff0000) >> 8))

More shellcode can be sent with a Trans2 SESSION_SETUP request and exec opcode. The shellcode is sent in the "data payload" part of the packet 4096 bytes at a time, using the XOR key as a basic stream cipher. The backdoor will allocate an executable region of memory, decrypt and copy over the shellcode, and run it. The Inject DLL payload is simply some DLL loading shellcode prepended to the DLL you actually want to inject.

We can see the hook is installed at SrvTransaction2DispatchTable+0x70 (112/8 = index 14):

And of course the full disassembly listing.


There you have it, a highly sophisticated, multi-architecture SMB backdoor. The world probably did not need a remote Windows kernel payload this advanced being spammed across the Internet. It's an unique payload, because you can infect a system, lay low for a little bit, and come back later when you want to do something more intrusive. It also finds a nice place in the system to hide out and not alert built-in defenses like PatchGuard. It is unclear if newer versions of PatchGuard, such as those in Windows 10, already detect this hook. We can expect them to be added if not.

Usually we only get to see kernel shellcode in local exploits, as it swaps process tokens in order to privilege escalate. However, Microsoft does many networking things in the kernel, such as Srv.sys and HTTP.sys. The techniques demonstrated are in many ways completely analagous to how usermode shellcode operates during remote exploits.

If/when this gets ported over to Metasploit, I would probably not copy this verbatim, and rather skip the backdoor idea. It isn't the most secure thing to do, as it's not a big secret anymore and anyone else can come along and use your backdoor.

Here's what can be done instead:

  1. Obtain ntoskrnl.exe address in the same fashion as DoublePulsar, and read export directory for necessary functions to perform the next operations.
  2. Spawn a hidden process (such as notepad.exe).
  3. Queue an APC with Meterpreter payload.
  4. Resume process, and exit the kernel cleanly.

Every major malware family, from botnets to ransomware to banking spyware, will eventually add the exploits in the FuzzBunch toolkit to their arsenal. This payload is simply a mechanism to load more malware with full system privileges. It does not open new ports, or have any real encryption or other features to prevent others from taking advantage of the same hole, making the attribution game for digital forensic investigators even more difficult. This is a jewel compared to the scraps that were given to Stuxnet. It comes in a more dangerous era than the days of Conficker. Given the persistence of the missing MS08-067 patch, we could be in store for a decade of breaches emanating from MS17-010 exploits. It is the perfect storm for one of the most damaging malware infections in computing history.


  1. Thanks for your analysis, very helpful.

  2. Thank u for sharing. Interesting post!

  3. Great RE work, thank you for your really helpful input.

  4. Great writing, I like how the last paragraph is very relevant right now.

    1. Thank you. I was trying to warn clients and others for weeks, but many of the people I talked to were skeptical or brushed it off. I think yesterday's attacks are only the beginning, but at least it's being talked about in the mainstream now.

  5. Interesting, thanks for the breakdown. It's a starting point, kinda suprised they did not use better obfuscation methods...

  6. Wow such a detailed write up. Though I understand very less I'm still impressed and wish that one day I'll be at your level zerosum0x0. Could you give a noob like me some tips to become better?

    1. Everyone starts at 0. Practical Malware Analysis is a great book with lots of hands on labs. I'm also writing a book, but more details on that later.

  7. Hi all,
    I am one of those caught by WannaDecryptor.

    I just managed to decrypt a few files of my choosing and wanted to share the info with the community so that better minds might be able to find a solution.

    I read elewhere that the f.wnry file contains a list of files that WannaDecryptor will decrypt for free to show their 'good will'. Well, I replaced the list with a list of my choosing, and I got it to decrypt some of the files !

    It stopped after the first few files, not sure it because there is an internal counter or because the file names we're non standard (spaces, french characters), but I at least got back those few files. And considering they are my email archives for past 3 years, I am euphoric.

    Hope this will help someone find a solution.

    Best of luck.

  8. Zerosum, I am trying to find out, what privileges uses EternalBlue to execute DoublePulsar DLL on the target machine. EternalBlue and DoublePulsar code is transferred to the kernel memory of the target machine, and next the code is extracted and dropped to disk in the form of DLL files. So, it seems that everything is processed in the Ring 0. But, the SMB1 vulnerability used by EternalBlue is classified as remote code execution, without mentioning privilege escalation.

    1. The remote code execution of EternalBlue allows a non-user to run an arbitrary payload in kernel mode. The FuzzBunch version installs DoublePulsar (whereas the Metasploit version does not). DoublePulsar can inject DLLs into a process (by default lsass.exe, which runs as SYSTEM). However, I believe these DLLs are fixed up in such a way that they never drop to disk, they are loaded in memory through reflective injection.

  9. How does EternalBlue get the first code part running in ring 0? For me it is not really clear what the exploit is, to get code from the user mode or remote into the ring 0 running?

    1. Your confusion is justified. Microsoft put numerous things in the kernel that you wouldn't expect to be there. Among those are some network services such as SMB, which lives in a driver called Srv.sys. EternalBlue takes advantage of a mathematical miscalculation leading to buffer overflow in Srv.sys, which is running in ring 0.

  10. I see. Would be interesting to see some code snippets about this "buffer overflow". It is just fascinating how this overflow can deterministically hit the right part of a code section. Further it surpises me that no processor safety mechanism throws an exception when code memory is overwritten by data?

    1. There are many exploit mitigations, such as DEP and ASLR, which could have prevented successful exploitation of EternalBlue. The problem is the Windows kernel, even on Windows 10, still contains fixed, static memory addresses of read-write-executable regions.