In my last post, “Really easy speed-testing“, I discussed the fastest method of duplicating an array, however I mentioned that to copy a array with complex types was an entirely different thing, requiring a different approach.
By “complex types” I mean things that have pointers as opposed to primitive types such as numbers and strings which have no pointers – when primitive types are passed to a function they’re copied; nothing of the original remains. Complex types include Arrays, Objects and Functions; these are all objects.
In my previous post I mistakenly mentioned “pass-by-reference” which JavaScript doesn’t actually support. Complex types like literal objects and arrays are not “passed-by-reference”; their pointers are passed by value and those pointers point to the respective object; this will hopefully become clearer if you carry on reading.
Say, for example, I have an array of arrays; something like this:
var arr1 = [ [1,2,3] , [1,2,3] , [1,2,3] ]; |
There are a total of four Array objects there, each of them contain values; some primitive and some not so. In this situation arr1
is merely a pointer to an object which exists somewhere in memory. So, one object exists, but it can have multiple pointers; here’s another “pointer”:
var arr2 = arr1; |
arr2
now points to the object, not to arr1
itself but the object that it refers to. So we have two variables pointing to the same thing; this is not possible with primitive types. For example:
var X = 5; var Y = X; |
Changing the value held by either Y
or X
will not have any effect on the other – there are no pointers involved. If we go back to our array example, we can try changing a value within the array:
// Continuing from where we left off (above) arr2[0] = 'replaced'; // arr2 = [ 'replaced' , [1,2,3] , [1,2,3] ]; // arr1 = [ 'replaced' , [1,2,3] , [1,2,3] ]; // arr2 === arr1; // This is TRUE |
Because both arr1
and arr2
point to the same object the above results should make perfect sense. When you do something to arr2[0]
you’re also doing something to arr1[0]
.
Copying the array
So, if we want to create a copy of arr1
we can’t just give it another pointer; we have to create a new array and fill it with the contents of the first array. This is quite simple in itself, and it is; have a look:
var theCopy = []; // An new empty array for (var i = 0, len = arr1.length; i < len; i++) { theCopy[i] = arr1[i]; } // "theCopy" = [ 'replaced' , [1,2,3] , [1,2,3] ] |
So, we’ve created a new object called theCopy
. The first value within this array is the string, ‘replaced’, and this is a totally new string; a copy of the original arr1[0]
. But, within theCopy
we also have two other values: two more arrays – both of which are objects and exist somewhere in memory (only accessible by pointers). When we attempted to copy arr1
we didn’t successfully copy those two inner arrays; all we did is create two more pointers: theCopy[1]
and theCopy[2]
now point to the same objects as arr1[1]
and arr2[2]
. This is not what we want.
So, to successfully copy the entire array we’ll have to create new arrays for each of the inner arrays and copy them in the same manner; by looping through all values. This can become a very expensive process if you continue looping through each array or object as it appears. But unfortunately, this is the reality of it; there’s no other way. We can limit repeating ourselves by using recursion, but the continual looping still occurs behind the facade of consecutive calling.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | function deepCopy(obj) { if (Object.prototype.toString.call(obj) === '[object Array]') { var out = [], i = 0, len = obj.length; for ( ; i < len; i++ ) { out[i] = arguments.callee(obj[i]); } return out; } if (typeof obj === 'object') { var out = {}, i; for ( i in obj ) { out[i] = arguments.callee(obj[i]); } return out; } return obj; } |
The first IF statement (line 2) simply checks that the passed obj
is an array, and if it is, the block commences: A new array is created (line 3) and is filled with values from the original array. Before each assignment (line 5) the value is first passed to arguments.callee
which refers to the deepCopy
function itself (this is what recursion is). If the value passed to deepCopy
is a primitive type it won’t pass either of the IF statements and so it will simply be returned (line 16). If however, it is an object or an array, then the respective IF block executes, thus copying the object. Any sub-arrays or sub-objects (or sub-sub-sub-objects) will be treated in the same manner, hence why it’s called a “deep copy”.
The point?
So, why would you want to copy an array or object; what’s the point!?
Whenever you want to manipulate the data held in a data-set without effecting the original data-set you’ll first want to make a copy of it. This is a common requirement; almost every well-constructed jQuery plugin uses this method to merge default settings with user-defined settings. There are other situations where copying objects is a requirement but they’re not all that common. This post was just to show you the commonly overlooked details associated with copying objects and arrays in JavaScript; it’s useful to know…
Thanks for reading! Please share your thoughts with me on Twitter. Have a great day!
I’ve been coding in ColdFusion for so long, I totally forgot that arrays are passed by reference in Javascript. ColdFusion passes them by value. Thanks for the reminder and the cool post!
On a side note, I’ve noticed that you always use arguments.callee rather than ever referring to the name of the function. I like your style – low coupling. Good stuff.
@Ben, in some respects JavaScript does pass by value; I think there’s a lot of confusion; I thought it was pass-by-reference but apparently calling it that is misleading: http://javadude.com/articles/passbyvalue.htm (Even though the article is about Java it applies to JavaScript). If you try “The Litmus Test” in JavaScript you’ll see it doesn’t pass. The author claims it’s better to say that “Object references are passed by value” instead of “Objects are passed by reference”.
Thanks for your comment! 🙂
In the article you refer to Functions as complex types, but in the deepCopy they are not referenced. Why? How are functions copied?
Thanks for your posts, always useful.
@Strx, functions are quite tricky. Like regular objects, functions exist in memory and when you assign one to a variable you’re giving it a pointer, and every time you assign a function to a new variable all you’re doing is creating a new pointer to the same function.
Functions generally don’t contain data or anything unique for that matter, so there’s nothing to copy; their functionality cannot be changed in any way once the function has been created. So you can essentially copy a function (although not really “copy”) just by re-assigning it to a different variable:
But note that, because functions are objects, unique properties can be added and if that happens then you have to treat it like a regular object when copying.
Unfortunately, deep copying a complex structure can be much more complicated than this, depending on the level of uniqueness and precision you need from the original vs the copy. A few examples: objects storing DOM elements, maintaining prototype relationships to protect instanceof, and using functions as namespaces.
On a more focused note, the final function doesn’t account for null, which is (sadly) typeof ‘object’. Also, you can use for..in for both arrays and objects.
This will handle sparse arrays and cases of arrays treated as objects (e.g. var a = []; a.dontDoThis = ‘you should use an Object’)