shareVM- Share insights about using VM's

Simplify the use of virtualization in everyday life

DropBox dedup only in the cloud

with 3 comments

I had observed in my earlier article that DropBox performs de-duplication in the cloud. This would mean that de-duplication is not performed at the client. In order to test my hypothesis, I performed the following experiment:

I first looked at the size of the DropBox folder on Windows and found it to be 1,723,871,232 bytes.

Next, in the DropBox client, I opened the DropBox folder and simply duplicated the contents of the Public folder by copying the 1.68MB file and pasting it as its copy. I looked at the size of the folder once again and it had doubled to 3,446,513,664 bytes.

If DropBox had been performing dedup at the client, then it should have detected the duplicate blocks between the parent and its copy at source and the folder should not have grown in size at all. As a result, my conclusion is that DropBox dedup’s only in the cloud but not at the client.

Wait, there’s more:

I repeated the same experiment on the Mac after deleting the duplicate file. Here’s what I started out with:

Last login: Thu Sep 17 15:30:58 on ttys000
mace1s:~ paule1s$ du -k DropBox
1152 DropBox/Photos/Sample Album
1516 DropBox/Photos
1682636 DropBox/Public
368 DropBox/sharevm
1684880 DropBox

Notice that the total size of the folder (the last line of the listing above) is 1.68GB.

Next, in the DropBox client, I opened the DropBox folder and simply duplicated the contents of the Public folder by copying the 1.68MB file and pasting it as its copy. I looked at the size of the folder once again and saw:

mace1s:~ paule1s$ du -k DropBox
1152 DropBox/Photos/Sample Album
1516 DropBox/Photos
2600140 DropBox/Public
368 DropBox/sharevm
2602384 DropBox

This is very interesting. I had expected the storage requirements to double to 3,369,760 however, they grew by approx. 1GB. What happened to the remaining 682MB? Did the DropBox client truncate the file? If so, why?

Readers, can you shed some light?

Written by paule1s

September 17, 2009 at 5:34 pm

3 Responses

Subscribe to comments with RSS.

  1. Well, I’m using dropbox on windows (mainly) and mac. But on windows I’m using several Junctions (sysinternals:junction) to link to folders outside of my dropbox.

    What I think would be really cools is the ability to create junctions INSIDE of dropbox, where you can access the same files by using several different folders.

    Sort of like soft links for unix.

    What I worry about with creating junctions inside dropbox is that 1) would it reduce my dropbox storage capacity? 2) do i have to make the junctions on every machine I link to dropbox so that dropbox doesn’t put the folders on that machine twice?

    Will

    September 22, 2009 at 11:21 am

    • I’ve done the same thing in Linux – my dropbox directory consists of a bunch of soft symlinks to other directories. Of course for this to work correctly you need to manually create these symlinks on all attached computers before linking to the dropbox account. The client program follows the symlinks in the way you’d expect it to (so to answer your question, yes you’d need to create the symlinks on every machine to their equivalent location).

      Lerneaen Hydra

      November 3, 2009 at 6:29 pm

  2. HELLo!

    Well, seeing as the DB folder is a normal _local_ folder on your HDD, I don’t quite see how DB would be able to do dedup on it. After all, copying/moving files on your local HDDs is completely controlled by your OS.

    Will: You can use junctions inside the DB folder, but due to the way Windows works DB will not be notified about real-time changes. Meaning you’ll have to restart DB, in order to get it to detect that the files/folders have changed.

    To your unanswered worry:
    1. Of course it would, you’re still syncing the files aren’t you?

    Happy syncin’!

    Hellkeepa

    December 11, 2009 at 2:35 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: