shareVM- Share insights about using VM's

Simplify the use of virtualization in everyday life

Posts Tagged ‘DropBox

DropBox dedup only in the cloud

with 3 comments

I had observed in my earlier article that DropBox performs de-duplication in the cloud. This would mean that de-duplication is not performed at the client. In order to test my hypothesis, I performed the following experiment:

I first looked at the size of the DropBox folder on Windows and found it to be 1,723,871,232 bytes.

Next, in the DropBox client, I opened the DropBox folder and simply duplicated the contents of the Public folder by copying the 1.68MB file and pasting it as its copy. I looked at the size of the folder once again and it had doubled to 3,446,513,664 bytes.

If DropBox had been performing dedup at the client, then it should have detected the duplicate blocks between the parent and its copy at source and the folder should not have grown in size at all. As a result, my conclusion is that DropBox dedup’s only in the cloud but not at the client.

Wait, there’s more:

I repeated the same experiment on the Mac after deleting the duplicate file. Here’s what I started out with:

Last login: Thu Sep 17 15:30:58 on ttys000
mace1s:~ paule1s$ du -k DropBox
1152 DropBox/Photos/Sample Album
1516 DropBox/Photos
1682636 DropBox/Public
368 DropBox/sharevm
1684880 DropBox

Notice that the total size of the folder (the last line of the listing above) is 1.68GB.

Next, in the DropBox client, I opened the DropBox folder and simply duplicated the contents of the Public folder by copying the 1.68MB file and pasting it as its copy. I looked at the size of the folder once again and saw:

mace1s:~ paule1s$ du -k DropBox
1152 DropBox/Photos/Sample Album
1516 DropBox/Photos
2600140 DropBox/Public
368 DropBox/sharevm
2602384 DropBox

This is very interesting. I had expected the storage requirements to double to 3,369,760 however, they grew by approx. 1GB. What happened to the remaining 682MB? Did the DropBox client truncate the file? If so, why?

Readers, can you shed some light?

Written by paule1s

September 17, 2009 at 5:34 pm

Compressed VM file transfer using DropBox

with 2 comments

I am using DropBox for transferring compressed files including VM’s  between my environment at home, a Mac running Windows XP SP3 in VMware Fusion 2.0.5 and the test machine, a Windows XP SP3 system located in the office lab. Each machines has a DropBox  folder linked to the same account.

Neat product!

I love the simplicity and ease of use. A lot of thought has gone into making the product easy to install, the integration with the host OS (Windows and Mac) is seamless and sets a benchmark for how UI’s for downloadable products should be designed.

Usage model

I compress each file using the Mac’s native file compression and drop into into my DropBox folder. DropBox seems to follow a two-step file transfer process:

  1. It first uploads the file completely from the source DropBox folder to the DropBox folder in the cloud
  2. After the upload is complete, the file is then downloaded from the DropBox folder in the cloud to the destination DropBox folders.

Setup

Speed ratings are from here. I have been able to correlate these speeds with the end-to-end transfer times.

Transfer Type

Speed Rating for my ISP

Observed DropBox Transfer Rate

Upload

120 KB/sec

70 KB/sec

Download

360 KB/sec

210 KB/sec

Near real-time transfer for uncompressed files

DropBox transfers uncompressed files almost instantaneously between the two machines. The files are transferred sequentially and seem to arrive in order. For example,  I transferred a 1.72 GB folder containing 400 photographs and the photos started appearing sequentially 10 – 15 seconds apart.

Compressed files

Compressed files are transferred as a unit, although dedup applies to blocks contained within it. The transfer times are as recorded below:

Original Size

Compressed Size

Upload Time

Download Time

Total Time

4.30 GB

1.6800 GB

6h 40m

2h 12m

8h 52m

2.15 GB

0.6714 GB

2h 27m

0h 48m

3h 15m

1.10 GB

0.2371 GB

0h 56m

0h 18m

1h 14m

Dedup works well with compressed files

DropBox examines the file to be transferred and builds an index of blocks to be transferred. Its de-duplication technology is smart enough to figure out when not to transfer blocks that are duplicates, i.e., have already been transferred before. For example, when I tried to transfer two clones, the first one took a long time to transfer ( a few hours), but the second transfer was very rapid (under five minutes).

Since I am using the free account, I deleted a 2GB VM from my DropBox folder in order to begin my next transfer. I was pleasantly surprised to see that the next VM transfer was very rapid. I suspect this was because the VM that was transferred earlier was still residing in DropBox’s cache even though I had deleted it, so that DropBox discovered common/duplicate blocks and did not upload them from my Mac.

Summary

Nifty tool. Love it. Will use it a lot.

A few feature requests

  • Subfolders: I would like to organize the files by date and category.
  • Timers: I would like to time the uploads and downloads easily.
  • Profile my usage and suggest how long an end-to-end transfer will take
  • Speed up compressed file transfers – improve my effective transfer rate  from ~60% to ~80%- I would like to saturate the available bandwidth for uploads and downloads

Thanks 🙂

Written by paule1s

September 13, 2009 at 5:42 pm

DropBox: Cloud service for storing, syncing, sharing files

leave a comment »

I found Dropbox, a nifty service for storing files online, keeping their copies on several of your own computers in sync, or sharing some of them with your friends.

  • You download the Dropbox client (supported on Windows XP and Vista (32 and 64-bit), Mac OS X Tiger and Leopard, as well as Ubuntu 7.10+ and Fedora Core 9+)
  • 2GB of free storage provided with it
  • You can then drag and drop files that you want to store online or share into the Dropbox.
  • Dropbox maintains a snapshot of files
  • If any of the files get updated, it sends only blocks that have changed
  • It also offers the ability to undelete and restore files from the copies that are stored online.
  • You can create Public folders for sharing, files in Public folders have URL’s that you can share with your friends.

While the company seems to be consumer-focused, the service is usable for dull and boring corporate stuff, like instantaneous automatic backups of files that change and also enables disaster recovery.

Someone has used Dropbox for syncing and sharing VM‘s. This is an interesting use case, however, readers should pay heed to the transfer times as image sizes grow

Written by paule1s

April 8, 2009 at 12:38 pm