Hardware bottleneck when copying files?

Rocco123

Limp Gawd
Joined
Jan 24, 2006
Messages
481
If anyone has a good technical explination of why this occurs, that would be great. I sort of already know the answer, but not the dirty-technical one.

Say I'm copying files from one disk to another, 30GB worth. What is the technical explination of why it take longer to copy a 30GB chunk that is comprised of many small files, than a 30GB chunk that is comprised of one or two files? Is this a Windows limitation, or a disk limitation?

In this case, I have a 30GB folder that is comprised of 1.9 million files. Lots and lots of stupid small text files. We are noticing that this takes sometimes days to copy from one server to another. Is it just me, or is 1.9 million files just a staggering amount?
 
Two reasons, probably. The first is the file system; it feels free to allocate files wherever it's got room. So little files, even in the same directory, get scattered across the disk, and that means they take longer to retrieve. Second is network protocol; if you're sending one big file you just send data; with little files you gave to send metadata. What's the name of that file? What are its permissions?

The best way around this is stop using small files :p Failing that, try using tar, cpio, or whatever to pack a bunch of files into one file, then send that, then unpack it on the remote end. If you're on Unix, you can probably do this with pipes and not use 2N space on both ends; windows you're not so lucky.
 
Rocco123 said:
I have a 30GB folder that is comprised of 1.9 million files.

That's a lot of files... I just checked my desktop, notebook, htpc, and server and between the 4 I've only got about 800K files (over about 1.9TB) and my guess is that most of those are temp files from the browser cache (since almost 500K are from my desktop).
 
Back
Top