I needed to copy files generated by doxygen from one directory into another for a large opensource C++ project. Sadly there were too many files in the directory, so bash started complaining 🙁 cp and rsync died out with the error of argument list too long. initially I figured I could generate it all from scratch in new location but it was quicker and easier to use a for loop to rsync the files over 🙂
some info:
-
all files start with alphabetic characters.
-
there are no spaces in the names
-
all files are in single directory
I realized bash expansion would work here.
Using for loop
for x in {a..b}do echo $x*doneNotice I only stepped between A and B because I didn’t want to sit there for an hour while it listed all the files. this worked well, it listed all files and I was sure it would suite my purposes. now the real deal!
for x in {a..z}doecho $xrsync -az /backups/doxygen/$x* /home/user/current/directory/donesometimes you might still get the error even for each letter, for example I still had too many files starting with **D **and Q. so I just changed where I globbed :
for x in {a..z}doecho $xrsync -az /backups/doxygen/d$x* /home/user/current/directory/donethis allows me to further iterate a thru z but after starting the files with the letter d. Now what happens if you happen to have files starting with numbers? simply switch the letters for numbers.
for x in {0..9}doecho $xrsync -az /backups/doxygen/$x* /home/user/current/directory/doneYou can use any other command you need in place of rsync. like mv cp mkdir or any custom commands.
for x in {a..z}doecho $xmv /backups/doxygen/$x* /home/user/current/directory/doneGlobbing
Now if you don’t want to use for loops you can glob them in a one liner like so :
ls /backups/doxygen/[x-z]*and the actual command using cp and globbing
cp -r /backups/doxygen/[a-z]* /home/user/current/directory/and again going a level deeper
cp -r /backups/doxygen/d[a-z]* /home/user/current/directory/Voila! argument list too long is now vanquished! do any of you have a better way of dealing with this? let me know!
More info about globbingglobbing.
The Actually Correct Way: find + xargs
Here’s the thing — the glob and loop tricks work, and they got me out of a jam, but there’s a more general solution that doesn’t require you to know anything about your filenames. find paired with xargs sidesteps the whole kernel argument limit entirely because it never builds a giant argument list in the first place.
find /backups/doxygen/ -maxdepth 1 -type f | xargs -I{} cp {} /home/user/current/directory/The -I{} tells xargs to substitute {} with each filename, one at a time. Slow but bulletproof. If you want it faster and your destination command can handle batches, drop the -I{} and let xargs batch them up itself:
find /backups/doxygen/ -maxdepth 1 -type f | xargs cp -t /home/user/current/directory/The -t flag on cp flips the argument order so the destination comes first — that’s what lets xargs pass a batch of source files at the end. Saves a ton of time on large directories.
For rsync specifically, xargs is a bit awkward because rsync wants to preserve directory structure. In that case, --files-from is your friend:
find /backups/doxygen/ -maxdepth 1 -type f -printf '%f\n' > /tmp/filelist.txtrsync -az --files-from=/tmp/filelist.txt /backups/doxygen/ /home/user/current/directory/Why This Happens (The Two-Second Explanation)
The error comes from a kernel limit called ARG_MAX — the maximum size of the argument buffer passed to execve(). On most Linux systems it’s around 2MB. When bash expands * in a directory with 50,000 files, it tries to shove all those filenames into a single command invocation. Two megabytes fills up fast.
You can check your limit:
getconf ARG_MAXTypical output is 2097152 (2MB). Some systems let you bump it temporarily, but that’s a band-aid — fix the command, not the limit.
Gotchas That Will Bite You
A few things that catch people off guard with the loop approach:
Silent failures on no-match. If a glob like $x* matches nothing (say, there are no files starting with q), bash passes the literal string q* to the command. rsync and cp will error out or create a directory named q*. Either shopt -s nullglob before your loop to silently skip non-matching globs, or check manually.
shopt -s nullglobfor x in {a..z}; do rsync -az /backups/doxygen/$x* /home/user/current/directory/doneFiles with spaces. The original post explicitly says there were no spaces — good, because the loop approach falls apart the moment a filename has a space in it. find + xargs -print0 handles that:
find /backups/doxygen/ -maxdepth 1 -type f -print0 | xargs -0 cp -t /home/user/current/directory/The -print0 / -0 pair uses null bytes as separators instead of newlines, so filenames with spaces, newlines, or other weird characters get passed through safely. Good habit to build even when you “know” there are no spaces, because you’ll copy that command six months later into a context where there are.