Skip to content
Go back

linux shell argument list too long rsync or cp

· Updated:
By SumGuy 5 min read
linux shell argument list too long rsync or cp

I needed to copy files generated by doxygen from one directory into another for a large opensource C++ project. Sadly there were too many files in the directory, so bash started complaining 🙁 cp and rsync died out with the error of argument list too long. initially I figured I could generate it all from scratch in new location but it was quicker and easier to use a for loop to rsync the files over 🙂

some info:

I realized bash expansion would work here.

Using for loop

for x in {a..b}
do
echo $x*
done

Notice I only stepped between A and B because I didn’t want to sit there for an hour while it listed all the files. this worked well, it listed all files and I was sure it would suite my purposes. now the real deal!

for x in {a..z}
do
echo $x
rsync -az /backups/doxygen/$x* /home/user/current/directory/
done

sometimes you might still get the error even for each letter, for example I still had too many files starting with **D **and Q. so I just changed where I globbed :

for x in {a..z}
do
echo $x
rsync -az /backups/doxygen/d$x* /home/user/current/directory/
done

this allows me to further iterate a thru z but after starting the files with the letter d. Now what happens if you happen to have files starting with numbers? simply switch the letters for numbers.

for x in {0..9}
do
echo $x
rsync -az /backups/doxygen/$x* /home/user/current/directory/
done

You can use any other command you need in place of rsync. like mv cp mkdir or any custom commands.

for x in {a..z}
do
echo $x
mv /backups/doxygen/$x* /home/user/current/directory/
done

Globbing

Now if you don’t want to use for loops you can glob them in a one liner like so :

ls /backups/doxygen/[x-z]*

and the actual command using cp and globbing

cp -r /backups/doxygen/[a-z]* /home/user/current/directory/

and again going a level deeper

cp -r /backups/doxygen/d[a-z]* /home/user/current/directory/

Voila! argument list too long is now vanquished! do any of you have a better way of dealing with this? let me know!

More info about globbingglobbing.

The Actually Correct Way: find + xargs

Here’s the thing — the glob and loop tricks work, and they got me out of a jam, but there’s a more general solution that doesn’t require you to know anything about your filenames. find paired with xargs sidesteps the whole kernel argument limit entirely because it never builds a giant argument list in the first place.

Terminal window
find /backups/doxygen/ -maxdepth 1 -type f | xargs -I{} cp {} /home/user/current/directory/

The -I{} tells xargs to substitute {} with each filename, one at a time. Slow but bulletproof. If you want it faster and your destination command can handle batches, drop the -I{} and let xargs batch them up itself:

Terminal window
find /backups/doxygen/ -maxdepth 1 -type f | xargs cp -t /home/user/current/directory/

The -t flag on cp flips the argument order so the destination comes first — that’s what lets xargs pass a batch of source files at the end. Saves a ton of time on large directories.

For rsync specifically, xargs is a bit awkward because rsync wants to preserve directory structure. In that case, --files-from is your friend:

Terminal window
find /backups/doxygen/ -maxdepth 1 -type f -printf '%f\n' > /tmp/filelist.txt
rsync -az --files-from=/tmp/filelist.txt /backups/doxygen/ /home/user/current/directory/

Why This Happens (The Two-Second Explanation)

The error comes from a kernel limit called ARG_MAX — the maximum size of the argument buffer passed to execve(). On most Linux systems it’s around 2MB. When bash expands * in a directory with 50,000 files, it tries to shove all those filenames into a single command invocation. Two megabytes fills up fast.

You can check your limit:

Terminal window
getconf ARG_MAX

Typical output is 2097152 (2MB). Some systems let you bump it temporarily, but that’s a band-aid — fix the command, not the limit.

Gotchas That Will Bite You

A few things that catch people off guard with the loop approach:

Silent failures on no-match. If a glob like $x* matches nothing (say, there are no files starting with q), bash passes the literal string q* to the command. rsync and cp will error out or create a directory named q*. Either shopt -s nullglob before your loop to silently skip non-matching globs, or check manually.

Terminal window
shopt -s nullglob
for x in {a..z}; do
rsync -az /backups/doxygen/$x* /home/user/current/directory/
done

Files with spaces. The original post explicitly says there were no spaces — good, because the loop approach falls apart the moment a filename has a space in it. find + xargs -print0 handles that:

Terminal window
find /backups/doxygen/ -maxdepth 1 -type f -print0 | xargs -0 cp -t /home/user/current/directory/

The -print0 / -0 pair uses null bytes as separators instead of newlines, so filenames with spaces, newlines, or other weird characters get passed through safely. Good habit to build even when you “know” there are no spaces, because you’ll copy that command six months later into a context where there are.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it'll show up above once verified.


Previous Post
Linux Home Lab Security: Planning for the Unexpected
Next Post
LiteLLM & vLLM: One API to Rule All Your Models

Discussion

Powered by Garrul . Sign in with GitHub or Google, or post anonymously.

Related Posts