Welcome! Log In Create A New Profile

Advanced

rsync does not exclude directories in cron.daily

Posted by bodhi 
rsync does not exclude directories in cron.daily
July 13, 2013 01:38AM
Hi,

Anybody running rsync in a cron job to backup rootfs? Once in a while I run rsync in a script to backup my debian development plug rootfs to another plug. It always works fine, excluding typical directories that we should exclude such as /proc, /dev,... Now I'm starting to get organize :) and want to execute it inside a cron.daily. And I found out from dry-run that it does not exclude these directories! since I am running everything as root, it is really strange that it would behave differently in a cron job.

Anybody experience the same thing on kernel 3.8.11?

Thanks,

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)



Edited 2 time(s). Last edit at 07/13/2013 01:48AM by bodhi.
Re: rsync does not exclude directories in cron.daily
July 13, 2013 10:29PM
I use rsync extensively. On my Internet VPSes, I do things slightly differently -- I use the '-x' option to prevent rsync from venturing off the selected file systems. It creates the mount points on the backup, but goes no further. I do use '--exclude' also, and have never had any problems with it. For instance, '--exclude=/var/run' and '--exclude="/home/*/www"' both seem to work as expected, although on my systems neither is on a file system boundary. All this is executed as root, BTW, but out of root's crontab, not the cron.daily area.

On my Pogos, I backup my /home and /mail partitions using rsync, also as root. I don't use --exclude there, -- well, I do use one and will have to go check it out more thoroughly to see if it indeed is honored -- but otherwise the command appears to work fine.

When you get it figured out, let us know what you find, bodhi.
Re: rsync does not exclude directories in cron.daily
July 14, 2013 12:32PM
Actually, this thread prompted me to implement something I've been meaning to do for some time. I have two Pogos up and running, predominantly in active/standby mode. Once a day the active /home directory is rsynced to the standby box. However, there are a few files which need to be box-specific. So, I just created a /home/local directory for that purpose, and use symlinks to access it as necessary. I removed the reference to all these special files from my backup script and instead just coded "--exclude /home/local/" in the rsync argument list in their stead. I've manually run the backup several times and it works as expected. I'll take a look tomorrow morning to make sure there isn't a different outcome when running the script from cron, but I doubt there will be.

YMMV.
Re: rsync does not exclude directories in cron.daily
July 14, 2013 03:28PM
I did not use cron to backup rootfs. I guess the problem might be the env, e.g., chroot for the cron job?

I used rsync a while ago, now I use dump/restore to backup my rootfs. It works great but I have not do it in cron since I do not want such a serious matter happens behind my back:)

-syong
Re: rsync does not exclude directories in cron.daily
July 14, 2013 04:00PM
@restamp,
Thanks for the valuable info! I put the new entry in the root cron and the dry-run works :) ... but only after I rebooted both the source and destination boxes. The inital root cron attempt showed the same behavior (not excluding directorries).

That caused me to think a little bit about what I did during testing. I did abort a cron daily dry-run when I saw a massive number of files being compared, could that be the cause of all my followed uncsuccessful tries? i.e. it was all messed up. In any case I'm happy with running root cron. But I will try to dry-run cron.daily again to confirm whether it has real problem.

@syong,
Interesting, do you dump from a running rootfs? the package description said it's for ext2 file system, does it support ext3?

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Re: rsync does not exclude directories in cron.daily
July 14, 2013 09:53PM
@bodhi,

Yes, I dump from a running rootfs. Dump/restore is the traditional way to backup in UNIX. It should work for ext2/3/4. I use ext3 rootfs myself.

Here is my script to backup/transfer a rootfs.

http://forum.doozan.com/read.php?2,12968

-syong
Re: rsync does not exclude directories in cron.daily
July 16, 2013 12:37AM
restamp,

I've scheduled 2 rsync dry-run cron jobs to run consecutively today and got the result. The root cron job ran correctly, excluding all the directories. The cron.daily did not, it showed the same behavior before, did not exclude any directory! So I guess Syong has the right idea above, could be something about permission. Could it be something change in security policy that we are not aware of?

my apt sources:
deb http://ftp.us.debian.org/debian wheezy main
deb http://security.debian.org/ wheezy/updates main contrib non-free

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Re: rsync does not exclude directories in cron.daily
July 16, 2013 02:14PM
Hmm. Interesting.

Do you use full paths in your "--exclude"s? Do you use shell variables in your backup script? Note that root's crontab sets certain variables such as $HOME and invokes scripts from root's home directory whereas cron.daily does not and operates from '/'.

Also note that the shell invoked may be different in one versus the other.

It's probably something like that. If you haven't figured it out, you might want to post your shellscript file and I'll see if I can see anything.
Re: rsync does not exclude directories in cron.daily
July 18, 2013 12:12AM
Thanks restamp! I use relative path for --exclude (I did try full path, too). I have not had time to try to figure this one out. Perhaps you can spot something
rsync --version
rsync version 3.0.9 protocol version 30

#!/bin/sh

echo `date`   "Started rsync daily backup"  >> /tmp/rsync.backup.daily.log

Host=`hostname`
if  [ $Host = "tldDebian" ]; then

   echo `date`   "Started rsync backup for tldDebian rootfs" >> /tmp/rsync.backup.daily.log

   /usr/bin/rsync -aAXv --dry-run --stats \
	--exclude dev/* --exclude proc/* --exclude sys/*   --exclude tmp/* \
	--exclude run/* --exclude mnt/*  --exclude media/* --exclude lost+found --exclude var/cache/apt/* --exclude swapfile1 \
	--password-file=/root/rsync.backup.daily.pswd  \
	--log-file=/tmp/rsync.backup.daily.log \
	/* rsync://mbpbackup@HomeMedia2.local/tldDebian

   echo `date`   "Finished rsync daily backup for tldDebian rootfs"  >> /tmp/rsync.backup.daily.log
fi

echo `date`   "Finished rsync daily backup"  >> /tmp/rsync.backup.daily.log

exit 0
Re: rsync does not exclude directories in cron.daily
July 18, 2013 08:26PM
OK. If I am reading this correctly, you have not quoted any of the arguments to your --excludes, and they all contain '*' shell expansion characters, so the shell will attempt to expand them before it invokes rsync.

When running out of cron.daily, cron initializes the cwd to '/', so something like 'proc/*' will match the files under the /proc file system and be expanded by the shell. Let's say there are two files in proc: 'one' and 'two'. (Yes, I know there are really many more.) The --exclude argument 'proc/*' will be exanded by the shell and the arguments rsync will see are ... --exclude proc/one proc/two ... Rsync will interpret this as you wanting to exclude 'proc/one' but will treat 'proc/two' as a specific include (i.e., as a stand-alone argument). In essence, you are asking rsync to explicitly include everything in proc except the first argument of the expansion.

OTOH, if you invoke this script from root's crontab, cron will set the home directory to (most likely, unless you've changed things) /root. But now, when the shell tries to expand 'proc/*' it cannot do so because it finds no proc directory under /root, so it leaves the argument intact. Rsync now sees ... --exclude proc/* ... and does what you want: It compares each file in the hierarchy it is updating against the string 'proc/*' (where '*' is now a special match-all character to rsync), and thus will exclude everything under the proc directory -- or for that matter, any directory that ends in 'proc' in the hierarchy.

The solution: Put quotes around each of your --exclude arguments and see if that doesn't resolve the problem.
Re: rsync does not exclude directories in cron.daily
July 19, 2013 12:28AM
Restamp,

Indeed, I did not have any quote surrounding the exclude patterns! your explanation makes sense. I'd bet that is the answer.

However, I always thought the rsync exclude patterns, if specified without a full path, they are relative to the source. In this case it is /*, and the manual seems to indicate that:

ANCHORING INCLUDE/EXCLUDE PATTERNS
       As  mentioned  earlier, global include/exclude patterns are anchored at the "root of the transfer" (as opposed to per-directory patterns, which are anchored at
       the merge-file’s directory).  If you think of the transfer as a subtree of names that are being sent from sender to receiver, the transfer-root  is  where  the
       tree starts to be duplicated in the destination directory.  This root governs where patterns that start with a / match.

Does the "root of the transfer" mean the source? meaning the "/*" in my command above?

Thanks,

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Re: rsync does not exclude directories in cron.daily
July 19, 2013 08:43PM
I have always presumed -- a dangerous thing to do -- that the exclude pattern wasn't anchored unless it began with a leading slash, but otherwise, it applied if the pattern was a subset of the filename. In other words, 'a/b' would match the file 'aaa/bbb'. But now I believe this is wrong and it would only match the case where there was a 'b' entity within an 'a' directory. Of course, '/a/b' would anchor to the start (and only the start) of a hierarchy.

I think the term "root of the transfer" refers to the start of the hierarchy (or hierarchies) being rsynced. For instance, in

rsync a b c d

the roots would be 'a', 'b', and 'c' respectively.

Bear in mind that I don't consider myself an rsync expert by any stretch of the imagination, so if anyone else can shed a more definitive light on all this, I'd love to hear from you.
Re: rsync does not exclude directories in cron.daily
July 20, 2013 01:05AM
restamp, I think so too. I'm changing the --exclude to use the full path, and will see how it goes.
Thanks,

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Re: rsync does not exclude directories in cron.daily
July 20, 2013 04:28PM
restamp,

It works fine in cron.daily with the full path for --exclude. Thanks for your help :)

-bodhi
===========================
Forum Wiki
bodhi's corner (buy bodhi a beer)
Author:

Your Email:


Subject:


Spam prevention:
Please, enter the code that you see below in the input field. This is for blocking bots that try to post this form automatically. If the code is hard to read, then just try to guess it right. If you enter the wrong code, a new image is created and you get another chance to enter it right.
Message: