Welcome! Log In Create A New Profile

Advanced

RFC: Replace Pogoplug install with better recovery system

Posted by Jeff 
RFC: Replace Pogoplug install with better recovery system
September 11, 2010 02:06PM
The original Pogoplug install is useful as a base system for running installers and simple scripts, and it's great for new users to fall back on until they get their Debian system booting reliably. However, once you've got Debian booting and running the way you want it, it would be helpful to have a more robust recovery system to fall back on.

Since I have no intention of ever using the Pogoplug services, I'd like to replace the Pogoplug stuff on mtd1 and mtd2 with something a little more recovery-friendly.

If we merge mtd1 and mtd2 into a single partition, we can create a 36M recovery partition. Formatting this as a compressed UBIFS, we get about 50M of space. That's a lot of space for a simple recovery system.

I'd like to create a system with the following:
SSH server and client
UBIFS support
fsck utilities
fw_set/printenv utils for uBoot

So my questions to everyone are:

Would you use this?
What would you like to see included?

Is there a good 'base' that we should start from? I'm leaning towards an emdebian 'baked' system, but if there's something better, I'd be more than happy to use it.

-- Jeff
ecc
Re: RFC: Replace Pogoplug install with better recovery system
September 11, 2010 03:56PM
What would be the advantage over just using a USB flash drive as a recovery disk? I like having all 255M of NAND available for "normal" use as the ubifs root.
Re: RFC: Replace Pogoplug install with better recovery system
September 11, 2010 05:12PM
ecc, I think you are one of the few people without a lingering Pogoplug partition. This is really intended for the rest of us who still keep the old Pogoplug partitions in tact to us as a recovery partition.

To answer you question in more detail, I believe there are two advantages to maintaining a recovery partition:

1) Convenience. It's already there. You don't have to plan ahead. You don't need to keep a backup USB drive, and you don't need to fuss with opening up the device and connecting a serial cable.

2) Automation. If the system can fall back into a recovery install when things go bad, that recovery install can be configured to do some simple repairs and then boot back into the full install.

For your purposes, it sounds like the extra space is worth the extra hassle if things go wrong.

On the other hand, I'm planning to deploy Dockstars in remote locations where I won't have easy physical access. For me, the advantage of not needing physical access the device in the event that something goes wrong is worth giving up a little disk space.

-- Jeff
ecc
Re: RFC: Replace Pogoplug install with better recovery system
September 11, 2010 05:58PM
Sounds reasonable. And whatever you come up with for an onboard solution could probably be used on a USB drive too.

In addition to what you listed above, I'd suggest dropbear and busybox, plus a version of your install script, (in case you simply want to re-install to the larger NAND partition rather than repair it). It could create an empty ubifs, mount it, and debootstrap into it dynamically rather than first preparing a ubi image.
Re: RFC: Replace Pogoplug install with better recovery system
September 11, 2010 06:23PM
I probably would use such a recovery system, tho I'm not sure I'm quite ready to cut all ties to the pogoplug service.

I'd like to see less and nano. Apologies to vi/emacs folks, but I've never been able to learn them. If there's room, fdisk, wget, and maybe usbutils.
Re: RFC: Replace Pogoplug install with better recovery system
September 11, 2010 06:49PM
Argh, ate my post somehow.

I'd add a vote for nano as well. Netcat sounds like a good option.

Is there a benefit with having two IPs for the box, a fixed known good IP and DHCP address? The known would be the fall back method in case you can't get DHCP for whatever reason.

A webmin style interface would likely help the most users (least experienced). I'm not sure how much is involved in that. Dream big. :)



Edited 1 time(s). Last edit at 09/11/2010 06:49PM by gzader.
I would go ecc suggestion. Since I may use openWRT on mdt3 and the boot order is USB -> mdt3 (openwrt) -> mtd1 (Pogoplug). If openwrt boots but I cannot access its shell, there is no way to tell uboot boots mtd1 (Pogoplug) without serial console.

With ecc suggestion, it boots on USB if USB is plugged. Unplug USB it will boot mtd1.
I second the dual IP address suggestion. I think I saw a poor man's version of it in the original OS, which sets a fixed address in the APIPA range before it starts the DHCP client. The APIPA range is a good idea, because then a cross cable is all you need to connect to the system. If the admin computer is configured for DHCP, it will automatically get an address in the same subnet, either through DHCP or, if no DHCP is available, by also following the APIPA protocol. No need to change the IP address configuration or have other infrastructure present. The APIPA protocol should be observed (which means the address will be random: good if there's more than one system on the same network). It might be helpful to continuously broadcast small beacon UDP packets (one per second) on a defined port so that the admin system can easily find the target address(es).

Another suggestion: If possible, make booting the rescue system dependent on the presence of a simple external trigger, like the presence of a file with a particular name ("rescueboot"?) in the root dir of a FAT formatted USB stick. Then the rescue system could come before other systems in the boot order and fall through to the production system if the trigger isn't present. This would solve the problem that an internally installed system might boot but become inaccessible, which would not allow the boot loader to fall through to the rescue system. This could also prevent "accidental" booting of the rescue system, which might be undesirable in some environments. (Other possible triggers: absence of a USB device class, disconnected network cable, text file on a web server. Can software tame the reset button? Check multiple triggers, boot the rescue system if one is active.)

An idea for remote recovery: Systems which are behind NAT could use a reverse shell (where the target system initiates the connection to the admin system) to avoid the need for port forwardings. Connection attempts should be repeated (with exponential backoff) until a connection has been established either through the reverse shell or in some other way.

Care should be taken that the system does not rely on DNS, NTP (key expiration) or other external services (except for things like remote triggers if names are specified instead of IP addresses).
Re: RFC: Replace Pogoplug install with better recovery system
September 14, 2010 07:27PM
To save googling: APIPA range 169.254.0.0 -169.254.255.255

The only issue with the "rescueboot" filename option is that it becomes a point for something malicious to happen. Granted the chance of this happening are very low but if hacker X wanted to cause mayhem, just dropping that file onto your stick could cause your system to get rebuilt.

Now, if you did something like that and started netcat with the assumption that the dockstart was say 169.254.10.10 and the user was at 169.254.10.20 you could get a recovery moving forward with less chance of breaking a system.
Re: RFC: Replace Pogoplug install with better recovery system
September 15, 2010 12:15AM
I've made some good progress on this and hope to have something available soon.

The default config will have DHCP + a static IP of 169.254.9.17.

Just to clarify a point that seems to be causing some confusion: this is not going to ship as auto-repair system. It's just a system with better utilities than the Pogoplug install. Of course, you can use at as a base for making your own auto-repair script if you're so inclined.

In the meantime, here's a list of the binaries on the rescue system. If you notice something important missing, please let me know.

rescue:/# ls /bin
addgroup       df             iplink         mountpoint     sh
adduser        dmesg          iproute        mt             sleep
ash            dnsdomainname  iprule         mv             stty
busybox        dumpkmap       iptunnel       netstat        su
cat            echo           kill           nice           sync
catv           egrep          linux32        pidof          tar
chattr         false          linux64        ping           touch
chgrp          fdflush        ln             pipe_progress  true
chmod          fgrep          login          printenv       umount
chown          getopt         ls             ps             uname
cp             grep           lsattr         pwd            usleep
cpio           gunzip         mkdir          rm             vi
date           gzip           mknod          rmdir          watch
dd             hostname       mktemp         run-parts      zcat
delgroup       ip             more           sed
deluser        ipaddr         mount          setarch

rescue:/# ls /sbin
devmem             lvcreate           poweroff           udhcpc
dmsetup            lvdisplay          pvchange           usbmount
fdisk              lvextend           pvcreate           vconfig
freeramdisk        lvm                pvdisplay          vgcfgbackup
fsck               lvmchange          pvmove             vgcfgrestore
fw_printenv        lvmdiskscan        pvremove           vgchange
fw_setenv          lvmsadc            pvresize           vgck
getty              lvmsar             pvs                vgconvert
halt               lvreduce           pvscan             vgcreate
hdparm             lvremove           reboot             vgdisplay
hwclock            lvrename           rmmod              vgexport
ifconfig           lvresize           route              vgextend
ifdown             lvs                runlevel           vgimport
ifup               lvscan             setconsole         vgmerge
init               makedevs           start-stop-daemon  vgmknodes
insmod             mdev               sulogin            vgreduce
klogd              mkfs.ntfs          swapoff            vgremove
ldconfig           mkswap             swapon             vgrename
loadkmap           modprobe           switch_root        vgs
losetup            mount.ntfs-3g      sysctl             vgscan
lsmod              nameif             syslogd            vgsplit
lvchange           pivot_root         tune2fs            watchdog

rescue:/# ls /usr/bin
[                    killall5             scp
[[                   last                 screen
ar                   ldd                  seq
arping               length               setkeycodes
awk                  less                 setserial
basename             links                setsid
bunzip2              lockfile-create      sftp
bzcat                lockfile-remove      sha1sum
chattr               lockfile-touch       sha256sum
chrt                 logger               sha512sum
chvt                 logname              slogin
cksum                lsattr               sort
clear                lspci                ssh
cmp                  lsusb                ssh-add
crontab              lzcat                ssh-agent
curl                 lzma                 ssh-keygen
curl-config          md5sum               ssh-keyscan
cut                  mesg                 sshfs
dbclient             mkfifo               strings
dc                   nano                 sudo
deallocvt            nc                   tail
diff                 ncftp                tee
dirname              ncftpbatch           telnet
dos2unix             ncftpget             test
dropbearconvert      ncftpls              tftp
dropbearkey          ncftpput             time
du                   ncftpspooler         top
eject                nohup                tr
env                  nslookup             traceroute
ether-wake           ntfs-3g              tty
expr                 ntfs-3g.probe        uniq
fdformat             ntfscat              unix2dos
file                 ntfscluster          unlzma
find                 ntfscmp              unxz
fold                 ntfsfix              unzip
free                 ntfsinfo             update-alternatives
fuser                ntfsls               uptime
fusermount           ntpdate              uudecode
gio-querymodules     od                   uuencode
head                 openssl              uuidgen
hexdump              openvt               vlock
hostid               passwd               wc
iconv                patch                wget
id                   printf               which
install              readlink             who
ipcrm                realpath             whoami
ipcs                 renice               xargs
iperf                reset                xz
ipkg-cl              resize               xzcat
killall              rsync                yes

rescue:/# ls /usr/sbin/
badblocks       flash_unlock    mkfs.ext4dev    smartctl
blkid           flashcp         mkfs.jffs2      smartd
chroot          fsck            mklost+found    sshd
crond           fsck.ext2       mkntfs          tune2fs
dnsd            fsck.ext3       mtd_debug       ubiattach
dropbear        fsck.ext4       mtdinfo         ubicrc32
dumpe2fs        fsck.ext4dev    nanddump        ubidetach
e2freefrag      ifplugd         nandtest        ubiformat
e2fsck          ifplugstatus    nandwrite       ubimkvol
e2label         inetd           ntfsclone       ubinfo
e2undo          loadfont        ntfscp          ubinize
ethtool         logsave         ntfslabel       ubirename
filefrag        lsusb           ntfsresize      ubirmvol
findfs          mdadm           ntfsundelete    ubirsvol
flash_erase     mke2fs          ntpd            ubiupdatevol
flash_eraseall  mkfs.ext2       rdate           uuidd
flash_info      mkfs.ext3       readprofile     visudo
flash_lock      mkfs.ext4       setlogcons
Re: RFC: Replace Pogoplug install with better recovery system
September 15, 2010 12:46AM
It seems some users have difficulties with vi. But I guess mcedit from mc (GNU Midnight Commander, wich is very good file manager) would be too heavy or too specific ?
Re: RFC: Replace Pogoplug install with better recovery system
September 15, 2010 05:16AM
Perhaps a script that would get the latest uboot, make it executable and run it, as well as one for debian?
smbclient could be useful
The relevant RFC for serverless IP address autoconfiguration is RFC3927. If anyone wants to implement proper use of this address space in the rescue system: http://tools.ietf.org/html/rfc3927#page-10

Static addresses are problematic if more than one rescue system is active on a network segment, especially if the static address should be active in addition to a DHCP-assigned address.
There appears to be an RFC 3927 implementation in busybox. The command is called zcip:
http://www.busybox.net/downloads/BusyBox.html#zcip
Re: RFC: Replace Pogoplug install with better recovery system
September 15, 2010 03:14PM
I think a static IP and DHCP would be the best option. It is very rare that two dockstars are in the rescue system at the same time.

I prefer nano for editing but I can also live with vi, but it will be more difficult to handle.

A basic skript to install an additional linux to a usb device would be nice. Or your debian-install-script and uboot-update-script as written above would be an awesome feature!

PS: Will the rescue system delete the original Pogoplug?
zcip can start with a preferred address. In that case it will only use a random IP address if the default one indeed causes a conflict. That way you end up with a "static" IP when possible, but it also works if there already is another system with that IP.
Re: RFC: Replace Pogoplug install with better recovery system
November 05, 2010 03:02PM
oops - totally wrong thread, sorry!



Edited 1 time(s). Last edit at 11/05/2010 03:11PM by Riffer.
Author:

Your Email:


Subject:


Spam prevention:
Please, enter the code that you see below in the input field. This is for blocking bots that try to post this form automatically. If the code is hard to read, then just try to guess it right. If you enter the wrong code, a new image is created and you get another chance to enter it right.
Message: