Welcome! Log In Create A New Profile

Advanced

ARM Server Long Term Reliability - What's Your Experience?

Posted by restamp 
ARM Server Long Term Reliability - What's Your Experience?
March 27, 2015 01:04AM
One of the concerns I had when I converted from my power-hungry-but-bulletproof Sparc servers to these small Arm boxes was how reliable the hardware would be over time. Here's my personal experience with these devices to date.

SheevaPlug - The first Arm device I acquired was a SheevaPlug. I put it in service in the fall of 2009. The power supply lasted about a year, but this was a well-known point of failure with these Plugs. When mine eventually bit the dust, I replaced it with a wall-wart and the Sheeva has been running reliably ever since. (It's cheap SD card did develop bit errors after a couple years continual use, and had to be replaced, but I won't hold that failure against the Plug.)

Dockstars - I acquired two of them in the fall of 2010. One had a problem out of the box: It booted and ran OK, but there was a (latent) bit error in the uBoot section of the NAND. I went ahead and reflashed the uBoot anyway and ran it for several months, then swapped it for the other Dockstar. When I next tried to put it into service, it was dead. I suspect the NAND developed some other bit errors which were not so latent. The other Dockstar ran a couple years and was retired. I suspect it would boot if I tried it today, but I haven't run it for the better part of the past three years.

PogoPlug E02 - I've acquired several of these and have had good luck with them. Two have been running for about 2.5 years now. Except for the fact that one Plug, or its disk, seems a bit sensitive to RF (I'm a ham radio op) and occasionally while hamming its disk will take errors and gets remapped to read-only mode, requiring a reboot, I've had zero problems with these units. (And, since I've added some ferrites to the cords, that problem has largely disappeared.) In any event, I have a couple spares on the shelf just in case one dies.

In short, I've been amazed at the apparent long-term reliability of these little devices.

So, what's your experience?
TEN
Re: ARM Server Long Term Reliability Experience
March 27, 2015 05:35PM
Very similar: Dockstar even had a power-hungry DVB-T stick and 4-in-1 hub on it but worked well 24/7 for half a decade. Sometimes it would need a few cycles to launch from USB after power outages, but I understand that was a function of Jeff's uBoot.
Never had one of the specially-cased hard disks particularly made for its cradle on it, but rather a mini-to-regular USB adapter with a stick.
The mceusb IR transceiver seemed to be picking up a lot more EM interference on the Dockstar though, but on the other hand I hadn't ever had to replace the USB media on it.
While according to Tolkien one oughtn't to leave a live dragon out of calculations, something very similar holds true for a not entirely dissimilar "hairdryer with a burning desire" on the same circuit, which is what reduced that particular bright white (Dock)star to a "brown dwarf" (astronomically speaking).
Barring that unfortunate incident, it would probably still be up and running (with uptime occasionally showing more than a year).

On Pogoplug E02's, I've now read the forums rumor that media was more likely to get fried on the front port, which is where I had usually put my USB sticks and lost them after mere months, so I'm now trying to find out if using the rear ones instead does make a difference.
A pity its stand seems a unnecessarily flimsy, though it does help to make heat even less of an issue than on the Dockstar.
The very same mceusb as above runs much more reliably on the E02, while I can't rule out improvements to the driver are to credit.



Edited 1 time(s). Last edit at 03/27/2015 05:35PM by TEN.
Re: ARM Server Long Term Reliability Experience
March 27, 2015 07:15PM
I have a farm of about 20 plugs :) some located remotely, 2 off-line. They come online at various time in the last 4 years. None ever failed (majority has Ext3 USB rootfs), but a single instance with a 2TB WD Element which is known to be flaky and slow wake up cycle. This problem was solved with u-boot env usb_retry_ready patch.

-bodhi
===========================
Forum Wiki
bodhi's corner
Re: ARM Server Long Term Reliability - What's Your Experience?
March 27, 2015 07:26PM
Just think, a farm of 20 plugs is 100 watts, has CPU value of 24,000 and over 5 Gigs RAM!
I knew this was going to be a strange day...
Re: ARM Server Long Term Reliability Experience
March 27, 2015 07:27PM
TEN Wrote:
-------------------------------------------------------
> Very similar: Dockstar even had a power-hungry
> DVB-T stick and 4-in-1 hub on it but worked well
> 24/7 for half a decade. Sometimes it would need a
> few cycles to launch from USB after power outages,
> but I understand that was a function of Jeff's
> uBoot.

I think rather Jeff's setup allows a corrupted rootfs have a chance to be fixed and boot again.

> Never had one of the specially-cased hard disks
> particularly made for its cradle on it, but rather
> a mini-to-regular USB adapter with a stick.
> The mceusb IR transceiver seemed to be picking up
> a lot more EM interference on the Dockstar though,
> but on the other hand I hadn't ever had to replace
> the USB media on it.
> While according to Tolkien one oughtn't to leave a
> live dragon out of calculations, something very
> similar holds true for a not entirely dissimilar
> "hairdryer with a burning desire" on the same
> circuit, which is what reduced that particular
> bright white (Dock)star to a "brown dwarf"
> (astronomically speaking).
> Barring that unfortunate incident, it would
> probably still be up and running (with uptime
> occasionally showing more than a year).
>
> On Pogoplug E02's, I've now read the forums rumor
> that media was more likely to get fried on the
> front port, which is where I had usually put my
> USB sticks and lost them after mere months, so I'm
> now trying to find out if using the rear ones
> instead does make a difference.

I dont think that rumors is true :) I think it might have been due to the order of enumeration for the port. The one closest to the Ethernet port will get assigned sda1, if all USB drives attached are thumb drives.

> A pity its stand seems a unnecessarily flimsy,
> though it does help to make heat even less of an
> issue than on the Dockstar.
> The very same mceusb as above runs much more
> reliably on the E02, while I can't rule out
> improvements to the driver are to credit.

-bodhi
===========================
Forum Wiki
bodhi's corner
TEN
Re: ARM Server Long Term Reliability Experience
March 28, 2015 03:16AM
bodhi Wrote:
> TEN Wrote:
> > Very similar: Dockstar [s]ometimes ... would need
> > a few cycles to launch from USB after power outages,
> but I understand that was a function of Jeff's uBoot.
>
> I think rather Jeff's setup allows a corrupted
> rootfs have a chance to be fixed and boot again.

Well, in these cases it did fall back into stock - which is appreciated for this reason as it tells how much alive the machine still is.
How's Jeff running fsck on boot? I thought some versions of his uBoot were just alternating the boot order between retries.

> > On Pogoplug E02's, I've now read the forums rumor
> > that media was more likely to get fried on the
> > front port, which is where I had usually put my
> > USB sticks and lost them after mere months, so
> > I'm now trying to find out if using the rear ones
> > instead does make a difference.
>
> I dont think that rumors is true :) I think it
> might have been due to the order of enumeration
> for the port. The one closest to the Ethernet port
> will get assigned sda1, if all USB drives attached
> are thumb drives.

Enumeration shouldn't make a difference to the wear of flash memory though, but in spite of never deploying one without prior testing (H2testw, as it's Linux "port" has been in pilot for 5 years: https://fixfakeflash.wordpress.com/2010/08/20/linux-h2testw-alternative-program-called-f3-by-michel%C2%A0machado/) I know most sold these days are of limited longevity anyway, and a rumor's a rumor's a rumor.

> > The very same mceusb as above runs much more
> > reliably on the E02, while I can't rule out
> > improvements to the driver are to credit.

Which is part of bodhi's 3.16 of course :)
Re: ARM Server Long Term Reliability Experience
March 28, 2015 03:39AM
TEN,

> Well, in these cases it did fall back into stock -
> which is appreciated for this reason as it tells
> how much alive the machine still is.
> How's Jeff running fsck on boot? I thought some
> versions of his uBoot were just alternating the
> boot order between retries.

U-boot itsef does not really do anything, but Jeff's env for bootcmd has a reset at the end. That's the convention I used, too. So if the disk has problem spinning up, the reset cycle will allow it to try again. That's what I meant, perhaps that's what occured when it took a few cycles to boot up?

> Enumeration shouldn't make a difference to the
> wear of flash memory though, but in spite of never
> deploying one without prior testing (H2testw, as
> it's Linux "port" has been in pilot for 5 years:
> https://fixfakeflash.wordpress.com/2010/08/20/linu
> x-h2testw-alternative-program-called-f3-by-michel%
> C2%A0machado/) I know most sold these days are of
> limited longevity anyway, and a rumor's a rumor's
> a rumor.

I did not mean it has anything to do with the wear and tear! And I always use the front port for USB and I did not notice any thing wrong.

So my hypothesis is: the enumeration does make it so that if someone has a problem booting with multi-drive configuration, and then switch the rootfs to the rear port, it will boot up correctly every time. That could cause a wrong impression that the front port tends to mess up USB thumb drives.

-bodhi
===========================
Forum Wiki
bodhi's corner
TEN
Re: ARM Server Long Term Reliability Experience
March 28, 2015 06:38AM
bodhi wrote:
> TEN wrote:
> > Well, in these cases it did fall back into stock
> > How's Jeff running fsck on boot? I thought some
> > versions of his uBoot were just alternating the
> > boot order between retries.
>
> U-boot itsef does not really do anything, but
> Jeff's env for bootcmd has a reset at the end.
> That's the convention I used, too. So if the disk
> has problem spinning up, the reset cycle will
> allow it to try again. That's what I meant,
> perhaps that's what occured when it took a few
> cycles to boot up?

Not in these cases without a disk to spin, as it would drop back into stock (the experience that this one in NAND may be worth preserving, as Arch come to realize), which was fine by me, and easily rebooted back into the "real" OS. But probably even a USB stick wouldn't immediately initialize fast enough at cold power-up all the time.

> > Enumeration shouldn't make a difference to the
> > wear of flash memory though [...]
> > I know most [USB sticks] sold these days are of
> > limited longevity anyway, and a rumor's a rumor
>
> I did not mean it has anything to do with the wear
> and tear! And I always use the front port for USB
> and I did not notice any thing wrong.

The one mostly seen in blog pictures about hacking these Pogoplugs too (key-shaped ones most visually intriguing ;)).

> So my hypothesis is: the enumeration does make it
> so that if someone has a problem booting with
> multi-drive configuration, and then switch the
> rootfs to the rear port, it will boot up
> correctly every time. That could cause a wrong
> impression that the front port tends to mess up
> USB thumb drives.

Well, there may be something electrically different about this particular port, as I've also had this earlier experience on the second V2 Pogo by now: http://forum.doozan.com/read.php?2,12096,16411#msg-16411 "irsend via mceusb ... work[s] on the top rear connector (but not the front one)" - with no software-detectable/logged errors, but IR LED staying dark (even to an appropriate sensor, i.e. PowerMid transmitter or digital camera) only there, this being the very command (ever so slightly) increasing power draw.

Speaking of which, though unlikely in spite of negotiable USB current, is there by any chance a way to completely cut & re-enable power to one particular port in software (that would open up some nice control opportunities) ?



Edited 2 time(s). Last edit at 03/28/2015 08:54AM by TEN.
Re: ARM Server Long Term Reliability Experience
March 28, 2015 02:29PM
TEN,

> Speaking of which, though unlikely in spite of
> negotiable USB current, is there by any chance a
> way to completely cut & re-enable power to one
> particular port in software (that would open up
> some nice control opportunities) ?

We could do it with GPIO. I recall there is a shell utility somewhere that allows you to control the pins. In u-boot there is a built-in command for GPIO, and I've included some of the code in the u-boot build, but not sure it is usable yet.

-bodhi
===========================
Forum Wiki
bodhi's corner
Re: ARM Server Long Term Reliability Experience
March 28, 2015 02:55PM
bodhi Wrote:
-------------------------------------------------------
> TEN,
>
> > Speaking of which, though unlikely in spite of
> > negotiable USB current, is there by any chance
> a
> > way to completely cut & re-enable power to one
> > particular port in software (that would open up
> > some nice control opportunities) ?
>
> We could do it with GPIO. I recall there is a
> shell utility somewhere that allows you to control
> the pins. In u-boot there is a built-in command
> for GPIO, and I've included some of the code in
> the u-boot build, but not sure it is usable yet.

Briefly looked at the DTS, seems like we don't have a way to do it for individual USB port. Only USB power in general. But I need to look at this closely to know for sure.

-bodhi
===========================
Forum Wiki
bodhi's corner
TEN
Re: RF or IR transceivers on USB / GPIO
April 10, 2015 07:47AM
bodhi Wrote:
> > > though unlikely in spite of negotiable USB current, is there by any chance a way to
> > > completely cut & re-enable power to one particular port in software
> > > (that would open up some nice control opportunities) ?
> >
> > We could do it with GPIO. I recall there is a
> > shell utility somewhere that allows you to
> > control> the pins. In u-boot there is a built-in command
> > for GPIO, and I've included some of the code in
> > the u-boot build, but not sure it is usable yet.
>
> Briefly looked at the DTS, seems like we don't
> have a way to do it for individual USB port. Only
> USB power in general. But I need to look at this
> closely to know for sure.

Speaking of GPIO pins, do you see a way to use them (or the internal serial port if you happen to have one wired up) for http://lirc.org/receivers.html & http://lirc.org/transmitters.html and/or ISM RF transceivers (the latter requiring neither line of sight nor baseband modulation in software of course, while PowerMids etc. in appropriate locations can easily interface them with IR as well) ?
Some contraptions of similar flexibility ("transceivers of all trades") have been contrived for another ARM at http://www.harctoolbox.org/lirc_rpi.html.


mceusb (also particularly finicky on the front port while it works on the rear ones) seems to be limited to 50 microseconds sample accuracy (cf. rounding to that in "mode2 -d /dev/lirc0", as well as http://sourceforge.net/p/lirc/mailman/message/24253837/) while the default tolerance (aeps) in /etc/lirc/lircd.conf is only twice as much (quite prohibitive WRT raw_codes), and rather problematic transmit (blasting) fidelity in comparison to the old Serial Port dongles (haven't had a chance to try e.g. http://www.huitsing.nl/irftdi/ yet).

restamp Wrote:
> I'm a ham radio op

Sounds like you might be well equipped and interested to help optimize such Home Automation transceivers too. :)
(Cf. http://forum.doozan.com/read.php?2,2782,2998#msg-2998 already...)
I've got to admit I still even lack a (sampling) scope...



Edited 4 time(s). Last edit at 04/10/2015 08:29AM by TEN.
Re: RF or IR transceivers on USB / GPIO
April 11, 2015 09:26PM
TEN,

> Speaking of GPIO pins, do you see a way to use
> them (or the internal serial port if you happen to
> have one wired up) for
> http://lirc.org/receivers.html &
> http://lirc.org/transmitters.html and/or ISM RF
> transceivers (the latter requiring neither line of
> sight nor baseband modulation in software of
> course, while PowerMids etc. in appropriate
> locations can easily interface them with IR as
> well) ?

I amnot sure. Usually when you plug in any device it could have some GPIO pins mapping to it. I would have to know device well and/or and having the device to install.

-bodhi
===========================
Forum Wiki
bodhi's corner
TEN
Re: RF or IR transceivers on USB / GPIO
April 12, 2015 03:34AM
bodhi Wrote:
> > Speaking of GPIO pins, do you see a way to use
> > them (or the internal serial port if you happen
> to
> > have one wired up) for
> > http://lirc.org/receivers.html &
> > http://lirc.org/transmitters.html and/or ISM RF
> > transceivers (the latter requiring neither line
> of
> > sight nor baseband modulation in software of
> > course, while PowerMids etc. in appropriate
> > locations can easily interface them with IR as
> > well) ?
>
> not sure. Usually when you plug in any device
> it could have some GPIO pins mapping to it. I
> would have to know device well and/or and having
> the device to install.

Cf. section "The hardware:" at http://aron.ws/projects/lirc_rpi/ as well as his info on assigning an IRQ for the pin.

Circuitry is as simple as it gets (only IR blasting will need to be driven by transistor, RF should interface directly both ways), but I've got fewer boards & tools to come up with an optimized schematic (and timings for irsend fidelity) for the Plug(s).

http://www.harctoolbox.org/lirc_rpi.html is quite an advanced driver to build on, suggesting only the further addition of placeholders for flexible measurement bits (as opposed to keypresses that only flip a counter/toggle bit at the most) to read weather sensors e.g. into /proc from the same hardware.

While my earlier hardware is not nearly as integrated, besides the convenience of scenario control across manufacturers (e.g. max-WAF ;-) wireless single-button VDR home cinema), I can attest to massive energy savings throughout the building from automated lighting, shutter&shade control.



Edited 3 time(s). Last edit at 04/12/2015 03:51AM by TEN.
Re: RF or IR transceivers on USB / GPIO
April 12, 2015 05:19AM
> Cf. section "The hardware:" at
> http://aron.ws/projects/lirc_rpi/ as well as his
> info on assigning an IRQ for the pin.

OK then it is well documented, so most definitely possible through /sys/class/gpio/. As I recalled. you just need to "export" a pin number, and set diretion to out, and start "echo" a value to that pin (the same way you would set LED). Our kernel is modern, so you can do this through this gpio device class. The hard part is to find out the correct pin number, and experiment it with your test box to tool-proof the script.

-bodhi
===========================
Forum Wiki
bodhi's corner
TEN
Re: USB power switching & GPIO
April 17, 2015 01:38PM
bodhi Wrote:
> > > Speaking of which, though unlikely in spite of
> > > negotiable USB current, is there by any chance
> > a way to completely cut & re-enable power to
> one particular port in software (that would open up
> > > some nice control opportunities) ?
> > I recall there is a shell utility somewhere that allows you
> > to control the pins. In u-boot there is a built-in command
> > for GPIO, and I've included some of the code in
> > the u-boot build, but not sure it is usable yet.
>
> Briefly looked at the DTS, seems like we don't
> have a way to do it for individual USB port. Only
> USB power in general. But I need to look at this
> closely to know for sure.

bsdfox at https://forum.openwrt.org/viewtopic.php?pid=250166#p250166 seems to discuss USB power and GPIOs, but probably on a v4 though using E02 files as a starting point.
Re: ARM Server Long Term Reliability - What's Your Experience?
April 25, 2015 05:28AM
Just in terms of reliability, I've had my E02 and my Goflex Net being in use from about 2012 - the GFN almost continuously.

Unlike the main goal of this forum, I stuck with the stock system and used optware to add the tools I needed - I have probably stressed the hardware less than other people but they're still both in perfect order. The only issue I have is that the GFN seems to completely freeze up every so often - but that's related to the pogoplug service going nuts with only 128mb RAM and not a hardware issue.

I've had the E02 at my place, and GFN at my parents (they have fibre, I'm on measly ADSL) - remote access works a treat and I utilise the my.pogoplug service quite heavily (one of the reasons I stayed stock). It's not a massive load, but they still keep going!
Re: ARM Server Long Term Reliability - What's Your Experience?
July 12, 2015 09:13PM
I got two dockstars in Oct 2010, both have been operating flawlessly since then.

One acts as a home server, providing: Exim, Samba, mediatomb, ftp, http, and tftp. The rootfs is on a flash drive running Wheezy.

The other is a print server connected via WiFi. It plays chess 24/7 at http://www.freechess.org. It runs squeeze from NAND and includes an updated G++ (4.8) in order to keep the engine at the latest dev version from git. (4.8 produces much better optimized code for the chess engine than the squeeze default)

I also have a sheevaplug that I got in Jan 2010. It had power supply issues in Jun 2011 which I replaced. It ran all my server tasks until then when I switched to a dockstar while the new P/S was on order from Globalscale. It currently runs Jessie from NAND, but I have little use for it right now so it collects dust. Although it has bigger NAND and RAM, it only has one USB port, and so the DockStar works better as my home server. The sheeva also has quite a lot of bad NAND blocks, since very early in fact. But it works fine anyway.
Author:

Your Email:


Subject:


Spam prevention:
Please, enter the code that you see below in the input field. This is for blocking bots that try to post this form automatically. If the code is hard to read, then just try to guess it right. If you enter the wrong code, a new image is created and you get another chance to enter it right.
Message: