Welcome! Log In Create A New Profile

Advanced

(Solved) GoFlex home stalling for 30 mins when there is no nc endpoint

Posted by tbp 
tbp
(Solved) GoFlex home stalling for 30 mins when there is no nc endpoint
May 20, 2020 04:17PM
Hi guys, bit of a weird problem.
Thank you for the assistance recently in getting my GoFlex Home back online.
Now that I have it nicely setup with Debian and a 4TB HDD I'm getting a long boot delay.

If I have a machine setup listening at the NC IP, it boots straight through uboot (as I can see from the NC output) and is online within a minute or so.

However, if the NC endpoint isn't there, the device sits with the blinking green status light for around 30 minutes before spinning up the drive and initializing Linux.

This is an assumption, as I do not have serial cabling and therefore can not actually see what is going on to cause the long delay.
Note: I do not hear the spin up sound for the IDE drive or the normal red LED for two seconds followed by static green - that I normally see towards the end of uboot initialisation.

I suspected it may be something getting stuck in a loop regarding NC uboot vars.
I therefore delicately removed the NC related variables, but this has had no effect and strangely I still get NC output for uboot.

The current printenv being used is listed below.
Any ideas and thanks in advance for any advice?

Current uboot env vars
GoFlexHome> printenv
printenv
arcNumber=3338
bootcmd=run bootcmd_uenv; run scan_disk; run set_bootargs; run bootcmd_exec
bootcmd_exec=run load_uimage; if run load_initrd; then if run load_dtb; then bootm $load_uimage_addr $load_initrd_addr $load_dtb_addr; else bootm $load_uimage_addr $load_initrd_addr; fi; else if run load_dtb; then bootm $load_uimage_addr - $load_dtb_addr; else bootm $load_uimage_addr; fi; fi
bootcmd_uenv=run uenv_load; if test $uenv_loaded -eq 1; then run uenv_import; fi; sleep 3
bootdelay=10
bootdev=usb
device=0:1
devices=usb ide mmc
disks=0 1 2 3
dtb_file=/boot/dts/kirkwood-goflexhome.dtb
ethact=egiga0
ethaddr=[Deliberately Omitted]
if_netconsole=ping $serverip
ipaddr=192.168.0.90
led_error=orange blinking
led_exit=green off
led_init=green blinking
load_dtb=echo loading DTB $dtb_file ...; load $bootdev $device $load_dtb_addr $dtb_file
load_dtb_addr=0x1c00000
load_initrd=echo loading uInitrd ...; load $bootdev $device $load_initrd_addr /boot/uInitrd
load_initrd_addr=0x1100000
load_uimage=echo loading uImage ...; load $bootdev $device $load_uimage_addr /boot/uImage
load_uimage_addr=0x800000
mainlineLinux=yes
mtdids=nand0=orion_nand
mtdparts=mtdparts=orion_nand:1M(u-boot),-(rootfs)
nc_ready=1
ncip=192.168.0.191
ncipk=192.168.0.26
netconsole=off
partition=nand0,2
scan_disk=echo running scan_disk ...; scan_done=0; setenv scan_usb "usb start"; setenv scan_ide "ide reset"; setenv scan_mmc "mmc rescan"; for dev in $devices; do if test $scan_done -eq 0; then echo Scan device $dev; run scan_$dev; for disknum in $disks; do if test $scan_done -eq 0; then echo device $dev $disknum:1; if load $dev $disknum:1 $load_uimage_addr /boot/uImage 1; then scan_done=1; echo Found bootable drive on $dev $disknum; setenv device $disknum:1; setenv bootdev $dev; fi; fi; done; fi; done
serverip=192.168.0.191
set_bootargs=setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 $mtdparts $custom_params
start_netconsole=setenv ncip $serverip; setenv bootdelay 10; setenv stdin nc; setenv stdout nc; setenv stderr nc; version;
stderr=nc
stdin=nc
stdout=nc
uenv_addr=0x810000
uenv_import=echo importing envs ...; env import -t $uenv_addr $filesize
uenv_init_devices=echo Initializing devices...; setenv init_usb "usb start"; setenv init_ide "ide reset"; setenv init_mmc "mmc rescan"; for devtype in $devices; do run init_$devtype; done
uenv_load=run uenv_init_devices; setenv uenv_loaded 0; for devtype in $devices; do for disknum in $disks; do if test $uenv_loaded -eq 0; then setenv device_type $devtype; setenv disk_number $disknum; run uenv_read; fi; done; done;
uenv_read=echo Loading envs from $device_type $disk_number...; if load $device_type $disk_number:1 $uenv_addr /boot/uEnv.txt; then setenv uenv_loaded 1; echo ... envs loaded; fi
usb_ready_retry=15

UBoot output when NC is listening
0
Initializing devices...
starting USB...
USB0: USB EHCI 1.00
scanning bus 0 for devices... 1 USB Device(s) found
scanning usb for storage devices... 0 Storage Device(s) found

Reset IDE: Bus 0: ........OK Bus 1: not available
Device 0: Model: ST4000DM004-2CV104 Firm: 0001 Ser#: ZFN32R0G
Type: Hard Disk
Supports 48-bit addressing
Capacity: 3815447.8 MB = 3726.0 GB (7814037168 x 512)
Unknown command 'mmc' - try 'help'
Loading envs from usb 0...
** Bad device usb 0 **
Loading envs from usb 1...
** Bad device usb 1 **
Loading envs from usb 2...
** Bad device usb 2 **
Loading envs from usb 3...
** Bad device usb 3 **
Loading envs from ide 0...
** File not found /boot/uEnv.txt **
Loading envs from ide 1...
** Bad device ide 1 **
Loading envs from ide 2...
** Bad device ide 2 **
Loading envs from ide 3...
** Bad device ide 3 **
Loading envs from mmc 0...
** Bad device mmc 0 **
Loading envs from mmc 1...
** Bad device mmc 1 **
Loading envs from mmc 2...
** Bad device mmc 2 **
Loading envs from mmc 3...
** Bad device mmc 3 **
running scan_disk ...
Scan device usb
device usb 0:1
** Bad device usb 0 **
device usb 1:1
** Bad device usb 1 **
device usb 2:1
** Bad device usb 2 **
device usb 3:1
** Bad device usb 3 **
Scan device ide

Reset IDE: Bus 0: OK Bus 1: not available
Device 0: Model: ST4000DM004-2CV104 Firm: 0001 Ser#: ZFN32R0G
Type: Hard Disk
Supports 48-bit addressing
Capacity: 3815447.8 MB = 3726.0 GB (7814037168 x 512)
device ide 0:1
1 bytes read in 56 ms (0 Bytes/s)
Found bootable drive on ide 0
loading uImage ...
4973929 bytes read in 1120 ms (4.2 MiB/s)
loading uInitrd ...
9788920 bytes read in 2232 ms (4.2 MiB/s)
loading DTB /boot/dts/kirkwood-goflexhome.dtb ...
10249 bytes read in 95 ms (104.5 KiB/s)
## Booting kernel from Legacy Image at 00800000 ...
Image Name: Linux-5.2.9-kirkwood-tld-1
Created: 2020-05-11 22:09:27 UTC
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 4973865 Bytes = 4.7 MiB
Load Address: 00008000
Entry Point: 00008000
Verifying Checksum ... OK
## Loading init Ramdisk from Legacy Image at 01100000 ...
Image Name: initramfs-5.2.9-kirkwood-tld-1
Created: 2020-05-11 22:19:56 UTC
Image Type: ARM Linux RAMDisk Image (gzip compressed)
Data Size: 9788856 Bytes = 9.3 MiB
Load Address: 00000000
Entry Point: 00000000
Verifying Checksum ... OK
## Flattened Device Tree blob at 01c00000
Booting using the fdt blob at 0x1c00000

Starting kernel ...



Edited 2 time(s). Last edit at 06/02/2020 01:41PM by tbp.
Re: GoFlex home stalling for 30 mins when there is no nc endpoint
May 20, 2020 05:17PM
tbp,

It is a known problem. I will explain it, but in the mean time. Set up netconsole following Step 10:

Quote
https://forum.doozan.com/read.php?3,12381

10. Set up netconsole. It's important to set up neconsole if you don't already have serial console connected. If you have serial console, don't set up netconsole at this moment, because it will interfere with serial console.

If you've flashed the default environments in step 8 then activate netconsole with the following envs:

Adjust 192.168.0.xxx and 192.168.0.yyy below to the real numbers in your network configuration. 192.168.0.xxx is this plug IP address, and 192.168.0.yyy is the IP address of the netconsole server where it will monitor the output from the this plug (adjust them to the real values in your own local network settings).

fw_setenv preboot_nc 'setenv nc_ready 0; for pingstat in 1 2 3 4 5; do; sleep 1; if run if_netconsole; then setenv nc_ready 1; fi; done; if test $nc_ready -eq 1; then run start_netconsole; fi'
fw_setenv preboot 'run preboot_nc'
fw_setenv ipaddr    '192.168.0.xxx'
fw_setenv serverip '192.168.0.yyy'

-bodhi
===========================
Forum Wiki
bodhi's corner



Edited 1 time(s). Last edit at 05/20/2020 05:18PM by bodhi.
Re: GoFlex home stalling for 30 mins when there is no nc endpoint
May 20, 2020 06:30PM
OK here is why I have scripted the netconsole envs to have a loop

Quote

fw_setenv preboot_nc 'setenv nc_ready 0; for pingstat in 1 2 3 4 5; do; sleep 1; if run if_netconsole; then setenv nc_ready 1; fi; done; if test $nc_ready -eq 1; then run start_netconsole; fi'

U-boot code for network is rather simplistic. The "ping" command should be used with caution. Because it relies on the router to respond to a ping. If the netconsole server has been active sometime in the near past, the router still have the netconsole server IP information and it is not cleared soon enough when that box goes offline.

The result is a false positive from the ping. And then the u-boot network code will take a long time to retry (at least 2 minutes) to connect when u-boot executes further envs to find the netconsole server.

So if we do a few pings continously, chance is that it will fail in the subsequenced pings. And then netconsole is recognized to be "not ready". And u-boot would stop further setup for netconsole.

-bodhi
===========================
Forum Wiki
bodhi's corner
tbp
Re: GoFlex home stalling for 30 mins when there is no nc endpoint
May 22, 2020 02:18PM
Thank you Bodhi for the response, suggestion and explanation.

I re-applied the commands you suggested, saved them and rebooted.
I disconnected the network cable at the nc server and later retried with no network cable connected at the GoFlex, and the GoFlex remained blinking green LED for over 10 minutes (I gave up at that point).

Am I missing something? Is there any permanent method for me to remove pinging/searching for the nc server?
(I now have a rescue USB disk should I get into trouble in the future so have no need for the NC failsafe).

Current uboot vars if required.

U-Boot 2017.07-tld-1 (Sep 05 2017 - 00:21:31 -0700)
Seagate GoFlex Home
gcc (Debian 6.3.0-18) 6.3.0 20170516
GNU ld (GNU Binutils for Debian) 2.28
Hit any key to stop autoboot:  9 
 0 
GoFlexHome> printenv
printenv

arcNumber=3338
bootcmd=run bootcmd_uenv; run scan_disk; run set_bootargs; run bootcmd_exec
bootcmd_exec=run load_uimage; if run load_initrd; then if run load_dtb; then bootm $load_uimage_addr $load_initrd_addr $load_dtb_addr; else bootm $load_uimage_addr $load_initrd_addr; fi; else if run load_dtb; then bootm $load_uimage_addr - $load_dtb_addr; else bootm $load_uimage_addr; fi; fi
bootcmd_uenv=run uenv_load; if test $uenv_loaded -eq 1; then run uenv_import; fi; sleep 3
bootdelay=10
bootdev=usb
device=0:1
devices=usb ide mmc
disks=0 1 2 3
dtb_file=/boot/dts/kirkwood-goflexhome.dtb
ethact=egiga0
ethaddr=[Deliberately omitted]
if_netconsole=ping $serverip
ipaddr=192.168.0.90
led_error=orange blinking
led_exit=green off
led_init=green blinking
load_dtb=echo loading DTB $dtb_file ...; load $bootdev $device $load_dtb_addr $dtb_file
load_dtb_addr=0x1c00000
load_initrd=echo loading uInitrd ...; load $bootdev $device $load_initrd_addr /boot/uInitrd
load_initrd_addr=0x1100000
load_uimage=echo loading uImage ...; load $bootdev $device $load_uimage_addr /boot/uImage
load_uimage_addr=0x800000
mainlineLinux=yes
mtdids=nand0=orion_nand
mtdparts=mtdparts=orion_nand:1M(u-boot),-(rootfs)
nc_ready=1
ncip=192.168.0.191
ncipk=192.168.0.26
netconsole=off
partition=nand0,2
preboot=run preboot_nc
preboot_nc=setenv nc_ready 0; for pingstat in 1 2 3 4 5; do; sleep 1; if run if_netconsole; then setenv nc_ready 1; fi; done; if test $nc_ready -eq 1; then run start_netconsole; fi
scan_disk=echo running scan_disk ...; scan_done=0; setenv scan_usb "usb start";  setenv scan_ide "ide reset";  setenv scan_mmc "mmc rescan"; for dev in $devices; do if test $scan_done -eq 0; then echo Scan device $dev; run scan_$dev; for disknum in $disks; do if test $scan_done -eq 0; then echo device $dev $disknum:1; if load $dev $disknum:1 $load_uimage_addr /boot/uImage 1; then scan_done=1; echo Found bootable drive on $dev $disknum; setenv device $disknum:1; setenv bootdev $dev; fi; fi; done; fi; done
serverip=192.168.0.191
set_bootargs=setenv bootargs console=ttyS0,115200 root=LABEL=rootfs rootdelay=10 $mtdparts $custom_params
start_netconsole=setenv ncip $serverip; setenv bootdelay 10; setenv stdin nc; setenv stdout nc; setenv stderr nc; version;
stderr=nc
stdin=nc
stdout=nc
uenv_addr=0x810000
uenv_import=echo importing envs ...; env import -t $uenv_addr $filesize
uenv_init_devices=echo Initializing devices...; setenv init_usb "usb start";  setenv init_ide "ide reset";  setenv init_mmc "mmc rescan"; for devtype in $devices; do run init_$devtype; done
uenv_load=run uenv_init_devices; setenv uenv_loaded 0; for devtype in $devices; do for disknum in $disks; do if test $uenv_loaded -eq 0; then setenv device_type $devtype; setenv disk_number $disknum; run uenv_read; fi; done; done;
uenv_read=echo Loading envs from $device_type $disk_number...; if load $device_type  $disk_number:1 $uenv_addr /boot/uEnv.txt; then setenv uenv_loaded 1; echo ... envs loaded; fi
usb_ready_retry=15


Regards

===============
Moderator edit: please use code tags (Formatted code button) to post log.



Edited 1 time(s). Last edit at 05/22/2020 05:07PM by bodhi.
Re: GoFlex home stalling for 30 mins when there is no nc endpoint
May 22, 2020 05:13PM
tbp,

Connect the all network cables. Boot the GF Home. Log into Debian and clear the netconsole activation command

fw_setenv preboot_nc_save 'setenv nc_ready 0; for pingstat in 1 2 3 4 5; do; sleep 1; if run if_netconsole; then setenv nc_ready 1; fi; done; if test $nc_ready -eq 1; then run start_netconsole; fi'
fw_setenv preboot_nc 'echo Netconsole is not running'

-bodhi
===========================
Forum Wiki
bodhi's corner



Edited 1 time(s). Last edit at 05/22/2020 05:14PM by bodhi.
tbp
Re: GoFlex home stalling for 30 mins when there is no nc endpoint
May 27, 2020 03:59PM
Thanks for the feedback bodhi.

I applied both of those commands from debian and rebooted.
However, it still appears to be sitting there waiting for an nc response.
I have checked both of the vars have been added/saved in uboot.

How long should it be waiting there for now? Should uboot now be going straight through without an nc server ping check?

Regards
Re: GoFlex home stalling for 30 mins when there is no nc endpoint
May 27, 2020 04:33PM
Ah, I see. Will post a few more envs. Yours box is stuck in netconsole right now.

-bodhi
===========================
Forum Wiki
bodhi's corner
Re: GoFlex home stalling for 30 mins when there is no nc endpoint
May 27, 2020 05:20PM
tbp,

The reason for why netconsole is not robust.

U-boot netconsole is not a command like other capability. So we set up the envs to activate it. It has been this way from day one (when Jeff did it for u-boot 2012). And it has been working fine in 99% of the time. For some cases because a weird network router behavior and you'll see some problem.

Simple reason is that not a lot of u-boot maintainers pay attention to this area. What we are using now has a few patches I wrote to make it more resilient. But in retrospect, I should have just created a new u-boot comand to handle everything internally. I will do that, if and when I release another version (I have built u-boot 2019's but they are so bloated with new mainline code we don't use and have become too big, I don't want to release them).

==========

Try the below setenvs to clean up all netconsole hooks so you can run without it.

Connect the all network cables. Boot the GF Home. Log into Debian and clear the netconsole activation command

fw_setenv preboot_nc_save 'setenv nc_ready 0; for pingstat in 1 2 3 4 5; do; sleep 1; if run if_netconsole; then setenv nc_ready 1; fi; done; if test $nc_ready -eq 1; then run start_netconsole; fi'
fw_setenv preboot_nc 'echo Netconsole is not running'
fw_setenv stderr serial
fw_setenv stdin serial
fw_setenv stdout serial

-bodhi
===========================
Forum Wiki
bodhi's corner



Edited 1 time(s). Last edit at 05/28/2020 03:30AM by bodhi.
tbp
Re: GoFlex home stalling for 30 mins when there is no nc endpoint
June 02, 2020 01:40PM
That sorted it thanks Bodhi.
uboot load time is now around 5-10 seconds.
Author:

Subject:


Spam prevention:
Please, enter the code that you see below in the input field. This is for blocking bots that try to post this form automatically. If the code is hard to read, then just try to guess it right. If you enter the wrong code, a new image is created and you get another chance to enter it right.
Message: