« Back to all recent discussions

Dead Zyxel NSA211after deleting partition - is it possible to boot it from a USB rescue stick?

OrlandoScarletOrlandoScarlet Posts: 9  Junior Member
edited July 5 in Questions
Trying to recover from a moment of madness...

Had a working NSA221 that I was able to access via the UI.

Inserted two new drives that I'd previously used in a Linux Mint system to check for bad blocks. Was expecting to create a RAID 1 via the Web UI on the NAS but somehow the UI was only offering to create a JBOD disk from Disk 1 (Drive 2 was greyed out).

I opened the Telnet backdoor and accessed the Zyxel via PuTTY, then examined fdisk output. Comparing /dev/sda and /dev/sdb there seemed an unexpected partition on sdb that I used fdisk to delete.

Since re-booting the NAS I am not longer able to access the Web UI, so I presume there must have been some small flash drive used for boot whoise partition I have mistakenly deleted.

Is there any way to boot from a USB stick and perhaps attempt to recreate the deleted partition or re-install the Zyxel firmware from it? Or any alternative NAs software I can install or run from a usb stick to make use of the unit?

I seem to be able to find the archive of Zyxel firmware for the  NSA221 and other resources such as the archive of zyxel.nas-central.org and am fairly Unix literate but am not really finding steps on creating a Rescue disk for the NSA221.

Is anyone able to give me some pointers so I can recover?

Cheers,
Orlando Scarlet


#NAS_Jul_2019

Answers

  • MijzelfMijzelf Posts: 784  Heroic Warrior Member
    The NAS can't boot from an USB stick. But the firmware can run a script from an USB stick before it accesses the disks. This gives a possibility to start a telnet daemon.
    Such a stick can be found here, one of the universal_usb_key_func zipfiles.

  • OrlandoScarletOrlandoScarlet Posts: 9  Junior Member
    Hi Mijzelf,

    Thanks so much for the pointer.

    I haven't had much luck so far, so I'm wondering if I'm following the README correctly.
    Here's what I've done:
    • Downloaded universal_usb_key_func-2013-03-21.zip and expanded it to a "usb_key_func-2013-03-21" directory on my hard drive
    • Formatted a 32Gb thumb drive to FAT32
    • Copied all files from "usb_key_func-2013-03-21" to the root directory of the thumb drive
    • Copied usb_key_func.sh.network_telnet_stop within the USB drive to become usb_key_func.sh.2
    • Modified the file to change the IP address of the ifconfig call from 192.168.0.33 to 192.168.31.150 to match my network range (and to set the IP that the NAS was using before).
    • Made the changes in TextPad ensuring that the UNIX file mode was preserved (and using od to confirm):
      $ od -a usb_key_func.sh.2
      0000000   #   !   /   b   i   n   /   s   h  nl  nl   /   s   b   i   n
      0000020   /   i   f   c   o   n   f   i   g  sp   e   g   i   g   a   0
      0000040   :   1  sp   1   9   2   .   1   6   8   .   3   1   .   1   5
      0000060   0  sp   n   e   t   m   a   s   k  sp   2   5   5   .   2   5
      0000100   5   .   2   5   5   .   0  sp   u   p  nl  nl   t   e   l   n
      0000120   e   t   d  sp   -   l  sp   /   b   i   n   /   s   h  nl   /
      0000140   b   i   n   /   s   h  nl   e   x   i   t  sp   1  nl
      0000156
      
    • Inserted into the USB slot at the back of the NSA221 and powered on the unit
    • The activity LED flashes constantly
    • The USB LED lights up
    • The led on the USB Flash drive flashes six times in total (three times before and after an interval of a couple of seconds).
    • The Activity LED on the RJ45 connector at the back of the unit flashes (the corresponding column of lights for the switch on my desk also flashes)

    However I can't, even allowing several minutes, access the NAS from a PuTTY Telnet session or ping the 192.168.31.150 ip address. I also can't see the device in the list of unit's connected to network.

    I think I've followed the instructions properly, so am thinking my earlier action of deleting the wrong partition seems to have disrupted the boot process before the point where it looks for the script on the USB.

    I'm not sure what the boot process does but looking at "Bootlog_NSA-221" there seems a lot of activity before it gets to the first reference to usb_key_func.sh.2.

    I thought I'd enabled logging for PuTTY sessions to go back and see exactly what I've done (I know it was only to delete one partition via fdisk but as time passes I'm not 100% sure if the was /dev/sdb1 (as I originally thought) but /dev/sda1 -- is there something critical to the boot process for one but not other of those disks?

    I'm wondering if there's a way to get serial output from the NSA221 console to see more about what's going on?

    Any other thoughts would be very welcome!

    Cheers,
    Orlando
  • MijzelfMijzelf Posts: 784  Heroic Warrior Member
    edited July 6
    I suggest you to try with the 2015 zipfile. The main difference is that I added NAS5xx and NSA326, but I also remember there was some checksum error for some box. Just can't remember which one, and the forum where it was reported is down.
    There is a known timing issue with the usb_key_func sticks, but in that case it wouldn't have accessed the stick at all, I think. A 221 indeed checks the stick twice, as you can read here, so your observation is right.

    <quote>I'm wondering if there's a way to get serial output from the NSA221 console to see more about what's going on?</quote>

    AFAIK yes. I never saw the mainboard of a 221 or a picture of it, but all ZyXEL devices I looked at had the same serial port. (Including 3 different modem/routers). For the 325 it's documented here. If you can find that 3space1 pins on your 221, you can assume it's a 3.3V TTL serial port.

    <quote> is there something critical to the boot process for one but not other of those disks?</quote>

    Not that I'm aware of. The box boots from flash, and accesses the disks equally. But you can simply exchange the disks, the sequence doesn't matter. The disks are recognized at the GUID of the internal raid array, Not their physical position.

  • OrlandoScarletOrlandoScarlet Posts: 9  Junior Member
    Hi Mijzelf,

    Thank you for your continued support, very much appreciated.

    I tried with the 2015 zipfile, following the same approach I outlined earlier, but unfortunately got the same exact same results I reported in my last update. I explored the links you provided and did some additional experimentation but sadly to no avail.

    I think it will prove interesting (and hopefully enlightening!) to go down the path of connecting a serial cable to see the console output and I see the following connectors on the board that I believe align with the 3space1 connection you mentioned for the 325:

    The dedicated FTDI cables to a SIL seem quite expensive so I'm hoping that the following is suitable: JANSANE PL2303TA USB to TTL Serial Cable. The cable is for a Rasberry Pi, which I believe has 3.3V TTL pins on it's header. The details for the cable say "this usb debug cable can be configured for either v5 or v3.3 power output. Built-in PL2303 chipset has an on-board DC-DC converter."

    Per comments on using for 3.3V TTL: "The wiring is designed for 3.3V TTL Serial connection at RXD and TXD. The Wiring colors as follows: GREEN = TXD, WHITE = RXD, BLACK = signal Ground. The RED is VUSB(+5V) which IS NOT needed for Serial Connection."

    Based on the above I intend connecting the TX,RX and GND leads from the cable to the respective pins but leaving the 3.3V pin disconnected.

    Let me know if you think I'm getting ahead of myself and need to go with a more dedicated 3.3V cable.

    Cheers,
    Orlando







  • MijzelfMijzelf Posts: 784  Heroic Warrior Member
    That cable should be fine. And it's never a good idea to connect the Vcc, unless you have to power the box through the cable.
    BTW, if you have a RPi, you can also use it's serial port.

  • OrlandoScarletOrlandoScarlet Posts: 9  Junior Member
    Hi Mijzelf,

    Thanks again - I've ordered the cable and will let you know what the console output tells us once it arrives.

    Cheers,
    Orlando

  • OrlandoScarletOrlandoScarlet Posts: 9  Junior Member
    Hi Mijzelf,

    For some reason, the first set of cables didn't work for me, so I ended up getting the following: CP2102 USB 2.0 to TTL UART 6Pin Serial Converter with Cables

    I struggled with that for a while, even resorting to getting my multi-meter out to confirm the expected pin-outs on the NAS before resolving it the way I had it wired originally (the connectors seemed very loose and I'm unsure the cables crimped a firm enough connection).

    Anyway, I have this working now and thought I would post the connections for the benefit of others:

    wiring at CP2102Wiring at NAS

    The following is how the CP102 shows up in Device Manager (after the correct drivers were installed):



    The following are the settings in PuTTY (COM port will vary, depending on which USB port you connect to):



    Now that I have access to the console, I'll do a little research and capture a couple of logs to share.

    Cheers,
    Orlando
  • OrlandoScarletOrlandoScarlet Posts: 9  Junior Member
    Hi Mijzelf,

    I've done a quick initial review of the logs, which I have attached, and am a little puzzled by what I see.

    I've tried booting the NAS two ways:
    1. Without the original drives mounted (log file: zyxel_console_log_no-disks.txt)
    2. With the original drives (unmodified) re-inserted (log file: zyxel_console_log4_disks.txt)

    I've done a quick review against: Some_information_from_slash_proc_(NSA-221)

    In comparison to those log messages, here's where things seem to come off the rails when booting with no disks:
    ...
    sd 2:0:0:0: [sda] Attached SCSI removable disk
    sd 2:0:0:0: Attached scsi generic sg0 type 0
    
    umount: can't umount /zyxel/mnt/NAND: Invalid argument						<====
    bsname}: no internal disk available
     Flag_HD_Exists = 1
    WARNING: No valid partition on HDD or no HDD plugged!
    WARNING: No valid partition on HDD or no HDD plugged
    Booting from ramdisk
    gzip: /zyxel/mnt/NAND/sysdisk.img.gz: No such file or directory
    mount: mounting /dev/loop0 on /ram_bin failed: Invalid argument
    *** ERROR: Can not mount system image, file is invalid
    killall: udhcpc: no process killed
    mount: mounting /ram_bin/usr on /usr failed: No such file or directory
    mount: mounting /ram_bin/sbin on /sbin failed: No such file or directory
    mount: mounting /ram_bin/bin on /bin failed: No such file or directory
    mount: mounting /ram_bin/lib on /lib failed: No such file or directory
    tar: can't open '/ram_bin/tmp.tar.gz': No such file or directory
    cp: can't stat '/ram_bin/var/*': No such file or directory
    cp: can't stat '/ram_bin/home/*': No such file or directory
    cp: can't stat '/ram_bin/mnt/*': No such file or directory
    cp: can't stat '/ram_bin/etc/*': No such file or directory
    cp: can't stat '/bin/makedev.sh': No such file or directory
    /etc/init.d/rcS.221: line 307: ./makedev.sh: not found
    /etc/init.d/rcS.221: line 309: /etc/init.d/rcS2: not found
    
    Please press Enter to activate this console. sd 3:0:0:0: [sdb] 60555264 512-byte hardware sectors (31004 MB)
    sd 3:0:0:0: [sdb] Write Protect is off
    sd 3:0:0:0: [sdb] Assuming drive cache: write through
    sd 3:0:0:0: [sdb] 60555264 512-byte hardware sectors (31004 MB)
    sd 3:0:0:0: [sdb] Write Protect is off
    sd 3:0:0:0: [sdb] Assuming drive cache: write through
     sdb: sdb1
    sd 3:0:0:0: [sdb] Attached SCSI removable disk
    sd 3:0:0:0: Attached scsi generic sg1 type 0
    ------------------
    --- HANGS HERE ---
    ------------------
    
    My expectation is that this should still bring up the UI to inspect in administration mode (correct me if I'm getting ahead of myself...)

    When booting with the original drives installed I see the NAS trying to boot from disk first:
    ...
    OS type: Linux
    Block size=1024 (log=0)
    Fragment size=1024 (log=0)
    124928 inodes, 498688 blocks
    0 blocks (0%) reserved for the super user
    First data block=1
    Maximum filesystem blocks=524288
    61 block groups
    8192 blocks per group, 8192 fragments per group
    2048 inodes per group
    Superblock backups stored on blocks:
            8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409
    /dev/sda1 /zyxel/mnt/sysdisk ext2 ro 0 0
     Flag_HD_Exists = 0
    Boot from disk
    System disk image does NOT exist on HDD! Extract new firmware from NAND flash ...
    bsname}: skip changing partition name because parted command not available yet
    Filesystem label=
    OS type: Linux
    Block size=1024 (log=0)
    Fragment size=1024 (log=0)
    124928 inodes, 498688 blocks
    0 blocks (0%) reserved for the super user
    First data block=1
    Maximum filesystem blocks=524288
    61 block groups
    8192 blocks per group, 8192 fragments per group
    2048 inodes per group
    Superblock backups stored on blocks:
            8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409
    /dev/sda1 /zyxel/mnt/sysdisk ext2 rw 0 0
    gzip: /zyxel/mnt/NAND/sysdisk.img.gz: No such file or directory
    Checksum of sysdisk.img : d41d8cd98f00b204e9800998ecf8427e
    Checksum from INFO  : a54c703439224f1ab395b24004edc395
    Checksum of sysdisk.img does NOT match!
    WARNING: No valid partition on HDD or no HDD plugged
    Booting from ramdisk
    ...
    

    Within the above, I see a warning that the checksum for sysdisk.img is not as expected, which I also confirm from the following:
    / # cat /zyxel/mnt/info/image_checksum
    a54c703439224f1ab395b24004edc395 sysdisk.img
    
    / # md5sum /zyxel/mnt/sysdisk/sysdisk.img
    d41d8cd98f00b204e9800998ecf8427e  /zyxel/mnt/sysdisk/sysdisk.img
    

    It fails to boot from disk due to "WARNING: No valid partition on HDD or no HDD plugged", which does not match my recollection of the state of the disks (I had taken one disk out to insert into a desktop, mount there to Linux and take a backup copy, for safety, to a further disk).

    Q: Is the difference in chksum on sysdisk.img against /zyxel/mnt/info/image_checksum enough to prevent it from continuing the boot against the disk?

    After the above it tries to boot to RAMDISK and fails with the same result and messages as when there are no disks present

    I will start by inserting the disks into a Linux desktop to inspect status to see if/how either differs from my recollection. It seems that if the original disk contents still exist then the NAS should be able to boot without going to RAMDISK (which might get the system back usable enough for me to undo whatever madness I created previously with fdisk).

    As I can now reach a prompt I will inspect things a little better, so will later post a further update of findings when I've had a chance to explore further.

    I'm currently trying to locate the script that containing the chksum test on sysdisk.img to better understand the logic there, to see if that's why it no longer boots from the original disks.

    One other thing I've tried exploring quickly was getting 'fdisk -l' output to explore if I can quickly redefine the partition I believe deleted to see if that helps. In that direction I've hit an immediate problem as I'm getting the error "fdisk: can't open '/dev/null': No such file or directory":

    / # fdisk -l
    fdisk: can't open '/dev/null': No such file or directory
    
    / # ls -l /dev
    brw-r--r--    1 0        0           7,   0 Apr  8 01:29 loop0
    

    Any quick pointers on any of my above ramblings or any better strategy on recovering things would be very welcome!

    Cheers,
    Orlando
  • MijzelfMijzelf Posts: 784  Heroic Warrior Member
    Right. I think I know what is going on. A ZyXEL nas (with exception of the 220) has a part of the firmware (mainly the webinterface) compressed in flash. When you install a disk, a small partition (512MB?) is created, a filesystem is created, and that compressed part is extracted to that filesystem, as sysdisk.img. That file is actually an ext2 filesystem, which is loopmounted somewhere, and using some bindmounts it's added to the rootfilesystem.
    When no disk is available, the compressed file is extracted to a ramdisk, to be able to use the webinterface.

    On boot the firmware checks if the checksum of sysdisk.img is equal to the known checksum of the compressed flash file, if not, a fresh one is extracted. I attached the script which does this (/etc/init.d/rcS.221)

    For some reason your compressed flash file (sysdisk.img.gz) is gone, or corrupted. That was no problem, until you deleted sysdisk.img on disk, which made the box unresponsive, as there is no copy of the webinterface anymore.

    The flash which contains that file is on a 221 an internal usb disk (in contrast with all other ZyXEL NASses, where it's some raw NAND flash). Don't know if that is a recognizable disk, or if it's soldered on the PCB. I've never seen a 221.

    To get the box running, you'll have to put a valid sysdisk.img.gz on that usb disk. Maybe you can simply put it on an external usb thumb disk, the bootscript seems to loop through all available usb disks. I've extracted that file from fw 4.41, and put it here.



  • OrlandoScarletOrlandoScarlet Posts: 9  Junior Member
    Hi Mijzelf,

    Thanks for the continued help.

    I reviewed the script you provided which has really helped me relate to the log messages I was seeing during boot.

    I see the following block:
    ### Check USB key
    USB_CHECK_TIMEOUT=10
    check_time=0
    echo -n "INITRD: Trying to mount NAND flash as Root FS"
    while sg_map -x -i | grep "${NAND_DISK}" > /dev/null 2>&1
    	[ $? -ne 0 ] && [ $check_time -lt $USB_CHECK_TIMEOUT ]
    do
    	echo -n "."
    	check_time=$(($check_time+1))
    	sleep 1
    done
    
    which I believe relates to the following log fragment:
    ...scsi 2:0:0:0: Direct-Access     ZyXEL    USB DISK 2.0     PMAP PQ: 0 ANSI: 0 CCS
    The three dots mean that "sg_map -x -i" is discovering the list of SCSI drives before the 10 second retry limit.

    Then we enter the following code block:
    ### check upgrade key
    any_usb=`sg_map -x -i|grep -v " 0 0 0 0"|grep -v " 1 0 0 0"|grep -v "${NAND_DISK}"|awk '{print $7}'`
    echo "${any_usb}"
    if [ -n "${any_usb}" ]; then
    	/bin/mkdir /mnt/parnerkey
    	for usb in ${any_usb}
    	do
    		echo "mount upgrade key"
    		mount "${usb}"1 /mnt/parnerkey
    		ls -la /mnt/parnerkey | grep "NSA221_fw"
    		FW=$?
    		ls -al /mnt/parnerkey | grep "NSA221_pwr_func_check"
    		PWR=$?
    		if [ $FW == 0 ] || [ $PWR == 0 ] ; then
    			/sbin/check_key /mnt/parnerkey/NSA221_check_file
    			if [ $? == 0 ] ; then
    				echo "========  Start USB Upgrade Key  ========"
    				/mnt/parnerkey/usb_key_func.sh
    				test $? -eq 0 && exit 0
    			fi
    			umount /mnt/parnerkey
    			exit 1
    		else
    			umount /mnt/parnerkey
    		fi
    	done
    	rmdir /mnt/parnerkey
    fi
    
    which maps to the remaining output following the three dots:
    scsi 2:0:0:0: Direct-Access     ZyXEL    USB DISK 2.0     PMAP PQ: 0 ANSI: 0 CCS
    scsi 3:0:0:0: Direct-Access              USB DISK 2.0     PMAP PQ: 0 ANSI: 6
    sd 2:0:0:0: [sda] 247808 512-byte hardware sectors (127 MB)
    sd 2:0:0:0: [sda] Write Protect is off
    sd 2:0:0:0: [sda] Assuming drive cache: write through
    sd 2:0:0:0: [sda] 247808 512-byte hardware sectors (127 MB)
    sd 2:0:0:0: [sda] Write Protect is off
    sd 2:0:0:0: [sda] Assuming drive cache: write through
     sda:
    sd 2:0:0:0: [sda] Attached SCSI removable disk
    sd 2:0:0:0: Attached scsi generic sg0 type 0
    
    umount: can't umount /zyxel/mnt/NAND: Invalid argument
    bsname}: no internal disk available
    

    The lines containing "scsi X:0:0:0:0: ..." seem the output assigned to the "any_usb" variable.

    I believe:
    • "scsi 2:0:0:0" is the INTERNAL USB you identified (I assume the "Zyxel" within the line is the label assigned to the disk?)
    • "scsi 3:0:0:0" is the USB I inserted (which contains files from 2015 zipfile)
    I'm not 100% sure of the last bullet since the label on my thumb drive is "USB DISK" which should show up in the line (I'll change the label to something more distinct so it's easier to tell if I'm right).

    However, it also strikes me the format of the output in "any_usb" is different to that I expected -- I thought (by picking off a single column via the "awk { print $7 }", it would just be a single device value, like "/dev/sda"??

    I can escape the hang at the end of the failed boot to get to the busybox prompt, so I'll run the "sg_map" command to see what the output should look like.

    The other thing that bothers me is that the log output never shows the phrase "mount upgrade key", which should be seen once for each iteration of the loop in the above code block.

    Then the last but one line logged (the umount failure) seems to come from the code block following the one discussed above:
                    ...
    		if [ -f ${NAND_PATH}/sysdisk.img.gz ]; then
    			echo "Find compressed sys image NAND"
    			break
    		else
    			umount ${NAND_PATH}    <====
    		fi
    
    The worrying thing is that neither of the the log lines in that block ("There is new sys image" or " Find compressed sys image NAND") are seen, suggesting execution doesn't enter that block, though if that's the case it shouldn't get to the umount call either.

    That might suggest that none of the USB drives are mounted which is consistent with what I have been seeing so far from the busybox prompt (which would be worrying as it would make accessing a fresh copy of "sysdisk.img" from a USB stick impossible).

    Again, now that I have the script for reference about device paths and exact syntax on mount commands, I'll do some additional exploration to see what more I can figure out and let you know.

    One final thought...

    If all else fails, I was wondering if the procedure documented here could be used to replace everything that is missing: ftp://ftp.zyxel.it/guide/nas/nsa220_recovery_firmware.pdf via tftp?

    The challenge there is that unless I can mount a USB on the NAS to make 400AFM4CO.bin available, I don't have an environment where I can run "bin2ram" or "fw_unpack" to get the ~12 DATA_ files I'd need to stage on the tftp server.

    Anyway, one step at a time -- your help has given me a good direction to follow to see if I can find a good way to locally restore the sysdisk.img and I will let you know how things go.

    Again, huge thanks for your invaluable assistance.

    Cheers,
    Orlando


  • MijzelfMijzelf Posts: 784  Heroic Warrior Member
    I'm not 100% sure of the last bullet since the label on my thumb drive is "USB DISK" which should show up in the line

    I don't think it's the label here. The label is written on the disk, and sg_map works on a lower level. It will be some vendor string.

    However, it also strikes me the format of the output in "any_usb" is different to that I expected -- I thought (by picking off a single column via the "awk { print $7 }", it would just be a single device value, like "/dev/sda"??

    Yes. Later it tries to mount each 1st partition:

    for usb in ${any_usb}
        do
            mount "${usb}"1 /mnt/parnerkey
    Where do you read any_usb contains something different?
    If all else fails, I was wondering if the procedure documented here could be used to replace everything that is missing: ftp://ftp.zyxel.it/guide/nas/nsa220_recovery_firmware.pdf via tftp?
    For some reason I can't reach that document from here. But the nsa220 is completely different from the nsa221. It doesn't have uboot, for instance, so I wouldn't be surprised if it doesn't work. But if needed I can extract a firmware for you.
  • OrlandoScarletOrlandoScarlet Posts: 9  Junior Member
    Hi Mijzelf,

    Thanks for keeping me honest - sadly, I was looking at the wrong block of code (it was late for me -- that's my excuse and I'm sticking to it :p).

    When I tested at the command line I realized the block of code I had been looking at was running "grep -v" to exclude all the lines containing "USB DISK":
    / # export NAND_DISK="USB DISK"
    / # echo $NAND_DISK
    USB DISK
    
    / # sg_map -x -i|grep -v " 0 0 0 0"|grep -v " 1 0 0 0"|grep -v "${NAND_DISK}"|awk '{print $7}'
    
    / #  sg_map -x -i
    /dev/sg0  0 0 0 0  0  /dev/sda  HGST      HGST HDN724040AL  MJAO
    /dev/sg2  2 0 0 0  2  /dev/sdb  ZyXEL     USB DISK 2.0      PMAP
    /dev/sg3  3 0 0 0  3  /dev/sdc            USB DISK 2.0      PMAP
    
    / # sg_map -x -i|grep -v " 0 0 0 0"|grep -v " 1 0 0 0"|grep -v "${NAND_DISK}"
    / #
    
    What I was seeing in the logs started to make more sense when reviewed against the correct block of code :open_mouth:

    When I manually step though the right block, which you had already steered me to, I think this illustrates where things come of the rails. This is the code:
    ### Find NAND Flash
    /bin/mkdir -m 777 -p ${NAND_PATH}
    
    any_usb=`cat /proc/scsi/scsi_zyxel | grep "${NAND_DISK}"  | awk '{print $7}'`
    if [ -n "${any_usb}" ]; then
    	for usb in ${any_usb}
    	do
    		ls -l /"$usb"1 ${NAND_PATH} > /dev/null 2>&1
    		chmod -R 777  ${NAND_PATH}
                    ...
    
    and these are my results from the command line:
    / # export NAND_PATH="/zyxel/mnt/NAND"
    / # echo $NAND_PATH
    /zyxel/mnt/NAND
    
    / # cat /proc/scsi/scsi_zyxel | grep "${NAND_DISK}"  | awk '{print $7}'
    /dev/sdb
    /dev/sdc
    
    / # mkdir -p /zyxel/mnt/NAND
    / # ls -l ${NAND_PATH}
    / #
    
    / # /bin/mount -t ext3 /dev/sdb1 ${NAND_PATH}
    mount: mounting /dev/sdb1 on /zyxel/mnt/NAND failed: No such file or directory
    / # ls -l /dev
    -rw-r--r--    1 0        0                0 Jan  1 00:04 null
    
    Although "sg_map" successfully lists the USB disks present, the script is failing to mount them (as it redirects the output to /dev/null, we weren't getting the clue we needed).

    Does this strike you as the "timing issue"?

    Regarding the alternate approach, it is strange I can still get to ftp://ftp.zyxel.it/guide/nas, but it stalls when attempting to retrieve "nsa220_recovery_firmware.pdf" so I've attached it.

    However, on a more positive side it does seem NSA-221 does offer "U-Boot" as I can get to a prompt when I press a key early enough in the boot cycle to stop the autoboot:
    System halted.
    
    
    1.00 U-Boot 1.1.2 (Apr 11 2010 - 20:23:49)
    
    U-Boot code: 48D00000 -> 48D172F4  BSS: -> 48D1AFA4
    RAM Configuration:
            Bank #0: 48000000 256 MB
    SRAM Configuration:
            128KB at 0x58000000
    Flash:  4 MB
    In:    serial
    Out:   serial
    Err:   serial
    Hit any key to stop autoboot:  0
    $
    $ help
    ?       - alias for 'help'
    base    - print or set address offset
    bdinfo  - print Board Info structure
    bootm   - boot application image from memory
    bootp   - boot image via network using BootP/TFTP protocol
    cmp     - memory compare
    cp      - memory copy
    crc32   - checksum calculation
    echo    - echo args to console
    erase   - erase FLASH memory
    exit    - exit script
    flinfo  - print FLASH memory information
    go      - start application at address 'addr'
    help    - print online help
    iminfo  - print header information for application image
    loop    - infinite loop on address range
    md      - memory display
    mm      - memory modify (auto-incrementing)
    mtest   - simple RAM test
    mw      - memory write (fill)
    nm      - memory modify (constant address)
    ping    - send ICMP ECHO_REQUEST to network host
    printenv- print environment variables
    protect - enable or disable FLASH write protection
    rarpboot- boot image via network using RARP/TFTP protocol
    reset   - Perform RESET of the CPU
    run     - run commands in an environment variable
    saveenv - save environment variables to persistent storage
    setenv  - set environment variables
    test    - minimal test like /bin/sh
    tftpboot- boot image via network using TFTP protocol
    version - print monitor version
    $
    

    I think this does open the door to a potential reflash via TFTP and, if you are kind enough to help me get the firmware files, would probably like to do this in the long term so the boot from flash will work in the future.

    However, my first stop will be removing the physical drive to my linux box to re-instate the sysdisk.img.gz you were kind enough to share. If the md5sum matches what's on the INFOPATH (mounted from "/dev/mtdblock4" to "/zyxel/mnt/info/"), then I could be back in business booting from the hard disk.

    Thanks again - will keep you posted.

    Cheers,
    Orlando

  • OrlandoScarletOrlandoScarlet Posts: 9  Junior Member
    Hi Mijzelf,

    I inserted one of the original HDD's from the NAS into a linux box so I could mount it and copy the sysdisk.img you provided (md5sum: 12427e0a1c921f9bd7819a3736228628). The sysdisk.img was the only file on the HDD.

    I was concerned this might not work since the 12427e0a1c921f9bd7819a3736228628 md5sum does not match the "a54c703439224f1ab395b24004edc395" value in /zyxel/mnt/info/image_checksum:
    / # for file in /zyxel/mnt/info/*; do printf '\n--- %s ---\n' ${file}; cat $
    {file}; done
    
    --- /zyxel/mnt/info/core_checksum ---
    9382
    
    --- /zyxel/mnt/info/fwversion ---
    4.41(AFM.0)
    
    --- /zyxel/mnt/info/image_checksum ---
    a54c703439224f1ab395b24004edc395 sysdisk.img      <=====
    
    --- /zyxel/mnt/info/initrd_checksum ---
    104C
    
    --- /zyxel/mnt/info/modelid ---
    DC01
    
    --- /zyxel/mnt/info/revision ---
    39717
    
    --- /zyxel/mnt/info/romfile_checksum ---
    E863
    
    --- /zyxel/mnt/info/zld_checksum ---
    F304
    

    However, although it sees the HDD, it seems unable to successfully mount it at /zyxel/mnt/sysdisk. Here's what I see in the logs:
    INITRD: Trying to mount NAND flash as Root FS.egiga0: PHY is Realtek RTL8211BGR
    egiga0: link down
    ...scsi 2:0:0:0: Direct-Access     ZyXEL    USB DISK 2.0     PMAP PQ: 0 ANSI: 0 CCS
    sd 2:0:0:0: [sdb] 247808 512-byte hardware sectors (127 MB)
    sd 2:0:0:0: [sdb] Write Protect is off
    sd 2:0:0:0: [sdb] Assuming drive cache: write through
    sd 2:0:0:0: [sdb] 247808 512-byte hardware sectors (127 MB)
    sd 2:0:0:0: [sdb] Write Protect is off
    sd 2:0:0:0: [sdb] Assuming drive cache: write through
     sdb:
    sd 2:0:0:0: [sdb] Attached SCSI removable disk
    sd 2:0:0:0: Attached scsi generic sg1 type 0
    
    umount: can't umount /zyxel/mnt/NAND: Invalid argument
    /bin/storage_gen_mntfw.sh: line 110: e2fsck: not found	  <====
    bsname}: skip changing partition name because parted command not available yet
    Filesystem label=
    OS type: Linux
    Block size=1024 (log=0)
    Fragment size=1024 (log=0)
    124928 inodes, 498688 blocks
    0 blocks (0%) reserved for the super user
    First data block=1
    Maximum filesystem blocks=524288
    61 block groups
    8192 blocks per group, 8192 fragments per group
    2048 inodes per group
    Superblock backups stored on blocks:
            8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409
    /dev/sda1 /zyxel/mnt/sysdisk ext2 ro 0 0
     Flag_HD_Exists = 0
    Boot from disk
    System disk image does NOT exist on HDD! Extract new firmware from NAND flash ...
    bsname}: skip changing partition name because parted command not available yet
    Filesystem label=
    OS type: Linux
    Block size=1024 (log=0)
    Fragment size=1024 (log=0)
    124928 inodes, 498688 blocks
    0 blocks (0%) reserved for the super user
    First data block=1
    Maximum filesystem blocks=524288
    61 block groups
    8192 blocks per group, 8192 fragments per group
    2048 inodes per group
    Superblock backups stored on blocks:
            8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409
    /dev/sda1 /zyxel/mnt/sysdisk ext2 rw 0 0
    gzip: /zyxel/mnt/NAND/sysdisk.img.gz: No such file or directory
    Checksum of sysdisk.img : d41d8cd98f00b204e9800998ecf8427e <= md5sum of 0 byte file
    Checksum from INFO  : a54c703439224f1ab395b24004edc395
    Checksum of sysdisk.img does NOT match!
    WARNING: No valid partition on HDD or no HDD plugged
    Booting from ramdisk
    

    It's curious why it can longer find "e2fsck" to be logging the following error:
    /bin/storage_gen_mntfw.sh: line 110: e2fsck: not found
    The above is probably the result of the partition deletion that initiated these troubles, but I'm I not sure that should be a road block to it booting from HDD.

    Uploading the full boot log so you can review more broadly but it seems, from the following log fragment much earlier in the boot process, that it sees the HDD:
    scsi 0:0:0:0: Direct-Access     HGST     HGST HDN724040AL MJAO PQ: 0 ANSI: 5
    sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16).
    sd 0:0:0:0: [sda] 7814037168 512-byte hardware sectors (4000787 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
    sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16).
    sd 0:0:0:0: [sda] 7814037168 512-byte hardware sectors (4000787 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
     sda: sda1 sda2
    sd 0:0:0:0: [sda] Attached SCSI disk
    sd 0:0:0:0: Attached scsi generic sg0 type 0
    
    but somehow the boot process is electing not to mount the HDD at /zyxel/mnt/sysdisk to see the restored sysdisk.img file I copied there.

    Consequently an empty file somehow gets created at /zyxel/mnt/sysdisk/sysdisk.img, which gives rise to the following message:
    Checksum of sysdisk.img : d41d8cd98f00b204e9800998ecf8427e
    The above checksum is the same checksum as a zero byte file created by "touch":
    $ touch zeroByteFile
    
    $ ls -l zeroByteFile
    -rw-rw-r--+ 1 ORLANDO  None 0 Jul 19 23:36 zeroByteFile
    
    $ ls zeroByteFile
    zeroByteFile
    
    $ md5sum zeroByteFile
    d41d8cd98f00b204e9800998ecf8427e *zeroByteFile
    

    I tried running "/bin/storage_gen_mntfw.sh" with the appropriate arguments as soon as I could reach a prompt:
    / # export DISKPATH="/zyxel/mnt/sysdisk"
    / # echo $DISKPATH
    /zyxel/mnt/sysdisk
    / # sh -x /bin/storage_gen_mntfw.sh ${DISKPATH}; echo $?
    + bsname=storage_gen_mntfw.sh
    + mkdir -p -m 777 /zyxel/mnt/sysdisk
    + [ ! -d /zyxel/mnt/sysdisk ]
    + devsdx=
    + mkfsed1=
    + mkfsed2=
    + cat /etc/modelname
    cat: can't open '/etc/modelname': No such file or directory
    + MODEL=
    + [ 1 -ne 1 ]
    + clean1=
    + sg_map -x -i
    + grep  0 0 0 0
    + awk {print $7}
    + cut -c 6-8
    + swap1=sda
    + [ sda !=  ]
    + get_part sda
    + ls /sys/block/sda/
    + grep sda
    + wc -l
    + twopart=2
    + cat /sys/block/sda/size
    + hddsize=7814037168
    + let hddsize=7814037168/2048
    + [ 3815447 -ge 2097152 ]
    + [ 2 -eq 2 ]
    + fwpart=no_parted_in_initrd
    + swappart=no_parted_in_initrd
    + PARTED=1
    + [ no_parted_in_initrd !=  ]
    + [ 2 -eq 2 ]
    + e2fsck -n /dev/sda1
    /bin/storage_gen_mntfw.sh: line 110: e2fsck: not found
    + clean1=127
    + devsdx=/dev/sda
    + [ 127 -eq 0 ]
    + sg_map -x -i
    + grep  1 0 0 0
    + awk {print $7}
    + cut -c 6-8
    + swap2=
    + [  !=  ]
    + [ sda ==  ]
    + cat /proc/mounts
    + grep /zyxel/mnt/sysdisk
    + whymounted=/dev/sda1 /zyxel/mnt/sysdisk ext2 rw 0 0
    none /zyxel/mnt/sysdisk ramfs rw 0 0
    + [ /dev/sda1 /zyxel/mnt/sysdisk ext2 rw 0 0
    none /zyxel/mnt/sysdisk ramfs rw 0 0 !=  ]
    + echo storage_gen_mntfw.sh: why /zyxel/mnt/sysdisk is already mounted
    storage_gen_mntfw.sh: why /zyxel/mnt/sysdisk is already mounted
    + exit 1
    1
    

    The problem is the boot process goes on to try RAM disk before I can stop it, which results in an extra entry in /proc/mounts ("/zyxel/mnt/sysdisk ramfs rw 0 0" in addition to "/zyxel/mnt/sysdisk ext2 rw 0 0" so the above ends with an error not encountered during the boot itself.

    Finally, and I'm not sure if this plays into it, when comparing the above against a reference boot log for NSA-221 when disk containing "sysdisk.img" are present, my HDD partition currently seems ext2 rather than ext3.

    These disks were the original disks (they were not in the NAS when I did whatever bonehead partition deletion) so the fact that they are different now (ext2 filesystem and previously a zero byte "sysdisk.img") seems to have transpired during the recovery efforts.

    Any thoughts on the above or switching to the TFTP recovery approach would be welcome but I'm becoming increasingly resigned to not being able to recover this :'(

    Cheers,
    Orlando
  • MijzelfMijzelf Posts: 784  Heroic Warrior Member
    I have read that nsa220 recovery document. As we expect from ZyXEL, it's actually not about the 220, but about the 221. It's very interesting. Have you read it? Actually only the kernel is tftp'ed. The rest of the data is handled from a booted system.
    <quote>
    10. After bootup, Press “Ctrl + C” after you see this message “INITRD: Trying to mount
    NAND flash as Root FS.”
    11. In BusyBox shell, enter the below commands to format and mount NAND flash
    -> mke2fs -j /dev/sda1
    -> mkdir -p /zyxel/mnt/NAND
    -> mount /dev/sda1 /zyxel/mnt/NAND
    <snip>
    -> cp /e-data/1234/DATA_1004 /zyxel/mnt/NAND/sysdisk.img.gz
    </quote>
    How nice is that?

    Does this strike you as the "timing issue"?
    No. I once saw a bootlog of a 'timing issue'. You could see the firmware enumerating the disks (It was a 310, on which all disks are probed for usb_key_func.sh, including the internal disks), and after that the kernel log from an detected scsi device popped up. So sg_map wouldn't have seen it at all.

    It's curious why it can longer find "e2fsck" to be logging the following error:
    I think storage_gen_mntfw.sh was copied from the 220/210, and nobody bothered to remove the disk check.

    Your bootlog shows the firmware puts a filesystem somewhere. So I think somehow your partition was not recognized as valid, and got a new filesystem. Then of course the sysdisk.img was gone.

    These disks were the original disks
    Really? Does the 221 support >2TB disks?


    BTW, there is another way out. The SoC, an oxnas, is designed to be unbrickable. It tries first to boot from harddisk, before it falls back to flash. Here you can find instructions how to create a disk for an Iomega Home Media, which happens to have the same SoC. It should work on a 221 too (and on a 210), although the leds and stuff won't work. The fan will work, I think, as the oxnas kernel has a special module for it. The needed files are here.

Sign In or Register to comment.