Discussion:
sparc64 -CURRENT in LDOM: ERROR: Last Trap: Fast Data Access Protection
Ax0n
2017-05-25 00:05:05 UTC
Permalink
I have a SunFire T2000 that I've chopped up into LDOMs. The primary domain
and six of the LDOMs are running 6.1-STABLE just fine. I pulled down the
May 22 snapshot, and it installs (with a strange error, see bottom of
post), but the LDOM crashes upon boot. I just tried again with the May 24th
snapshot, and I'm getting the same error. This seems to dump me into
OpenBoot, not ddb. I can provide a shell on the primary domain, and serial
console (over ssh) access to a developer if needed. I am not subscribed to
bugs@, so please copy me off-list.


Sun Fire T200, No Keyboard
Copyright 2010 Sun Microsystems, Inc. All rights reserved.
OpenBoot 4.30.4.a, 3072 MB memory available, Serial #83548683.
Ethernet address 0:14:4f:fa:da:b, Host ID: 84fada0b.


Boot device: disk File and args:
OpenBSD IEEE 1275 Bootblock 1.4
..>> OpenBSD BOOT 1.9
Trying bsd...
Booting /virtual-***@100/channel-***@200/***@0:a/bsd
***@0x1000000+***@0x17fe420+***@0x1800000+***@0x1830100
symbols @ 0xfedcc3c0 155+553992+364844 start=0x1000000
[ using 919960 bytes of bsd ELF symbol table ]
console is /virtual-***@100/***@1
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
Copyright (c) 1995-2017 OpenBSD. All rights reserved.
https://www.OpenBSD.org

OpenBSD 6.1-current (GENERIC.MP) #123: Tue May 23 21:16:40 MDT 2017
***@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/GENERIC.MP
real mem = 3221225472 (3072MB)
avail mem = 3145572352 (2999MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root: Sun Fire T200
cpu0 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1200 MHz
cpu1 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1200 MHz
cpu2 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1200 MHz
cpu3 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1200 MHz
vbus0 at mainbus0
"flashprom" at vbus0 not configured
cbus0 at vbus0
vdsk0 at cbus0 chan 0x2: ivec 0x4, 0x5
scsibus1 at vdsk0: 2 targets
sd0 at scsibus1 targ 0 lun 0: <SUN, Virtual Disk, 1.1> SCSI3 0/direct fixed
sd0: 4000MB, 512 bytes/sector, 8192000 sectors
vnet0 at cbus0 chan 0x3: ivec 0x6, 0x7, address 00:14:4f:f9:ce:e5
vcons0 at vbus0: ivec 0x111, console
vrtc0 at vbus0
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
bootpath: /virtual-***@100,0/channel-***@200,0/***@0,0
root on sd0a (65ce554ceba2f574.a) swap on sd0b dump on sd0b

ERROR: Last Trap: Fast Data Access Protection

{1} ok


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

* one odd thing happens during install:
fatal in dhclient: yielding responsibility for vnet0

I worked around it by ^Z and manually running dhclient from the shell
before foregrounding it. Full install log up to the error included below.


OpenBSD 6.1-current (RAMDISK) #116: Mon May 22 13:59:22 MDT 2017
***@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/RAMDISK
real mem = 3221225472 (3072MB)
avail mem = 3150954496 (3004MB)
mainbus0 at root: Sun Fire T200
cpu0 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1200 MHz
"SUNW,UltraSPARC-T1" at mainbus0 not configured
"SUNW,UltraSPARC-T1" at mainbus0 not configured
"SUNW,UltraSPARC-T1" at mainbus0 not configured
vbus0 at mainbus0
"flashprom" at vbus0 not configured
cbus0 at vbus0
vdsk0 at cbus0 chan 0x2: ivec 0x4, 0x5
scsibus0 at vdsk0: 2 targets
sd0 at scsibus0 targ 0 lun 0: <SUN, Virtual Disk, 1.1> SCSI3 0/direct fixed
sd0: 4000MB, 512 bytes/sector, 8192000 sectors
vnet0 at cbus0 chan 0x3: ivec 0x6, 0x7, address 00:14:4f:f9:ce:e5
vcons0 at vbus0: ivec 0x111, console
vrtc0 at vbus0
softraid0 at root
scsibus1 at softraid0: 256 targets
bootpath: /virtual-***@100,0/channel-***@200,0/***@0,0
root on rd0a swap on rd0b dump on rd0b
erase ^?, werase ^W, kill ^U, intr ^C, status ^T

Welcome to the OpenBSD/sparc64 6.1 installation program.
(I)nstall, (U)pgrade, (A)utoinstall or (S)hell? i
At any prompt except password prompts you can escape to a shell by
typing '!'. Default answers are shown in []'s and are selected by
pressing RETURN. You can exit this program at any time by pressing
Control-C, but this can leave your system in an inconsistent state.

System hostname? (short form, e.g. 'foo') puffyfivesnap

Available network interfaces are: vnet0 vlan0.
Which network interface do you wish to configure? (or 'done') [vnet0]
IPv4 address for vnet0? (or 'dhcp' or 'none') [dhcp]
DHCPDISCOVER on vnet0 - interval 1
fatal in dhclient: yielding responsibility for vnet0
IPv6 address for vnet0? (or 'autoconf' or 'none') [none] ^Z
[1] + Suspended /install
# dhclient vnet0
DHCPDISCOVER on vnet0 - interval 1
DHCPOFFER from 192.168.1.254 (f8:18:97:8f:a5:09)
DHCPREQUEST on vnet0 to 255.255.255.255
DHCPACK from 192.168.1.254 (f8:18:97:8f:a5:09)
bound to 192.168.1.70 -- renewal in 43200 seconds.
# fg
/install

Available network interfaces are: vnet0 vlan0.
Which network interface do you wish to configure? (or 'done') [done]
Using DNS domainname attlocal.net
Using DNS nameservers at 192.168.1.254


TIA
--ax0n
http://www.h-i-r.net/
Ted Unangst
2017-05-26 23:31:18 UTC
Permalink
Post by Ax0n
I have a SunFire T2000 that I've chopped up into LDOMs. The primary domain
and six of the LDOMs are running 6.1-STABLE just fine. I pulled down the
May 22 snapshot, and it installs (with a strange error, see bottom of
post), but the LDOM crashes upon boot. I just tried again with the May 24th
snapshot, and I'm getting the same error. This seems to dump me into
OpenBoot, not ddb. I can provide a shell on the primary domain, and serial
console (over ssh) access to a developer if needed. I am not subscribed to
There's a hardware/software limit that currently restricts the kernel to 8MB.
Larger than that and bad things happen. Hopefully someone will soon find a way
to reduce the size of the kernel.
Ax0n
2017-05-27 00:40:01 UTC
Permalink
Is this limit specifically for LDOM guests? I have a Sun Blade 1500 I could
compile a custom -CURRENT kernel with, if that might help. Though I'm not
sure I want to do that with every snapshot I try.

*musing* I wonder if that's why NetBSD 7.1 is also crashing on boot.
Post by Ax0n
Post by Ax0n
I have a SunFire T2000 that I've chopped up into LDOMs. The primary
domain
Post by Ax0n
and six of the LDOMs are running 6.1-STABLE just fine. I pulled down the
May 22 snapshot, and it installs (with a strange error, see bottom of
post), but the LDOM crashes upon boot. I just tried again with the May
24th
Post by Ax0n
snapshot, and I'm getting the same error. This seems to dump me into
OpenBoot, not ddb. I can provide a shell on the primary domain, and
serial
Post by Ax0n
console (over ssh) access to a developer if needed. I am not subscribed
to
There's a hardware/software limit that currently restricts the kernel to 8MB.
Larger than that and bad things happen. Hopefully someone will soon find a way
to reduce the size of the kernel.
Ted Unangst
2017-05-27 05:06:02 UTC
Permalink
Post by Ax0n
Is this limit specifically for LDOM guests? I have a Sun Blade 1500 I could
compile a custom -CURRENT kernel with, if that might help. Though I'm not
sure I want to do that with every snapshot I try.
Not specifically, but the limit can vary by hardware. If you want to run a
snapshot now, a custom kernel with a few devices removed will help. We'll have
to make a similar long term fix anyway.
Ax0n
2017-05-27 05:55:08 UTC
Permalink
FWIW, the kernels running in my -stable guests are considerably larger than
8MB, and not much smaller than the -CURRENT kernels.

---------- a running LDOM guest -------------
-bash-4.4$ doas cu -l ttyV0
Connected to /dev/ttyV0 (speed 9600)

OpenBSD/sparc64 (puffyone.ldom.openbsd.local) (console)
login: axon
Password:
Last login: Fri May 26 00:46:47 on console
OpenBSD 6.1 (GENERIC.MP) #58: Sat Apr 1 17:10:24 MDT 2017

Welcome to OpenBSD: The proactively secure Unix-like operating system.
[...]
You have new mail.
$ uname -a
OpenBSD puffyone.ldom.openbsd.local 6.1 GENERIC.MP#58 sparc64
$ ls -la /bsd*
-rw-r--r-- 1 root wheel 9487408 Dec 31 1999 /bsd
-rw-r--r-- 1 root wheel 2739432 Dec 31 1999 /bsd.rd
-rw-r--r-- 1 root wheel 9440853 Dec 31 1999 /bsd.sp

-------- the -CURRENT image (bsd.rd's been copied to bsd for testing)
---------
-bash-4.4$ doas vnconfig /dev/vnd0c /home/axon/vm/vdisk5
-bash-4.4$ doas mount /dev/vnd0a /mnt
-bash-4.4$ ls -al /mnt/bsd*
-rw-r--r-- 1 root wheel 2749459 May 26 22:02 /mnt/bsd
-rw-r--r-- 1 root wheel 9531028 May 26 22:02 /mnt/bsd.bak
-rw-r--r-- 1 root wheel 2749459 May 24 18:28 /mnt/bsd.rd
-rw-r--r-- 1 root wheel 9480748 May 24 18:28 /mnt/bsd.sp
Post by Ted Unangst
Post by Ax0n
Is this limit specifically for LDOM guests? I have a Sun Blade 1500 I
could
Post by Ax0n
compile a custom -CURRENT kernel with, if that might help. Though I'm not
sure I want to do that with every snapshot I try.
Not specifically, but the limit can vary by hardware. If you want to run a
snapshot now, a custom kernel with a few devices removed will help. We'll have
to make a similar long term fix anyway.
Ted Unangst
2017-05-27 06:41:55 UTC
Permalink
Post by Ax0n
FWIW, the kernels running in my -stable guests are considerably larger than
8MB, and not much smaller than the -CURRENT kernels.
So it's actually the size of the code in the kernel, not the file size.

From your boot message

Booting /virtual-***@100/channel-***@200/***@0:a/bsd
***@0x1000000+***@0x17fe420+***@0x1800000+***@0x1830100

8381472 + 7136 (padding) = 8388608
Ax0n
2017-06-02 04:48:37 UTC
Permalink
Gotcha. File size != Kernel Size.

BTW, I just tried the latest snapshot after reading "Changes Of Note 623"
and as expected, it works. Thanks again!
Post by Ted Unangst
Post by Ax0n
FWIW, the kernels running in my -stable guests are considerably larger
than
Post by Ax0n
8MB, and not much smaller than the -CURRENT kernels.
So it's actually the size of the code in the kernel, not the file size.
From your boot message
8381472 + 7136 (padding) = 8388608
Loading...