Category Archives: Linux

Update: AVDF installation fails on HP server with Smart Array

A couple of days ago I’ve wrote about some problems when installing Oracle Audit Vault and Database Firewall 12.1.2 on HP server with Smart Array Disk Controller. The problem is still not resolved, but in the meantime Oracle has open a Bug and added some Metalink Notes related to this issue.

  • AVDF 12.1.1 Installation Fails On HP server with Smart Array Disk Controller [1587742.1]
  • Unable To Install AVDF Server With HP Smart Array [1680134.1]
  • AVDF installation ISO [1680961.1]
  • Bug 18940816 AVDF SERVER FAILS TO INSTALL ON HP DL380 GEN8 WITH CCISS!C0D0 ERROR

The contents of MOS note 1680134.1 and 1680961.1 are certainly known to the regular readers of OraDBA. The workaround and procedure are the same as I’ve posted a couple of days ago. Oracle created MOS notes based on my blog post AVDF installation fails on HP server with Smart Array Disk Controller and AVDF installation ISO. In this case, my posts are somehow useful. 🙂 The Bug mentioned above is unfortunately not publicly available. I’ll provide more information as soon as it is available.

AVDF installation fails on HP server with Smart Array Disk Controller

I’ve successfully set up a couple of AVDF installation on different VM Server as well on HP Blade or Rack servers. On the VM server I never had any problems. For the installation of AVDF 12.1.1.x on HP servers BL465c Gen8 or DL380p Gen8, there were always warnings during partitioning of the disks. So far it was never an issue to just continue the installation. With AVDF 12.1.2 this has changed. On some HP servers with smart array disk controller the installation fails because of problems with the drivers respectively device names.

Earlier installation of AVDF like 12.1.1.3.0 simply complained about not enough space.

AVDF_12.1.1.3.0_setup02

OK, 0GB is a bit less for setting up an AVDF Server :-), nevertheless ignoring the error still worked. AVDF 12.1.1.2 as well AVDF 12.1.1.3 could be successfully setup using the cciss Driver for HP Smart Array. As of AVDF 12.1.2 the error is not that friendly any more.

AVDF_12.1.2.0.0_setup02

The title of the error “Error Parsing Kickstart Config” indicates that there is an issue at an early stage of the system setup. It is worth having a deeper look into the kickstart configuration file. The kickstart file can be found in the initrd.img image on the AVDF installation ISO. See AVDF installation ISO for how to extract the kickstart file.

In the kickstart file we can see at line 62, that a pre-script is executed to create the partition commands. This pre-script is a python script which does create a temporary file (/tmp/partition-include) with the partition commands based on the available disks. The partition command itself is then included at line 36.

35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
########## Partition the disk ##############
%include /tmp/partition-include

# Create logical volume group - this is where all volumes will reside
volgroup vg_root pv.01
# Now create the volumes, a.k.a logical partitions. The data partions (/var/lib/oracle) is grown
# up to the specified size. The rest of the FS is left unallocated.
#
# You must make changes to ruby_lib/dbfw/dbfw_fstab.rb if you change the FS specification.
#
logvol swap --fstype swap --vgname=vg_root --size=4096 --name=lv_swap
logvol / --fstype ext3 --fsoptions="noatime" --vgname=vg_root --size=7000 --name=lv_root
logvol /images --fstype ext3 --fsoptions="noexec,nodev,nosuid,noatime" --vgname=vg_root --size=15000 --name=lv_images
logvol /usr/local/dbfw --fstype ext3 --fsoptions="noatime" --vgname=vg_root --size=1000 --name=lv_local_dbfw
logvol /usr/local/dbfw/tmp --fstype ext3 --fsoptions="noexec,nodev,nosuid,noatime" --vgname=vg_root --size=9000 --name=lv_local_dbfw_tmp
logvol /home --fstype ext3 --fsoptions="noexec,nodev,nosuid,noatime" --vgname=vg_root --size=1000 --name=lv_home
logvol /tmp --fstype ext3 --fsoptions="nodev,nosuid,noatime" --vgname=vg_root --size=2000 --name=lv_tmp
logvol /var/log --fstype ext3 --fsoptions="noexec,nodev,nosuid,noatime" --vgname=vg_root --size=6000 --name=lv_var_log
logvol /var/tmp --fstype ext3 --fsoptions="noexec,nodev,nosuid,noatime" --vgname=vg_root --size=6000 --name=lv_var_tmp
logvol /var/www --fstype ext3 --fsoptions="nodev,nosuid,noatime" --vgname=vg_root --size=1000 --name=lv_var_www
logvol /var/www/tmp --fstype ext3 --fsoptions="nodev,nosuid,noatime" --vgname=vg_root --size=1000 --name=lv_var_www_tmp
logvol /var/lib/oracle --fstype ext3 --fsoptions="noatime" --vgname=vg_root --size=20000 --name=lv_oracle
logvol /var/dbfw --fstype ext3 --fsoptions="noatime" --vgname=vg_root --size=10000 --name=lv_var_dbfw

# Tasks performed before installation
%pre

python /kickstart/partitions.py 2> /tmp/partitions_error
if [ $? -ne 0 ]; then
DISKERROR=$(/bin/cat /tmp/partitions_error)
fi

Having a look into the file /tmp/partition-include reveals the wrong partition command which leads to the error mentioned earlier. As you can see below the disks are specified with –ondisk=cciss!c0d0 rather than –ondisk=cciss/c0d0. The python script which builds the partition commands, has issues with the device names. Actually, for an HP smart array disk, the corresponding driver should be loaded so that the devices are visible as sd*. The root cause could be the missing driver or an error in the python script. I’ve opened a service request with oracle Support for further analysis.
AVDF_12.1.2.0.0_setup04

Workarounds

For the moment I just see the following two workarounds.

  • First install and configure AVDF 12.1.1.3 and perform an upgrade to AVDF 12.1.2.
  • Install AVDF 12.1.2 with an alternative kickstart file respectively partition commands

The first workaround is straightforward. It just takes a bit more time. For the second workaround you may create a new AVDF ISO image, but this is way to complex. It is much simpler to manually specify the boot options and provide an alternative kickstart file on an internal web server. The kickstart file is the same as for the regular AVDF 12.1.2 installation, it just has a fixed partition section. For that I have taken the partitioning commands from the file /tmp/partition-include and removed the –ondisk parameter. I’ll provide my kickstart file as an example for download. But do not use it directly the partitioning section must be adapted to your environment.
Action plan for the workaround:

  1. Create an alternative kickstart file with correct partition commands for your environment
  2. Put the kickstart file on a Webserver which is accessible by the AVDF Server
  3. Boot from AVDF 12.1.2 ISO image with custom boot parameter

My custom boot option did look like the following command. The IP address is the address of my web server.

vmlinuz noipv6 initrd=initrd.img ramdisk_size=8192 ks=http://192.128.1.40/avdf.cfg

Conclusion

This problem is quite annoying, especially if you have already done the installation on another physical or virtual servers several times. The workaround is basically simple. With a bit enhanced Linux knowledge and a web server, one has quickly created an installation with an alternative kickstart file. Nevertheless I highly recommend to open a service request with Oracle when you have similar issues with your hardware during the setup of a productive AVDF 12.1.2 environment.

References

Further information on this topic.

AVDF missing boot partition

While working on the problem with missing RAM on the AVDF test system (see ) I realized, that the linux boot partition is not available by default.

[root@melete2 log]# ls -al /boot
total 16
drwxr-xr-x  2 root root 4096 Jan 11  2013 .
drwxr-xr-x 24 root root 4096 Jul 11 20:19 ..

[root@melete2 log]# df -kh /boot
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_root-lv_root
                      6.6G  2.2G  4.1G  35% /

Initially I was a bit confused since it contains stuff like grub configuration, inited.img, kernel etc. All stuff that are needed for system boot. Ok, I have not thought about that for the bootloader, the file system does not have to be mounted. From the security point of view it’s even better to not have it mounted. If not mounted nobody can accidentally change something. 😉 Oracle has defined noauto for the boot partition. Therefore the device is not mounted automatically during system boot.

[root@melete2 log]# cat /etc/fstab|grep boot
LABEL=/boot                    /boot                    ext3   noatime,noauto,nodev,nosuid                  1 2

If you need to change the grub configuration just mount the boot partition manually.

[root@melete2 log]# mount /boot

[root@melete2 log]# vi /boot/grub/grub.conf

[root@melete2 ~]# umount /boot

AVDF Linux kernel could not recognize whole RAM

After initial setup of an Audit Vault and Database Firewall engineering system, I’ve started to add several audit vault agents and secure targets. In the beginning it went quite smoothly. But after a certain number of secured targets, there were continuously ORA-04031 errors. Most of the errors were related to large pool and PX Msg buffers issues. The analysis of the trace files has shown interesting stuff. 😉 But more on that in a later blog post. The real problem is the available memory.

Symptoms

The Audit Vault and Database Firewall engineering system is running on a HP ProLiant BL465c Gen 8. It comes with 32GB Memory. Should actually be sufficient for a system engineering. It turned out that the 32GB are not recognized by operating system. As you can see below the system has just 3GB memory in total.

[root@melete2 ~]# free
                     total    used   free shared buffers  cached
Mem:               3048108 2385888 662220      0   10720 1525036
-/+ buffers/cache:  850132 2197976
Swap:              4194296  453564 3740732

Reviewing dmesg shows that we lose 29 GB of memory.

Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.32-300.39.5.el5uek (mockbuild@ca-build56.us.oracle.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)) #1 SMP Wed Mar 13 11:26:53 PDT 2013
Command line: ro root=/dev/vg_root/lv_root console=tty9 udevtimeout=10
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  Centaur CentaurHauls
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f000 (usable)
 BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000bddde000 (usable)
 BIOS-e820: 00000000bddde000 - 00000000bde0e000 (ACPI data)
 BIOS-e820: 00000000bde0e000 - 00000000d0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fee10000 (reserved)
 BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 000000083efff000 (usable)
DMI 2.7 present.
last_pfn = 0x83efff max_arch_pfn = 0x400000000
MTRR default type: uncachable
MTRR fixed ranges enabled:
  00000-9FFFF write-back
  A0000-BFFFF uncachable
  C0000-FFFFF write-back
MTRR variable ranges enabled:
  0 base 000000000000 mask FFFF80000000 write-back
  1 base 000080000000 mask FFFFC0000000 write-back
  2 disabled
  3 disabled
  4 disabled
  5 disabled
  6 disabled
  7 disabled
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
e820 update range: 00000000c0000000 - 000000083efff000 (usable) ==> (reserved)
WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 29679MB of RAM.
------------[ cut here ]------------

Cause

According to an Oracle Metalink Note 1448147.1 this problem is related to a BIOS issue.

Solutions and Workaround

The solution described in Oracle Metalink Note 1448147.1 is to upgrade the BIOS or disable MTRR in kernel. Since BIOS upgrade is not an option for this environment I’ll try to workaround by disable MTRR.

Disable MTRR

Changing the grub.conf is basically quite easy if you find the boot files. When I first try it, I’d realized that there is no grub configuration available. It seems that Oracle decided to not mount /boot at startup. So it is mandatory to first mount the boot partition. Afterward you just can add disable_mtrr_trim as additional kernel option.

[root@melete2 ~]# mount /boot

[root@melete2 ~]# df -kh /boot
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             145M   26M  112M  19% /boot

[root@melete2 ~]# vi /boot/grub/grub.conf
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/vg_root/lv_root
#          initrd /initrd-version.img
#boot=/dev/sda
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title Audit Vault Server 12.1.1.0.0
        root (hd0,0)
        kernel /vmlinuz-2.6.32-300.39.5.el5uek ro root=/dev/vg_root/lv_root console=tty9
udevtimeout=10 disable_mtrr_trim
        initrd /initrd-2.6.32-300.39.5.el5uek.img
title Audit Vault Server 12.1.1.0.0
        root (hd0,0)
        kernel /vmlinuz-2.6.32-300.38.1.el5uek ro root=/dev/vg_root/lv_root console=tty9
udevtimeout=10 disable_mtrr_trim
        initrd /initrd-2.6.32-300.38.1.el5uek.img

[root@melete2 ~]# reboot

Broadcast message from root (pts/0) (Thu Jul 11 20:17:56 2013):

The system is going down for reboot NOW!
[root@melete2 ~]# Connection to melete2 closed by remote host.
Connection to melete2 closed.

After reboot we now have 32GB memory available.

[root@melete2 ~]# free
                      total     used     free shared buffers  cached
Mem:               33024372  3930724 29093648      0   17868 2640744
-/+ buffers/cache:  1272112 31752260
Swap:              14680056        0 14680056

Unfortunately, the configuration of the AVDF appliance is not automatically updated to use the extra memory. We have to do some manual changes.

Update Kernel Parameters

The kernel setting have to be changed to allow a bigger SGA. See Metalink Note 1529433.1 for more detailed information on how calculate and set the kernel parameters. For the engineering system we will define a SGA with 20GB therefor we set the shmmax and shmall as follows:

[root@melete2 ~]# vi /etc/sysctl.conf

kernel.shmmax=23622320128
kernel.shmall=5368709120
...
[root@melete2 ~]# sysctl -p

Increase SWAP

With 32GB memory, it is also advisable to enlarge the swap space. I’ve discussed this already in the blog post Resize swap space on linux. Since the AVDF appliance does use logical volumes it’s even a bit easier.

[root@melete2 ~]# swapoff -v /dev/vg_root/lv_swap

[root@melete2 ~]# lvresize /dev/vg_root/lv_swap -L +8G

[root@melete2 ~]# mkswap /dev/vg_root/lv_swap

[root@melete2 ~]# swapon -v /dev/vg_root/lv_swap

Increase SGA

Finally we can increase the SGA.

SQL> ALTER system SET sga_max_size=20G scope=spfile;
System altered.

SQL> ALTER system SET sga_target=20G scope=spfile;
System altered.

SQL> startup force

Conclusion

Although AVDF is an appliance, it is mandatory to examine the system after installation. Eg. are there errors in the log files in /var/log, memory, storage etc. available. The solution described here makes it possible to use all the memory. Nevertheless, the appliance has been adjusted to an extent where is necessary to consider whether the support is still archive. If you run into a similar issue on your production AVDF setup I would recommend opening an Oracle SR. Looking forward to the next AVDF patchset. I hope this system stays patchable.

References

Some links related to this post.

  • Linux kernel could not recognize whole RAM [1448147.1]
  • Upon startup of Linux database get ORA-27102: out of memory Linux-X86_64 Error: 28: No space left on device[301830.1]
  • Requirements for Installing Oracle Database 12.1 on RHEL5 or OL5 64-bit (x86-64) [1529433.1]
  • Requirements for Installing Oracle 11gR2 RDBMS on RHEL (and OEL) 5 on AMD64/EM64T [880989.1]
  • Master Note of Linux OS Requirements for Database Server [851598.1]