vCenter: VPXD Crash (vCenter keeps going down)

Leave a comment

April 17, 2012 by aubreykloppers

Recently I was trying to resolve an issue on a small vCenter 5.0.0 deployment.  Basically the primary symptom of the issue was that the service the vmware-vxpd service was crashing and needing to be restarted frequently.

The first thing I did was examine the log files, while the service was still in a failed state.

# cat /var/log/vmware/vpx/vpxd.log

I won’t bore you with an entire excerpt, however I will put some of the more helpful pieces of data from the log file here, perhaps it will help someone identify this article if they are having a similar issue but not really gaining any traction.

Unable to get exclusive access to vCenter repository.
Error deleting from VPX_SESSIONL
Alert:false@ /build/mts/release/bora-455964/bora/vpx/vpxd/util/vpxdVdb.cpp:408
Registry Item DB 5 value is ''
Failed to intialize VMware VirtualCenter. Shutting down...

Now this environment is based off of SLES 11 SP1 with an embedded DB2 database.  This might affect other configurations, but I don’t know.  So use at your own risk.

Here is the VMWare KB [http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1021581] which “describes” the fix…  But it is not very verbose.

Here is the step-by-step.

Please keep in mind, this article contains two fixes for two similarly logged but different problems.  Fix one should only work if your service crashes and is unable to restart.  Fix two is actually what solved the problem that we were experiencing.

FIX ONE – Which did not work for me.

Gather the Database Connection Information

# cat /etc/vmware-vpx/embedded_db.cfg
EMB_DB_INSTALL_DIR='/opt/db2/current'
EMB_DB_HOME='/opt/db2/home/'
EMB_DB_TYPE='db2'
EMB_DB_SERVER='127.0.0.1'
EMB_DB_PORT='50000'
EMB_DB_INSTANCE='VCDB'
EMB_DB_USER='vc'
EMB_DB_PASSWORD='YOURPASSWORDHERE'

Stop the Service

# /etc/init.d/vmware-vpxd stop
Stopping VMware vSphere Profile-Driven Storage Service...
Stopped VMware vSphere Profile-Driven Storage Service.
Stopping VMware Inventory Service...
Stopped VMware Inventory Service.
Stopping tomcat: success
Stopping vmware-vpxd: success
Shutting down ldap-server..done

Connect to DB2 Database

# /opt/db2/v9.7.2/bin/db2
(c) Copyright IBM Corporation 1993,2007
Command Line Processor for DB2 Client 9.7.2

db2 => connect to vcdb user vc using YOURPASSWORDHERE

Database Connection Information

Database server        = DB2/LINUXX8664 9.7.2
SQL authorization ID   = VC
Local database alias   = VCDB

List Tables to Test Database Connection

db2 => list tables
...
VPX_SESSIONLOCK                 VC              T     2011-11-16-17.28.59.418535
...
287 record(s) selected.

Delete All Rows from VPX_SESSIONLOCK Table

db2 => delete from vpx_sessionlock
DB20000I  The SQL command completed successfully.

Quit and Restart Service

db2 => quit
# /etc/init.d/vmware-vpxd start
Waiting for embedded DB2 database to startup: .success Cleaning session lock table: success Verifying EULA acceptance: success Starting ldap-server..done Starting vmware-vpxd: success Waiting for vpxd to initialize: ....success Starting tomcat: success Executing startup scripts...
Starting VMware Inventory Service...Waiting for VMware Inventory Service........................
VMware Inventory Service started.

Starting VMware vSphere Profile-Driven Storage Service...Waiting for VMware vSphere Profile-Driven Storage Service......
VMware vSphere Profile-Driven Storage Service started.

 FIX TWO – Which did work for me.

Apparently the problem here is that db2 is configured with a transaction log which is too small, which is resulting in the service crashing.  Now according to VMWare’s KB [http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2006812] you should expect errors indicating that the transaction log is full, however we did not see these at all.  So we really went out on a limb simply because logically transaction logs made sense.  Also VMWare recommends the sizes that we are going to be configuring the logs to in these steps, so it was relatively low risk.  Also keep in mind we have 3 esxi hosts and maybe 30 guests, so this can happen regardless of your size.

Stop the Service

# service vmware-vpxd stop

Change to db2inst1 and Connect to Database

# su db2inst1
db2inst1@vcenter00:~>db2 connect to vcdb

Retreive Current Database Configuration

db2inst1@vcenter00:~> db2 get database config for vcdb | grep LOG
Catalog cache size (4KB)              (CATALOGCACHE_SZ) = 300
Log buffer size (4KB)                        (LOGBUFSZ) = 256
Log file size (4KB)                         (LOGFILSIZ) = 1024
Number of primary log files                (LOGPRIMARY) = 13
Number of secondary log files               (LOGSECOND) = 4
Changed path to log files                  (NEWLOGPATH) =
Path to log files                                       = /storage/db/db2/home/db2inst1/db2inst1/NODE0000/SQL00001/SQLOGDIR/
Overflow log path                     (OVERFLOWLOGPATH) =
Mirror log path                         (MIRRORLOGPATH) =
Block log on disk full                (BLK_LOG_DSK_FUL) = NO
Block non logged operations            (BLOCKNONLOGGED) = NO
Percent max primary log space by transaction  (MAX_LOG) = 0
Num. of active log files for 1 active UOW(NUM_LOG_SPAN) = 0
Log retain for recovery enabled             (LOGRETAIN) = OFF
First log archive method                 (LOGARCHMETH1) = OFF
Options for logarchmeth1                  (LOGARCHOPT1) =
Second log archive method                (LOGARCHMETH2) = OFF
Options for logarchmeth2                  (LOGARCHOPT2) =
Log pages during index build            (LOGINDEXBUILD) = OFF

Update Database Configuration

db2inst1@vcenter00:~> db2 update db CFG FOR VCDB USING logprimary 16 logsecond 112 logfilsiz 8192
DB20000I  The UPDATE DATABASE CONFIGURATION command completed successfully.
SQL1363W  One or more of the parameters submitted for immediate modification
were not changed dynamically. For these configuration parameters, all
applications must disconnect from this database before the changes become
effective.

Validate Database Configuration

db2inst1@vcenter00:~> db2 get database config for vcdb | grep LOG
Catalog cache size (4KB)              (CATALOGCACHE_SZ) = 300
Log buffer size (4KB)                        (LOGBUFSZ) = 256
Log file size (4KB)                         (LOGFILSIZ) = 8192
Number of primary log files                (LOGPRIMARY) = 16
Number of secondary log files               (LOGSECOND) = 112
Changed path to log files                  (NEWLOGPATH) =
Path to log files                                       = /storage/db/db2/home/db2inst1/db2inst1/NODE0000/SQL00001/SQLOGDIR/
Overflow log path                     (OVERFLOWLOGPATH) =
Mirror log path                         (MIRRORLOGPATH) =
Block log on disk full                (BLK_LOG_DSK_FUL) = NO
Block non logged operations            (BLOCKNONLOGGED) = NO
Percent max primary log space by transaction  (MAX_LOG) = 0
Num. of active log files for 1 active UOW(NUM_LOG_SPAN) = 0
Log retain for recovery enabled             (LOGRETAIN) = OFF
First log archive method                 (LOGARCHMETH1) = OFF
Options for logarchmeth1                  (LOGARCHOPT1) =
Second log archive method                (LOGARCHMETH2) = OFF
Options for logarchmeth2                  (LOGARCHOPT2) =
Log pages during index build            (LOGINDEXBUILD) = OFF

Exit and Restart Service

db2inst1@vcenter00:~> exit
# service vmware-vpxd start

So there you go.  Hopefully one of these methods will help you resolve this issue in your environment.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: