Logging – Part Five: Analyse and Test Logging Configuration Information

In order to analyse log file and resolve issues we need to view the log files, such as vmkernel.log and vmsummary.log using a program such as tail to  end of the log file. The log file will show related related messages, in this example we will examine the vmkernel.log file which will display messages related to storage and contains SCSI sense codes. The SCSI sense codes are an industry standard maintained by Technical Committee T10 to which the ESXi host system conforms to this standard. For information on SCSI sense codes this  information can be found at http://www.t10.org/lists/1spc-lst.htm.

The SCSI sense codes are sent during the status phase, which occurs prior to the Command Complete Message and indicates a success or failure. For any time a SCSI command is sent to a target, the initiator expects a completion status. The various status codes are displayed below:

Status Code Description
00h Good
02h Check Condition
04h Condition Met
08h Busy
18h Reservation Conflict
28h Task Set Full
30h ACA Active
40h Task Aborted

 

SCSI Sense Key Description
0h No Sense
1h Receovered Error
2h Not Ready
3h Medium Error
4h Hardware Error
5h Illegal Request
6h Unit Attention
7h Data Protect
8h Blank Check
9h Vendor Specific
Ah Copy Aborted
Bh Aborted Command
Dh Volume Overflow
Eh Miscomplete
Fh Completed

So, if I take a log entry in my vmkernel.log file, we can analyse the error message being reported and isolate the SCSI event.

2015-02-14T08:29:47.778Z cpu1:32784)ScsiDeviceIO: 2337: Cmd(0x412e80896940) 0x1a, CmdSN 0x3884 from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0

From the Status Code received we can determine that the device is reporting a Check Condition.

  • Host Status – H:0x0 = Good
  • Device Status – D:0x2 = Check Condition
  • Plugin Status –  P:0x0 = Good

From the SCSI sense key (0x5) we can determine that the error message is being reported due to an Illegal Request. Finally, we need to determine from the cause of the error by analysing the additional sense code (0x20) and ASC qualifier (0x0). The list of additional SCSI sense data is far too detailed to document in this blog, so the following can be used as a reference http://www.t10.org/lists/asc-alph.txt.

From the  additional sense code and ASC qualifier information, we can now determine that the error message reported by the device is due to ‘INVALID COMMAND OPERATION CODE’, which may be due to the device not supporting VPD pages.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s