Sunday, December 31, 2006

Updates for mdb, dtrace and diskio pages

I just sent a pile of updates to Princeton. They will probably be added to the site on the 2nd.

The index page has been re-written to include pointers to some of the major sources used for the material on the site.

An Intermittent Problems page has been added to describe how to deal with intermittent problems.

The mdb, dtrace and Disk I/O pages have been rewritten and expanded. The adb and mdb pages were merged to reflect the fact that Solaris 7 is really a dead OS at this point in time. The Disk I/O page was updated to reflect current Solaris 10 information from Solaris Internals and Solaris Performance and Tools by McDougall, Mauro and Gregg, and especially to include pointers to the really cool tools on the DTrace Toolkit page.

The kstat page has been updated to provide some additional information, and the netstat page has been changed to reflect the death of netstat -k.

The next major effort for the site will be an expansion of the zones page. (The current page is really not much more than a placeholder to avoid dead links on the other pages that refer to zones.)

I am also working on a root cause analysis page. I am finding that this page is involving a lot of reading of business publications; the business community seems to be way ahead of us on thinking about this issue.

--Scott

Tuesday, December 26, 2006

mdb and kmdb pages

I've submitted the following new pages to Princeton for inclusion: Intermittent Problems, mdb and kmdb. I also made significant improvements to the dtrace page.

Thursday, December 21, 2006

dtrace, methodology, SMF

I've submitted pages on
general methodology, dtrace and SMF. Depending on Princeton's work schedule, they may not be up until after the holidays.

Monday, December 18, 2006

SysAdmin article

An expanded and rewritten version of the Resource Management page has been tentatively accepted by SysAdmin as for its April 2007 issue.

Updated pages

I've submitted some updated pages for Resource Management, ZFS and Scheduling.

I also added a beginning of a page on Zones.

--Scott

Thursday, December 07, 2006

ZFS and Resource Pools

Additional pages for ZFS and Resource Pools have been submitted. The scheduler page has been expanded.

Sources for the ZFS page include the following:

Solaris ZFS Administration Guide


Brune, Corey, ZFS Administration, SysAdmin Magazine Jan 2007

Tuesday, December 05, 2006

Sunday, December 03, 2006

Ishikawa and Interrelationship Diagrams

I've been working on a page including information on some formal troubleshooting methods. En route, I have been looking at Cause-and-Effect (Ishikawa fishbone) diagrams and Interrelationship Diagrams.

Here are some of the noteworthy web pages I've been looking at:

Concordia: Cause and Effect Diagram and
Concordia Interrelationship Diagram provide a nice introduction to the two types of diagrams.

HCI Cause and Effect Diagram provides a slightly longer article, including some historical informaton about Ishikawa diagrams.

balancedscorecard.org Cause and Effect Diagram provides a much more in-depth view of Ishikawa diagrams.


questlearningskills.org Interrelationship Diagrams
provides a howto level article about Interrelationship diagrams.


ASQ Interrelationship Diagrams
provides a slightly longer article about Interrelationship Diagrams.


Root Cause Analysis: A Framework for Tool Selection
provides a nice comparison of Ishikawa and Interrelationship diagrams, as well as Current Reality diagrams.

Thursday, November 30, 2006

Resource Management resources

I've been looking at documentation on Resource Management over the last few days. Here are some of the articles that I have found. Unfortunately, much of the information I found is based on the Solaris 9 and even Solaris 8 implementations of Resource Manager, which is only somewhat useful when looking at Solaris 10.

If you are aware of additional resources, please feel free to add them to the comments on this post.

Here are the best items I've found:

System Administration Guide: Solaris Containers-Resource Management and Solaris Zones from the Solaris 10 documentation. This is quite well-written, though organized differently than I would have done it.

The Sun BluePrints Guide to Solaris Containers by Foxwell, Lageman, Hoogeveen, Rozenfeld, Setty and Victor. The Resource Management section is also quite well-written, and I found the organization to be more helpful than the manual in the Solaris 10 docs.

Solaris Resource Management by Galvin in SysAdmin. This is a high-level introduction. Though it is specific to Solaris 9, it is still the best quick introduction to the subject I've come across.

Capping a Solaris processes memory by matty is a blog page describing the ability of Solaris 10 to use rcapd to manage memory. This is a brief but thorough discussion of this topic.

Wednesday, November 29, 2006

Introduction

This blog is designed to be a companion to my Solaris Troubleshooting web site, hosted by Princeton University.

I used to have an email link to solicit feedback on the web site. I received some outstanding feedback, but I also received an outstanding amount of spam.

I am in the process of updating the site to include more Solaris 10 specific information, especially with regards to Resource Management and dtrace. I've posted a first cut at a Resource Management page.

Thanks to everyone who contributed to the old Solaris 8 site, and a special thanks to Princeton University for continuing to host the site long after I no longer worked on their Unix team.

--Scott Cromar