Skip to Content
Skip to Table of Contents

← Previous Article Next Article →

ATPM 12.10
October 2006

Columns

Segments

How To

Extras

Reviews

Download ATPM 12.10

Choose a format:

How To

by Sylvester Roque, sroque@atpm.com

Crash Logs: What Are They and What Do They Mean?

Most Mac users have noticed a wealth of benefits since making the shift from OS 9 to OS X. Arguably, the most important of these is the overall increased stability of the OS. I hate to admit it, but I have had more experiences with crashes on my dual 2 GHz G5 than I would like. I can almost hear some of my Windows-using friends laughing maniacally even as I type this.

The first few weeks were fine. Then I began experiencing kernel panics that turned out to be memory-related. Once I resolved that problem, months went by with no issues at all. Things performed as flawlessly as we have come to expect from Macs. Then I began experiencing kernel panics on boot up. After a bit of frustration, I discovered that my Mac would boot in safe mode and I could then reboot the system normally without any crashing. Before I could resolve the issue, a software update must have fixed the problem because it has gone away and not recurred. While I was experiencing that problem I got into the habit of leaving my Mac on and simply putting it to sleep when it wasn’t in use.

Most recently, I have experienced a crash that seems to be application-specific. My wife has been playing Second Life and sometimes uses my Mac to run characters. Most of the time things are fine, but once in a while the game crashes. The crashes are usually confined to that game, but sometimes the entire system grinds to a halt, forcing me to power down and reboot. Even with all these problems, I am not a troubleshooting genius, but there may be some things you can learn from my experiences.

Know Your System at Its Best

Right now, while the system is stable, take notice of what’s installed. I don’t mean you have to spend a great deal of time jotting down everything that’s installed on your Mac, but it does help to have some idea what’s on your system. It can be particularly difficult to remember this information if you are responsible for maintaining multiple Macs. In the past, I have suggested using the System Profiler report as the basis of a good troubleshooting log. As new things are added to the system, jot them down. You won’t need this information often, but if you do you’ll be glad to have it handy.

Since things are working properly, this would be a great time to clone your system to a second hard drive. I addressed this issue in a previous article about cloning. Since that time, new tools have become available. No matter which application you use to clone the system, be sure to use the most current version for your operating system. Also, remember to make regular backups of your data. These are perhaps the two most important troubleshooting steps you will ever perform. With these steps completed, you can get up and running again in no time by booting from the cloned system.

If you have a well-behaved system at the moment, create a new user account that will only be used in your troubleshooting efforts. Do not add hacks, add-ons, or other “enhancements” to this account. When a problem occurs in your normal account, log in to the troubleshooting account and attempt to recreate the problem. If it doesn’t occur in this account, the problem may well be file corruption or other problems in your main user account.

When a problem occurs and your system is not performing flawlessly, do not panic. Although OS X is quite complex, solving its problems can sometimes be remarkably simple. In addition to causing a great deal of stress, panic tends to inhibit your best troubleshooting tools—clear, logical thought and careful observation.

Detecting the pattern underlying a single application crash might not be too difficult for an experienced computer user, but things are often not that simple. Multi-tasking makes it possible to have several applications open simultaneously. Things are also complicated by the inherent stability of OS X that allows many Macs to be left on constantly and are therefore unattended for hours at a time. Given this set of circumstances, how is a Mac user supposed to determine the probable cause of a crash? Enter Console and the crash log.

Crash Logs—What Are They and Where Are They?

Crash logs are yet another indication of the Unix heritage underlying OS X. Sometimes it seems that Unix logs almost everything, good or bad, that happens on a system. You might not have been watching when your system crashed, but chances are there is a text file somewhere that has logged enough information for someone to reconstruct exactly what was happening at the time of the crash. Think of it as flight data recording for your computer. These logs can give developers much more detailed insight about a crash than most users could hope to provide. Do you know what block of memory your Mac was accessing the last time it crashed? Neither do I, but the crash logs know. Now that we know what a crash log is, where is it?

Most crash logs are stored in an individual user’s home directory. Follow the path to user name/Library/Logs/CrashReporter. The crash logs will be inside that folder. How many there are will depend on how often your Mac crashes and how often you clear out these files. Until we began having difficulty with Second Life, I had not logged a crash of any sort in months. According to Apple, there are some special circumstances in which crash logs are written in:

/Library/Logs/CrashReporter/<ProgramName>.crash.log

Crash logs are written here if any of the following circumstances are true: ownership of the crashed process cannot be determined, the crashed process was owned by the root user at the time of the crash, or the user’s home directory is not writable.

You can access crash logs using Console, which is in the /Applications/Utilities folder on your hard drive. Once you have launched the program, you should see a list of logs on the left side of the screen. Clicking a program’s triangle will show a list of logs for that program. Clicking one of the log files will display the contents of that log in the right pane of the window. If you do not see the list of logs on the left side of the screen, click the Logs icon and the list should appear.

What Do They Mean?

Crash logs may be the most daunting and least user-friendly aspects of OS X. That’s a bit more understandable when you consider that these files were intended to be used by developers as a means of improving their software. You and I might not understand these things very well, but developers do understand and make use of them. Even if they don’t give end users the kind of information needed to fix a problem, we can glean a modicum of information, so let’s take a brief look at the contents. If you subscribe to the MacFixIt site you can find a somewhat more detailed explanation here. If you are not a MacFixIt subscriber, or would simply like a more detailed overview, consult this technical article.

The first few lines of a crash log will contain the date and time of the crash as well as OS version information. This will include the version of an operating system as well as the build number. Build numbers are a bit more specific than OS version numbers. If two users purchased different models of Macs with the same OS version, the build numbers might be different due to differences in the hardware. That section of the report will look something like this.

Date/Time:   2006-08-26 21:58:27.846 -0500
OS Version:   10.4.7 (Build 8J135)
Report Version: 4

The next segment of the crash report identifies the process that crashed, the parent processes, and the version number. This information may be useful if you are not sure what application led to the crash. This can be misleading at times since the process that crashed can, in fact, have been called by another process. It is not uncommon, for example, for developers to call upon processes written by Apple as part of the OS. Here is an example of that segment of the report. In this case, the my ATI graphics card seems to be one component of the problem.

Command: ATI Monitor
Path: /Applications/Utilities/ATI Utilities/ATI Displays.app/
      Contents/Resources/ATI Monitor.app/Contents/MacOS/ATI Monitor
Parent: WindowServer [225]
Version:??? (???)
PID:  244
Thread: 0

The next piece of information is the type of crash that occurred. These types are usually referred to as exceptions. I doubt this information is of much use to end users troubleshooting a crash. There is even some question about just how useful it is for developers. Apple has identified the four most common types of exceptions (crashes), each of which is summarized briefly below:

KERN_INVALID_ADDRESS

The thread in question is making an attempt to use unmapped memory. This error can be caused either by data or by an instruction.

KERN_PROTECTION_FAILURE

This is always a data-related issue. The questionable process is attempting to write data to an area of memory that has been reserved as read-only.

BAD_INSTRUCTION

There is something wrong with the instruction that a thread is attempting to execute.

ARITHMETIC/EXC_I386_DIV

This is the error that occurs on Intel-based Macs, which occurs when the thread in question attempts to divide an integer by zero.

In my case, the error in question turned out to be KERN_INVALID_ADDRESS (0x0001) at 0xbf7fffe0. The game Second Life was running at the time, and it was checking the log that pointed me to the ATI crash log. The Second Life log indicated a very low frames per second rate immediately before the crash. Since Second Life can be both memory- and graphics-intensive, my initial suspicion was that the game was pushing the memory and graphics limitations of the computer. ATPM publisher Michael Tsai, who has much more application development experience than I do, tells me this error usually means there has been some corruption of an application’s memory. If that’s the case, the culprit is likely an application bug or operating system bug.

The last portion of the crash log is often referred to as a backtrace. It identifies which thread crashed and the steps occurring immediately before the crash. The first column of this section indicates the order of the tasks being performed. Items are listed in reverse chronological order. The first column indicates the order, with item 0 being the most recent. The second column indicates the library containing the code for that line. The third column is a program counter address, and the fourth column lists the name of the function that was running at the time of the crash. One line of the report will look something like this.

Thread 0 Crashed:
0 com.apple.CoreFoundation 0x907ba1c0 _CFRuntimeCreateInstance + 36

This segment of the report can run for many lines. Although these lines are, for the most point, unintelligible to the average user, careful examination may provide clues to what the application was doing at the time of the crash. If you are lucky, this segment will contain information with names that are somewhat descriptive, providing clues about the exact tasks the application was performing.

What Do You Do Now?

Now it’s time to put your observation and detection skills to work. No matter how simple or complex the problem you are trying to solve, troubleshooting is essentially a matter of answering four basic questions. What type of problem are you having? When does the problem occur? What seem to be the contributing factors? How do I solve the problem?

The first question to answer is does this appear to be a kernel panic, which affects the entire system, or an application crash, which usually affects only one program. Kernel panics are often the result of hardware issues or problems with kernel extensions. Although hardware is often an issue in these types of crashes, do not assume any hardware has failed. In my own experience, kernel panics are sometimes hardware-related as they were with my memory chips, but they can also be due to things such as memory and graphics cards not being properly seated in their respective slots. Have you opened the case and installed any new components recently? If so, carefully check these connections using appropriate safety procedures.

Application-specific crashes usually affect a specific program, leaving the rest of the system intact. For these types of problems you’ll want to know what applications were running at the time. If you were at the computer at the time of the crash, what were you doing? Recreate those steps to see if the crash continues to occur. (You are actually trying to crash the program. More accurately, you are trying to reproduce the circumstances that led up to the crash.)

Solve the Problem

If you have gotten this far, you may have an idea of potential problem areas to examine. Here are some general tips to follow, then I will point you in the direction of some more specific information.

Simplify the System

When a problem occurs, try to simplify the number of issues that must be investigated. If you suspect the problem may be hardware-related, start with the simplest things first. Check all power and data cables to make sure they are properly attached. If that doesn’t solve the problem, disconnect as much extraneous hardware as possible and reconnect things one at a time until you have everything reattached.

If you are trying to simplify a software issue, try logging in to the troubleshooting account you created earlier. If the same problem does not occur in that account, you can now start looking at files within your user account as the possible culprit. If the problem is occurring in both accounts, restart your system with the Shift key held down. This forces the system to load only those kernel extensions absolutely necessary for the system to operate. If the problem goes away, then the issue may well be caused by something common to both accounts.

There are several other keyboard shortcuts that can be invaluable in troubleshooting application or system crashes. This list not only contains useful troubleshooting keyboard shortcuts, but also other shortcuts commonly used in daily operation. Print this list, keep it handy, and before you know it you will be using the keyboard for activities you thought required the mouse.

Learn From Your Fellow Mac Users

I have mentioned before that I have found several Mac-related sites invaluable for solving problems and getting new ideas. If you haven’t already done, so check out Mac Owners Support Group, MacMentor, or OSXFAQ. These sites contain a wealth of information, and joining them is free. While you are at the OSXFAQ site, head to the forums and grab this general troubleshooting guide for OS X. Chain this guide somewhere near your Mac for future reference. It’s a much more concise reference than most things I’ve seen elsewhere. I also use MacFixIt to keep up with late-breaking troubleshooting news. The late-breaking updates are free, but for advanced searching and extended-troubleshooting guides you’ll want to spend the $25 per year to become a subscriber.

Final Thoughts

By now you have probably at least glanced at the information referenced in this article. Here are three tips you may not find written anywhere else. The first one is to start with the simplest possible explanation for the problem and work from there. I spent 20 minutes one day trying to decide why my G5 refused to power up at all. Since this was in the middle of the kernel panic phase, I was ready for a major hardware failure. It turns out that the power cord had pulled out of the machine just enough to break contact and prevent power up. On visual inspection everything looked fine. I found the problem when out of sheer desperation I started retracing my steps.

Once you have checked the obvious, my second tip is to check the simplest things first. During the time I was having memory-related problems, I opened the case several times to make sure the questionable chips were installed properly. On one of these sequences, I did not hear the usual system chime as things powered up. That chime occurs after your Mac has passed the Power On Self Test (POST). If you Mac fails the POST, there is likely a hardware issue that needs to be resolved. Generally it means that some internal piece of hardware is not connected properly or has failed. I immediately assumed the worst. It turns out I had reconnected my external speakers, which disables the internal speaker. Since my external speakers weren’t connected to an electrical outlet at the time, there was no sound. Boy, was I relieved. That’s a much cheaper fix than I was expecting.

I picked up the last tip in the pre–OS X days. It came from a program that listed OS 9 error codes, their meanings, and some possible solutions. If an application crashes when you perform a certain step in a program, try a different means of triggering the same step to see if the program still crashes. Suppose your favorite program quits when you use Command-C to copy information to the clipboard, try initiating the copy operation from the Edit menu using the mouse. If the program still crashes that’s one more piece of information about the problem. If the program doesn’t crash, you have a viable workaround until a fix is released for the problem.

That’s it for now. We’ll see what happens next month.

Also in This Series

Reader Comments (0)

Add A Comment





 E-mail me new comments on this article