To timidly go where many have gone before: Unkillable processes

Wednesday, December 20, 2006

Unkillable processes

One of the blogs I read religiously is Ben Rockwood's. He has some interesting anecdotes (that's http://www.cuddletech.com/blog/pivot/entry.php?id=780, in case you get the spam warning instead of the blog) about using OpenSolaris in production at Joyent, including one about an unkillable process.

I mailed the link to a couple of former colleagues, mostly because I thought they might be interested in the NFS-over-ZFS anecdote (given that they work at an ISP.) Apparently I jinxed them -- just after getting in to work the next morning, they discovered an unkillable process on one of their Solaris 10 boxes. And it was also a process running in a zone, so it was impossible to reboot the zone to clear it up.

Sorry, guys.

(BTW, this appeared to be a deadlock situation. The process has two threads, one stuck in cv_wait() via exitlwps() and the other stuck in cv_wait() via tcp_close(). Given that I don't work there anymore, I couldn't really go crash-dump diving, but I'd bet that there were no other threads on the system that were going to call cv_signal() or cv_broadcast() on that particular CV.)

# posted by Chad Mynhier @ 5:06 PM

Comments:

Goat entrails to undo the jinx are coming via FedEx.

# posted by

Anonymous : 7:15 AM

Interesting anecdote by Ben Rockwood there.
osgeek

# posted by

Anonymous : 10:20 AM

To timidly go where many have gone before

Wednesday, December 20, 2006

Unkillable processes

About Me

Links

archives