Friday, December 15, 2006
ZFS, DTrace, and fault domains
An interesting similarity between ZFS and DTrace occurred to me last night. One of the things ZFS gives you is the ability to sit outside the fault domain(s) between an application and its data on disk and catch any corruption that's introduced anywhere in that/those fault domain(s). You can't rely on a RAID controller to catch corruption intoduced between it and the server.
DTrace is similar in that it lets you look at different parts of the fault domain involved in running an application. That is, it lets you look at what's going in different parts of that fault domain -- the application, the libraries it uses, system calls it makes, and the function flow inside the kernel involved in implementing those system calls. Other traditional instrumentation tools generally allow you to look at one part of that domain -- truss lets you watch the system call boundary, instrumented libraries or applications let you watch just that part of the fault domain, etc.
Or maybe I'm just stretching things a bit in making this comparison.
DTrace is similar in that it lets you look at different parts of the fault domain involved in running an application. That is, it lets you look at what's going in different parts of that fault domain -- the application, the libraries it uses, system calls it makes, and the function flow inside the kernel involved in implementing those system calls. Other traditional instrumentation tools generally allow you to look at one part of that domain -- truss lets you watch the system call boundary, instrumented libraries or applications let you watch just that part of the fault domain, etc.
Or maybe I'm just stretching things a bit in making this comparison.