blog/templates/permalink.htm - header
06.02.2010 13:36:55

Reading through strace files - finding file accesses

This is more or less a followup to my last blog entry. Still trying to find out about which application is using my hard drive. I experimented some more with strace and learned something about vim search patterns. :)

Suppose you make a strace of a process (e.g. vim --help which outputs the standard vim commandline help) using the following command

strace -f -s 4095 vim --help 2>$HOME/tracefile.txt

This creates a very large file with all system calls the process did during it's execution. To now find the file accesses in this tracefile you maybe would open this file in vim and then would naïvly search e.g. for the string ] open ( to see, which files where opened. Vim search hightlighting would show you all open statements, but you would have to read the file yourself to find the corresponding close statement. Works, but gets very strenously if there are a lot of open-calls.

Now, with the following search-pattern in vim (using search highlighting) you will find the whole block in the trace file; from the beginning open to the ending close-call wonderfully highlighted for a quick overview. (Enter this after pressing / in command-mode.)

] open(.* = \(\d*\)\_.\{-}] close(\1)

This pattern uses several new features I never really used before (Which is funny, as I tend to use regular expressions a lot). An example of the block this pattern finds is

] open("/usr/share/tcltk/tcl8.4/encoding/iso8859-1.enc", O_RDONLY|O_LARGEFILE) = 5
[pid 14780] fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
[pid 14780] ioctl(5, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfb1bfa8) = -1 ENOTTY (Inappropriate ioctl for device)
[pid 14780] read(5, "# Encoding file: iso8859-1, single-byte...", 4096) = 1094
[pid 14780] read(5, "", 4096)           = 0
[pid 14780] close(5)

EDIT: As a reader remarked, my first example in this blog entry with ls -l is not very good. strace -f -s 4095 ls -l 2>$HOME/tracefile.txt does not work with the given pattern, as no pid-information is output (it seems strace only outputs pid-information, if the process is multithreaded.). So without the pid-information, the pattern should look like this: ^open(.* = \(\d*\)\_.\{-}] close(\1). In this case only open and close calls on the main-thread are found. If you omit the first ^ in the pattern, the search should still work although it might get mixed up with the string open inside string-outputs from strace. In summary this blog entry was created way to fast without proper testing. Sorry for this. Hopefully the information in this entry still is of some use.


As you can see, the open call returns a handler id which is used to also close the access again. Therefore we use \(\d*\) to mark the first occurrence of the handler at the end of the line and backreference it at the end with \1. (Using \2, \3 etc. you also could backreference more than one \(\) pattern.

Normal you only search for patterns which can be found on one line. Here we have read over line endings. This is done by using \_. which is the same as . but also takes in account line endings.

I you use the multiplier * to match more then one character, the longest string matching the atom will be found. For example with the string cabcabcabcab searching for c.*b will result in the full string found, as it starts with c and ends with b. If you only want to get cab you have to do a greedy search, which is done by using the multiplier \{-}. So doing c.\{-}b will result in finding only cab.


Be aware that this will not really work good, if the open and close statements are entangled. But it seems to work most of the time.

If you want to learn more about regular expressions in vim just enter :help regular-expression or :help pattern inside a vim-session.

Regular expressions can and should also be using in perl, javascript, sed, php, etc. They are very powerful constructs. Unfortunately every system seems to have its own dialect of regular expressions. But if you know the basic structure of regular expressions you learn to cope with the differences really fast.

Erstellt von Jerri | Kategorie: linux, konsole