Thursday, April 16, 2009

Using sed

A crash course in sed!

Often times, you'll end up with a file that isn't quite the format you need. Sed is a powerful tool that allows you to do a global find and replace from the comfort of your command line.

Take this example:

I've done a grep and piped the output to a file. I'd like to strip all lines in a file down from this:

/var/lsurf/log/gx-mmsc/mmsc-server.log.0:[GXMMS:1-12-2006] [14:46:19:351] [GXMMS:FINEST] LDAPSPIdResolver.resolveSPId() LDAP call done time:77
/var/lsurf/log/gx-mmsc/mmsc-server.log.0:[GXMMS:1-12-2006] [14:46:19:515] [GXMMS:FINEST] LDAPSPIdResolver.resolveSPId() LDAP call done time:80
/var/lsurf/log/gx-mmsc/mmsc-server.log.0:[GXMMS:1-12-2006] [14:46:19:868] [GXMMS:FINEST] LDAPSPIdResolver.resolveSPId() LDAP call done time:85
/var/lsurf/log/gx-mmsc/mmsc-server.log.0:[GXMMS:1-12-2006] [14:46:20:46] [GXMMS:FINEST] LDAPSPIdResolver.resolveSPId() LDAP call done time:88
/var/lsurf/log/gx-mmsc/mmsc-server.log.0:[GXMMS:1-12-2006] [14:46:20:748] [GXMMS:FINEST] LDAPSPIdResolver.resolveSPId() LDAP call done time:78
...

To this:

77
80
85
88
78
...

Bear in mind that the file I'd created wsa well over 1000 lines and hand-editing was not an option. A quick call to sed, and I had the file I was hoping for:

sed "s#.*call done time:\(.*\)#\1#g" calltimes.txt > justnumbers.txt

This same line works with any string within the ( ) markers.

h3: Search and Replace: sed "s#...#...#g

The three # chars separate the parts of the command. The s is for search and replace. Then the first "..." is the regex that matches the pattern you want to replace and then the second "..." is the replacement string. The g means "replace globally (don't stop after one)."

In the replacement string, & means the entire matched text, and \1, \2 ... \9 means the particular submatches surrounded by ( ) in the match regex.

For example,

sed "s#\(.*\) blah\(.*\)#\1 \2#g"

run on "glug blah bug" would result in "glug bug". Also notice that it it whitespace sensitive.

If you want to append something to the end of a matched string, simply use the & in your replace string. For example

sed "s#.*#& man"

run on "hey" would result in "hey man"

If you have a file that has bad linebreaks that are showing up s text elements (such as ^M), you can get rid of them by searching for \r:

sed "s#\(.*\)\r#\1#g" fileWithBadBreaks.in > file.out

Alternately, you can just chop off the last character in the line by reading any number of characters, then one final character:

sed "s#\(.*\).#\1#g"

Here's a new one! This takes any message tracing line and removes the subject:

sed "s#\(.*subject=\"\)[^\"]*\(\.*\)#\1\2#g" lessthan.sart > what.txt;

That's it for the crash course. This should handle most of your cases without a problem. If not, I'll defer to the man pages.

No comments:

Post a Comment