Get sed savvy - part 2
Now that you know a bit about the Stream EDitor from the last sed tutorial, we are going to expand our knowledge of substitution and line printing with an interesting scenario.
Suppose we want to let someone else know what kinds of functions are in a given Javascript file. Think of it as a simple sort of Javadoc for CSS or Javascript. The way we are going to do this is look at all of the files modified in the last day and then extract the comments out of them and put them somewhere (on a wiki perhaps?). Doing this kind of automation will increase team communication and productivity immensely if done correctly.
Tutorial
Download and install Cygwin if you’re on Windows to follow along.
# Single-line comments - grep is better but we can use sed sed -n '/\/\/p' blah.js > /tmp/comments.out # Multi-line comments sed -n '/\/\*/,/\*\//p' blah.js >> /tmp/comments.out
Now, the sed commands above are tricky so here is how you can understand them: The -n option tells sed not to print anything unless you tell it specifically what to print. The comma [,] in between the two patterns tells sed to match everything between the two patterns, in this case everything between multi-line comments /* and */ and then the p-command prints whole lines that match the pattern space.
We can combine these two commands to streamline a killer process.
# sed script file /\/\//p /\/\*/,/\*\//p # Use the sed script to print all comments sed -n -f sedscr blah.js > /tmp/comments.out
Now we have a nice little summary of our Javascript files we can post to a wiki or diff with another version to see what was added. Note that the sed print command prints the whole line, so if you have comments at the end of a line you will get the beginning of that line also. Not a perfect solution, but something quick and easy!
Other Examples
# Print lines longer than 80 characters
sed -n '/^.\{81\}/p' myfile
# Delete blank lines
sed '/^$/d' myfile
# Substitution optimized for speed
sed '/Yahoo/ s//Not Microhoo/g' myfile
Conclusion
You should now be getting pretty proficient with sed. Use it along with find and grep and you will find yourself feeling much more comfortable on the command-line.
I encourage you to experiment a bit and use this even in circumstances where you know it’s not necessary, just to get the hang of it. In the long run you’ll end up increasing your productivity by using these most powerful tools.












A 24 year-old programmer for
I would recommend keeping the Sed1line file handy. It has got a lot of very useful tips.
@Binny:
Yeah it’s linked to in the last article because it definitely is a solid reference.
Awesome again. I’m continuing to print these out!
Finally, the mysterious comma is explained. Sed takes the two matches like addresses, but wow, that’s not documented well in the man pages.
Gracias.
Details:
I was looking at someone else’s sed code that had a comma between two patterns. When I played with it, I was thinking the comma was a commmand, and couldn’t figure it out until I read this page.
When I would take the comma out, or replace it with another character (e.g. a period), sed would complain of a unknown command.
$./script
sed: -e expression #1, char 132: unknown command: `.’
$
Now I understand. Taking away the comma changes the context of the pattern (no longer an address).
@David Regal:
Glad I could clear it up. The comma thing is probably the best feature of sed besides substitution.
Can we have more examples?
Good tutorial :-)
I am a Windows user, but I like sed very much, it is one of the most useful tool that I know (I am using the Gnuwin32 sed).
I have used it to recover “deleted messages in Thunderbird (they are deleted only when folder are compacted, first of compating they are hidden):
http://www.gialloporpora.netsons.org/usare-sed-per-recuperare-le-email-invisibili-in-thunderbird/268/
or to create bookmarklet. I prefere to work with a javascript code in localost with a standard bookmarklet that inject it in my pages, but when I have finished I transform the file.js in a bookmarklet with this sed line (I have found the code in some place on the web, but I don’t remember the URL):
sed -e :a -e “/$/N; s/\n//; ta” > file.js
this is an windows syntax.
Ciao
Sandro
@Reginald:
Fair request :) Here is one of my references linked in the comments and 1st article: http://www.student.northpark.edu/pemente/sed/sed1line.txt
I know there are a million things worth mentioning about sed, but an often overlooked one is that the regular expression separators can be any character as long as they are consistent. The following examples all perform the same inline global search for “/tmp” and replace it with “/home” in the test.txt file:
sed -i ’s/\/tmp/\/home/g’ test.txt # ugly and requires escaping
sed -i ’s|/tmp|/home|g’ test.txt # better but still cluttered
sed -i ’s_/tmp_/home_g’ test.txt # open and easily readable
Changing the regular expression separator can be helpful at maintaining readability, especially when you are matching something with a “/”.
-Chris
@Chris:
You beat me to mentioning it :). A very cool tidbit about sed, there.