grep is a beautiful tool

Global Regular Expression Print is a staple of every command-line user’s toolbox. As with find, it derives a lot of power from being combined with other tools and can increase your productivity significantly.

Following is a simple tutorial that will help you realize the power of this simple and most useful command. If you are on Windows and haven’t already, download and install Cygwin. If you are also new to regular expressions (regex), here is a great regular expressions reference to get you started.

Tutorial

Suppose we want to search for duplicate functions in all of our JavaScript files. Let’s start basic and work up to it. This technique can be used to search for a TON of duplicate items like:

  • Duplicate HTML IDs
  • Check how many times a CSS class is used
  • Duplicate java classes
  • many, many more…
# Search JS files in this directory for "function"
grep function *.js

The above command will print the lines containing "function" in all JavaScript files in the current directory (NOT subdirectories). Printing out line contents would be much more helpful if we knew what files they come from and their line numbers:

# Print filenames, line #s, and lines that start with "(white space)function"
grep -EHn "^\s*(function \w+|\w+ \= function)" *.js

Depending on how you format your JavaScript files, something like this will omit comments, anonymous functions, and also words like "functionality" giving you better results.

# Print a list of: function <function-name> and sort it
grep -Eho "^\s*function \w+" *.js | sort

-o prints only the part that matches the regular expression. -E options gives me extended regex and -h suppresses printing of the file name. I am then piping to sort which just sorts the output so it a list of function <function-name>. If you don’t have a lot of files/functions to go through, you can just scan the list and then note the duplicate function names you see. Let’s go a step further for those that DO have a big list:

# Print only duplicate function names
grep -hEo "^\s*function \w+" *.js | sort | uniq -d

There we go! That will list only the duplcated functions. I know that we can expand this with awk or other stuff and get the file names and line numbers of the duplicates, but I don’t want to explaining the details of awk ;). I actually had it in this article and then removed it so leave a comment or contact me if you want the code for that.

Other Examples

# Count the number of functions in all JS files
grep -c function *.js

# Print lines that DO NOT have "function"
grep -v function *.js

# List processes that match "pidgin" (non-Windows)
ps -ef | grep pidgin

Conclusion

grep is one of the most used command-line tools, often piped to for filtering output. Understanding it is essential to increasing productivity on the command-line. There is so much more to grep than what I’ve shown here, and it would be cool to see your best uses in the comments!

Find is a beautiful tool

I have blogged before that knowledge of command-line tools is essential to take the next step in programming productivity. I think it would be useful to provide simple tutorials for these powerful tools, starting with find. I hope you agree, and would appreciate your feedback via the contact page or in the comments.

Tutorial

If you’re on Windows, I would recommend installing Cygwin to bring the power of a real shell to your OS. Let us start with a simple example and build upon it:

find . -name "*.css"

This will list all CSS files (and directories ending with ".css") under the current directory (represented by "."). We only want to match files so we’ll go ahead and change it to this:

find . -type f -name "*.css"

Now we will only match CSS files. Nothing special? Fine, I see how it is. Let’s find all CSS files that do something with your HTML ID #content next:

find . -name "*.css" -exec grep -l "#content" {} \;

Here we combine find with grep (covered in detail later) using the
We’re starting to get productive now, so let’s keep going. Suppose now we want to change every reference to the color #FF0000 (red) to #00FF00 (green). Normally you would have to have your editor search and replace them, if it even has that capability. Even then it’s slow, this statement is fast:

find . -name "*.css" -exec sed -i -r 's/#(FF0000|F00)\b/#0F0/' {} \;

Gasp! Wait a minute, I just searched for both ways to specify red and replaced it with green in my CSS!! How long would that have taken otherwise? Do you see now how you can code faster by automating it and combining powerful tools? Let’s look at some other cool search options find has to offer:

Other Examples

# find files changed in the last 1 day
find . -ctime -1 -type f 

#find files larger than 1 Mb in /tmp
find /tmp -size 1M -type f

#find files newer than main.css in ~/src
find ~/src -newer main.css

Conclusion

By itself, find is only as good as say… Google Desktop. The real power, as with other shell tools, is the ability to combine with other tools seamlessly. Effective use of tools like find very often make the difference between an average programmer and one that is 10x more effective (actual multiples up for debate).

These are just some of the basic features of find. Take advice from Chris Coyier and use your new power responsibly. Find is a beautiful tool.

This month in bookmarks: June 2008

June was a fairly lively month. Firefox 3 is now out! I see most of my viewers are now using FF3! Well done :). In all seriousness, I found some pretty cool stuff in the June bookmarks.

Firefox

Jiffy Firebug Extension

Web Design

Productivity

Javascript

GNOME-Do Screenshot

Tools

You can always keep up-to-date on the new technologies I’m following by Adding me to your del.icio.us network or simply checking my del.icio.us bookmarks once in awhile. If I like what you bookmark, I’ll add you back :)