[Novalug] bash & grep question - best for optimizing?
Nick Danger
nick at hackermonkey.com
Mon Nov 13 09:08:40 EST 2006
I have a large volume of files.(*) I would like to run a grep through
them and then act on the files that match. Easy enough to do. The
question I have is, which way is best?
1. This is a two layer deep hashed structure, and I have 4 patterns I
want to match. I can either do a "grep -rl" at the top level, or cd into
each hash (down 2 layers) and do a "grep " in that directory.
2. Should I do one grep for each pattern, or a single grep with multiple
matches?
There are anywhere from 200,000 to 250,000 files in there, so its not
exactly a speedy process and so any few mins I can eek out of my shell
script, I'd like to :-)
Thanks
-Nick
(*) This is a mail spooler, or as we call it "where mail goes to die."
Generally if it doesn't get spooled off in a few hours, it sits there
the entire time until it expires out. I know we have issues with
accepting too much email on spooler itself, but I'm not fixing postfix
right now (someone else is working on that), Im just trying to remove
mail that matches specific patterns.
More information about the Novalug
mailing list