Newsgroups: comp.unix.questions
Path: utzoo!utdoe!peter
From: peter@doe.utoronto.ca (Peter Mielke)
Subject: Re: Pattern matching with awk
In-Reply-To: tssi!nolan's message of 4 Mar 91 
Message-ID: <1991Mar7.013904.23421@doe.utoronto.ca>
Reply-To: peter@doe.utoronto.ca (Peter Mielke)
Organization: Dictionary of Old English Project, University of Toronto
Date: Thu, 7 Mar 1991 01:39:04 GMT


In <1994@tssi.UUCP>, tssi!nolan writes:
> lin@CS.WMICH.EDU (Lite Lin) writes:
> >  This is a simple question, but I don't see it in "Freqently Asked
> >Questions", so...
> >  I'm trying to identify all the email addresses in email messages, i.e.,
> >patterns with the format user@node.  Now I can use grep/sed/awk to find
> >those lines containing user@node, but I can't figure out from the manual
> >how or whether I can have access to the matching pattern (it can be
> >anywhere in the line, and it doesn't have to be surrounded by spaces,
> >i.e., it's not necessarily a separate "field" in awk).
> 
> [stuff about awk or gawk]
> 
> Then that gives a pattern something like this
> 
> [a-zA-Z0-9.\-_%!]+@[a-zA-Z0-9.\-_]+
> 
> I've escaped the dash, I suppose it might be necessary to escape other
> characters as well.  Have I left anything out that might occur in strange
> but otherwise valid mail addresses?

Or you could use sed to transform the address when it matches. eg.

sed -e 's/\([a-zA-Z0-9.\-_%!]*\)@\([a-zA-Z0-9.\-_]*\)/machine: \2 userid: \1/'

-- 
Peter Mielke                                    peter@doe.utoronto.ca
Dictionary of Old English Project               utgpu!utzoo!utdoe!peter
University of Toronto