Help? Trying to prepend search results for a text file. [linux]

This comment was posted to reddit on Jun 20, 2015 at 11:45 am and was deleted within 10 hour(s) and 34 minutes.

Help? Trying to prepend search results for a text file. [linux]

I chose awk. Alternation via \| only works in a few variants of grep and sed.

awk -v "regex=bar|baz" '
    $0 ~ regex {           # if there is a match
        a = $0             # store current line in a
        printf "Match! "   # print the marker
        # as long there matches something
        while (match(a, regex)) {
            # print that something
            printf "(%s)", substr(a, RSTART, RLENGTH)
            # remove the processed part from a
            a = substr(a, RSTART + RLENGTH)
        }
        printf " "         # separator
    }
    # finally print the line, regardless of whether there
    # was a match or not
    {
        print
    }
' your_file

An example run, including the script as a oneliner:

$ cat input
foo foo foo foo foo foo bar foo foo foo foo
foo foo foo foo foo foo foo foo foo foo foo
foo foo foo foo foo foo foo bar foo foo foo
foo foo foo baz foo foo bar foo foo foo foo

$ awk -v "regex=bar|baz" '$0 ~ regex { a = $0; printf "Match! "; while (match(a, regex)) { printf "(%s)", substr(a, RSTART, RLENGTH); a = substr(a, RSTART + RLENGTH) } printf " " } { print }' input
Match! (bar) foo foo foo foo foo foo bar foo foo foo foo
foo foo foo foo foo foo foo foo foo foo foo
Match! (bar) foo foo foo foo foo foo foo bar foo foo foo
Match! (baz)(bar) foo foo foo baz foo foo bar foo foo foo foo

The order of the matches in the last line is the same order they appear in the input; let me know how that should really be handled, because your example output orders them differently.

And a solution in sed, if only for showing that the awk variant is easier:

$ cat input
foo foo foo foo foo foo bar foo foo foo foo
foo foo foo foo foo foo foo foo foo foo foo
foo foo foo foo foo foo foo bar foo foo foo
foo foo foo baz foo foo bar foo foo foo foo

$ sed -e '/bar/ba' -e '/baz/!b' -e :a -e 'h;s/^/|/;:b' -e 's/|\(.*\)\(bar\)/(\2)|\1/;s/|\(.*\)\(baz\)/(\2)|\1/;tb' -e 's/^/Match! /;G;s/|.*\n/ /' input
Match! (bar) foo foo foo foo foo foo bar foo foo foo foo
foo foo foo foo foo foo foo foo foo foo foo
Match! (bar) foo foo foo foo foo foo foo bar foo foo foo
Match! (bar) (baz) foo foo foo baz foo foo bar foo foo foo foo

It can be shortened a bit for sed variants that support extended regular expressions (GNU sed -r, OS X, *BSD sed -E):

sed -E -e '/bar|baz/!b' -e 'h;s/^/#/;:a' -e 's/#(.*)(bar|baz)/(\2)#\1/;ta' -e 's/^/Match! /;G;s/#.*\n/ /'

With commentary:

/bar/ba       # "bar" found: jump to label a
/baz/!b       # no "baz" either, end cycle, print line
:a            # label a  (only here if "bar" or "baz" found)
h             # save the line in the hold buffer
# The | character is used as a marker. occurrences of "|" in
# the input are fine, they don't interfere.
s/^/|/        # prepend a |
:b            # label b for a loop
    # find "bar" and move it before |
    s/|\(.*\)\(bar\)/(\2)|\1/
    # same with baz
    s/|\(.*\)\(baz\)/(\2)|\1/
tb            # jump back to b if there was a match
s/^/Match! /  # prepend "Match! "
G             # append the remembered line
s/|.*\n/ /    # remove the data we worked with between | and
              # the linefeed inserted by G, so there's only
              # the prefix and the real line left, separated
              # by one space character

/r/commandline Thread

Help? Trying to prepend search results for a text file. [linux]

Recently removed from /r/commandline

More Random Comments