Sed

-->

What is it

sed (stream editor) is a Unix utility that parses text files and implements a programming language which can apply textual transformations to such files. It reads input files line by line (sequentially), applying the operation which has been specified via the command line (or a sed script), and then outputs the line. It was developed from 1973 to 1974 as a Unix utility by Lee E. McMahon of Bell Labs,^[1] and is available today for most operating systems.^[2]

More reference: Sed - An Introduction and Tutorial by Bruce Barnett

Usag

The following example shows a typical use of sed, where the -e option indicates that the sed expression follows:

sed -e 's/oldstuff/newstuff/g' inputFileName &gt; outputFileName

In many versions, the -e is not required to precede the expression. The s stands for substitute. The g stands for global, which means that all matching occurrences in the line would be replaced. The regular expression (i.e. pattern) to be searched is placed after the first delimiting symbol (slash here) and the replacement follows the second symbol. Slash is the conventional symbol. Any other could be used to make syntax more readable if it does not occur in the pattern or replacement (see below).

Under Unix, sed is often used as a filter in a pipeline:

generate_data | sed -e 's/x/y/g'

That is, generate the data, and then make the small change of replacing x with y.

Several substitutions or other commands can be put together in a file called, for example, subst.sed and then be applied using the -f option to read the commands from the file:

sed -f subst.sed inputFileName &gt; outputFileName

Besides substitution, other forms of simple processing are possible. For example, the following uses the d command to delete lines that are either blank or only contain spaces:

sed -e '/^ *$/d' inputFileName

This example used some of the following regular expression metacharacters:

The caret (^) matches the beginning of the line.
The dollar sign ($) matches the end of the line.
The asterisk (*) matches zero or more occurrences of the previous character.

Complex sed constructs are possible, allowing it to serve as a simple, but highly specialised, programming language. Flow of control, for example, can be managed by the use of a label (a colon followed by a string) and the branch instruction b. An instruction b followed by a valid label name will move processing to the block following that label. If the label does not exist then the branch will end the script.

[edit]Samples

[edit]Regex print

sed -rn '\|euro| s|.*&gt;([0-9.]+) &amp;euro.*|\1|p' 

sed -rn '\|KLb| s|.*BM25b=([0-9.]+)_QEAdap_RocProx4w=([0-9.]+)F=0.6_([0-9.]+)_([0-9]+)_KLb([0-9.]+)_([0-9.]+).gz.eval:Average Precision: ([0-9.]+)|\4\t\3\t\2|p'`


-r  :turn on extended regex.
-n  :don't print every line.

\|euro|  :match only lines containing "euro".  The address
  :pattern traditionally uses /string/, but you can
  :change it to a different character by preceding
  :it with a backslash.

s|x|y|  :the standard sed substitution pattern.  Again, it's 
  :traditionally s/x/y/, but any basic ascii character
  :can be used.

.*&gt;  :a string of any kind of character, ending with "&gt;".

(..)  :designates the part of the match to be captured.

[0-9.]+  :a string of digits and/or periods of any length
  :(but at least one).

 &amp;euro.* :followed by [space]&amp;euro, and the rest of the line.

\1  :insert the captured part into the output string.

p  :print the results.

a running example

out="QEvenBM25b=0.3_QEAdap_RocProx4w=20F=0.6_20_5_KLb0.3_117.gz.eval:Average Precision: 0.3281"
map=`echo $out| cut -d : -f 3`;
echo $out |sed -rn '\|KLb| s|.*BM25b=([0-9.]+)_QEAdap_RocProx4w=([0-9]+)F=0.6_20_5_KLb([0-9.]+)_117.gz.eval:Average Precision: ([0-9.]+)|\4\t\3\t\2|p'

[edit]print specified lines

sed -n '1,50p' topics.RF08

[edit]Replace word / string syntax

1. sed -i 's/old-word/new-word/g' *.txt
2. find . -type f | xargs sed -i 's/zheng/ben/g'

3. find . -type f | while read f;do sed -i 's/home\/ben\/tr.ben/media\/disk\/ben/g' $f; done

GNU sed command can edit files in place (makes backup if extension supplied) using -i option. == Append a line after a specified line ==

sed -i '/TERRIER_HOME/a\whatever' optimiseDFR-cpost-adv-cent.sh

GNU sed command can edit files in place (makes backup if extension supplied) using -i option.

[edit]To delete a line

sed '/yourword/d' yourfile

[edit]To delete a word

sed 's/yourword//g' yourfile

[edit]To delete two words for a file simultaneously

sed -e 's/firstword//g' -e 's/secondword//g' yourfile
or
sed  's/firstword//g;s/secondword//g' yourfile

[edit]delete a line containing a specific word

sed '/yourword/d' yourfile

[edit]delete only the word use

sed 's/yourword//g' yourfile

[edit]delete two words for a file simultaneously use

 sed -e 's/firstword//g' -e 's/secondword//g' yourfile
or
 sed  's/firstword//g;s/secondword//g' yourfile

In the next example, sed, which usually only works on one line, removes newlines from sentences where the second sentence starts with one space. Consider the following text:

This is my cat
 my cat's name is betty
This is my dog
 my dog's name is frank

The sed script below will turn it into:

This is my cat my cat's name is betty
This is my dog my dog's name is frank

Here's the script:

sed 'N;s/\n / /;P;D;'

(N) add the next line to the work buffer
(s) substitute
(/\n /) match: \n (newline character in Unix) and one space
(/ /) replace with: one space
(P) print the top line of the work buffer
(D) delete the top line from the work buffer and run the script again

More useful and complex is transposing an XML table into a CSV:

sed -rn '{s/  *//g;//{s/.*//g;H};///g;H};/&lt;\/row&gt;/{x;s/^\r*\n//;s/\r*\n\r*/","/g;s/^([^\r\n]*)/"\1"/;p};/
First clear out all the extraneous whitespace:
s/  *//g


If the current line is a blank  line, add a blank line to the hold space:
//{s/.*//g;H}


If there is an actual value in the field, strip the xml and add the value to the hold space:
//{s/&lt;\/*field&gt;//g;H}


If it is the end of a row (
), then get the hold space, replace the newlines with '","', add quotes to the beginning and end and then print the line:

/&lt;\/row&gt;/{x;s/^\r*\n//;s/\r*\n\r*/","/g;s/^([^\r\n]*)/"\1"/;p}

If it is the beginning of a row, clear the hold space by adding a blank line:

/
[edit]Exotic examples

Despite the inherent limitations, sed scripts exist for games such as sokoban, arkanoid,^[5] and an implementation of tetris.^[6]