Denis Vlasenko | 72cc964 | 2007-11-13 17:13:31 +0000 | [diff] [blame^] | 1 | http://www.student.northpark.edu/pemente/sed/sed1line.txt |
| 2 | ------------------------------------------------------------------------- |
| 3 | HANDY ONE-LINERS FOR SED (Unix stream editor) Apr. 26, 2004 |
| 4 | compiled by Eric Pement - pemente[at]northpark[dot]edu version 5.4 |
| 5 | Latest version of this file is usually at: |
| 6 | http://sed.sourceforge.net/sed1line.txt |
| 7 | http://www.student.northpark.edu/pemente/sed/sed1line.txt |
| 8 | This file is also available in Portuguese at: |
| 9 | http://www.lrv.ufsc.br/wmaker/sed_ptBR.html |
| 10 | |
| 11 | FILE SPACING: |
| 12 | |
| 13 | # double space a file |
| 14 | sed G |
| 15 | |
| 16 | # double space a file which already has blank lines in it. Output file |
| 17 | # should contain no more than one blank line between lines of text. |
| 18 | sed '/^$/d;G' |
| 19 | |
| 20 | # triple space a file |
| 21 | sed 'G;G' |
| 22 | |
| 23 | # undo double-spacing (assumes even-numbered lines are always blank) |
| 24 | sed 'n;d' |
| 25 | |
| 26 | # insert a blank line above every line which matches "regex" |
| 27 | sed '/regex/{x;p;x;}' |
| 28 | |
| 29 | # insert a blank line below every line which matches "regex" |
| 30 | sed '/regex/G' |
| 31 | |
| 32 | # insert a blank line above and below every line which matches "regex" |
| 33 | sed '/regex/{x;p;x;G;}' |
| 34 | |
| 35 | NUMBERING: |
| 36 | |
| 37 | # number each line of a file (simple left alignment). Using a tab (see |
| 38 | # note on '\t' at end of file) instead of space will preserve margins. |
| 39 | sed = filename | sed 'N;s/\n/\t/' |
| 40 | |
| 41 | # number each line of a file (number on left, right-aligned) |
| 42 | sed = filename | sed 'N; s/^/ /; s/ *\(.\{6,\}\)\n/\1 /' |
| 43 | |
| 44 | # number each line of file, but only print numbers if line is not blank |
| 45 | sed '/./=' filename | sed '/./N; s/\n/ /' |
| 46 | |
| 47 | # count lines (emulates "wc -l") |
| 48 | sed -n '$=' |
| 49 | |
| 50 | TEXT CONVERSION AND SUBSTITUTION: |
| 51 | |
| 52 | # IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format |
| 53 | sed 's/.$//' # assumes that all lines end with CR/LF |
| 54 | sed 's/^M$//' # in bash/tcsh, press Ctrl-V then Ctrl-M |
| 55 | sed 's/\x0D$//' # gsed 3.02.80, but top script is easier |
| 56 | |
| 57 | # IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format |
| 58 | sed "s/$/`echo -e \\\r`/" # command line under ksh |
| 59 | sed 's/$'"/`echo \\\r`/" # command line under bash |
| 60 | sed "s/$/`echo \\\r`/" # command line under zsh |
| 61 | sed 's/$/\r/' # gsed 3.02.80 |
| 62 | |
| 63 | # IN DOS ENVIRONMENT: convert Unix newlines (LF) to DOS format |
| 64 | sed "s/$//" # method 1 |
| 65 | sed -n p # method 2 |
| 66 | |
| 67 | # IN DOS ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format |
| 68 | # Can only be done with UnxUtils sed, version 4.0.7 or higher. |
| 69 | # Cannot be done with other DOS versions of sed. Use "tr" instead. |
| 70 | sed "s/\r//" infile >outfile # UnxUtils sed v4.0.7 or higher |
| 71 | tr -d \r <infile >outfile # GNU tr version 1.22 or higher |
| 72 | |
| 73 | # delete leading whitespace (spaces, tabs) from front of each line |
| 74 | # aligns all text flush left |
| 75 | sed 's/^[ \t]*//' # see note on '\t' at end of file |
| 76 | |
| 77 | # delete trailing whitespace (spaces, tabs) from end of each line |
| 78 | sed 's/[ \t]*$//' # see note on '\t' at end of file |
| 79 | |
| 80 | # delete BOTH leading and trailing whitespace from each line |
| 81 | sed 's/^[ \t]*//;s/[ \t]*$//' |
| 82 | |
| 83 | # insert 5 blank spaces at beginning of each line (make page offset) |
| 84 | sed 's/^/ /' |
| 85 | |
| 86 | # align all text flush right on a 79-column width |
| 87 | sed -e :a -e 's/^.\{1,78\}$/ &/;ta' # set at 78 plus 1 space |
| 88 | |
| 89 | # center all text in the middle of 79-column width. In method 1, |
| 90 | # spaces at the beginning of the line are significant, and trailing |
| 91 | # spaces are appended at the end of the line. In method 2, spaces at |
| 92 | # the beginning of the line are discarded in centering the line, and |
| 93 | # no trailing spaces appear at the end of lines. |
| 94 | sed -e :a -e 's/^.\{1,77\}$/ & /;ta' # method 1 |
| 95 | sed -e :a -e 's/^.\{1,77\}$/ &/;ta' -e 's/\( *\)\1/\1/' # method 2 |
| 96 | |
| 97 | # substitute (find and replace) "foo" with "bar" on each line |
| 98 | sed 's/foo/bar/' # replaces only 1st instance in a line |
| 99 | sed 's/foo/bar/4' # replaces only 4th instance in a line |
| 100 | sed 's/foo/bar/g' # replaces ALL instances in a line |
| 101 | sed 's/\(.*\)foo\(.*foo\)/\1bar\2/' # replace the next-to-last case |
| 102 | sed 's/\(.*\)foo/\1bar/' # replace only the last case |
| 103 | |
| 104 | # substitute "foo" with "bar" ONLY for lines which contain "baz" |
| 105 | sed '/baz/s/foo/bar/g' |
| 106 | |
| 107 | # substitute "foo" with "bar" EXCEPT for lines which contain "baz" |
| 108 | sed '/baz/!s/foo/bar/g' |
| 109 | |
| 110 | # change "scarlet" or "ruby" or "puce" to "red" |
| 111 | sed 's/scarlet/red/g;s/ruby/red/g;s/puce/red/g' # most seds |
| 112 | gsed 's/scarlet\|ruby\|puce/red/g' # GNU sed only |
| 113 | |
| 114 | # reverse order of lines (emulates "tac") |
| 115 | # bug/feature in HHsed v1.5 causes blank lines to be deleted |
| 116 | sed '1!G;h;$!d' # method 1 |
| 117 | sed -n '1!G;h;$p' # method 2 |
| 118 | |
| 119 | # reverse each character on the line (emulates "rev") |
| 120 | sed '/\n/!G;s/\(.\)\(.*\n\)/&\2\1/;//D;s/.//' |
| 121 | |
| 122 | # join pairs of lines side-by-side (like "paste") |
| 123 | sed '$!N;s/\n/ /' |
| 124 | |
| 125 | # if a line ends with a backslash, append the next line to it |
| 126 | sed -e :a -e '/\\$/N; s/\\\n//; ta' |
| 127 | |
| 128 | # if a line begins with an equal sign, append it to the previous line |
| 129 | # and replace the "=" with a single space |
| 130 | sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D' |
| 131 | |
| 132 | # add commas to numeric strings, changing "1234567" to "1,234,567" |
| 133 | gsed ':a;s/\B[0-9]\{3\}\>/,&/;ta' # GNU sed |
| 134 | sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta' # other seds |
| 135 | |
| 136 | # add commas to numbers with decimal points and minus signs (GNU sed) |
| 137 | gsed ':a;s/\(^\|[^0-9.]\)\([0-9]\+\)\([0-9]\{3\}\)/\1\2,\3/g;ta' |
| 138 | |
| 139 | # add a blank line every 5 lines (after lines 5, 10, 15, 20, etc.) |
| 140 | gsed '0~5G' # GNU sed only |
| 141 | sed 'n;n;n;n;G;' # other seds |
| 142 | |
| 143 | SELECTIVE PRINTING OF CERTAIN LINES: |
| 144 | |
| 145 | # print first 10 lines of file (emulates behavior of "head") |
| 146 | sed 10q |
| 147 | |
| 148 | # print first line of file (emulates "head -1") |
| 149 | sed q |
| 150 | |
| 151 | # print the last 10 lines of a file (emulates "tail") |
| 152 | sed -e :a -e '$q;N;11,$D;ba' |
| 153 | |
| 154 | # print the last 2 lines of a file (emulates "tail -2") |
| 155 | sed '$!N;$!D' |
| 156 | |
| 157 | # print the last line of a file (emulates "tail -1") |
| 158 | sed '$!d' # method 1 |
| 159 | sed -n '$p' # method 2 |
| 160 | |
| 161 | # print only lines which match regular expression (emulates "grep") |
| 162 | sed -n '/regexp/p' # method 1 |
| 163 | sed '/regexp/!d' # method 2 |
| 164 | |
| 165 | # print only lines which do NOT match regexp (emulates "grep -v") |
| 166 | sed -n '/regexp/!p' # method 1, corresponds to above |
| 167 | sed '/regexp/d' # method 2, simpler syntax |
| 168 | |
| 169 | # print the line immediately before a regexp, but not the line |
| 170 | # containing the regexp |
| 171 | sed -n '/regexp/{g;1!p;};h' |
| 172 | |
| 173 | # print the line immediately after a regexp, but not the line |
| 174 | # containing the regexp |
| 175 | sed -n '/regexp/{n;p;}' |
| 176 | |
| 177 | # print 1 line of context before and after regexp, with line number |
| 178 | # indicating where the regexp occurred (similar to "grep -A1 -B1") |
| 179 | sed -n -e '/regexp/{=;x;1!p;g;$!N;p;D;}' -e h |
| 180 | |
| 181 | # grep for AAA and BBB and CCC (in any order) |
| 182 | sed '/AAA/!d; /BBB/!d; /CCC/!d' |
| 183 | |
| 184 | # grep for AAA and BBB and CCC (in that order) |
| 185 | sed '/AAA.*BBB.*CCC/!d' |
| 186 | |
| 187 | # grep for AAA or BBB or CCC (emulates "egrep") |
| 188 | sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d # most seds |
| 189 | gsed '/AAA\|BBB\|CCC/!d' # GNU sed only |
| 190 | |
| 191 | # print paragraph if it contains AAA (blank lines separate paragraphs) |
| 192 | # HHsed v1.5 must insert a 'G;' after 'x;' in the next 3 scripts below |
| 193 | sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;' |
| 194 | |
| 195 | # print paragraph if it contains AAA and BBB and CCC (in any order) |
| 196 | sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;/BBB/!d;/CCC/!d' |
| 197 | |
| 198 | # print paragraph if it contains AAA or BBB or CCC |
| 199 | sed -e '/./{H;$!d;}' -e 'x;/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d |
| 200 | gsed '/./{H;$!d;};x;/AAA\|BBB\|CCC/b;d' # GNU sed only |
| 201 | |
| 202 | # print only lines of 65 characters or longer |
| 203 | sed -n '/^.\{65\}/p' |
| 204 | |
| 205 | # print only lines of less than 65 characters |
| 206 | sed -n '/^.\{65\}/!p' # method 1, corresponds to above |
| 207 | sed '/^.\{65\}/d' # method 2, simpler syntax |
| 208 | |
| 209 | # print section of file from regular expression to end of file |
| 210 | sed -n '/regexp/,$p' |
| 211 | |
| 212 | # print section of file based on line numbers (lines 8-12, inclusive) |
| 213 | sed -n '8,12p' # method 1 |
| 214 | sed '8,12!d' # method 2 |
| 215 | |
| 216 | # print line number 52 |
| 217 | sed -n '52p' # method 1 |
| 218 | sed '52!d' # method 2 |
| 219 | sed '52q;d' # method 3, efficient on large files |
| 220 | |
| 221 | # beginning at line 3, print every 7th line |
| 222 | gsed -n '3~7p' # GNU sed only |
| 223 | sed -n '3,${p;n;n;n;n;n;n;}' # other seds |
| 224 | |
| 225 | # print section of file between two regular expressions (inclusive) |
| 226 | sed -n '/Iowa/,/Montana/p' # case sensitive |
| 227 | |
| 228 | SELECTIVE DELETION OF CERTAIN LINES: |
| 229 | |
| 230 | # print all of file EXCEPT section between 2 regular expressions |
| 231 | sed '/Iowa/,/Montana/d' |
| 232 | |
| 233 | # delete duplicate, consecutive lines from a file (emulates "uniq"). |
| 234 | # First line in a set of duplicate lines is kept, rest are deleted. |
| 235 | sed '$!N; /^\(.*\)\n\1$/!P; D' |
| 236 | |
| 237 | # delete duplicate, nonconsecutive lines from a file. Beware not to |
| 238 | # overflow the buffer size of the hold space, or else use GNU sed. |
| 239 | sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P' |
| 240 | |
| 241 | # delete all lines except duplicate lines (emulates "uniq -d"). |
| 242 | sed '$!N; s/^\(.*\)\n\1$/\1/; t; D' |
| 243 | |
| 244 | # delete the first 10 lines of a file |
| 245 | sed '1,10d' |
| 246 | |
| 247 | # delete the last line of a file |
| 248 | sed '$d' |
| 249 | |
| 250 | # delete the last 2 lines of a file |
| 251 | sed 'N;$!P;$!D;$d' |
| 252 | |
| 253 | # delete the last 10 lines of a file |
| 254 | sed -e :a -e '$d;N;2,10ba' -e 'P;D' # method 1 |
| 255 | sed -n -e :a -e '1,10!{P;N;D;};N;ba' # method 2 |
| 256 | |
| 257 | # delete every 8th line |
| 258 | gsed '0~8d' # GNU sed only |
| 259 | sed 'n;n;n;n;n;n;n;d;' # other seds |
| 260 | |
| 261 | # delete ALL blank lines from a file (same as "grep '.' ") |
| 262 | sed '/^$/d' # method 1 |
| 263 | sed '/./!d' # method 2 |
| 264 | |
| 265 | # delete all CONSECUTIVE blank lines from file except the first; also |
| 266 | # deletes all blank lines from top and end of file (emulates "cat -s") |
| 267 | sed '/./,/^$/!d' # method 1, allows 0 blanks at top, 1 at EOF |
| 268 | sed '/^$/N;/\n$/D' # method 2, allows 1 blank at top, 0 at EOF |
| 269 | |
| 270 | # delete all CONSECUTIVE blank lines from file except the first 2: |
| 271 | sed '/^$/N;/\n$/N;//D' |
| 272 | |
| 273 | # delete all leading blank lines at top of file |
| 274 | sed '/./,$!d' |
| 275 | |
| 276 | # delete all trailing blank lines at end of file |
| 277 | sed -e :a -e '/^\n*$/{$d;N;ba' -e '}' # works on all seds |
| 278 | sed -e :a -e '/^\n*$/N;/\n$/ba' # ditto, except for gsed 3.02* |
| 279 | |
| 280 | # delete the last line of each paragraph |
| 281 | sed -n '/^$/{p;h;};/./{x;/./p;}' |
| 282 | |
| 283 | SPECIAL APPLICATIONS: |
| 284 | |
| 285 | # remove nroff overstrikes (char, backspace) from man pages. The 'echo' |
| 286 | # command may need an -e switch if you use Unix System V or bash shell. |
| 287 | sed "s/.`echo \\\b`//g" # double quotes required for Unix environment |
| 288 | sed 's/.^H//g' # in bash/tcsh, press Ctrl-V and then Ctrl-H |
| 289 | sed 's/.\x08//g' # hex expression for sed v1.5 |
| 290 | |
| 291 | # get Usenet/e-mail message header |
| 292 | sed '/^$/q' # deletes everything after first blank line |
| 293 | |
| 294 | # get Usenet/e-mail message body |
| 295 | sed '1,/^$/d' # deletes everything up to first blank line |
| 296 | |
| 297 | # get Subject header, but remove initial "Subject: " portion |
| 298 | sed '/^Subject: */!d; s///;q' |
| 299 | |
| 300 | # get return address header |
| 301 | sed '/^Reply-To:/q; /^From:/h; /./d;g;q' |
| 302 | |
| 303 | # parse out the address proper. Pulls out the e-mail address by itself |
| 304 | # from the 1-line return address header (see preceding script) |
| 305 | sed 's/ *(.*)//; s/>.*//; s/.*[:<] *//' |
| 306 | |
| 307 | # add a leading angle bracket and space to each line (quote a message) |
| 308 | sed 's/^/> /' |
| 309 | |
| 310 | # delete leading angle bracket & space from each line (unquote a message) |
| 311 | sed 's/^> //' |
| 312 | |
| 313 | # remove most HTML tags (accommodates multiple-line tags) |
| 314 | sed -e :a -e 's/<[^>]*>//g;/</N;//ba' |
| 315 | |
| 316 | # extract multi-part uuencoded binaries, removing extraneous header |
| 317 | # info, so that only the uuencoded portion remains. Files passed to |
| 318 | # sed must be passed in the proper order. Version 1 can be entered |
| 319 | # from the command line; version 2 can be made into an executable |
| 320 | # Unix shell script. (Modified from a script by Rahul Dhesi.) |
| 321 | sed '/^end/,/^begin/d' file1 file2 ... fileX | uudecode # vers. 1 |
| 322 | sed '/^end/,/^begin/d' "$@" | uudecode # vers. 2 |
| 323 | |
| 324 | # zip up each .TXT file individually, deleting the source file and |
| 325 | # setting the name of each .ZIP file to the basename of the .TXT file |
| 326 | # (under DOS: the "dir /b" switch returns bare filenames in all caps). |
| 327 | echo @echo off >zipup.bat |
| 328 | dir /b *.txt | sed "s/^\(.*\)\.TXT/pkzip -mo \1 \1.TXT/" >>zipup.bat |
| 329 | |
| 330 | TYPICAL USE: Sed takes one or more editing commands and applies all of |
| 331 | them, in sequence, to each line of input. After all the commands have |
| 332 | been applied to the first input line, that line is output and a second |
| 333 | input line is taken for processing, and the cycle repeats. The |
| 334 | preceding examples assume that input comes from the standard input |
| 335 | device (i.e, the console, normally this will be piped input). One or |
| 336 | more filenames can be appended to the command line if the input does |
| 337 | not come from stdin. Output is sent to stdout (the screen). Thus: |
| 338 | |
| 339 | cat filename | sed '10q' # uses piped input |
| 340 | sed '10q' filename # same effect, avoids a useless "cat" |
| 341 | sed '10q' filename > newfile # redirects output to disk |
| 342 | |
| 343 | For additional syntax instructions, including the way to apply editing |
| 344 | commands from a disk file instead of the command line, consult "sed & |
| 345 | awk, 2nd Edition," by Dale Dougherty and Arnold Robbins (O'Reilly, |
| 346 | 1997; http://www.ora.com), "UNIX Text Processing," by Dale Dougherty |
| 347 | and Tim O'Reilly (Hayden Books, 1987) or the tutorials by Mike Arst |
| 348 | distributed in U-SEDIT2.ZIP (many sites). To fully exploit the power |
| 349 | of sed, one must understand "regular expressions." For this, see |
| 350 | "Mastering Regular Expressions" by Jeffrey Friedl (O'Reilly, 1997). |
| 351 | The manual ("man") pages on Unix systems may be helpful (try "man |
| 352 | sed", "man regexp", or the subsection on regular expressions in "man |
| 353 | ed"), but man pages are notoriously difficult. They are not written to |
| 354 | teach sed use or regexps to first-time users, but as a reference text |
| 355 | for those already acquainted with these tools. |
| 356 | |
| 357 | QUOTING SYNTAX: The preceding examples use single quotes ('...') |
| 358 | instead of double quotes ("...") to enclose editing commands, since |
| 359 | sed is typically used on a Unix platform. Single quotes prevent the |
| 360 | Unix shell from intrepreting the dollar sign ($) and backquotes |
| 361 | (`...`), which are expanded by the shell if they are enclosed in |
| 362 | double quotes. Users of the "csh" shell and derivatives will also need |
| 363 | to quote the exclamation mark (!) with the backslash (i.e., \!) to |
| 364 | properly run the examples listed above, even within single quotes. |
| 365 | Versions of sed written for DOS invariably require double quotes |
| 366 | ("...") instead of single quotes to enclose editing commands. |
| 367 | |
| 368 | USE OF '\t' IN SED SCRIPTS: For clarity in documentation, we have used |
| 369 | the expression '\t' to indicate a tab character (0x09) in the scripts. |
| 370 | However, most versions of sed do not recognize the '\t' abbreviation, |
| 371 | so when typing these scripts from the command line, you should press |
| 372 | the TAB key instead. '\t' is supported as a regular expression |
| 373 | metacharacter in awk, perl, and HHsed, sedmod, and GNU sed v3.02.80. |
| 374 | |
| 375 | VERSIONS OF SED: Versions of sed do differ, and some slight syntax |
| 376 | variation is to be expected. In particular, most do not support the |
| 377 | use of labels (:name) or branch instructions (b,t) within editing |
| 378 | commands, except at the end of those commands. We have used the syntax |
| 379 | which will be portable to most users of sed, even though the popular |
| 380 | GNU versions of sed allow a more succinct syntax. When the reader sees |
| 381 | a fairly long command such as this: |
| 382 | |
| 383 | sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d |
| 384 | |
| 385 | it is heartening to know that GNU sed will let you reduce it to: |
| 386 | |
| 387 | sed '/AAA/b;/BBB/b;/CCC/b;d' # or even |
| 388 | sed '/AAA\|BBB\|CCC/b;d' |
| 389 | |
| 390 | In addition, remember that while many versions of sed accept a command |
| 391 | like "/one/ s/RE1/RE2/", some do NOT allow "/one/! s/RE1/RE2/", which |
| 392 | contains space before the 's'. Omit the space when typing the command. |
| 393 | |
| 394 | OPTIMIZING FOR SPEED: If execution speed needs to be increased (due to |
| 395 | large input files or slow processors or hard disks), substitution will |
| 396 | be executed more quickly if the "find" expression is specified before |
| 397 | giving the "s/.../.../" instruction. Thus: |
| 398 | |
| 399 | sed 's/foo/bar/g' filename # standard replace command |
| 400 | sed '/foo/ s/foo/bar/g' filename # executes more quickly |
| 401 | sed '/foo/ s//bar/g' filename # shorthand sed syntax |
| 402 | |
| 403 | On line selection or deletion in which you only need to output lines |
| 404 | from the first part of the file, a "quit" command (q) in the script |
| 405 | will drastically reduce processing time for large files. Thus: |
| 406 | |
| 407 | sed -n '45,50p' filename # print line nos. 45-50 of a file |
| 408 | sed -n '51q;45,50p' filename # same, but executes much faster |
| 409 | |
| 410 | If you have any additional scripts to contribute or if you find errors |
| 411 | in this document, please send e-mail to the compiler. Indicate the |
| 412 | version of sed you used, the operating system it was compiled for, and |
| 413 | the nature of the problem. Various scripts in this file were written |
| 414 | or contributed by: |
| 415 | |
| 416 | Al Aab <af137@freenet.toronto.on.ca> # "seders" list moderator |
| 417 | Edgar Allen <era@sky.net> # various |
| 418 | Yiorgos Adamopoulos <adamo@softlab.ece.ntua.gr> |
| 419 | Dale Dougherty <dale@songline.com> # author of "sed & awk" |
| 420 | Carlos Duarte <cdua@algos.inesc.pt> # author of "do it with sed" |
| 421 | Eric Pement <pemente@northpark.edu> # author of this document |
| 422 | Ken Pizzini <ken@halcyon.com> # author of GNU sed v3.02 |
| 423 | S.G. Ravenhall <stew.ravenhall@totalise.co.uk> # great de-html script |
| 424 | Greg Ubben <gsu@romulus.ncsc.mil> # many contributions & much help |
| 425 | ------------------------------------------------------------------------- |