Sed is a stream editor, where a stream can be
- a single file specified:
sed 's/a/b/' input.txt
- a list of files specified:
sed 's/a/b/' 1.txt 2.txt 3.txt
- standard input (ie whatever you type AFTER trying sed then enter, by the way it works like a prompt and will keep going, executing each time you press enter)
Streams are processed line by line.
So it chunks its input by looking for line terminators
It runs 1 or more commands in order to edit the stream
Ref) Online gnu manual (different from what you get running man sed
- much easier to understand!)
Editing files in place
use -i
or -I
option, giving the extension of backup files to be created.
The difference betweem them only matters when you give sed multiple files, and affects how you will specify line numbers.
-
i
: all files considered as one concatenated lump. So line 1 of the second file will be{number of lines in file 1} + 1
-
I
: don't consider all files as one concatenated lump
Command Structure
commands are made of address(s) function, arguments
- addresses are basically line numbers
- addresses are optional, you can have 0, 1 or 2.
- with 0, the function is applied to the whole stream
- with 1, the function is applied to that address(line number) only
- with 2, the function is applied to the inclusive range between the addresses.
- you can use $ to mean end of the stream (final line)
Example:
testfile:
a
a
a
sed '2s/a/x/' test file
- a single address is specified - line 2
- the function is "s/a/x/", which is the "s" function for replacing patterns.
result:
a
x
a
sed '1,2s/a/x/' testfile
Two addresses are given so the command works on the range of lines 1~2
result:
x
x
a
sed '$s/a/x' testfile
One address is given, and its the final line of the stream, so result is
a
a
x
Import Concept For Understanding The Manual: 「pattern space」 and 「hold space」
Sed has 2 buffers, called the pattern space and hold space.
(You won't get far reading man sed
if you don't know what they are)
The pattern space holds one line at a time while processing a stream.
At the end of a script for each line, the pattern space is written to sed's output.
The hold space keeps its content over multiple lines - kind of like the "accumulator" arg to a reduce
function.
By default nothing goes in the hold space, and most simple use cases don't touch it.
You can do things like
- append to the pattern space (the currently processed line) to the hold space
- vice versa
- swap contents of pattern and hold space
Examples Using the Hold Space
Accumulate all lines into one line
-
H
: append pattern space to hold space including newline -
x
: swap pattern and hold spaces, use $ to make this only happen on the final line of the stream. -
s
: pattern replacement command for replacing newlines with spaces.
sed 'H; $x; s/\n/ /g' input_file
To be honest you can do this more easily in most text editors (eg VS Code) using "find and replace". Where sed may have an advantage is if you altered the above command to accumulate just instances of a particular pattern instead of the whole line.
Reversing Order of Lines
sed -n '1!G;h;$p'
https://stackoverflow.com/questions/12833714/the-concept-of-hold-space-and-pattern-space-in-sed (see popular answer explaining this)
I don't think you can do this in VsCode1 (although there's probably some plugin for it...)
Functions Other Than s
-
d
- delete line- Delete first and last lines:
sed '1d; $d' input_file
- Delete first and last lines:
-
q
- quit the program- don't alter input and quit at the 5th line (ie only get first 5 lines):
sed '5q' input_file
- don't alter input and quit at the 5th line (ie only get first 5 lines):
-
y
- same as s but no regex, just pure find and replace of literal characters- replace all "a"s with "b"s
sed 'y/a/b/' input_file
- replace all "a"s with "b"s
-
r
- print filename's contents between each line - useful for debugging line by line?- if you have a file, poo.txt with just the word "poo" in it, put a line with "poo" between all lines:
sed - '-r poo.txt' input_file
-
a\ <text>
- similar to r but specify text directly instead
Example of a\
:
testfile:
a
b
c
sed 'a\
LINE: ' testfile
result:
LINE: a
LINE: b
LINE: c
Manual Jargon: What is a "script"?
"script" refers to all of the commands collectively
So if you have sed -e "/s/a/b/g" -e "s/d/e/g" -e "p":
## This is a 3 command script
sed \
-e 'p' \ # command 1: print pattern space (in current untouched condition)
-e 's/a/b/' \ # replace a with b
-e 's/x/y/' \ # replace x with y
input_file
Also you can have multiple commands within same single brackets, and delimit then using semicolons.
Ie the same as above:
sed 'p; s/a/b/; s/x/y/' input_file
How Does Branching Work in Scripts
You can branch at certain points in scripts by using the 'b' command.
It takes a label argument, but by default it just means go to the end of the script.
And that's all I have time to write!