- Find lines in file1.txt but not in file2.txt
- Find lines containing CJK characters from a file
- Delete (really) empty lines
- Change }; to } in multiple files
- After programmatically munging a text file, detect unexpectedly changed lines that don't match certain pattern
- Finding files that do not contain a given string
- Extract parts related to the -a option from man page of git
Find lines in file1.txt
but not in
file2.txt
grep -Fxvf file2.txt file1.txt
Find lines containing CJK characters from a file
grep --color=auto -P '.*\([^\x00-\x7f][^\x00-\x7f][^\x00-\x7f][^\x00-\x7f][^\x00-\x7f][^\x00-\x7f]\).* file.txt
Delete (really) empty lines
Delete (really) empty lines
grep . oldfile > newfile
or
grep ^$ oldfile > newfile
or, delete really empty lines and lines with only whitespace characters
grep -E ^[[:whitespace:]]*$ oldfile > newfile
Change };
to }
in multiple files
grep -rl "};" *.m | xargs sed -i .bak -e 's/};/}/'
Remarks:
grep
:
-
-r
recursively find -
-l
list files matching text -
xargs
construct argument list and execute utility
sed
:
-
-i .bak
change in place but keep backup with extension.bak
-
-e
execute command, in this case, a regex substitution
After programmatically munging a text file, detect unexpectedly changed lines that don't match certain pattern
If the expected changes will have string1
and
string2
and the unexpected ones should not have them,
use diff
to find the changesets
$ diff file1.txt file2.txt > diff12.txt
and find the ones that don't have string1
and
string2
$grep -vE -e '(^.*string2|string2.*$)|^---$|^[[:digit:]]+(,[[:digit:]]+)*c[[:digit:]]+(,[[:digit:]]+)*$' diff12.txt
If you only want to count how many such lines there are, do
-vEc
instead of -vE
.
Remarks:
-
-c
: count but don't echo the matching lines -
-v
: boolean negate the test -
-E
: use POSIX extended regular expression -
^---$|^[[:digit:]]+(,[[:digit:]]+)*c[[:digit:]]+(,[[:digit:]]+)*$
are used to match the lines "---
" and "5918,5925c5918,5925
" thatdiff
generates fordiff file1.txt file2.txt
Finding files that do not contain a given string
Find all files:
$ find . -type f -name "*.txt" > /tmp/all.txt
Find files that contain string foobar
:
$ ack -l foobar > /tmp/positive.txt
Find the complement using grep
:
$ grep -v -x -f /tmp/positive.txt /tmp/all.txt
Remarks:
-
-v
invert the matching, selecting non-matching cases; -
-x
matches the entire line -
-f <file name>
takes patterns from file.
Extract parts related to the -a
option from man
page of git
git help commit 10 | grep -B 1 -A 10 "\-a" | more