Find lines in (multiple) files that contain the given line in an input file

Question

If I have a number of files like the following:

file1:

123 456 789 012

file2:

line1 922 line2 392 line3 456 line5 291 line6 201 ...

file3:

line1 111 line2 123 line3 19 line5 542 line6 456 ...

What's the best way to get all of the lines in file1 which are contained in a line of bothfile2 and file3?

In this example, it would be just:

are lines just these numbers or is there anything more in them? — FelixJN, Aug 19, 2015 at 8:56
@Fiximan it's in this format - but longer numbers and different text instead of line1, etx — galois, Aug 19, 2015 at 21:43

gogoud · Accepted Answer · 2015-08-19 06:17:09Z

3

grep -of file1 file2|xargs -I {} grep -o "{}" file3

This starts by taking the input of file1 and feeding it in line by line into file2, returning the exact matched text if any. Then the results if any are fed into file3 line by line again returning only matched text.

answered Aug 19, 2015 at 6:17

gogoud

2,6132 gold badges14 silver badges18 bronze badges

Follow up - is there a way that this would still work if, for example, by excluding substrings in the first part of a line of file2 that matches the line in file1 being searched for? For example - if file2 had 1000 entries, and we wanted to find all occurrences of 400, it would show up at least once in file2 - in the first column (line400). Would there be a way to exclude that? maybe with regex, to make sure the text being matched comes after the tab/space?
– galois
Aug 19, 2015 at 9:23
1
Yes, by pre-processing the contents of file1 with sed to put a space in front of each word: sed 's/^/ /' file1|grep -of - file2|xargs -I {} grep -o "{}" file3 or, slightly more elegantly: grep -o "$(sed 's/^/ /' file1|grep -of - file2)" file3 This will now match ' 400' but not just '400' (as in 'line400')
– gogoud
Aug 19, 2015 at 13:31
That's a nice idea - but it unfortunately returns nothing in my terminal
– galois
Aug 21, 2015 at 4:08
It works perfectly for me with sed (GNU sed) 4.2.2 and grep (GNU grep) 2.20. Check for typos, especially that the sed expression is 's/^/ /'. The way stackexchange shows the code (breaking the line at the space) may have misled you.
– gogoud
Aug 22, 2015 at 5:29
I think the problem was that in file2 - the columns are separated by a tab, instead of a space. sed 's/^/\t/' seems to return output that is at least sort of correct. thanks for the idea
– galois
Aug 22, 2015 at 20:31

Add a comment |

chaos · Accepted Answer · 2015-08-19 06:15:48Z

You could use join 2 times on a row:

join -1 1 -2 2 -o 1.1 <(join -1 1 -2 2 <(sort file1) <(sort -k2 file2)) <(sort -k2 file3)

Prints only:

First look at the inner join. It joins file1 and file2 by using the field 1 in file1 and the field 2 in file2.

Then all of this is joined again with file3. Notice, the files must be sorted on the join fields (sort -k).

mikeserv · Accepted Answer · 2015-10-16 07:48:44Z

also(){ sed 'h; #save a copy of the line before edits s/[]$\./*^[]/\\&/g; #literally quote any metachars s|.*|/&/c\\|p; #print first half of sed command g; #get original copy out of hold space s/\\/&&/g;' | #double-up backslashes sed -nf - -- "$@" #read stdin script -file }

That function takes a pattern file as stdin and one or more search files as arguments. It writes to its output any line from its pattern file which can be matched in its search files. It is careful to reproduce the original exactly each time. And because it is, you can use it recursively.

also <file1 file2 | also file3

Shravan Yadav · Accepted Answer · 2015-08-19 06:22:26Z

grep should suffice your solution

 grep -o "`grep -of file1 file2`" file3

the inner grep "grep -f file1 file2" will grep the pattern present in file1 and file2 and the pattern returned is searched in file3.

this doesn't work as written because it needs the -o grep option thus: grep -o "$(grep -of file1 file2)" file3 — gogoud, Aug 19, 2015 at 6:19

Aug	SEP	Oct
	21
2022	2023	2024

Stack Exchange Network

Find lines in (multiple) files that contain the given line in an input file

4 Answers 4

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
text-processing
files
awk
grep
.

Hot Network Questions

Find lines in (multiple) files that contain the given line in an input file

4 Answers 4

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged text-processingfilesawkgrep.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
text-processing
files
awk
grep
.