Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
595 views
in Technique[技术] by (71.8m points)

linux - Find rows with the same value in a column in two files

I've got two files (millions of columns)

File1.txt, ~4k rows

some_key1 some_text1
some_key2 some_text2
...
some_keyn some_textn

File2.txt, ~20 M rows

some_key11 some_key11 some_text1
some_key22 some_key22 some_text2
...
some_keynn some_keynn some_textn

When there is an exact match between column 2 in File1.txt and column 3 in File2.txt, I want to print out the particular rows from both files.

EDIT

I've tried this (I forgot to write it) but it doesn't work

awk 'NR{a[$2]}==FNR{b[$3]}'$1 in a{print $1}' file1.txt file2.txt
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You need to fix your awk program

To print all records in file2 if field 1 (file1) exists in field 3 (file2):-

awk 'NR==FNR{A[$2];next}$3 in A' file1.txt file2.txt
some_key11 some_key11 some_text1
some_key22 some_key22 some_text2
...
some_keynn some_keynn some_textn

To print just field 1 in file2 if field 1 (file1) exists in field 3 (file2):-

awk 'NR==FNR{A[$2];next}$3 in A{ print $1 }' file1.txt file2.txt
some_key11
some_key22
...
some_keynn

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...