How to join two files based on one column in AWK
Tag : awk , By : user134570
Date : March 29 2020, 07:55 AM
seems to work fine I have two files both with millions of records in (not the same amount of records in each) , Try this awk -F"," 'BEGIN{OFS=","} {if (NR==FNR) {a[$1]=$2; next} if ($1 in a) {print $1, $2, $3, a[$1]}}' f2 f1
BEGIN{OFS=","} {if (NR==FNR) {a[$1]=$2; next} if ($1 in a) {print $1, $2, $3, a[$1]}}
|
How to join two files based on partial string match of a column in awk
Date : March 29 2020, 07:55 AM
I wish did fix the issue. This assumes there can't be multiple values from any given $2 of fileA appear in fileB: $ cat tst.awk
BEGIN { FS=OFS="\t" }
NR==FNR { fileB[$4] = $1 OFS $2; next }
{
tail = ""
split($2,fileA,/,/)
for (i in fileA) {
if (fileA[i] in fileB) {
tail = OFS fileB[fileA[i]]
}
}
print $0 tail
}
$ awk -f tst.awk fileB fileA
chr1 123,aa aa b c d xxxx abcd
chr1 234,dd a b c d
chr1 af,345,e aa b c d yyyy defg
chr1 456 a b c d
BEGIN { FS=OFS="\t" }
NR==FNR {
split($4,b,/,/)
for (i in b) {
fileB[b[i]] = $1 OFS $2
}
next
}
{
tail = ""
split($2,a,/,/)
for (i in a) {
if (a[i] in fileB) {
tail = OFS fileB[a[i]]
}
}
print $0 tail
}
|
Combining 2 csv files based on same column using join
Tag : bash , By : cameron
Date : March 29 2020, 07:55 AM
Hope this helps Are the files with windows line-endings \r? You can try dos2unix file_1.csv and dos2unix file_2.csv ?
|
How to join two files based on one column in AWK (using wildcards)
Tag : awk , By : mobi phil
Date : March 29 2020, 07:55 AM
I hope this helps . I have 2 files, and I need to compare column 2 from file 2 with column 3 from file 1. , hacky! $ awk -F'","' 'NR==FNR {n=split($NF,x,"-"); for(i=2;i<n;i++) a[x[i]]=$1 FS $2; next}
$2 in a {print a[$2] "\"," $0}' file1 file2
"testserver1","testserver1.domain.net","windows","10.10.10.10","datacenter1"
"testserver2","testserver2.domain.net","linux","2.2.2.2","datacenter2"
|
Merge multiple files and split output to multiple files based in each column (post 2)
Tag : awk , By : Tom Berthon
Date : March 29 2020, 07:55 AM
should help you out With GNU awk for ENDFILE and automatic handling of multiple open files and assuming your posted sample output showing file3 and file4 each having more fields than file1 and file2 is a mistake: $ cat tst.awk
BEGIN { FS=OFS=","; numHdrFlds=3 }
FNR <= numHdrFlds {
gsub(/[^0-9]/,"")
hdr = (FNR==1 ? "" : hdr OFS) $0
next
}
{
for (i=1; i<=NF; i++) {
data[i] = (FNR==(numHdrFlds+1) ? "" : data[i] OFS) ($i)+0
}
}
ENDFILE {
for ( fileNr=1; fileNr<=NF; fileNr++ ) {
print hdr, data[fileNr] > ("outputFile" fileNr)
}
}
$ awk -f tst.awk file1 file2
$ for i in outputFile*; do echo "$i"; cat "$i"; echo "---"; done
outputFile1
6174,15,3,1.6,1.7,1.8
6176,17,5,1.6,1.5,1.3
---
outputFile2
6174,15,3,19.5,23.2,26.5
6176,17,5,18.6,23.5,26.8
---
outputFile3
6174,15,3,0,28.3,27
6176,17,5,0,19.7,19.2
---
outputFile4
6174,15,3,0,27,25.4
6176,17,5,0,19.2,18.5
---
|