logo
down
shadow

How to join multiple txt files into based on column?


How to join multiple txt files into based on column?

Content Index :

How to join multiple txt files into based on column?
Tag : linux , By : CSCI GOIN KILL ME
Date : November 24 2020, 05:44 AM


Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

How to join two files based on one column in AWK


Tag : awk , By : user134570
Date : March 29 2020, 07:55 AM
seems to work fine I have two files both with millions of records in (not the same amount of records in each) , Try this
awk -F"," 'BEGIN{OFS=","} {if (NR==FNR) {a[$1]=$2; next} if ($1 in a) {print $1, $2, $3, a[$1]}}' f2 f1
BEGIN{OFS=","} {if (NR==FNR) {a[$1]=$2; next} if ($1 in a) {print $1, $2, $3, a[$1]}}

How to join two files based on partial string match of a column in awk


Tag : awk , By : oiyto
Date : March 29 2020, 07:55 AM
I wish did fix the issue. This assumes there can't be multiple values from any given $2 of fileA appear in fileB:
$ cat tst.awk
BEGIN { FS=OFS="\t" }
NR==FNR { fileB[$4] = $1 OFS $2; next }
{
    tail = ""
    split($2,fileA,/,/)
    for (i in fileA) {
        if (fileA[i] in fileB) {
            tail = OFS fileB[fileA[i]]
        }
    }
    print $0 tail
}

$ awk -f tst.awk fileB fileA
chr1    123,aa  aa      b       c       d       xxxx    abcd
chr1    234,dd  a       b       c       d
chr1    af,345,e        aa      b       c       d       yyyy    defg
chr1    456     a       b       c       d
BEGIN { FS=OFS="\t" }
NR==FNR {
    split($4,b,/,/)
    for (i in b) {
        fileB[b[i]] = $1 OFS $2
    }
    next
}
{
    tail = ""
    split($2,a,/,/)
    for (i in a) {
        if (a[i] in fileB) {
            tail = OFS fileB[a[i]]
        }
    }
    print $0 tail
}

Combining 2 csv files based on same column using join


Tag : bash , By : cameron
Date : March 29 2020, 07:55 AM
Hope this helps Are the files with windows line-endings \r?
You can try dos2unix file_1.csv and dos2unix file_2.csv ?

How to join two files based on one column in AWK (using wildcards)


Tag : awk , By : mobi phil
Date : March 29 2020, 07:55 AM
I hope this helps . I have 2 files, and I need to compare column 2 from file 2 with column 3 from file 1. , hacky!
$ awk -F'","' 'NR==FNR {n=split($NF,x,"-"); for(i=2;i<n;i++) a[x[i]]=$1 FS $2; next} 
               $2 in a {print a[$2] "\"," $0}' file1 file2

"testserver1","testserver1.domain.net","windows","10.10.10.10","datacenter1"
"testserver2","testserver2.domain.net","linux","2.2.2.2","datacenter2"

Merge multiple files and split output to multiple files based in each column (post 2)


Tag : awk , By : Tom Berthon
Date : March 29 2020, 07:55 AM
should help you out With GNU awk for ENDFILE and automatic handling of multiple open files and assuming your posted sample output showing file3 and file4 each having more fields than file1 and file2 is a mistake:
$ cat tst.awk
BEGIN { FS=OFS=","; numHdrFlds=3 }
FNR <= numHdrFlds {
    gsub(/[^0-9]/,"")
    hdr = (FNR==1 ? "" : hdr OFS) $0
    next
}
{
    for (i=1; i<=NF; i++) {
        data[i] = (FNR==(numHdrFlds+1) ? "" : data[i] OFS) ($i)+0
    }
}
ENDFILE {
    for ( fileNr=1; fileNr<=NF; fileNr++ ) {
        print hdr, data[fileNr] > ("outputFile" fileNr)
    }
}
$ awk -f tst.awk file1 file2

$ for i in outputFile*; do echo "$i"; cat "$i"; echo "---"; done
outputFile1
6174,15,3,1.6,1.7,1.8
6176,17,5,1.6,1.5,1.3
---
outputFile2
6174,15,3,19.5,23.2,26.5
6176,17,5,18.6,23.5,26.8
---
outputFile3
6174,15,3,0,28.3,27
6176,17,5,0,19.7,19.2
---
outputFile4
6174,15,3,0,27,25.4
6176,17,5,0,19.2,18.5
---
Related Posts Related QUESTIONS :
  • xm64 fills my processor resource in Linux Mageia 6
  • Measuring temperature while benchmarking on Linux
  • Grep using a regular expression and capturing using groups
  • Getopts default case bash script
  • C can I open a directory's files using open dir in an alphabetical order?
  • Get a tag value in multi line XML using shell script
  • Errors while execute 'make' command, debian
  • How to speed up grep/awk command?
  • Why does calling the C abort() function from an x86_64 assembly function lead to segmentation fault (SIGSEGV) instead of
  • Passing commandline argument to qml
  • Bash command with pipe not working in crontab
  • DBD-Oracle (1.74 or 1.76) with oracle instantclient 11.2 on win10 wsl ubuntu
  • How can I use Bash to contract/shrink a set of values
  • sed only print substring in a string
  • How to make find . -name "*.txt" | xargs grep "text" to work with filename with spaces
  • numeric variable in egrep regular expression bash script
  • How to use case to identify a specific pattern in BASH script
  • Pipe stdout to with multiple lines to individual files
  • How to remove a file called * (asterisk) without using quotations?
  • How to enable CONFIG_RT_GROUP_SCHED in Ubuntu to make it RT
  • How to make the query work in linux server?
  • Linux device driver for a gps module
  • Command for printing part of a String?
  • GitLab-Runner "listen_address not defined" error
  • How does logrotate work when there are two process use the same file?
  • Got error while Mounting, error massage:"mount.nfs: remote share not in 'host:dir' format"
  • Does read/write blocked system call put the process in TASK_UNINTERRUPTIBLE or TASK_INTERRUPTIBLE state?
  • How to create a 3 second timer in Bash?
  • Breaking out of nested function loops in Bash
  • no module named numpy even after installing it
  • How to fix error "'recipe for target "doc/automake-1.14.1' failed" while building Xenomai
  • Is it possible to partially unzip a .vcf file?
  • Sorting the time format in shell script
  • why bash changing my command 'mysql*' to 'mysql.sql'?
  • Is D-Bus a middleware IPC?
  • Make runs cc and I don't understand why
  • Add same files to multiple PACKAGES
  • Getting a symbolic link full path
  • How do I recover a corrupted dnf database?
  • docker-compose up and user inputs on stdin
  • How to convert DOS/Windows newline (CRLF) to Unix newline (LF) in a Bash script?
  • Installing sSMTP from SSH
  • Passing lines as args to a script
  • Create setup for Linux C project
  • Move/copy files/folder in linux/solaris using only bash built-ins
  • LD_LIBRARY_PATH : how to find a shared object
  • setitimer, SIGALRM & multithread process (linux, c)
  • Is there any flash driver at the OS level loaded?
  • Seeing what files are run at startup (linux)
  • how do i claim a low-numbered port as non-root the "right way"
  • How to convert PE(Portable Executable) format to ELF in linux
  • need help with installing shared libraries on linux
  • Retrieve available Wifi networks on Ubuntu
  • What happens after a packet is captured?
  • linux new/delete, malloc/free large memory blocks
  • On which platforms does libc store stack cookie values somewhere other than __stack_chk_guard?
  • Any porting available of backtrace for uclibc?
  • compiling boost based application using cron
  • How can I find file system concurrency issues?
  • How can I add a directory to the Perl library path at the system level?
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com