Is there an R function for comparing rows in data.frame?
Tag : r , By : gbodunski
Date : January 12 2021, 09:11 PM

With these it helps I summarized dataset and want to compare the rows with conditions. What functions can I use? , Since we have same number of rows for "Africa" and "Europe" we can do
unique(data$year[data$total_pop[data$continent == "Africa"] > 
       data$total_pop[data$continent == "Europe"]])
#[1] 1987 1992 1997 2002 2007
Africa_data <- data[data$continent == "Africa",]
Europe_data <- data[data$continent == "Europe",]
Africa_data$year[Africa_data$total_pop > Europe_data$total_pop]
#[1] 1987 1992 1997 2002 2007

Comparing two columns in a data frame across many rows

Tag : r , By : adbanginwar
Date : March 29 2020, 07:55 AM
I think the issue was by ths following , I have a data frame that I'm working with in which I'd like to compare a data point Genotype with two references S288C and SK1. This comparison will be done across many rows (100+) of the data frame. Here are the first few lines of my data frame: , A nested ifelse should do it (take a look at help(ifelse) for usage):
> dat
     Genotype S288C SK1
[1,] "G"      "A"   "G"
[2,] "G"      "A"   "G"
[3,] "C"      "T"   "C"
[4,] "G"      "A"   "G"
[5,] "G"      "G"   "T"
[6,] "G"      "A"   "A"
> ifelse(dat$Genotype==dat$S288C,1,ifelse(dat$Genotype==dat$SK1,0,NA))
[1]  0  0  0  0  1 NA

Comparing two data.frames and deleting rows based on NA values in one data.frame

Tag : r , By : Ed.
Date : March 29 2020, 07:55 AM
will be helpful for those in need I have two data frames. One is considered a reference and has every value, the other may or may not be missing values. I want to compare both data frames, then delete the values from the reference data frame that have NA in the other. However, each row of the data frame that can have missing values needs to be treated as a single comparison so you are developing a unique reference for every single row. For example the reference dataframe(1): , Try:
> ref<-data.frame(var1=c('a','q','z'),var2=c('b','w','x'),var3=c('c','e','n'))
> new<-data.frame(var1=c('p','u',NA,'l'),var2=c('o','y','e','k'),var3=c('i','t','w',NA))
> apply(new,1,function(x) ref[,which(!is.na(x))] )
  var1 var2 var3
1    a    b    c
2    q    w    e
3    z    x    n

  var1 var2 var3
1    a    b    c
2    q    w    e
3    z    x    n

  var2 var3
1    b    c
2    w    e
3    x    n

  var1 var2
1    a    b
2    q    w
3    z    x
is.odd <- function(x) x %% 2 == 1
apply(new, 1, function(x) {
    toremove <-which(is.na(x))
    toremove1<-sapply(toremove,function(x) ifelse(is.odd(x),x+1,x-1) )
    ref[,!(1:ncol(ref) %in% c(toremove,toremove1)),drop=F]

Comparing data frame rows containing NAs

Tag : r , By : damomurf
Date : March 29 2020, 07:55 AM
it fixes the issue One option would be to create an | condition to get those rows having NA for 'x'
subset(my.df, x != y | is.na(x))
subset(my.df, x != y | is.na(x)|is.na(y))
subset(my.df, (x != y | is.na(x)|is.na(y)) & !(is.na(x) & is.na(y)))

Comparing one value across multiple rows in one data frame with values across multiple rows in a second data frame

Tag : r , By : xguru
Date : March 29 2020, 07:55 AM
I hope this helps . Scenario: , Here's an answer with dplyr:

df1 <- tribble(
     ~CHR, ~POS,
     1,  2000,                  
     1,  3000,
     2,  1500,
     3,  3000

df2 <- tribble(
     1, 1500, 2500,                  
     1, 3200, 4000,
     2, 1200, 1600,
     2, 2000, 2200,
     3, 5000, 5500,
     4, 1000, 1200

df1 %>% 
     left_join(df2, by = 'CHR') %>% 
     mutate(IN_RANGE = POS >= POS_START & POS <= POS_END) %>% 
     group_by(CHR, POS) %>% 
     summarize(IN_RANGE = sum(IN_RANGE) > 0)

comparing each row with all other rows in data.frame

Tag : r , By : Tetting
Date : March 29 2020, 07:55 AM
wish helps you Here is an option using base R by making use of table and crossprod. Set the lower triangular values of the matrix output of crossprod to NA, convert it to 'long' format by converting to data.frame and then subset the rows that are non-NA for 'Freq' column
out <- with(df, crossprod(table(paste(category, value), ID)))
out[lower.tri(out, diag = TRUE)] <- NA
subset(as.data.frame.table(out), !is.na(Freq))
#    ID ID.1 Freq
#4 ID1  ID2    2
#7 ID1  ID3    1
#8 ID2  ID3    2
df <- structure(list(ID = c("ID1", "ID1", "ID1", "ID2", "ID2", "ID2", 
"ID3", "ID3", "ID3"), category = c("length", "type", "color", 
 "length", "type", "color", "length", "type", "color"), 
 value = c("100", 
 "L", "Blue", "100", "M", "Blue", "150", "M", "Blue")), 
 class = "data.frame", row.names = c(NA, -9L))
