logo
down
shadow

Is there an R function for comparing rows in data.frame?


Is there an R function for comparing rows in data.frame?

Content Index :

Is there an R function for comparing rows in data.frame?
Tag : r , By : gbodunski
Date : January 12 2021, 09:11 PM

With these it helps I summarized dataset and want to compare the rows with conditions. What functions can I use? , Since we have same number of rows for "Africa" and "Europe" we can do
unique(data$year[data$total_pop[data$continent == "Africa"] > 
       data$total_pop[data$continent == "Europe"]])
#[1] 1987 1992 1997 2002 2007
Africa_data <- data[data$continent == "Africa",]
Europe_data <- data[data$continent == "Europe",]
Africa_data$year[Africa_data$total_pop > Europe_data$total_pop]
#[1] 1987 1992 1997 2002 2007

Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

Comparing two columns in a data frame across many rows


Tag : r , By : adbanginwar
Date : March 29 2020, 07:55 AM
I think the issue was by ths following , I have a data frame that I'm working with in which I'd like to compare a data point Genotype with two references S288C and SK1. This comparison will be done across many rows (100+) of the data frame. Here are the first few lines of my data frame: , A nested ifelse should do it (take a look at help(ifelse) for usage):
ifelse(dat$Genotype==dat$S288C,1,ifelse(dat$Genotype==dat$SK1,0,NA))
> dat
     Genotype S288C SK1
[1,] "G"      "A"   "G"
[2,] "G"      "A"   "G"
[3,] "C"      "T"   "C"
[4,] "G"      "A"   "G"
[5,] "G"      "G"   "T"
[6,] "G"      "A"   "A"
> ifelse(dat$Genotype==dat$S288C,1,ifelse(dat$Genotype==dat$SK1,0,NA))
[1]  0  0  0  0  1 NA

Comparing two data.frames and deleting rows based on NA values in one data.frame


Tag : r , By : Ed.
Date : March 29 2020, 07:55 AM
will be helpful for those in need I have two data frames. One is considered a reference and has every value, the other may or may not be missing values. I want to compare both data frames, then delete the values from the reference data frame that have NA in the other. However, each row of the data frame that can have missing values needs to be treated as a single comparison so you are developing a unique reference for every single row. For example the reference dataframe(1): , Try:
> ref<-data.frame(var1=c('a','q','z'),var2=c('b','w','x'),var3=c('c','e','n'))
> new<-data.frame(var1=c('p','u',NA,'l'),var2=c('o','y','e','k'),var3=c('i','t','w',NA))
> apply(new,1,function(x) ref[,which(!is.na(x))] )
[[1]]
  var1 var2 var3
1    a    b    c
2    q    w    e
3    z    x    n

[[2]]
  var1 var2 var3
1    a    b    c
2    q    w    e
3    z    x    n

[[3]]
  var2 var3
1    b    c
2    w    e
3    x    n

[[4]]
  var1 var2
1    a    b
2    q    w
3    z    x
is.odd <- function(x) x %% 2 == 1
apply(new, 1, function(x) {
    toremove <-which(is.na(x))
    toremove1<-sapply(toremove,function(x) ifelse(is.odd(x),x+1,x-1) )
    ref[,!(1:ncol(ref) %in% c(toremove,toremove1)),drop=F]
})

Comparing data frame rows containing NAs


Tag : r , By : damomurf
Date : March 29 2020, 07:55 AM
it fixes the issue One option would be to create an | condition to get those rows having NA for 'x'
subset(my.df, x != y | is.na(x))
subset(my.df, x != y | is.na(x)|is.na(y))
subset(my.df, (x != y | is.na(x)|is.na(y)) & !(is.na(x) & is.na(y)))

Comparing one value across multiple rows in one data frame with values across multiple rows in a second data frame


Tag : r , By : xguru
Date : March 29 2020, 07:55 AM
I hope this helps . Scenario: , Here's an answer with dplyr:
library(dplyr)

df1 <- tribble(
     ~CHR, ~POS,
     1,  2000,                  
     1,  3000,
     2,  1500,
     3,  3000
)

df2 <- tribble(
     ~CHR, ~POS_START, ~POS_END,
     1, 1500, 2500,                  
     1, 3200, 4000,
     2, 1200, 1600,
     2, 2000, 2200,
     3, 5000, 5500,
     4, 1000, 1200
)

df1 %>% 
     left_join(df2, by = 'CHR') %>% 
     mutate(IN_RANGE = POS >= POS_START & POS <= POS_END) %>% 
     group_by(CHR, POS) %>% 
     summarize(IN_RANGE = sum(IN_RANGE) > 0)

comparing each row with all other rows in data.frame


Tag : r , By : Tetting
Date : March 29 2020, 07:55 AM
wish helps you Here is an option using base R by making use of table and crossprod. Set the lower triangular values of the matrix output of crossprod to NA, convert it to 'long' format by converting to data.frame and then subset the rows that are non-NA for 'Freq' column
out <- with(df, crossprod(table(paste(category, value), ID)))
out[lower.tri(out, diag = TRUE)] <- NA
subset(as.data.frame.table(out), !is.na(Freq))
#    ID ID.1 Freq
#4 ID1  ID2    2
#7 ID1  ID3    1
#8 ID2  ID3    2
df <- structure(list(ID = c("ID1", "ID1", "ID1", "ID2", "ID2", "ID2", 
"ID3", "ID3", "ID3"), category = c("length", "type", "color", 
 "length", "type", "color", "length", "type", "color"), 
 value = c("100", 
 "L", "Blue", "100", "M", "Blue", "150", "M", "Blue")), 
 class = "data.frame", row.names = c(NA, -9L))
Related Posts Related QUESTIONS :
  • Making variables immutable in R
  • R: Difference between the subsequent ranks of a item group by date
  • Match data within multiple time-frames with dplyr
  • Conditional manipulation and extension of rows in data.table also considering previous extensions without for-loop
  • Conditional formula referring to preview row in DF not working
  • Set hoverinfo text in plotly scatterplot
  • Histogram of Sums from Categorical/Binary Data
  • Efficiently find set differences and generate random sample
  • Find closest points from data set B to point in data set A, using lat long in R
  • dplyr join on column A OR column B
  • Replace all string if row starts with (within a column)
  • Is there a possibility to combine position_stack and nudge_x in a stacked bar chart in ggplot2?
  • How can I extract bounding boxes in a row-wise manner using R?
  • How do I easily sum up values in different columns?
  • Reading numeric Date value from CSV file to data.frame in "R"
  • R programming: creating a stacked bar graph, with variable colors for each stacked bar
  • How to identify all columns that contain binary representation
  • Filter different groups by different factor levels
  • Saving .xlsx file to disc, form http post request
  • Add an "all" option under the filter that selects the number of rows displayed in a datatable
  • How to select second column of every xts in list
  • Generate a frequency dataframe out of an input dataframe
  • Why manual autocorrelation does not match acf() results?
  • Merge 3 dataframes which are different to each other
  • remove adjacent duplicates from string
  • How to change the position of stacked stacked bar chart in ggplot in R?
  • How to divide each of a range a variables by a second range of variables in R
  • Why do I need to assemble vector before scaling in Spark?
  • How to select individuals which appear in multiple groups?
  • How can I fill columns based on values in another column?
  • 32 bit R and 64 bit R: output differs
  • Remove a single backslash in paste0 output
  • ggplot2 different label for the first break
  • TSP in R, with given distances
  • How to find the given value from the range of values?
  • Solution on R group by issue _ multiple combination
  • Transform multiple columns with a function that uses different arguments per column
  • How can I parse a string with the format "1/16/2019 1:24:51" into a POSIXct or other date variable?
  • How to plot a box plot in R for outlier detection for a huge number of rows?
  • How to change column name according to another dataframe in R?
  • `sjPlot::tab_df()`--how to set the number of decimal places?
  • time average for specific time range in r
  • joining dataframes by closest time and another key in r
  • How to create nested for loop for a certain range
  • New category based on sequence of date ranges
  • how to extract formula from coxph model summary in R?
  • add row based on variable condition in R
  • Generating the sequence 111122222333334
  • Unable to use has_goog_key() in R
  • how to multiply each row with a scaler in corresponding column?
  • R is not recognizing levels of a factor as the same. Is there a way to do this?
  • Calculating mean of replicate experiment result values in a column based on multiple columns using R
  • Best method to extract the first instance of a string between specified keywords using data.table
  • ignore optional combination of alphanumeric characters in str_extract
  • Why tracemem shows two copies when modification occurs inside function body?
  • Can't use mppm on multitype point patterns
  • How to move selected matrix rows to top of matrix based on a selection vector of row names
  • Combining expressions with a common operator
  • Passing string through multiple filters for matching
  • Convert two columns in R to rows of unique occurrence
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com