Dplyr write a function with column names as inputs
Tag : r , By : user185939
Date : March 29 2020, 07:55 AM
I wish did fix the issue. I'm writing a function that I'm going to use on multiple columns in dplyr, but I'm having trouble passing column names as inputs to functions for dplyr. , Is this what you expected? df<-tbl_df(data.frame(group=rep(c("A", "B"), each=3), var1=sample(1:100, 6), var2=sample(1:100, 6)))
example<-function(colname){
df %>%
group_by(group)%>%
summarize(output=mean(sqrt(colname)))%>%
select(output)
}
example( quote(var1) )
#-----
Source: local data frame [2 x 1]
output
1 7.185935
2 8.090866
|
Why can't I apply a function to create a new column with mutate() using dplyr?
Date : March 29 2020, 07:55 AM
will be helpful for those in need I have a data.frame, let's call it " df". , As pointed out + and sum() differ in behaviour. Consider: > sum(1:10,1:10)
[1] 110
> `+`(1:10,1:10)
[1] 2 4 6 8 10 12 14 16 18 20
library(dplyr)
df <- data_frame(w = letters[1:3], x=1:3, y = x^2, z = y - x)
# Source: local data frame [3 x 4]
#
# w x y z
# 1 a 1 1 0
# 2 b 2 4 2
# 3 c 3 9 6
df %>% rowwise() %>% mutate(result = sum(x, y, z))
# Source: local data frame [3 x 5]
# Groups: <by row>
#
# w x y z result
# 1 a 1 1 0 2
# 2 b 2 4 2 8
# 3 c 3 9 6 18
df %>% mutate(result = x + y + z)
# Source: local data frame [3 x 5]
#
# w x y z result
# 1 a 1 1 0 2
# 2 b 2 4 2 8
# 3 c 3 9 6 18
df %>% mutate(result = sum(x, y, z)) # sums over all of x, y and z and recycles the result!
# Source: local data frame [3 x 5]
#
# w x y z result
# 1 a 1 1 0 28
# 2 b 2 4 2 28
# 3 c 3 9 6 28
|
Can you make dplyr::mutate and dplyr::lag default = its own input value?
Tag : r , By : Mare Astra
Date : March 29 2020, 07:55 AM
hop of those help? This is similar to this dplyr lag post, and this dplyr mutate lag post, but neither of those ask this question about defaulting to the input value. I am using dplyr to mutate a new field that's a lagged offset of another field (that I've converted to POSIXct). The goal is, for a given ip, I'd like to know some summary statistics on the delta between all the times it shows up on my list. I also have about 12 million rows. , In the OP's code ... ...
d) group_by(ip) %>%
e) mutate(shifted = dplyr::lag(fulldate, default=fulldate)) %>%
...
|
Mutate with a list column function in dplyr
Date : March 29 2020, 07:55 AM
fixed the issue. Will look into that further I am trying to calculate the Jaccard similarity between a source vector and comparison vectors in a tibble. , You could simply add rowwise df_comp_jaccard <- df_comp %>%
rowwise() %>%
dplyr::mutate(jaccard_sim = length(intersect(names_vec, source_vec))/
length(union(names_vec, source_vec)))
# A tibble: 3 x 3
names_ names_vec jaccard_sim
<chr> <list> <dbl>
1 b d f <chr [3]> 0.2
2 u k g <chr [3]> 0.0
3 m o c <chr [3]> 0.2
tibble(a=1:10,b=10:1) %>% mutate(X = paste(a,b,sep="_"))
tibble(a=1:10,b=10:1) %>% rowwise %>% mutate(X = paste(a,b,sep="_"))
# # A tibble: 5 x 3
# a b X
# <int> <int> <chr>
# 1 1 5 1_5
# 2 2 4 2_4
# 3 3 3 3_3
# 4 4 2 4_2
# 5 5 1 5_1
tibble(a=1:5,b=5:1) %>% mutate(max(a,b))
# # A tibble: 5 x 3
# a b `max(a, b)`
# <int> <int> <int>
# 1 1 5 5
# 2 2 4 5
# 3 3 3 5
# 4 4 2 5
# 5 5 1 5
tibble(a=1:5,b=5:1) %>% rowwise %>% mutate(max(a,b))
# # A tibble: 5 x 3
# a b `max(a, b)`
# <int> <int> <int>
# 1 1 5 5
# 2 2 4 4
# 3 3 3 3
# 4 4 2 4
# 5 5 1 5
tibble(a=1:5,b=5:1) %>% mutate(pmax(a,b))
# # A tibble: 5 x 3
# a b `pmax(a, b)`
# <int> <int> <int>
# 1 1 5 5
# 2 2 4 4
# 3 3 3 3
# 4 4 2 4
# 5 5 1 5
|
When I don't know column names in data.frame, when I use dplyr mutate function
Tag : r , By : Juan Pablo
Date : March 29 2020, 07:55 AM
I think the issue was by ths following , I like to know how I can use dplyr mutate function when I don't know column names. Here is my example code; , With apply: library(dplyr)
library(purrr)
df %>%
mutate(minimum = apply(df[,2:4], 1, min))
df %>%
mutate(minimum = pmap(.[2:4], min))
df %>%
purrrlyr::by_row(~min(.[2:4]), .collate = "rows", .to = "minimum")
# tibble [3 x 5]
w x y z minimum
<dbl> <dbl> <dbl> <dbl> <dbl>
1 2 1 1 3 1
2 3 2 5 2 2
3 4 7 4 6 4
|