logo
down
shadow

Parameterizing group_by %>% summarise


Parameterizing group_by %>% summarise

Content Index :

Parameterizing group_by %>% summarise
Tag : r , By : orlandoferrer
Date : November 27 2020, 04:01 AM

hope this fix your issue There is a data.frame like so: , You can use enquo and !!
library(tidyverse)

mysumm <- function(variable){
  var <- enquo(variable)
  df %>%
    group_by(
    Config
    ) %>%
    summarise(!!paste0(variable, ".median") := median(!!var))
}

mysumm('SN1')
# # A tibble: 2 x 2
#   Config SN1.median
#   <fct>  <chr>     
# 1 C1     SN1       
# 2 C2     SN1    

Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

R - group_by n_distinct for summarise


Tag : r , By : Nougat
Date : March 29 2020, 07:55 AM
Hope that helps My dataset looks like this , There are a ton of different ways to do this, here's one:
dta %>% distinct(id) %>%
        group_by(sex) %>%
        summarise(n())
dta <- data.frame(id = rep(1:500, 30),
                  sex = rep (c("M", "F"), 750),
                  child = rep(c(1, 0, 0, 1), 375))
library(microbenchmark)

microbenchmark(
    distinctcount = dta %>% distinct(id) %>% count(sex),
    uniquecount = dta %>% unique %>% count(sex),
    distinctsummarise = dta %>% distinct(id) %>% group_by(sex) %>% summarise(n()),
    uniquesummarise = dta %>% unique %>% group_by(sex) %>% summarise(n()),
    distincttally= dta %>% distinct(id) %>% group_by(sex) %>% tally
)
Unit: milliseconds
              expr       min        lq      mean    median        uq       max neval
     distinctcount  1.576307  1.602803  1.664385  1.630643  1.670195  2.233710   100
       uniquecount 32.391659 32.885479 33.194082 33.072485 33.244516 35.734735   100
 distinctsummarise  1.724914  1.760817  1.815123  1.792114  1.830513  2.178798   100
   uniquesummarise 32.757609 33.080933 33.490001 33.253155 33.463010 39.937194   100
     distincttally  1.618547  1.656947  1.715741  1.685554  1.731058  2.383084   100
dta %>% distinct(id) %>% count(sex)

Using group_by and summarise from dplyr for all rows not containing the variable to group_by


Tag : r , By : micaleel
Date : March 29 2020, 07:55 AM
it should still fix some issue You can use the . to refer to the whole data.frame, which lets you calculate the differences between the group and the whole:
df1 %>% group_by(id) %>% 
    summarise(n = n(), 
              n_other = nrow(.) - n, 
              mean_cost = mean(cost), 
              mean_other = (sum(.$cost) - sum(cost)) / n_other)

## # A tibble: 2 × 5
##       id     n n_other mean_cost mean_other
##   <fctr> <int>   <int>     <dbl>      <dbl>
## 1      A     2       3        55        108
## 2      B     3       2       108         55

r: Summarise for rowSums after group_by


Tag : r , By : Joe
Date : March 29 2020, 07:55 AM
around this issue I've tried searching a number of posts on SO but I'm not sure what I'm doing wrong here, and I imagine the solution is quite simple. I'm trying to group a dataframe by one variable and figure the mean of several variables within that group. , You can try:
library(tidyverse)
airquality %>% 
  select(Month, target_vars) %>% 
  gather(key, value, -Month) %>% 
  group_by(Month) %>%
  summarise(n=length(unique(key)),
            Sum=sum(value, na.rm = T)) %>% 
  mutate(Average=Sum/n)
# A tibble: 5 x 4
  Month     n   Sum  Average
  <int> <int> <int>    <dbl>
1     5     3  7541 2513.667
2     6     3  8343 2781.000
3     7     3 10849 3616.333
4     8     3  8974 2991.333
5     9     3  8242 2747.333

How to summarise all columns using group_by and summarise?


Tag : r , By : Jaya
Date : March 29 2020, 07:55 AM
This might help you It's hard to try and answer your question without a better example (ie, you can dput() your data to give us a sample). But here is a solution to your last issue: "For the first problem, I expect to get a table with the sum of repeated rows for all columns. Moreover, if it was possible, I would expect to get a better code for the sum of different activities on Saturday."
# create toy data of 3 different IDs, 3 different types, and repeated days
df <- data.frame(id=sample(c(1:3),100,T),
                 type=sample(letters[1:3],100,T),
                 day=sample(c(1:7),100,T),
                 matrix(runif(300),nrow=100),
                 stringsAsFactors = F)

# gather data, summarize each activity column by ID, type and day
# and select Saturday==6
df %>% gather(k,v,-id,-type,-day) %>% 
  group_by(id,type,day,k) %>% 
  summarise(sum=sum(v)) %>% 
  filter(day==6) %>% 
  spread(k,sum)

# A tibble: 8 x 6
# Groups:   id, type, day [8]
     id type    day    X1    X2    X3
  <int> <chr> <int> <dbl> <dbl> <dbl>
1     1 a         6 1.85  3.26  2.09 
2     1 b         6 0.604 0.583 0.586
3     1 c         6 0.163 0.663 0.624
4     2 a         6 0.185 0.952 0.349
5     2 b         6 1.16  0.832 0.974
6     2 c         6 0.906 1.62  0.853
7     3 b         6 0.671 1.39  0.887
8     3 c         6 0.449 0.150 0.647
df %>% group_by(LbNr,Type,Weekday) %>% summarise_all(.,sum)

# A tibble: 20 x 14
# Groups:   LbNr, Type [5]
    LbNr Type  Weekday   Time    lie     sit   stand    move    walk     run  stairs   cycle
   <dbl> <fct>   <dbl>  <dbl>  <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
 1 22002 A1. ~       1  6.33  0.386  4.52e+0 0.726   0.499   0.189   0.00111 0.0075  0.00556
 2 22002 A1. ~       2  7.9   0.766  4.74e+0 1.28    0.611   0.489   0.00194 0.0111  0      
 3 22002 A1. ~       3  7.33  0.262  3.63e+0 2.04    0.941   0.449   0.00083 0.0114  0      
 4 22002 A1. ~       4 11.7   0.761  5.91e+0 2.54    1.19    1.25    0.00416 0.0394  0.00778
 5 22002 A1. ~       5  6.57  0.140  4.51e+0 1.12    0.51    0.254   0.00139 0.0183  0.01   
 6 22002 A1. ~       6  0.433 0.0169 3.02e-1 0.0589  0.0378  0.0175  0       0       0      
 7 22002 A2. ~       1  7.5   0.0792 5.90e+0 0.546   0.326   0.611   0.00111 0.0392  0      
 8 22002 A2. ~       2  9.83  0.0597 6.64e+0 1.64    0.595   0.842   0.00167 0.0575  0      
 9 22002 A2. ~       3  9.83  0.653  5.79e+0 1.82    0.525   1.01    0.00083 0.0333  0      
10 22002 A2. ~       4  5     0.383  2.80e+0 0.886   0.392   0.514   0.0025  0.0247  0      
11 22002 A2. ~       5 11.0   0.0103 6.77e+0 1.83    1.05    1.29    0.00472 0.0672  0      
12 22002 A4. ~       2  6.27  4.86   1.41e+0 0       0       0       0       0       0      
13 22002 A4. ~       3  6.83  5.69   1.15e+0 0       0       0       0       0       0      
14 22002 A4. ~       4  7.3   7.28   4.72e-3 0.00667 0.00667 0       0       0.00194 0      
15 22002 A4. ~       5  6.42  5.49   9.30e-1 0       0       0       0       0       0      
16 22002 C0. ~       6 15.7   0.245  9.78e+0 2.34    2.45    0.800   0.00194 0.0581  0      
17 22002 C0. ~       7 15.6   0.122  1.20e+1 1.80    0.940   0.656   0.0869  0.0164  0      
18 22002 C4. ~       1  6.33  5.75   5.84e-1 0       0       0       0       0       0      
19 22002 C4. ~       6  7.9   6.96   9.22e-1 0.00667 0.00806 0.00306 0       0       0      
20 22002 C4. ~       7  8.35  7.36   9.33e-1 0.0364  0.0208  0.00472 0       0       0      
# ... with 2 more variables: WalkSlow <dbl>, WalkFast <dbl>
df %>% group_by(LbNr,Type,Weekday) %>% summarise_all(.,sum) %>% 
  filter(Weekday==6)

# A tibble: 3 x 14
# Groups:   LbNr, Type [3]
   LbNr Type  Weekday   Time    lie   sit   stand    move    walk     run stairs cycle WalkSlow
  <dbl> <fct>   <dbl>  <dbl>  <dbl> <dbl>   <dbl>   <dbl>   <dbl>   <dbl>  <dbl> <dbl>    <dbl>
1 22002 A1. ~       6  0.433 0.0169 0.302 0.0589  0.0378  0.0175  0       0          0  0.00417
2 22002 C0. ~       6 15.7   0.245  9.78  2.34    2.45    0.800   0.00194 0.0581     0  0.14   
3 22002 C4. ~       6  7.9   6.96   0.922 0.00667 0.00806 0.00306 0       0          0  0      
# ... with 1 more variable: WalkFast <dbl>

# summarise across different activities, for each column, on Saturday only
df %>% group_by(LbNr,Type,Weekday) %>% summarise_all(.,sum) %>% 
  filter(Weekday==6) %>% group_by(LbNr) %>% select(-Type,-Weekday) %>% 
  summarise_all(.,sum)

# A tibble: 1 x 12
   LbNr  Time   lie   sit stand  move  walk     run stairs cycle WalkSlow WalkFast
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>  <dbl> <dbl>    <dbl>    <dbl>
1 22002    24  7.22  11.0  2.41  2.49 0.820 0.00194 0.0581     0    0.144    0.670

How to use group_by with summarise and summarise_all?


Tag : r , By : amy
Date : January 02 2021, 06:48 AM
Does that help Here's an approach that breaks it into two problems and combines them:
library(dplyr)
left_join(
  # Here we want to treat column y specially
  df %>%
    group_by(x) %>%
    summarize(sum_y = sum(y)),
  # Here we exclude y and use a different summation for all the remaining columns
  df %>%
    group_by(x) %>%
    select(-y) %>%
    summarise_all(first)
  ) 

# A tibble: 5 x 3
      x sum_y     z
  <int> <int> <int>
1     1    20     1
2     2    16     3
3     3    17     2
4     4    18     2
5     5     7     3
df <- read.table(
  header = T, 
  stringsAsFactors = F,
  text="x  y z
        1  1 1
        3  2 2
        2  3 3
        3  4 4
        2  5 1
        4  6 2
        5  7 3
        2  8 4
        1  9 1
        1 10 2
        3 11 3
        4 12 4")
Related Posts Related QUESTIONS :
  • How to create nested for loop for a certain range
  • New category based on sequence of date ranges
  • how to extract formula from coxph model summary in R?
  • add row based on variable condition in R
  • Generating the sequence 111122222333334
  • Unable to use has_goog_key() in R
  • how to multiply each row with a scaler in corresponding column?
  • R is not recognizing levels of a factor as the same. Is there a way to do this?
  • Calculating mean of replicate experiment result values in a column based on multiple columns using R
  • Best method to extract the first instance of a string between specified keywords using data.table
  • ignore optional combination of alphanumeric characters in str_extract
  • Why tracemem shows two copies when modification occurs inside function body?
  • Can't use mppm on multitype point patterns
  • How to move selected matrix rows to top of matrix based on a selection vector of row names
  • Combining expressions with a common operator
  • Passing string through multiple filters for matching
  • Convert two columns in R to rows of unique occurrence
  • How to create a dataframe using a function based on user-input?
  • How to access the visited vertices in a given shortest path using R igraph
  • Differences in Unicode character output with print()
  • Extracting Function or Objects from a String and then Piping Them with Magrittr/Dplyr
  • renderUI not evaluated until it is rendered
  • Find the maximum absolute value by row in an R data frame
  • Extracting data from irregular lists using purrr:map()
  • transforming data based on range of column in r
  • Identify and subset rows with some similar information
  • converting character from mongolite to timestamp in R
  • Create list from two vectors with every combo of each
  • Error in running a spread because of unique 'key combinations'; combining rows of data
  • visualize numerical strings as a matrixed heatmap
  • how to make a blocked matrix?
  • How to summarize with two functions using with dplyr
  • Dataframe is no longer the same after being saved to Excel and read back in
  • Create duplicate rows using based on availability of data
  • Keep empty groups when grouping with data.table in R
  • Grouping of Event Time Data based on multiple, iterative conditions
  • Formatting Numbers in Flextable for Specific Columns
  • How to store results from for-loop into a dataframe
  • How to select the values in my dataframe which has logical operator "<" (less than), divide them by two, an
  • Rowwise extract data between two strings
  • Convert a string separate by . and +
  • stacking function for values in R
  • dplyr coerces characters to factors
  • How do I use spread and group_by on a single row dataset
  • Replacing values in one matrix with values from another
  • Aggregate data and exclude duplicates in one column
  • Perform an R data.table binary search with OR select
  • How can I include a function in the Standard Deviation parameter of pnorm
  • How to get a tidy excel output of P values from R
  • Rotate boxplot legend (R, ggplot2)
  • dplyr::n() returns “Error: Error: n() should only be called in a data context ”
  • Extract fix columns and one variable column from a list of df´s in R
  • A function that can translate DNA sequence to binary code
  • I want to extract 365 netcdf files using loop
  • rvest vs RSelenium results for text extracting
  • Converting wide data to tall data
  • How to remove vertical white lines when using ggsave in R?
  • R-Shiny error: "renderDataTable" and "server=FALSE"
  • Read csv file with selected rows using data.table's fread
  • how to resolve an error like non numeric argument to binary argument?
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com