logo
down
shadow

running t.test() on multiple columns to output tibble


running t.test() on multiple columns to output tibble

Content Index :

running t.test() on multiple columns to output tibble
Tag : r , By : Pieter Taelman
Date : January 12 2021, 08:33 AM

will help you We could take advantage of purrr::map_df(), which is in library(tidyverse), like this:
library(broom)
library(tidyverse) # purrr is in here
data(mtcars)

#reproducible data to simulate your case
mtcars2 <- filter(mtcars, cyl %in% c(4, 6)) 
mtcars2$cyl <- as.factor(mtcars2$cyl)

# capture the columns you want to t.test
cols_not_cyl <- names(mtcars2)[-2]

# turn those column names into formulas
formulas <- paste(cols_not_cyl, "~ cyl") %>%
    map(as.formula) %>% # needs to be class formula
    set_names(cols_not_cyl) # useful for map_df()

# do the tests, then stack them all together
map_df(formulas, ~ tidy(t.test(formula = ., data = mtcars2)),
       .id = "column_id")

Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

How to pipe an output tibble into further calculations without saving the tibble as a separate object in R?


Tag : r , By : ralph okochu
Date : March 29 2020, 07:55 AM
it helps some times In your caase, you can further manipulate the tibble you have generated using dplyr functions.
Note the existence of mutate_at and summarize_at, that lets you transform a set of columns with the option to select them by column position.
sr_df %>%
  group_by(ResolutionViolated) %>%
  tally() %>% 
  arrange(desc(n)) %>% 
  mutate(total = sum(n)) %>% 
  mutate_at(.cols = c(1, 2), 
            .funs = function(column) round(column / .$total * 100, digits = 2))

What are the columns in the output of running a jmeter test from the command line?


Tag : jmeter , By : 小和尚
Date : March 29 2020, 07:55 AM
hop of those help? First column represents the number of sample collected at regular intervals.
Here is the excerpt from Blazemeter article.

How to extract a vector in a tibble column to multiple columns in the same tibble?


Tag : r , By : Sharad
Date : March 29 2020, 07:55 AM
will help you We need to get the names of the 'cut' variable as new column and then do a spread to reshape to 'wide' format after unnesting the list elements
mtcars %>%
   group_by(cyl)  %>%   
   by_slice(~fun(.x$hp,.x$gear)) %>%
   rename(cut=.out) %>%
   mutate(Names = map(cut, ~factor(names(.x), levels = names(.x)))) %>%
   unnest %>%
   spread(Names, cut)
# A tibble: 3 x 7
#    cyl `[50,100)` `[100,150)` `[150,200)` `[200,250)` `[250,300)` `[300,350)`
#*  <dbl>      <dbl>       <dbl>       <dbl>       <dbl>       <dbl>       <dbl>
#1     4         36           9          NA          NA          NA          NA
#2     6         NA          22           5          NA          NA          NA
#3     8         NA          NA          21          15           5           5

How to add multiple columns to a tibble?


Tag : r , By : Boris
Date : March 29 2020, 07:55 AM
To fix the issue you can do What you are seeing in tibble::add_column(columnsToAdd = NA) is the quasi-something evaluation that dplyr and tidyr introduced. If you check the definition:
> args(add_column)
function (.data, ..., .before = NULL, .after = NULL) 
extra <- matrix(NA_real_, nrow=nrow(someTibble), ncol=length(columnsToAdd), dimnames=list(NULL, columnsToAdd))
dplyr::bind_cols(someTibble, as.data.frame(extra))

Summarise tibble to multiple rows of output


Tag : r , By : gcomstock
Date : December 30 2020, 04:08 PM
around this issue Option 1 -
A bit verbose but transparent way of doing this is with joins. However, it's not that verbose considering the code in test_function as well. -
test_tibble %>% 
  group_by(country, campaign) %>% 
  summarize(campaign_ltv = sum(revenue)/sum(users)) %>% 
  inner_join(
    test_tibble %>% 
      group_by(country) %>% 
      summarise(total_ltv = sum(revenue)/sum(users)),
    by = "country"
  ) %>% 
  mutate(ltv = (total_ltv + campaign_ltv)/2) %>% 
  ungroup()

# A tibble: 3 x 5
  country campaign campaign_ltv total_ltv   ltv
    <dbl>    <dbl>        <dbl>     <dbl> <dbl>
1       1        1        0.167     0.259 0.213
2       1        2        0.333     0.259 0.296
3       2        3        0.444     0.444 0.444
test_tibble %>%
  group_by (country) %>%
  mutate(
    ltv = list(test_function(activation_date, campaign, revenue, users))
  ) %>%
  select(country, ltv) %>% 
  filter(row_number() == 1) %>% 
  unnest() %>% 
  ungroup()

# A tibble: 3 x 3
  country campaign   ltv
    <dbl>    <dbl> <dbl>
1       1        1 0.213
2       1        2 0.296
3       2        3 0.444
df %>% 
  group_by(country) %>% 
  tidyr::complete(nesting(country, campaign), nesting(revenue, users)) %>% 
  group_by(campaign, add = TRUE)
  # now you have all revenue and users for each country-campaign
  # for total_ltv: use revenue and users as is
  # for campaign_ltv: use revenue and users where activation_date is not NA

# A tibble: 15 x 5
# Groups:   country, campaign [3]
   country campaign revenue users activation_date
     <int>    <int> <chr>   <chr>           <int>
 1       1        1 R_1     U_1                 1
 2       1        1 R_2     U_2                 2
 3       1        1 R_3     U_3                 3
 4       1        1 R_4     U_4                NA
 5       1        1 R_5     U_5                NA
 6       1        1 R_6     U_6                NA
 7       1        2 R_1     U_1                NA
 8       1        2 R_2     U_2                NA
 9       1        2 R_3     U_3                NA
10       1        2 R_4     U_4                 1
11       1        2 R_5     U_5                 2
12       1        2 R_6     U_6                 3
13       2        3 R_7     U_7                 1
14       2        3 R_8     U_8                 2
15       2        3 R_9     U_9                 3
test_tibble %>% 
  group_by(country) %>% 
  tidyr::complete(nesting(country, campaign), nesting(revenue, users)) %>% 
  group_by(campaign, add = TRUE) %>% 
  summarise(
    ltv = sum(revenue)/sum(users)/2 + 
      sum(revenue[!is.na(activation_date)])/sum(users[!is.na(activation_date)])/2
  ) %>% 
  ungroup()

# A tibble: 3 x 3
  country campaign   ltv
    <dbl>    <dbl> <dbl>
1       1        1 0.213
2       1        2 0.296
3       2        3 0.444
Related Posts Related QUESTIONS :
  • Regression table with clustered standard errors in R jupyter notebook?
  • Disaggregate quarterly data to daily data in R keeping values?
  • How to save output to console and file simultaneously in RStudio server?
  • Why does data.table j have a different environment when directly calling mget() vs calling mget() inside another functio
  • scale_fill_viridis_c color bar on a log scale
  • How to change the lab name corresponding to function in ggplot
  • R, filtering for an element in a list in a dataframe cell
  • Extracting only bottom temperature from 4d NetCDF file
  • How to add/wrap lines of text to .tex with .sh script
  • R - building new variables from sequenced data
  • Sum rows values one after the other
  • Nesting ifelse inside summarytools
  • How best to divide different levels of a factor by one another in dataframe in R?
  • Why does my code run multiple times before I type data into the table? How do I make an action button that creates a tab
  • How to impute missing values not at random?
  • Set the y limits of an added average line of a plotly plot
  • how to calculate a new column after grouping with dplyr
  • Extract data from rows creating new columns using R
  • Create a filled area line plot with plotly
  • When do I need parentheses around an if statement to control the sequence of a formula in R?
  • my graph in ggplot2 contains an "e" character in y-axis
  • Making variables immutable in R
  • R: Difference between the subsequent ranks of a item group by date
  • Match data within multiple time-frames with dplyr
  • Conditional manipulation and extension of rows in data.table also considering previous extensions without for-loop
  • Conditional formula referring to preview row in DF not working
  • Set hoverinfo text in plotly scatterplot
  • Histogram of Sums from Categorical/Binary Data
  • Efficiently find set differences and generate random sample
  • Find closest points from data set B to point in data set A, using lat long in R
  • dplyr join on column A OR column B
  • Replace all string if row starts with (within a column)
  • Is there a possibility to combine position_stack and nudge_x in a stacked bar chart in ggplot2?
  • How can I extract bounding boxes in a row-wise manner using R?
  • How do I easily sum up values in different columns?
  • Reading numeric Date value from CSV file to data.frame in "R"
  • R programming: creating a stacked bar graph, with variable colors for each stacked bar
  • How to identify all columns that contain binary representation
  • Filter different groups by different factor levels
  • Saving .xlsx file to disc, form http post request
  • Add an "all" option under the filter that selects the number of rows displayed in a datatable
  • How to select second column of every xts in list
  • Generate a frequency dataframe out of an input dataframe
  • Why manual autocorrelation does not match acf() results?
  • Merge 3 dataframes which are different to each other
  • remove adjacent duplicates from string
  • How to change the position of stacked stacked bar chart in ggplot in R?
  • How to divide each of a range a variables by a second range of variables in R
  • Why do I need to assemble vector before scaling in Spark?
  • How to select individuals which appear in multiple groups?
  • How can I fill columns based on values in another column?
  • 32 bit R and 64 bit R: output differs
  • Remove a single backslash in paste0 output
  • ggplot2 different label for the first break
  • TSP in R, with given distances
  • How to find the given value from the range of values?
  • Solution on R group by issue _ multiple combination
  • Transform multiple columns with a function that uses different arguments per column
  • How can I parse a string with the format "1/16/2019 1:24:51" into a POSIXct or other date variable?
  • How to plot a box plot in R for outlier detection for a huge number of rows?
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com