logo
down
shadow

aggregating elements to create groups of minimal size


aggregating elements to create groups of minimal size

Content Index :

aggregating elements to create groups of minimal size
Tag : r , By : SachinJadhav
Date : November 24 2020, 04:01 AM

I think the issue was by ths following , This might work:
We initiate a cumulative sum j, starting at 0, a grouping value k, starting at 1 and a group vector w, set to NA with length = length(v).
j <- 0
k <- 1
w <- rep(NA, length(v))
for(i in 1:length(v)){
  w[i] <- k
  j <- j + v[i]
  if(i == length(v) & j < 50){
    w[w == k] <- k-1
  }
  if(j >= 50){
    k <- k + 1
    j <- 0
  }
}
v = c(3, 23, 224, 124, 49, 17, 3, 8, 12)
> w
[1] 1 1 1 2 3 3 3 3 3

df <- cbind.data.frame(v, w)
    v w
1   3 1
2  23 1
3 224 1
4 124 2
5  49 3
6  17 3
7   3 3
8   8 3
9  12 3
aggregate(v ~ w, df, sum)
  w   v
1 1 250
2 2 124
3 3  89

Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

How to create a function that will split continuous variables only to groups equal size groups


Tag : r , By : Bart van Bragt
Date : March 29 2020, 07:55 AM
it should still fix some issue I would like to run a function over my data frame that will find only continuous variables and add new categorial variables based on dividing the continuous variables to 2 equal size groups. I have a code that I use to split a variable to groups and add it as anew categorial variable but when I tried to use it in a function it does't work.What could be the problem? Also, how can I avoid from running over non continuous variables? Here is a toy data frame: , Here are some possible problems in your function
for (i in names(df)) function (x) { as.factor( as.numeric( cut(df$i,2)))  }
lst <- vector('list', ncol(df))
for(i in seq_along(df)) {
        lst[[i]] <- as.factor(as.numeric(cut(df[,i], 2)))
 }
df[paste0(names(df), 'new')] <- lst
df[paste0(names(df), 'new')] <- lapply(df, function(x)
                   factor(cut(x, 2, labels=FALSE)))
 indx <- vapply(df2, function(x) !all(x %in% 0:1) & is.numeric(x), logical(1L))
   lst <- vector('list', ncol(df2[indx]))
   for(i in seq_along(df2[indx])) {
       lst[[i]] <- as.factor(as.numeric(cut(df2[indx][,i], 2)))
    }
  df2[paste0(names(df2)[indx], 'new')] <- lst
 df2[paste0(names(df2)[indx], 'new')] <- lapply(df2[indx],
                  function(x) factor(cut(x, 2, labels=FALSE)))
set.seed(24)
df1 <- data.frame(col1=sample(0:1, 10, replace=TRUE),
           col2=rnorm(10), col3=letters[1:10])
#df - OP's dataset

df2 <- cbind(df1, df)

Matlab: zero groups of non-zero elements in a matrix based on group size


Tag : matlab , By : user183676
Date : March 29 2020, 07:55 AM
wish helps you Essentially I have binary, 3D image masks with the "1"'s in them in groups of various shapes and sizes spread throughout the mask. Working in matlab, I've got tools that allow me to convert this into a matrix, and what I'm looking to do is go through the matrix and zero blobs of 1's (i.e. adjacent sets of non-zero numbers which are surrounded by 0's) if the total size of that group is less than a given number of elements (say 30). Is there a pre-existing function that will do this, or am I going to need to get involved with kernels and the like? , Fortunately, Matlab has a function for that: bwareaopen
maskWithOnlyBigObjects = bwareaopen(mask, 30);
maskWithOnlyBigObjects = bwareaopen(mask, 30, conndef(6));

Hive - Create map columns type by aggregating values across groups


Tag : sql , By : TheDave1022
Date : March 29 2020, 07:55 AM
should help you out This can be accomplished using a series of self-joins to find other rooms in the same category before combining the results into 2 maps.
Code
CREATE TABLE `table` AS
SELECT 1 AS customer, 'A' AS category, 'aa' AS room, 'd1' AS `date` UNION ALL
SELECT 1 AS customer, 'A' AS category, 'bb' AS room, 'd2' AS `date` UNION ALL
SELECT 1 AS customer, 'B' AS category, 'cc' AS room, 'd3' AS `date` UNION ALL
SELECT 1 AS customer, 'C' AS category, 'aa' AS room, 'd1' AS `date` UNION ALL
SELECT 1 AS customer, 'C' AS category, 'bb' AS room, 'd2' AS `date` UNION ALL
SELECT 2 AS customer, 'A' AS category, 'aa' AS room, 'd3' AS `date` UNION ALL
SELECT 2 AS customer, 'A' AS category, 'bb' AS room, 'd4' AS `date` UNION ALL
SELECT 2 AS customer, 'C' AS category, 'bb' AS room, 'd4' AS `date` UNION ALL
SELECT 2 AS customer, 'C' AS category, 'ee' AS room, 'd5' AS `date` UNION ALL
SELECT 3 AS customer, 'D' AS category, 'ee' AS room, 'd6' AS `date`
;


SELECT
    customer_rooms.customer,
    collect(customer_rooms.room, customer_rooms.date) AS map_customer_room_date,
    collect(
        COALESCE(customer_category_rooms.room, category_rooms.room),
        COALESCE(customer_category_rooms.date, category_rooms.date)) AS map_category_room_date
FROM `table` AS customer_rooms
JOIN `table` AS category_rooms ON customer_rooms.category = category_rooms.category
LEFT OUTER JOIN `table` AS customer_category_rooms ON customer_rooms.customer = customer_category_rooms.customer
AND category_rooms.category = customer_category_rooms.category
AND category_rooms.room = customer_category_rooms.room
WHERE (
    customer_rooms.customer = customer_category_rooms.customer AND
    customer_rooms.category = customer_category_rooms.category AND
    customer_rooms.room = customer_category_rooms.room AND
    customer_rooms.date = customer_category_rooms.date
)
OR (
    customer_category_rooms.customer IS NULL AND
    customer_category_rooms.category IS NULL AND
    customer_category_rooms.room IS NULL AND
    customer_category_rooms.date IS NULL
)
GROUP BY
    customer_rooms.customer
;
1   {"aa":"d1","bb":"d2","cc":"d3"} {"aa":"d1","bb":"d2","cc":"d3","ee":"d5"}
2   {"aa":"d3","bb":"d4","ee":"d5"} {"aa":"d3","bb":"d4","ee":"d5"}
3   {"ee":"d6"} {"ee":"d6"}
FROM `table` AS customer_rooms
JOIN `table` AS category_rooms ON customer_rooms.category = category_rooms.category
LEFT OUTER JOIN `table` AS customer_category_rooms ON customer_rooms.customer = customer_category_rooms.customer
AND category_rooms.category = customer_category_rooms.category
AND category_rooms.room = customer_category_rooms.room
WHERE (
    customer_rooms.customer = customer_category_rooms.customer AND
    customer_rooms.category = customer_category_rooms.category AND
    customer_rooms.room = customer_category_rooms.room AND
    customer_rooms.date = customer_category_rooms.date
)
OR (
    customer_category_rooms.customer IS NULL AND
    customer_category_rooms.category IS NULL AND
    customer_category_rooms.room IS NULL AND
    customer_category_rooms.date IS NULL
)
    collect(customer_rooms.room, customer_rooms.date) AS map_customer_room_date,
    collect(
        COALESCE(customer_category_rooms.room, category_rooms.room),
        COALESCE(customer_category_rooms.date, category_rooms.date)) AS map_category_room_date

Delete from PC table the computers having minimal hdd size or minimal ram size


Tag : sql , By : Blight
Date : March 29 2020, 07:55 AM
seems to work fine Your EXISTS will just delete anything from the table where the EXISTS condition is true. You can find out more here.
You need to delete only the records you're after, which points to a window function. You can find out more info here.
BEGIN TRAN;
DELETE p FROM PC p
INNER JOIN
(
SELECT Code,
ROW_NUMBER() OVER (PARTITION BY model ORDER BY hd DESC, ram DESC) [RNum]
) m ON m.Code = p.Code AND m.RNum = 1;
--COMMIT TRAN;
--ROLLBACK TRAN;

Python: Summarizing & Aggregating Groups and Sub-groups in DataFrame


Tag : python , By : Derek
Date : March 29 2020, 07:55 AM
Hope this helps Use DataFrame.melt with GroupBy.agg and tuples for aggregate functions with new columns names:
df1 = (df.melt('interval', var_name='source')
         .groupby(['interval','source'])['value']
         .agg([('cnt','count'), ('average','mean')])
         .reset_index())
print (df1.head())
  interval source  cnt  average
0        0      a    1      5.0
1        0      b    1      0.0
2        0      c    1      0.0
3        0      d    1      0.0
4        0      f    1      0.0
Related Posts Related QUESTIONS :
  • How to create nested for loop for a certain range
  • New category based on sequence of date ranges
  • how to extract formula from coxph model summary in R?
  • add row based on variable condition in R
  • Generating the sequence 111122222333334
  • Unable to use has_goog_key() in R
  • how to multiply each row with a scaler in corresponding column?
  • R is not recognizing levels of a factor as the same. Is there a way to do this?
  • Calculating mean of replicate experiment result values in a column based on multiple columns using R
  • Best method to extract the first instance of a string between specified keywords using data.table
  • ignore optional combination of alphanumeric characters in str_extract
  • Why tracemem shows two copies when modification occurs inside function body?
  • Can't use mppm on multitype point patterns
  • How to move selected matrix rows to top of matrix based on a selection vector of row names
  • Combining expressions with a common operator
  • Passing string through multiple filters for matching
  • Convert two columns in R to rows of unique occurrence
  • How to create a dataframe using a function based on user-input?
  • How to access the visited vertices in a given shortest path using R igraph
  • Differences in Unicode character output with print()
  • Extracting Function or Objects from a String and then Piping Them with Magrittr/Dplyr
  • renderUI not evaluated until it is rendered
  • Find the maximum absolute value by row in an R data frame
  • Extracting data from irregular lists using purrr:map()
  • transforming data based on range of column in r
  • Identify and subset rows with some similar information
  • converting character from mongolite to timestamp in R
  • Create list from two vectors with every combo of each
  • Error in running a spread because of unique 'key combinations'; combining rows of data
  • visualize numerical strings as a matrixed heatmap
  • how to make a blocked matrix?
  • How to summarize with two functions using with dplyr
  • Dataframe is no longer the same after being saved to Excel and read back in
  • Create duplicate rows using based on availability of data
  • Keep empty groups when grouping with data.table in R
  • Grouping of Event Time Data based on multiple, iterative conditions
  • Formatting Numbers in Flextable for Specific Columns
  • How to store results from for-loop into a dataframe
  • How to select the values in my dataframe which has logical operator "<" (less than), divide them by two, an
  • Rowwise extract data between two strings
  • Convert a string separate by . and +
  • stacking function for values in R
  • dplyr coerces characters to factors
  • How do I use spread and group_by on a single row dataset
  • Replacing values in one matrix with values from another
  • Aggregate data and exclude duplicates in one column
  • Perform an R data.table binary search with OR select
  • How can I include a function in the Standard Deviation parameter of pnorm
  • How to get a tidy excel output of P values from R
  • Rotate boxplot legend (R, ggplot2)
  • dplyr::n() returns “Error: Error: n() should only be called in a data context ”
  • Extract fix columns and one variable column from a list of df´s in R
  • A function that can translate DNA sequence to binary code
  • I want to extract 365 netcdf files using loop
  • rvest vs RSelenium results for text extracting
  • Converting wide data to tall data
  • How to remove vertical white lines when using ggsave in R?
  • R-Shiny error: "renderDataTable" and "server=FALSE"
  • Read csv file with selected rows using data.table's fread
  • how to resolve an error like non numeric argument to binary argument?
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com