This might help you You're doing it wrong. Calling [.data.table in a loop, which is what your lapply does, is going to be slow because that function has a lot of overhead, and that overhead is not worth it for the tiny operation that you do. The correct way is to do a nonequi join: table[data.table(x), on = .(min.x < x, max.x > x), rowname, by = .EACHI]
# min.x max.x rowname
# 1: 1.084668 1.084668 1
# 2: 1.293461 1.293461 7734
# 3: 1.293461 1.293461 739
# 4: 1.293461 1.293461 2
# 5: 1.293461 1.293461 3757
# 
#30216: 1.324366 1.324366 9999
#30217: 1.324366 1.324366 9635
#30218: 1.869469 1.869469 8740
#30219: 1.869469 1.869469 3302
#30220: 1.869469 1.869469 10000
Boards Message : 
You Must Login
Or Sign Up
to Add Your Comments . 
Share :

Memoryefficient subsetting of large data.table
Tag : r , By : user119605
Date : March 29 2020, 07:55 AM
Any of those help I have a SQLitedb with a size of 11 GB and 16 GB of RAM (shared with OS and so on). I want to perform a subsetting method with data.table: , The only two I have in mind at the moment: sql = "SELECT *, period >= stableStateStart AS tmpcol FROM inventory"
inventory = setDT(dbGetQuery(conn, sql), key="tmpcol")
inventory[.(TRUE)]

C++: Efficient way to check if elements in a vector are greater than elements in another having same indices?
Tag : cpp , By : user181945
Date : March 29 2020, 07:55 AM
fixed the issue. Will look into that further This is called the maxima of a point set. For two and three dimensions, this can be solved in O(n log n) time. For more than three dimensions, this can be solved in O(n(log n)^(d − 3) log log n) time. For random points, a linear expected time algorithm is available.

Different results when subsetting data.table columns with numeric indices in different ways
Tag : r , By : user187301
Date : March 29 2020, 07:55 AM
this will help By looking at the source code we can simulate data.tables behaviour for different inputs if (!missing(j)) {
jsub = replace_dot_alias(substitute(j))
root = if (is.call(jsub)) as.character(jsub[[1L]])[1L] else ""
if (root == ":" 
(root %chin% c("","!") && is.call(jsub[[2L]]) && jsub[[2L]][[1L]]=="(" && is.call(jsub[[2L]][[2L]]) && jsub[[2L]][[2L]][[1L]]==":") 
( (!length(av<all.vars(jsub))  all(substring(av,1L,2L)=="..")) &&
root %chin% c("","c","paste","paste0","","!") &&
missing(by) )) { # test 763. TODO: likely that !missing(by) iff with==TRUE (so, with can be removed)
# When no variable names (i.e. symbols) occur in j, scope doesn't matter because there are no symbols to find.
# If variable names do occur, but they are all prefixed with .., then that means look up in calling scope.
# Automatically set with=FALSE in this case so that DT[,1], DT[,2:3], DT[,"someCol"] and DT[,c("colB","colD")]
# work as expected. As before, a vector will never be returned, but a single column data.table
# for type consistency with >1 cases. To return a single vector use DT[["someCol"]] or DT[[3]].
# The root==":" is to allow DT[,colC:colH] even though that contains two variable names.
# root == "" or "!" is for tests 1504.11 and 1504.13 (a : with a ! or  modifier root)
# We don't want to evaluate j at all in making this decision because i) evaluating could itself
# increment some variable and not intended to be evaluated a 2nd time later on and ii) we don't
# want decisions like this to depend on the data or vector lengths since that can introduce
# inconistency reminiscent of drop=TRUE in [.data.frame that we seek to avoid.
with=FALSE
is_satisfied < function(...) {
jsub < substitute(...)
root = if (is.call(jsub)) as.character(jsub[[1L]])[1L] else ""
if (root == ":" 
(root %chin% c("","!") &&
is.call(jsub[[2L]]) &&
jsub[[2L]][[1L]]=="(" &&
is.call(jsub[[2L]][[2L]]) &&
jsub[[2L]][[2L]][[1L]]==":") 
( (!length(av<all.vars(jsub))  all(substring(av,1L,2L)=="..")) &&
root %chin% c("","c","paste","paste0","","!"))) TRUE else FALSE
}
is_satisfied("x")
# [1] TRUE
is_satisfied(c("x", "y"))
# [1] TRUE
is_satisfied(..x)
# [1] TRUE
is_satisfied(1:2)
# [1] TRUE
is_satisfied(c(1:2))
# [1] TRUE
is_satisfied((1:2))
# [1] FALSE
is_satisfied(y)
# [1] FALSE
is_satisfied(list(x, y))
# [1] FALSE

Subsetting a data frame depending if value in column of reference is greater or lower than 0
Date : March 29 2020, 07:55 AM
may help you . This can also be done using this succinct code using rowSums() and sign() mismatch = 1
df[rowSums(sign(df)) >= (ncol(df)  mismatch * 2), ]
col1 col2 col3 col_Reference
[1,] 1 1 1 5
[2,] 2 2 2 6
[3,] 4 4 4 8

Generate the output array A[] when the number of items greater than a[i] for indices greater than i is given
Tag : arrays , By : Simon Hogg
Date : March 29 2020, 07:55 AM

