logo
down
shadow

R;Too slow to overate loops for million vectors


R;Too slow to overate loops for million vectors

Content Index :

R;Too slow to overate loops for million vectors
Tag : r , By : ganok_tor
Date : January 11 2021, 05:14 PM

this one helps. You should probably try to vectorize your operations (NB: for loops can often times be avoided in R). In addition, you could check out the data.table package to further improve efficiency:
library(data.table)

set.seed(1)

## create data.table
eco <- as.data.table(matrix(sample(1:100, 13 * 2598893, replace = TRUE), ncol = 13))

## update column
system.time(
    set(eco, j = 13L, value = 1 * (eco[[4]] <= 15))
)
#>    user  system elapsed 
#>   0.018   0.016   0.033

eco
#>          V1 V2  V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13
#>       1: 68 74  55 62 82 51 42 18 16  12  50  73   0
#>       2: 39 97  53 61 21 25 79 71 85  19  54  30   0
#>       3:  1 89  62 42  5 90 33 77 31   1  59  26   0
#>       4: 34 22  27  4 36 74 65 45 46  67  74  34   1
#>       5: 87 57  88  4 42 26  9 13 64  32  16  15   1
#>      ---                                            
#> 2598889: 91 59  78 28 98 98 13 87 88  46  66  85   0
#> 2598890: 82 60  87 60 49 25 10  9 97  78  61  91   0
#> 2598891: 19  2 100 75 66 88 12 46 94  32  69  56   0
#> 2598892: 18 47  22 87 23 79 56 99 13  29  15  46   0
#> 2598893: 47 30   8  8  9 80 49 78 20  43  86  11   1

Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

KMeans clustering for more than 5 million vectors


Tag : algorithm , By : BinaryBoy
Date : March 29 2020, 07:55 AM
To fix this issue OK, So who ever wants clustering for large scale datasets, the only way of doing so is to use Mahout. IT requires a linux platform. So I had to use virtual box, placed Ubuntu on it and then used Mahout. Its a lengthy procedure to set up Mahout, but the two links that I used are as follows.
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)

Will adding an index on a table of 2 million records be twice as slow as the same table with 1 million records?


Tag : mysql , By : ponchopilate
Date : March 29 2020, 07:55 AM
Any of those help (Disclaimer: I have minimal experience on MySQL)
It should be somewhere in-between.

Fastest way to sort and concatenate million or billion STL vectors


Tag : cpp , By : General Mills
Date : March 29 2020, 07:55 AM
This might help you How about this:
Split the vectors into cores piles. Calculate the size needed for each pile Reserve space in a vector for all the data Split this vector into cores parts. Feed the parts and the piles to a thread for merging.
typedef vector<vector<ULLINT>> ManyVectors; 

void merge(ManyVectors vector_of_vectors) {
  const int cores = 16;
  std::array<ManyVectors, cores> piles = split_vector(vector_of_vectors,cores);
  std::array<size_t, cores> sizes = calculate_sizes(piles,cores);
  std::vector<ULLINT> result;
  result.reserve(sum_of_sizes(sizes));
  int used = 0; 
  int core = 0;
  for (ManyVectors& pile: piles) {
    std::thread(merge_vectors, pile, result.begin()+used);
    used += sizes[core];
    core += 1;  
  }
}

Efficient comparison of 1 million vectors containing (float, integer) tuples


Tag : database , By : Ruchi
Date : October 26 2020, 11:52 AM

most frequent vector that appears out of 1 million random vectors generated


Tag : matlab , By : Dominique Vocat
Date : March 29 2020, 07:55 AM
Related Posts Related QUESTIONS :
  • ggplot2 different label for the first break
  • TSP in R, with given distances
  • How to find the given value from the range of values?
  • Solution on R group by issue _ multiple combination
  • Transform multiple columns with a function that uses different arguments per column
  • How can I parse a string with the format "1/16/2019 1:24:51" into a POSIXct or other date variable?
  • How to plot a box plot in R for outlier detection for a huge number of rows?
  • How to change column name according to another dataframe in R?
  • `sjPlot::tab_df()`--how to set the number of decimal places?
  • time average for specific time range in r
  • joining dataframes by closest time and another key in r
  • How to create nested for loop for a certain range
  • New category based on sequence of date ranges
  • how to extract formula from coxph model summary in R?
  • add row based on variable condition in R
  • Generating the sequence 111122222333334
  • Unable to use has_goog_key() in R
  • how to multiply each row with a scaler in corresponding column?
  • R is not recognizing levels of a factor as the same. Is there a way to do this?
  • Calculating mean of replicate experiment result values in a column based on multiple columns using R
  • Best method to extract the first instance of a string between specified keywords using data.table
  • ignore optional combination of alphanumeric characters in str_extract
  • Why tracemem shows two copies when modification occurs inside function body?
  • Can't use mppm on multitype point patterns
  • How to move selected matrix rows to top of matrix based on a selection vector of row names
  • Combining expressions with a common operator
  • Passing string through multiple filters for matching
  • Convert two columns in R to rows of unique occurrence
  • How to create a dataframe using a function based on user-input?
  • How to access the visited vertices in a given shortest path using R igraph
  • Differences in Unicode character output with print()
  • Extracting Function or Objects from a String and then Piping Them with Magrittr/Dplyr
  • renderUI not evaluated until it is rendered
  • Find the maximum absolute value by row in an R data frame
  • Extracting data from irregular lists using purrr:map()
  • transforming data based on range of column in r
  • Identify and subset rows with some similar information
  • converting character from mongolite to timestamp in R
  • Create list from two vectors with every combo of each
  • Error in running a spread because of unique 'key combinations'; combining rows of data
  • visualize numerical strings as a matrixed heatmap
  • how to make a blocked matrix?
  • How to summarize with two functions using with dplyr
  • Dataframe is no longer the same after being saved to Excel and read back in
  • Create duplicate rows using based on availability of data
  • Keep empty groups when grouping with data.table in R
  • Grouping of Event Time Data based on multiple, iterative conditions
  • Formatting Numbers in Flextable for Specific Columns
  • How to store results from for-loop into a dataframe
  • How to select the values in my dataframe which has logical operator "<" (less than), divide them by two, an
  • Rowwise extract data between two strings
  • Convert a string separate by . and +
  • stacking function for values in R
  • dplyr coerces characters to factors
  • How do I use spread and group_by on a single row dataset
  • Replacing values in one matrix with values from another
  • Aggregate data and exclude duplicates in one column
  • Perform an R data.table binary search with OR select
  • How can I include a function in the Standard Deviation parameter of pnorm
  • How to get a tidy excel output of P values from R
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com