Optimizing java heap usage by String using StringBuffer , StringBuilder , String.intern()
Date : March 29 2020, 07:55 AM
Does that help What I do is is have one or more String pools. I do this to a) not create new Strings if I have one in the pool and b) reduce the retained memory size, sometimes by a factor of 3-5. You can write a simple string interner yourself but I suggest you consider how the data is read in first to determine the optimal solution. This matters as you can easily make matters worse if you don't have an efficient solution. As EJP points out processing a line at a time is more efficient, as is parsing each line as you read it. i.e. an int or double takes up far less space than the same String (unless you have a very high rate of duplication) public class StringInterner {
@NotNull
private final String[] interner;
private final int mask;
public StringInterner(int capacity) {
int n = nextPower2(capacity, 128);
interner = new String[n];
mask = n - 1;
}
@Override
@NotNull
public String intern(@NotNull CharSequence cs) {
long hash = 0;
for (int i = 0; i < cs.length(); i++)
hash = 57 * hash + cs.charAt(i);
int h = hash(hash) & mask;
String s = interner[h];
if (isEqual(s, cs))
return s;
String s2 = cs.toString();
return interner[h] = s2;
}
static boolean isEqual(@Nullable CharSequence s, @NotNull CharSequence cs) {
if (s == null) return false;
if (s.length() != cs.length()) return false;
for (int i = 0; i < cs.length(); i++)
if (s.charAt(i) != cs.charAt(i))
return false;
return true;
}
static int nextPower2(int n, int min) {
if (n < min) return min;
if ((n & (n - 1)) == 0) return n;
int i = min;
while (i < n) {
i *= 2;
if (i <= 0) return 1 << 30;
}
return i;
}
static int hash(long n) {
n ^= (n >> 43) ^ (n >> 21);
n ^= (n >> 15) ^ (n >> 7);
return (int) n;
}
}
|
Python, optimizing a list comprehension for string concatenation
Date : March 29 2020, 07:55 AM
Does that help I'm using cProfile to run benchmarks on a script that process strings via list comprehension. The target line I'd like to optimize looks like this: , This should bring at least a minimum improvement: # First precalculate the static part of the string
template = 'prefix%s_' % day + '%s'
# Then, use the %s string interpolation instead of joining strings with '+'
# -->> Note: this proved to be wrong later...
signals = [template % s for s in signals]
# Alternatively you could use map to compare efficiency (be aware of differences between python 2 and 3)
signals = map(template.__mod__, signals)
>>> import timeit
>>> day = 45
>>> signals = ['aaaa', 'bbbb', 'cccccc', 'dddddddddddd']
>>> timeit.timeit("[('prefix' + str(day) + '_' + s) for s in signals]", 'from __main__ import day, signals')
1.35095184709592
>>> template = 'prefix%s_' % day + '%s'
>>> timeit.timeit("[template % s for s in signals]", 'from __main__ import template, signals')
0.7075940089748229
>>> timeit.timeit("map(template.__mod__, signals)", 'from __main__ import template, signals')
0.9939713030159822
>>> template = 'prefix%s_' % day
>>> timeit.timeit("[template + s for s in signals]", 'from __main__ import template, signals')
0.39771016975851126
|
Optimizing Duplicate node search(Closed list,Open List) on N-Puzzle using A-star
Date : March 29 2020, 07:55 AM
fixed the issue. Will look into that further Instead of searching through all the nodes sequentially, you may want to keep a map or set of nodes using the grid for comparison: struct GridLess {
bool operator()(const Node *a,const Node *b) const
{
assert(a);
assert(b);
for(int i=0;i<N;i++)
{
for(int j=0;j<N;j++)
{
if(a->Grid[i][j]!=b->Grid[i][j])
{
return a->Grid[i][j] < b->Grid[i][j];
}
}
}
return false;
}
};
std::set<Node*,GridLess> closed_list;
if (closed_list.count(temp_Node)==0) {
// No node in closed_list has the same grid as temp_node
}
|
Optimizing my insertion to SQL table while parsing through a long list of string
Tag : chash , By : RinKaMan
Date : March 29 2020, 07:55 AM
Any of those help I think what you're looking for is the SqlBulkCopy. I had to use it a little bit ago to populate a bunch of data into SQL for analysis purposes - here's what my function looked like: private void PopulateTable(DataTable data, string tableName)
{
string connectionString = @"Server=PC40808\SQLEXPRESS;Database=Scratchpad;Trusted_Connection=True;";
SqlConnection conn = new SqlConnection(connectionString);
conn.Open();
SqlTransaction transaction = conn.BeginTransaction();
try
{
SqlBulkCopy copy = new SqlBulkCopy(conn, SqlBulkCopyOptions.KeepIdentity, transaction);
copy.DestinationTableName = tableName;
copy.WriteToServer(data);
transaction.Commit();
}
catch (Exception ex)
{
transaction.Rollback();
MessageBox.Show(ex.ToString());
}
finally
{
conn.Close();
}
}
private DataTable GetBlankDTForRetrievals()
{
DataTable retVal = new DataTable("retrievals");
retVal.Columns.Add("ObjectID", typeof(string));
retVal.Columns.Add("DateTimeStamp", typeof(string));
retVal.Columns.Add("Username", typeof(string));
retVal.Columns.Add("DocClass", typeof(string));
retVal.Columns.Add("Func", typeof(string));
return retVal;
}
private DataTable GetDataTableForSingleRetrievalFile(string fileLoc)
{
DataTable retVal = GetBlankDTForRetrievals();
string[] lines = System.IO.File.ReadAllLines(fileLoc);
foreach(string line in lines.Where(l => l.Length > 94).Where((l) => l.StartsWith(" ")))
{
DataRow rowToAdd = retVal.NewRow();
rowToAdd["ObjectID"] = line.Substring(48, 15);
rowToAdd["DateTimeStamp"] = line.Substring(1, 17);
rowToAdd["Username"] = line.Substring(32, 15);
rowToAdd["DocClass"] = line.Substring(69, 20);
rowToAdd["Func"] = line.Substring(90, 4);
retVal.Rows.Add(rowToAdd);
}
return retVal;
}
|
Optimizing big string arrays for checking if a string exists queries
Tag : php , By : Topher Cyll
Date : March 29 2020, 07:55 AM
around this issue I would suggest you to do sorting with binary search to know if a value exists. Time Complexity will be O(N log N) for sorting and O(log N) to search each individual element, where N is the number of elements in the array. <?php
function checkIfValueExists($arr,$search_value){
$low = 0;
$high = count($arr) - 1;
while($low <= $high){
$mid = $low + intval(($high - $low) / 2);
$compare_result = strcmp($arr[$mid],$search_value);
if($compare_result === 0) return true;
else if($compare_result < 0) $low = $mid + 1;
else $high = $mid - 1;
}
return false;
}
<?php
$arr = array();
$str = "abcdefghijklmnopqrstuvwxyz";
$values_to_check = array();
for($i=1;$i<=50000;++$i){
$str_length = rand(1,50);
$new_str = "";
while($str_length-- > 0){
$new_str .= $str[rand(0,25)];
}
$arr[] = $new_str;
if(rand(0,1) === 1){
$values_to_check[] = rand(0,1) === 1 ? $new_str . $str[rand(0,25)] : $new_str;
}
}
// sort the array of strings.
sort($arr);
// test the functionality
foreach($values_to_check as $each_value){
var_dump(checkIfValueExists($arr,$each_value));
echo "<br/>";
}
|