around this issue Apart from what @Kasra suggested the other slower part in your code is deletion of item at 0th index, for lists it's O(N) operation. A better data-structure for this is collections.deque which allows fast insertion and deletion on either end.
seems to work fine Based on the amount of time you're listing, (10-20 seconds for each) it seems nearly certain that you're compiling with optimization disabled1. This renders your results basically meaningless. Doing a quick test on my (roughly 7 year-old) machine, with optimization enabled I get a time of 0 for vector and about 1.2-1.5 seconds for list (1.2 with VC++, 1.5 with g++).
Can iterating over unsorted data structure (like array, tree), with multiple thread make iteration faster?
With these it helps The answer is yes, it can make it faster - but not necessarily. In your case, when you're iterating over pretty small arrays, it is likely that the overhead of launching a new thread will be much higher than the benefit gained. If you array was much bigger then this would be reduced as a proportion of the overall runtime and eventually become worth it. Note you will only get speed up if your system has more than 1 physical core available to it. Additionally, you should note that whilst that the code that reads the array in your case is perfectly thread-safe, writing to std::cout is not (you will get very strange looking output if your try this). Instead perhaps your thread should do something like return an integer type indicating the number of instances found.
How can my Postgres query perform faster? Can I use Python to provide faster iteration?
wish help you to fix your issue Try the following, which eliminates your count(*) and instead uses exists.
with dupe as (
json_document->'Firstname'->0->'Content' as first_name,
json_document->'Lastname'->0->'Content' as last_name,
identifiers->'RecordID' as record_id
jsonb_array_elements(json_document->'Identifiers') as identifiers
from staging ) sub
order by last_name )
select * from dupe da
from dupe db
db.record_id = da.record_id
and db.id != da.id
I think the issue was by ths following , In your code you're calculating the median 50K times despite being always the same. Since computing the median requires sorting your 50K values, this ends up being pretty intensive. Below you find a numpy-based snippet.
data = np.loadtxt('text.txt', dtype=str)
grades = [float(g) for g in data[1:, 2]]
norm_grades = grades / np.median(grades) * 500
Make this faster. (Min, Max in same iteration using a condition)