SQL query on inner join extremely slow
Tag : chash , By : user179190
Date : March 29 2020, 07:55 AM
like below fixes the issue I have a database in SQL Server. There are 2 tables in it, let's call them MASTER and SLAVE. There are a one-to-many relationship between them, so one MASTER record can connect to many SLAVE records. , This is your query: SELECT m.date_time, s.*
FROM MASTER m INNER JOIN
SLAVE s
ON m.gprs_id = s.recordid
WHERE m.date_time >= @fromdate AND m.date_time <= @todate;
|
extremely slow unloading table from bigquery to Google cloud storage
Tag : python , By : user181945
Date : March 29 2020, 07:55 AM
around this issue The way you've formulated your request, it is writing a single 300 MB CSV file in a single worker. This is going to be fairly slow. (5 minutes is still longer than I'd expect, but within a reasonable realm) If you use a glob pattern (e.g. gs://xxxxxxx/test*.csv) in your destination URI, it should be much faster since it can be done in parallel.
|
MySQL Multi JOIN extremely slow
Tag : mysql , By : Sharad
Date : March 29 2020, 07:55 AM
wish of those help Not all of these can be 'fixed', but they jump out at me as performance red-flags: Don't mix DISTINCT and GROUP BY. They sorta do the same thing. Do use InnoDB; that link you quote was resoundingly refuted -- the author admitted it. Do not use LEFT JOIN if JOIN gives you what you want. LEFT implies that the 'right' table may have missing rows. LEFT JOIN ( SELECT ... ) usually cannot be optimized, but JOIN might be. This is especially inefficient: ( SELECT ... ) JOIN ( SELECT ... ) "explode-implode": JOINing inflates the number of rows; GROUP BY then deflates. This is a common cause of performance issues. (Maybe I can be more specific as I go along.) COUNT(x) checks x for not being NULL. Usually, what you really want is COUNT(*). p: INDEX(deleted, name, id)
l: INDEX(practice_fk)
|
Why is writing to Bigquery using Dataflow EXTREMELY slow?
Date : March 29 2020, 07:55 AM
around this issue Turns out Bigquery under Dataflow is NOT slow. Problem was, 'status.getPlace().getCountryCode()' was returning NULL so it was throwing NullPointerException that I couldn't see anywhere in the log! Clearly, Dataflow logging needs to improve. It's running really well now. As soon as message comes in the topic, almost instantaneously it gets written to BigQuery!
|
BigQuery - Group By with multiple fields extremely slow
Tag : sql , By : JulianCT
Date : March 29 2020, 07:55 AM
will help you Try this one(maybe some fix is required due to your columns datatypes): SELECT
cs.CriterionId,
cs.AdGroupId,
cs.CampaignId,
cs.Date,
SUM(cs.Impressions) AS Sum_Impressions,
SUM(cs.Clicks) AS Sum_Clicks,
SUM(cs.Interactions) AS Sum_Interactions,
(SUM(cs.Cost) / 1000000) AS Sum_Cost,
SUM(cs.Conversions) AS Sum_Conversions,
cs.AdNetworkType1,
cs.AdNetworkType2,
cs.AveragePosition,
cs.Device,
cs.InteractionTypes
FROM
`adwords.Keyword_{customer_id}` c
INNER JOIN
`adwords.KeywordBasicStats_{customer_id}` cs
ON
c.ExternalCustomerId = cs.ExternalCustomerId
WHERE
c._DATA_DATE = c._LATEST_DATE
AND c.ExternalCustomerId = {customer_id}
GROUP BY
1, 2, 3, 4, 10, 11, 12, 13, 14
UNION ALL
SELECT
cs.CriterionId,
cs.AdGroupId,
cs.CampaignId,
cs.Date,
0.0 AS Sum_Impressions,
0.0 AS Sum_Clicks,
0.0 AS Sum_Interactions,
0.0 AS Sum_Cost,
0.0 AS Sum_Conversions,
cs.AdNetworkType1,
cs.AdNetworkType2,
cs.AveragePosition,
cs.Device,
cs.InteractionTypes
FROM
`adwords.Keyword_{customer_id}` c
LEFT JOIN
`adwords.KeywordBasicStats_{customer_id}` cs
ON
c.ExternalCustomerId = cs.ExternalCustomerId
WHERE cs.ExternalCustomerId IS NULL
c._DATA_DATE = c._LATEST_DATE
AND c.ExternalCustomerId = {customer_id}
GROUP BY
1, 2, 3, 4, 10, 11, 12, 13, 14
ORDER BY
1, 2, 3, 4, 10, 11, 12, 13, 14
|