Mysql – Optimizing an embedded SELECT query in theSQL


Ok, here's a query that I am running right now on a table that has 45,000 records and is 65MB in size… and is just about to get bigger and bigger (so I gotta think of the future performance as well here):

SELECT count(payment_id) as signup_count, sum(amount) as signup_amount
FROM payments p
WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND completed > 0
AND tm_completed IS NOT NULL
AND member_id NOT IN (SELECT p2.member_id FROM payments p2 WHERE p2.completed=1 AND p2.tm_completed < '2009-05-01' AND p2.tm_completed IS NOT NULL GROUP BY p2.member_id)

And as you might or might not imagine – it chokes the mysql server to a standstill…

What it does is – it simply pulls the number of new users who signed up, have at least one "completed" payment, tm_completed is not empty (as it is only populated for completed payments), and (the embedded Select) that member has never had a "completed" payment before – meaning he's a new member (just because the system does rebills and whatnot, and this is the only way to sort of differentiate between an existing member who just got rebilled and a new member who got billed for the first time).

Now, is there any possible way to optimize this query to use less resources or something, and to stop taking my mysql resources down on their knees…?

Am I missing any info to clarify this any further? Let me know…


Here are the indexes already on that table:

PRIMARY PRIMARY 46757 payment_id

member_id INDEX 23378 member_id

payer_id INDEX 11689 payer_id

coupon_id INDEX 1 coupon_id

tm_added INDEX 46757 tm_added, product_id

tm_completed INDEX 46757 tm_completed, product_id

Best Solution

Those kinds of IN subqueries are a bit slow in MySQL. I would rephrase it like this:

SELECT COUNT(1) AS signup_count, SUM(amount) AS signup_amount
FROM   payments p
WHERE  tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND    completed > 0
           SELECT member_id
           FROM   payments
           WHERE  member_id = p.member_id
           AND    completed = 1
           AND    tm_completed < '2009-05-01');

The check 'tm_completed IS NOT NULL' is not necessary as that is implied by your BETWEEN condition.

Also make sure you have an index on:

(tm_completed, completed)