Sql – What am I doing wrong when using RAND() in MS SQL Server 2005

randomsqlsql server

I'm trying to select a random 10% sampling from a small table. I thought I'd just use the RAND() function and select those rows where the random number is less than 0.10:

SELECT * FROM SomeTable
WHERE SomeColumn='SomeCondition' AND
      RAND() < 0.10

But I soon discovered that RAND() always returns the same number! Reminds me of this xkcd cartoon.

OK, no problem, the RAND function takes a seed value. I will be running this query periodically, and I want it to give different results if I run it on a different day, so I seed it with a combination of the date and a unique row ID:

SELECT * FROM SomeTable
WHERE SomeColumn='SomeCondition' AND
      RAND(CAST(GETDATE) AS INTEGER) + RowID) < 0.10

I still don't get any results! When I show the random numbers returned by RAND, I discover that they're all within a narrow range. It appears that getting a random number from RAND requires you to use a random seed. If I had a random seed in the first place, I wouldn't need a random number!

I've seen the previous discussions related to this problem:

SQL Server Random Sort
How to request a random row in SQL?

They don't help me. TABLESAMPLE works at the page level, which is great for a big table but not for a small one, and it looks like it applies prior to the WHERE clause. TOP with NEWID doesn't work because I don't know ahead of time how many rows I want.

Anybody have a solution, or at least a hint?

Edit: Thanks to AlexCuse for a solution which works for my particular case. Now to the larger question, how to make RAND behave?

Best Answer

This type of approach (shown by ΤΖΩΤΖΙΟΥ) will not guarantee a 10% sampling. It will only give you all rows where Rand() is evaluated to < .10 which will not be consistent.

Something like

select top 10 percent * from MyTable order by NEWID()

will do the trick.

edit: there is not really a good way to make RAND behave. This is what I've used in the past (kludge alert - it kills you not being able to use Rand() in a UDF)

CREATE VIEW RandView AS 

SELECT RAND() AS Val

GO

CREATE FUNCTION RandomFloat()
RETURNS FLOAT
AS
BEGIN

RETURN (SELECT Val FROM RandView)

END

Then you just have select blah, dbo.RandomFloat() from table in your query.