What is randomness?
According to www.thefreedictionary.com; to be random something must be "lacking any definite plan or prearranged order". Or in this case, the results must be in no particular order, and just as importantly, the result must NOT be reproducable.
Generating a random sample
Generating a random sample of records in SQL Server is easy. SQL Server provides the NewID() function which produces a GUID, or Globally Unique Identifier. The idea of a GUID is that it is both random and unique; In theory, no matter how many times you generate a GUID you will never get the same result. This of course makes NewID() useful for generating keys that are unique, even between different sites or systems, but also for placing data in a random order. The below code will generate a list of 5 random databases from the current server:
Select Top 5 name
from master.sys.databases
order by NewID()
Generating a Random Number
I would like to examine 2 methods, and two mechanisms for applying them, and will aim to generate numbers between 1 and 1000:
Firstly there is the Rand() function. This function is an inbuilt function in SQL Server, and produces a random decimal value between 0 and 1. The following script will be used to generate a random number in the required range:
Select (Rand() * 1000) + 1
15 / 10 = 1 with a remainder of 5
so
15 % 10 = 5
ABS(CAST(NEWID() AS varbinary) % 1000) + 1
Mechanism 1
The first mechanism that I would like to use is a simple loop:
Create table #RandomTest
(
Method1 int,
Method2 int
)
Declare @c as int
Select @c = 1
while @c <= 5
begin
insert #RandomTest
Select (Rand() * 1000) + 1,
ABS(CAST(NEWID() AS varbinary) % 1000) + 1
Select @c = @c + 1
End
Select *
from #RandomTest
Method1 437 929 881 390 54 | Method2 231 396 483 122 94 |
As you can see; both appear to return random numbers within the required range. However, due to the performance overhead it is not often desirable to use a loop.
Mechanism 2
Mechanism 2 is to use a simple select agains a table
Select top 5
CAST((Rand() * 1000) + 1 as int) Method1,
ABS(CAST(NEWID() AS varbinary) % 1000) + 1 Method2
from master.sys.columns
Method1 40 40 40 40 40 | Method2 15 805 72 58 777 |
Whilst Rand() produced a usable result set in a loop, it has the limitation that it generates only one value per result set rather than a value per row.
Conclusions
From the above demonstration it is obvious that, in the majority of implementations, method 2 would be the preferred option.