Have you ever been in a situation that required you to generate a set of multiple “artificial rows”? Perhaps you needed it for a report. Typically, you would need to do this in a PL/SQL loop of some kind. How else does one generate a >1 set of rows dynamically, from nothing? With a single query of course!
Our Scenario:
For a report of some kind, or perhaps for use in a much larger query, you need to create a rowset that consists of the dates of the 365 days previous to today, each along with a random number between 1 and 20 (pointless I agree, but useful as an example).
There are a couple ways to tackle this.
Method 1 - Using DUAL and a CONNECT BY
This technique actually came to us late in the drafting of this topic by way of one of our readers, Rose. But it is so simple, elegant and ingenious that it immediately became our chosen solution for most situations. Our hat tip to Rose.
Many people I’m sure are already familiar with Oracle’s CONNECT BY clause, which allows one to construct queries that select rows from a table in a hierarchical ordering that may be present in the data, by specifying the conditions that identify a parent-child relationship between rows. But for those that aren’t, a brief example is in order, though greater detail will be omitted.
Consider the following Employee table:
| emp_id |
emp_name |
mgr_id |
| 1 |
M. Vranicar |
null |
| 2 |
J. Simmons |
1 |
| 3 |
J. Weicher |
2 |
| 4 |
R. Rolek |
2 |
The CONNECT BY clause (and its associated optional keywords) allow us to cleverly query the table, ordering the rows hierarchically, using a relationship we know is defined through columns in the table itself, in this case, an employee to his manager:
> SELECT LEVEL – oracle keyword available in CONNECT BY queries
LPAD(’ ‘, LEVEL*2) || emp_name name
FROM employee
START WITH mgr_id is null
CONNECT BY PRIOR emp_id = mgr_id

Without going into greater detail, Oracle retrieves the records using the CONNECT BY PRIOR condition to in essence “filter” the records to determine the next appropriate “child” of the record last retrieved, when determining the retrieval order.
Returning to our example scenario, we can use this functionality to generate the 365 row result set that we are looking for by omitting the PRIOR keyword, and using only a condition based on the LEVEL. This is doable as the CONNECT BY PRIOR clause really just acts as a filter, determining which record is the next child (or in this case, simply when to stop selecting more “child” rows!):
> SELECT sysdate - LEVEL thedate,
mod(abs(dbms_random.random), 20) + 1 thevalue
FROM dual
CONNECT BY LEVEL <= 365;
THEDATE THEVALUE
--------- ----------
23-APR-07 20
22-APR-07 19
21-APR-07 10
20-APR-07 5
19-APR-07 13
18-APR-07 9
17-APR-07 12
16-APR-07 14
15-APR-07 19
14-APR-07 14
...
28-APR-06 9
27-APR-06 7
26-APR-06 4
25-APR-06 10
24-APR-06 8
365 rows selected.
What is happening?
This technique works because the CONNECT BY clause causes Oracle to execute the query by taking the following steps:
1) Select the first date and random number using the single record from dual.
2) Next, “connect back” to select the next appropriate “child” record, again from dual, using the CONNECT BY condition as a “filter”. Dual only has one record, and the only “filter” is that LEVEL be <= 365. The result is the single record being selected from dual again, though this time LEVEL is automatically incremented by 1.
3) Step 2 will be continually repeated, as the CONNECT BY condition/filter will continue to find the single eligible record in dual with every "connect back", until one is retrieved where LEVEL has been incremented beyond 365!
Kudos again to Rose for this one.
Method 2 - Using DUAL and a WITH Clause
Another technique is available, which also uses DUAL, but relies on a WITH clause instead of CONNECT BY. It should be considered more of a “brute force” method. The goal is to build an “empty” rowset of size 2n that is equal or greater than our target rowset size.
We begin by selecting anything from DUAL twice, and using a UNION ALL to preserve the duplication:
> SELECT 1 FROM DUAL UNION ALL SELECT 1 FROM DUAL;
1
----------
1
1
Next, let’s use this query in a WITH clause for efficiency, and select from it multiple times without a join condition- a Cartesian Product!
> WITH tworows AS
( SELECT 1 FROM dual UNION ALL SELECT 1 FROM dual)
SELECT *
FROM tworows t1, /* produces 2 rows */
tworows t2, /* produces 4 rows */
tworows t3, /* produces 8 rows */
tworows t4, /* produces 16 rows */
tworows t5, /* produces 32 rows */
tworows t6, /* produces 64 rows */
tworows t7, /* produces 128 rows */
tworows t8, /* produces 256 rows */
tworows t9 /* produces 512 rows */
What we see produced is a result set of 512 rows and 9 columns:
1 1 1 1 1 1
---------- ---------- ---------- ---------- ---------- ---------- ...
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
...
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
512 rows selected.
Next, we can extend our query by actually selecting the data we are interested in by utilizing ROWNUM: the past 365 days and a random number:
> WITH tworows AS
( SELECT 1 FROM dual union all SELECT 1 FROM dual)
SELECT sysdate - ROWNUM thedate,
mod(abs(dbms_random.random), 20) + 1 thevalue
FROM tworows t1, /* produces 2 rows */
tworows t2, /* produces 4 rows */
tworows t3, /* produces 8 rows */
tworows t4, /* produces 16 rows */
tworows t5, /* produces 32 rows */
tworows t6, /* produces 64 rows */
tworows t7, /* produces 128 rows */
tworows t8, /* produces 256 rows */
tworows t9 /* produces 512 rows */
And again, we see that this produces a rowset of 512 rows, this time with only values we are interested in:
THEDATE THEVALUE
--------- ----------
12-APR-07 2
11-APR-07 18
10-APR-07 12
09-APR-07 18
08-APR-07 19
07-APR-07 2
...
22-NOV-05 9
21-NOV-05 17
20-NOV-05 5
19-NOV-05 1
18-NOV-05 19
17-NOV-05 8
512 rows selected.
Finally, we can limit the number of rows down to our target size with a filter on “rownum”:
> WITH tworows AS
( SELECT 1 FROM dual union all SELECT 1 FROM dual)
SELECT sysdate - rownum thedate,
mod(abs(dbms_random.random), 20) + 1 thevalue
FROM tworows t1, /* produces 2 rows */
tworows t2, /* produces 4 rows */
tworows t3, /* produces 8 rows */
tworows t4, /* produces 16 rows */
tworows t5, /* produces 32 rows */
tworows t6, /* produces 64 rows */
tworows t7, /* produces 128 rows */
tworows t8, /* produces 256 rows */
tworows t9 /* produces 512 rows */
WHERE rownum <= 365
And we now have the result set we were looking for:
THEDATE THEVALUE
--------- ----------
12-APR-07 16
11-APR-07 20
10-APR-07 12
09-APR-07 12
08-APR-07 11
07-APR-07 17
06-APR-07 5
...
17-APR-06 13
16-APR-06 7
15-APR-06 7
14-APR-06 2
13-APR-06 18
365 rows selected.
So Which is Better?
You guessed it: it depends. In our testing, we’ve found that while Method 2 is a bit less efficient in terms of database activities (gets, sorts, etc.), the timing differences are negligible at worst when dealing with smaller result set sizes. However, this does not remain the case should you be generating significantly larger numbers of records. With large rowsets, Method 2 wins out in the end.
For purposes of illustration, consider a rowset of ~1.7 million records (224).
If we run our query using Method 1, we see that the execution time is a little over 10 seconds (~10.3):
> SELECT count(*) from (
SELECT sysdate - rownum thedate,
mod(abs(dbms_random.random), 20) + 1 thevalue
FROM dual
CONNECT BY LEVEL <= 16777216
)
COUNT(*)
----------
16777216
Elapsed: 00:00:10.29
Running this query using Method 2 however, takes about 7 seconds (~6.9), a savings of about 3 seconds:
> WITH tworows AS
( SELECT 1 FROM dual union all SELECT 1 FROM dual)
select count(*) from (
SELECT sysdate - rownum thedate,
mod(abs(dbms_random.random), 20) + 1 thevalue
FROM tworows t1, /* produces 2 rows */
tworows t2, /* produces 4 rows */
tworows t3, /* produces 8 rows */
tworows t4, /* produces 16 rows */
tworows t5, /* produces 32 rows */
tworows t6, /* produces 64 rows */
tworows t7, /* produces 128 rows */
tworows t8, /* produces 256 rows */
tworows t9, /* produces 512 rows */
tworows t10, /* produces 1024 rows */
tworows t11, /* produces 2048 rows */
tworows t12, /* produces 4096 rows */
tworows t13, /* produces 8192 rows */
tworows t14, /* ... */
tworows t15,
tworows t16,
tworows t17,
tworows t18,
tworows t19,
tworows t20,
tworows t21,
tworows t22,
tworows t23,
tworows t24 /* produces 16777216 rows */
)
COUNT(*)
----------
16777216
Elapsed: 00:00:06.90
The real savings for large results found with Method 2 however, comes from the ability to “tune” it. Specifically, there is nothing stopping us from increasing the the number of records initially produced in the WITH clause subquery from 2, to say 4. Due to Oracle’s efficiency in MERGE JOIN CARTESIANS, the dominant database operation resulting from Method 2, the savings can be significant:
> WITH fourrows AS
( SELECT 1 FROM dual
union all
SELECT 1 FROM dual
union all
SELECT 1 FROM dual
union all
SELECT 1 FROM dual
)
select count(*) from (
SELECT sysdate - rownum thedate,
mod(abs(dbms_random.random), 20) + 1 thevalue
FROM fourrows t1, /* produces 4 rows */
fourrows t2, /* produces 16 rows */
fourrows t3, /* produces 64 rows */
fourrows t4, /* produces 256 rows */
fourrows t5, /* produces 1024 rows */
fourrows t6, /* produces 4096 rows */
fourrows t7, /* produces 16384 rows */
fourrows t8, /* produces 65536 rows */
fourrows t9, /* produces 262144 rows */
fourrows t10, /* produces 1048576 rows */
fourrows t11, /* produces 4194304 rows */
fourrows t12 /* produces 16777216 rows */
)
COUNT(*)
----------
16777216
Elapsed: 00:00:03.54
Conclusion
For all but the largest record sets, Method 1 is ideally suited for dynamically generating artificial rowsets using a query.