PIOCON Technologies Seeks an Oracle DBA in Chicago

Published in Jobs by gpike Tuesday April 24, 2007

The publishers of SingleQuery.com are looking for a highly-qualified Oracle DBA for a position to start immediately. PIOCON Technologies, a Business Intelligence consulting firm located in the Chicago suburbs, seeks an independent or permanent-position DBA this week. At PIOCON, we are constantly seeking the consummate professional, people with years of experience and the qualifications to prove it. Take a close look at our postings here at SingleQuery.com to get a taste of the caliber of employee we hire at PIOCON.

If you are interested in pursuing a career or seeking a short-term DBA engagement, we urge you to contact PIOCON Technologies by clicking here or sending email to jobs@piocon.com.

Generating Multiple Artificial Rows in Oracle 10g Using a Query

Published in Database Tips, Quick Tips, SQL by jweicher Tuesday April 24, 2007

Have you ever been in a situation that required you to generate a set of multiple “artificial rows”? Perhaps you needed it for a report. Typically, you would need to do this in a PL/SQL loop of some kind. How else does one generate a >1 set of rows dynamically, from nothing? With a single query of course!

Our Scenario:

For a report of some kind, or perhaps for use in a much larger query, you need to create a rowset that consists of the dates of the 365 days previous to today, each along with a random number between 1 and 20 (pointless I agree, but useful as an example).

There are a couple ways to tackle this.

 

Method 1 - Using DUAL and a CONNECT BY
This technique actually came to us late in the drafting of this topic by way of one of our readers, Rose. But it is so simple, elegant and ingenious that it immediately became our chosen solution for most situations. Our hat tip to Rose.

Many people I’m sure are already familiar with Oracle’s CONNECT BY clause, which allows one to construct queries that select rows from a table in a hierarchical ordering that may be present in the data, by specifying the conditions that identify a parent-child relationship between rows. But for those that aren’t, a brief example is in order, though greater detail will be omitted.

Consider the following Employee table:

emp_id emp_name mgr_id
1 M. Vranicar null
2 J. Simmons 1
3 J. Weicher 2
4 R. Rolek 2

The CONNECT BY clause (and its associated optional keywords) allow us to cleverly query the table, ordering the rows hierarchically, using a relationship we know is defined through columns in the table itself, in this case, an employee to his manager:


> SELECT  LEVEL  – oracle keyword available in CONNECT BY queries
          LPAD(’ ‘, LEVEL*2) || emp_name name
  FROM employee
  START WITH mgr_id is null
  CONNECT BY PRIOR emp_id = mgr_id

Connect By Example

Without going into greater detail, Oracle retrieves the records using the CONNECT BY PRIOR condition to in essence “filter” the records to determine the next appropriate “child” of the record last retrieved, when determining the retrieval order.

Returning to our example scenario, we can use this functionality to generate the 365 row result set that we are looking for by omitting the PRIOR keyword, and using only a condition based on the LEVEL. This is doable as the CONNECT BY PRIOR clause really just acts as a filter, determining which record is the next child (or in this case, simply when to stop selecting more “child” rows!):


> SELECT  sysdate - LEVEL thedate,
          mod(abs(dbms_random.random), 20) + 1 thevalue
  FROM dual
  CONNECT BY LEVEL <= 365;

THEDATE     THEVALUE
--------- ----------
23-APR-07         20
22-APR-07         19
21-APR-07         10
20-APR-07          5
19-APR-07         13
18-APR-07          9
17-APR-07         12
16-APR-07         14
15-APR-07         19
14-APR-07         14
...
28-APR-06          9
27-APR-06          7
26-APR-06          4
25-APR-06         10
24-APR-06          8

365 rows selected.


What is happening?
This technique works because the CONNECT BY clause causes Oracle to execute the query by taking the following steps:

1) Select the first date and random number using the single record from dual.

2) Next, “connect back” to select the next appropriate “child” record, again from dual, using the CONNECT BY condition as a “filter”. Dual only has one record, and the only “filter” is that LEVEL be <= 365. The result is the single record being selected from dual again, though this time LEVEL is automatically incremented by 1.

3) Step 2 will be continually repeated, as the CONNECT BY condition/filter will continue to find the single eligible record in dual with every "connect back", until one is retrieved where LEVEL has been incremented beyond 365!

Kudos again to Rose for this one.

 

Method 2 - Using DUAL and a WITH Clause
Another technique is available, which also uses DUAL, but relies on a WITH clause instead of CONNECT BY. It should be considered more of a “brute force” method. The goal is to build an “empty” rowset of size 2n that is equal or greater than our target rowset size.

We begin by selecting anything from DUAL twice, and using a UNION ALL to preserve the duplication:


> SELECT 1 FROM DUAL UNION ALL SELECT 1 FROM DUAL;

         1
----------
         1
         1

Next, let’s use this query in a WITH clause for efficiency, and select from it multiple times without a join condition- a Cartesian Product!

> WITH tworows AS
( SELECT 1 FROM dual UNION ALL SELECT 1 FROM dual)
SELECT	*
FROM	tworows	t1, /* produces 2 rows */
	tworows t2, /* produces 4 rows */
	tworows t3, /* produces 8 rows */
	tworows t4, /* produces 16 rows */
	tworows t5, /* produces 32 rows */
	tworows t6, /* produces 64 rows */
	tworows t7, /* produces 128 rows */
	tworows t8, /* produces 256 rows */
	tworows t9  /* produces 512 rows */

What we see produced is a result set of 512 rows and 9 columns:


         1          1          1          1          1          1
---------- ---------- ---------- ---------- ---------- ---------- ...
         1          1          1          1          1          1
         1          1          1          1          1          1
         1          1          1          1          1          1
         1          1          1          1          1          1
         1          1          1          1          1          1
         1          1          1          1          1          1
         1          1          1          1          1          1
...
         1          1          1          1          1          1
         1          1          1          1          1          1
         1          1          1          1          1          1

512 rows selected.

Next, we can extend our query by actually selecting the data we are interested in by utilizing ROWNUM: the past 365 days and a random number:


> WITH tworows AS
( SELECT 1 FROM dual union all SELECT 1 FROM dual)
SELECT	sysdate - ROWNUM thedate,
	mod(abs(dbms_random.random), 20) + 1 thevalue
FROM	tworows	t1, /* produces 2 rows */
	tworows t2, /* produces 4 rows */
	tworows t3, /* produces 8 rows */
	tworows t4, /* produces 16 rows */
	tworows t5, /* produces 32 rows */
	tworows t6, /* produces 64 rows */
	tworows t7, /* produces 128 rows */
	tworows t8, /* produces 256 rows */
	tworows t9  /* produces 512 rows */

And again, we see that this produces a rowset of 512 rows, this time with only values we are interested in:


THEDATE     THEVALUE
--------- ----------
12-APR-07          2
11-APR-07         18
10-APR-07         12
09-APR-07         18
08-APR-07         19
07-APR-07          2
...
22-NOV-05          9
21-NOV-05         17
20-NOV-05          5
19-NOV-05          1
18-NOV-05         19
17-NOV-05          8

512 rows selected.

Finally, we can limit the number of rows down to our target size with a filter on “rownum”:


> WITH tworows AS
( SELECT 1 FROM dual union all SELECT 1 FROM dual)
SELECT	sysdate - rownum thedate,
	mod(abs(dbms_random.random), 20) + 1 thevalue
FROM	tworows	t1, /* produces 2 rows */
	tworows t2, /* produces 4 rows */
	tworows t3, /* produces 8 rows */
	tworows t4, /* produces 16 rows */
	tworows t5, /* produces 32 rows */
	tworows t6, /* produces 64 rows */
	tworows t7, /* produces 128 rows */
	tworows t8, /* produces 256 rows */
	tworows t9  /* produces 512 rows */
WHERE	rownum <= 365

And we now have the result set we were looking for:


THEDATE     THEVALUE
--------- ----------
12-APR-07         16
11-APR-07         20
10-APR-07         12
09-APR-07         12
08-APR-07         11
07-APR-07         17
06-APR-07          5
...
17-APR-06         13
16-APR-06          7
15-APR-06          7
14-APR-06          2
13-APR-06         18

365 rows selected.

 

So Which is Better?
You guessed it: it depends. In our testing, we’ve found that while Method 2 is a bit less efficient in terms of database activities (gets, sorts, etc.), the timing differences are negligible at worst when dealing with smaller result set sizes. However, this does not remain the case should you be generating significantly larger numbers of records. With large rowsets, Method 2 wins out in the end.

For purposes of illustration, consider a rowset of ~1.7 million records (224).

If we run our query using Method 1, we see that the execution time is a little over 10 seconds (~10.3):


> SELECT count(*) from (
    SELECT  sysdate - rownum thedate,
	    mod(abs(dbms_random.random), 20) + 1 thevalue
    FROM dual
    CONNECT BY LEVEL <= 16777216
  )

  COUNT(*)
----------
  16777216

Elapsed: 00:00:10.29

Running this query using Method 2 however, takes about 7 seconds (~6.9), a savings of about 3 seconds:


> WITH tworows AS
( SELECT 1 FROM dual union all SELECT 1 FROM dual)
select count(*) from (
SELECT	sysdate - rownum thedate,
	mod(abs(dbms_random.random), 20) + 1 thevalue
FROM	tworows	t1,  /* produces 2 rows */
	tworows t2,  /* produces 4 rows */
	tworows t3,  /* produces 8 rows */
	tworows t4,  /* produces 16 rows */
	tworows t5,  /* produces 32 rows */
	tworows t6,  /* produces 64 rows */
	tworows t7,  /* produces 128 rows */
	tworows t8,  /* produces 256 rows */
	tworows t9,  /* produces 512 rows */
	tworows t10,  /* produces 1024 rows */
	tworows t11,  /* produces 2048 rows */
	tworows t12,  /* produces 4096 rows */
	tworows t13,  /* produces 8192 rows */
	tworows t14,  /* ... */
	tworows t15,
	tworows t16,
	tworows t17,
	tworows t18,
	tworows t19,
	tworows t20,
	tworows t21,
	tworows t22,
	tworows t23,
	tworows t24   /* produces 16777216 rows */
)

  COUNT(*)
----------
  16777216

Elapsed: 00:00:06.90

The real savings for large results found with Method 2 however, comes from the ability to “tune” it. Specifically, there is nothing stopping us from increasing the the number of records initially produced in the WITH clause subquery from 2, to say 4. Due to Oracle’s efficiency in MERGE JOIN CARTESIANS, the dominant database operation resulting from Method 2, the savings can be significant:


> WITH fourrows AS
( SELECT 1 FROM dual
  union all
  SELECT 1 FROM dual
  union all
  SELECT 1 FROM dual
  union all
  SELECT 1 FROM dual
)
select count(*) from (
SELECT	sysdate - rownum thedate,
	mod(abs(dbms_random.random), 20) + 1 thevalue
FROM	fourrows t1,  /* produces 4 rows */
	fourrows t2,  /* produces 16 rows */
	fourrows t3,  /* produces 64 rows */
	fourrows t4,  /* produces 256 rows */
	fourrows t5,  /* produces 1024 rows */
	fourrows t6,  /* produces 4096 rows */
	fourrows t7,  /* produces 16384 rows */
	fourrows t8,  /* produces 65536 rows */
	fourrows t9,  /* produces 262144 rows */
	fourrows t10, /* produces 1048576 rows */
	fourrows t11, /* produces 4194304 rows */
	fourrows t12  /* produces 16777216 rows */
)

  COUNT(*)
----------
  16777216

Elapsed: 00:00:03.54

 

Conclusion
For all but the largest record sets, Method 1 is ideally suited for dynamically generating artificial rowsets using a query.

 
Direct E-mail John Weicher
PIOCON Technologies Logo

The Art of Writing a Complex Query - The Data

Published in Uncategorized by gpike Thursday April 19, 2007

The following creates the single table necessary for the exercise:

CREATE TABLE sales (time_of_day DATE);

Populate the table for a single day:

truncate table sales;
insert into sales values (trunc(to_date('24-mar-07'))+8.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+9.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+9.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+9.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+10.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+10.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+10.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+10.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+10.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+10.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+10.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+10.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+10.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+10.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+11.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+11.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+11.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+11.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+11.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+11.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+11.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+13.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+14.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+14.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+14.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+14.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+15.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+16.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+16.5/24,2);
insert into sales values (trunc(to_date('24-mar-07'))+17.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+17.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+17.5/24,1);
insert into sales values (trunc(to_date('24-mar-07'))+17.5/24,1);
commit;

The Art of Writing a Complex Oracle 10g SQL Query: Part 1

Published in Complex SQL, SQL by gpike Thursday April 19, 2007

“One is all you need!” Its the SingleQuery.com slogan and my personal philosophy: In the Oracle database, just about anything can (and should) be done with a Single Query. Sure, there are clear cases where a Single Query is not the best option; for example, when you need to perform detailed row-level auditing or error handling logic. And placing complex logic into queries may yield an almost unreadable jumble of DECODES, CASE statements, PARTITION BYs, sub-queries and in-line views. Extensive programmer comments are a must for the complex Single Query; even this author cannot remember what his last Single Query is doing without some reminders.

When speed is of the essence (when isn’t it!), a Single Query is the only way to go. Show me just about any PL/SQL cursor FOR LOOP (or a Java looping structure or whatever else you can dream up) and I’ll write you a Single Query that will dramatically outperform the loop every time. Why? Because the Oracle database is a highly-efficient, multi-threaded computational engine just itching for a tough query to solve.

Now that I’ve hopefully convinced you to attempt a Single Query, how do we go about crafting this monster. Oracle SQL gives us lots of tools to choose from; we need only decide what tools best meet our needs:

  • Result set selectivity is accomplished with good old fashioned inner joins and WHERE clauses. Parallel processing (see our post on the PARALLEL hint) provides the power to handle large data sets.

  • Dealing with incomplete or sparse data requires the use of outer joins and, optionally, Cartesian products.

  • Conditional logic is handled with DECODEs and CASE statements.

  • Complex, set-based calculations are performed with analytic functions combined with GROUP BY or PARTITION BY clauses.

  • Inter-row data sharing (row 10 knowing about a column in row 11) is handled by windowing functions.

  • Spreadsheet-style calculations are enabled with the MODEL clause.

  • Step-by-step logic, complex operations and query readability requires liberal use of in-line views.

This last point is critical to writing the Single Query. An in-line view is really just a query of its own with its result set exposed as a virtual table to a higher-level query. In the strictest sense, I suppose everything cannot be done in a single query, because complex queries will include the SELECT keyword many times. Argue semantics with me if you must; I say if it produces a single result set, its a Single Query.

In Part 2, I will construct a query that begins with a single-column table and produces a complex result set. This practical example is designed to illustrate a repeatable process of constructing a complex Single Query step by step.

Continue to part 2 –>


The Pickle
Greg Pike - gpike@piocon.com

The Art of Writing a Complex Oracle 10g SQL Query: Part 2

Published in Complex SQL, SQL by gpike Thursday April 19, 2007

<-- Back to Part 1

Learning By Example: The Store Transactions Query

Let’s move on to an example of building a complex query from scratch. In this example, a store captures the date and time for each sales transaction. Our goal is to determine which three-hour interval during the typical day has the highest average transactions (the most customers) so the store manager can determine when to increase the sales staff.

Our query will return the total transactions that occurred in each hour of the day, the three-hour moving average (and the rank compared to the other three-hour time periods) as well as the difference from the average transactions/hour for the entire day. For those that might want to try this query at home, the raw data and INSERT statements can be found by clicking here.

The simple source table contains only one column: time_of_day (DATE type). Every row in the table represents the date and time that a single sales transaction occurred (51 rows). From the SALES table, we will produce the following complex result set with a SingleQuery:

 Time of    Txns  3 Hour 3  Hour   Daily    Diff. from
   Day             Avg.  Ranking  Average   Daily Avg.
--------------------------------------------------------
 8:00 A.M.    1    2.0      8       5.1       -61%
 9:00 A.M.    3    4.7      6       5.1       - 8%
10:00 A.M.   10    6.7      4       5.1       +31%
11:00 A.M.    7    5.7      5       5.1       +12%
12:00 A.M.    0    8.7      1       5.1       +71% <-- No rows in the table!
 1:00 P.M.   19    8.3      2       5.1       +63%
 2:00 P.M.    4    8.0      3       5.1       +57%
 3:00 P.M.    1    2.3      7       5.1       -55%
 4:00 P.M.    2    2.3      7       5.1       -55%
 5:00 P.M.    4    2.0      8       5.1       -61%

I know what you might be thinking, “This would be a breeze with a CURSOR FOR LOOP and a couple of PL/SQL tables or VARRAYS. I could step through all of the rows in the table, add them to the appropriate array buckets, do a few division operations and out comes the result!” Well, that method may work, but its not the preferred solution for SingleQuery.com.

Creating the SingleQuery:

To construct any complex query, always start with the basics and iterate; add additional parts to the query and continually assure yourself that the query is functioning properly as it is built. Attempting to create a complex query with all logic included from the beginning will quickly prove a frustrating and perhaps futile experience. For this query, the first, obvious step is to determine the total number of transactions by hour since these are the first two columns. We will leave this portion of the query in blue throughout the rest of the example:

SELECT   TRUNC(time_of_day, 'HH24') hour_of_day,
         count(*) total_txns
FROM     sales
GROUP BY TRUNC(time_of_day, 'HH24');

Our desired result set includes 10 rows (see above), but when this query is executed it produces only 9 rows since no transactions occurred during the noon hour. But we need that zero-transaction row for our report! Even with zero transactions, the noon hour had the highest three-hour average for the day. For our store manager, the noon hour is right in the middle of the busy time. Next step: Get back the missing row.

Adding an In-Line Views, a Cartesian Product and an Outer Join

Since any hour might have zero transactions, we need to construct a “model day” (a set of rows that contains all of the possible store operating hours) and Outer Join this to the real data to manufacture the “missing” hours. This cannot be accomplished at the individual transaction level (pre-GROUP BY) since the 51 individual transaction rows have not been aggregated into hours. Enter the In-Line view and a Cartesian Product. To construct a model day, create a query that will always return the distinct store operating hours (assume the store is open from 8:00 AM to 6:00 PM everyday). We will leave this portion of the query in red throughout the rest of the example:

SELECT  days.day+(hours.hour/24) model_day_hours
FROM     (
          SELECT 7+rownum hour
          FROM   all_objects
          WHERE  rownum <11
         ) hours,
         (
          SELECT DISTINCT trunc(time_of_day) day
          FROM sales
         ) days;

The HOURS In-Line view gets the numbers 8 through 17 (the operating hours in HH24 format) and this is Cartesian joined (no join condition) to the DAYS In-Line view which returns a distinct set of the each day where even a single transaction occurred.

——
Note #1: Although our sample data includes only one day, the query should work for any number of days.
Note #2: I used the ALL_OBJECTS view here since we can be assured that this data dictionary view will always contain at least 10 rows. There are other ways to manufacture data sets of any size and you can learn about those here at SingleQuery.com.
——

To manufacture the missing zero-transaction row into the result set, outer join the “model day” query to the real data set using another set of In-Line Views (aliased as MDL and TXN). We will leave the new portions of the query in green throughout the rest of the example:

SELECT   mdl.model_day_hours,
         nvl(txn.total_txns,0) total_txns
FROM     
         (
          SELECT   TRUNC(time_of_day, ‘HH24′) hour_of_day,
                   count(*) total_txns
          FROM     sales
          GROUP BY TRUNC(time_of_day, ‘HH24′)
         ) txn,
         (
          SELECT  days.day+(hours.hour/24) model_day_hours
          FROM     (
                    SELECT 7+rownum hour
                    FROM   all_objects
                    WHERE  rownum <11
                   ) hours,
                   (
                    SELECT DISTINCT trunc(time_of_day) day
                    FROM sales
                   ) days
         ) mdl
WHERE    txn.hour_of_day(+) = mdl.model_day_hours;

This is starting to look a little complex! We have now introduced 4 in-line views with a Cartesian Product and an outer join. Of course, if you had an HOURS_OF_OPERATION table, you could have substituted it in place of the entire MDL In-Line view. But that’s not fun since we don’t learn anything about constructing the complex SingleQuery.

Adding the Daily and 3-hour Averages:

Although the result set from the above query contains the correct hourly rows, we have not much furthered our goal to calculate average transactions for the day as well as the three-hour averages. These calculations are preformed by using Oracle’s advanced analytic functions:

Daily Avg:     AVG(nvl(txn.total_txns,0))
               OVER (PARTITION BY day) daily_avg

3-hour Avg:    AVG(nvl(txn.total_txns,0))
               OVER (PARTITION BY day
                     ORDER BY mdl.model_day_hours
                     ROWS BETWEEN 1 PRECEDING
                     AND 1 FOLLOWING) hr_avg

The Daily Average is a relatively straightforward calculation that simply produces the average transactions per hour for the day. Each and every row of the result set will include this daily value. Notice the nvl(txn.total_txns,0) which is required to turn outer-joined rows from NULLs into usable zeros. The 3-hour average is calculated by “virtually ordering” the table by the hour of day and then applying a moving 3 hour window. The ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING clause automatically handles the ends of the days when the average is calculated using only two rows. These types of calculations only scratch the surface of what is available with advanced analytics; a full discussion of the nuances of the PARTITION_BY clause is beyond the scope of this article (but coming soon). These two additional columns, plus the time of day are added to the above query in green:

SELECT   mdl.model_day_hours,
         to_char(mdl.model_day_hours,'HH24') model_time,   <-- new column
         nvl(txn.total_txns,0) total_txns,
         AVG(nvl(txn.total_txns,0))                      <-- new column
           OVER (PARTITION BY day
               ORDER BY mdl.model_day_hours
               ROWS BETWEEN 1 PRECEDING
               AND 1 FOLLOWING) hr_avg,
         AVG(nvl(txn.total_txns,0))               <-- new column
           OVER (PARTITION BY day) daily_avg
FROM     
         (
          SELECT   TRUNC(time_of_day, ‘HH24′) hour_of_day,
                   count(*) total_txns
          FROM     sales
          GROUP BY TRUNC(time_of_day, ‘HH24′)
         ) txn,
         (
          SELECT  days.day+(hours.hour/24) model_day_hours
          FROM     (
                    SELECT 7+rownum hour
                    FROM   all_objects
                    WHERE  rownum <11
                   ) hours,
                   (
                    SELECT DISTINCT trunc(time_of_day) day
                    FROM sales
                   ) days
         ) mdl
WHERE    txn.hour_of_day(+) = mdl.model_day_hours;

Adding in the Rank and Returning the Final Result Set:

The last missing columns are the RANK() of the three-hour averages and the percentage difference of the current three-hour average to the daily average. Due to limitations of the RANK() analytic function, it is required to first calculate the three-hour average and then determine the ranks; calculating a RANK() on a PARTITION_BY will yield an Oracle error. To eliminate this problem, we employ yet one more in-line view and finally order the final result set:

SELECT   model_day_hours,
         model_time,
         total_txns,
         hr_avg,
         RANK() OVER ORDER BY hr_avg DESC) daily_rank,
         daily_avg,
         (hr_avg - daily_avg)/daily_avg*100 percent_diff
FROM
  (
   SELECT mdl.model_day_hours,
          to_char(mdl.model_day_hours,'HH24') model_time,
          nvl(txn.total_txns,0) total_txns,
          AVG(nvl(txn.total_txns,0))
            OVER (PARTITION BY day
                  ORDER BY mdl.model_day_hours
                  ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
                 ) hr_avg,
          AVG(nvl(txn.total_txns,0))
            OVER (PARTITION BY day) daily_avg
   FROM
     (
      SELECT   TRUNC(time_of_day, 'HH24') hour_of_day,
               count(*) total_txns
      FROM     sales
      GROUP BY TRUNC(time_of_day, 'HH24')
     ) txn,
     (
      SELECT days.day,
             days.day+(hours.hour/24) model_day_hours
      FROM
        (
         SELECT 7+rownum hour
         FROM   all_objects
         WHERE  rownum <11
        ) hours,
        (
         SELECT DISTINCT trunc(time_of_day) day
         FROM sales
        ) days
     ) mdl
   WHERE    txn.hour_of_day(+) = mdl.model_day_hours
  )
ORDER BY model_day_hours;

This truly is a complex SingleQuery but taken in its respective parts, not that difficult to understand or create. As with anything, creating a complex SingleQuery takes patience and practice. Might I also add that this query has not been optimized in any way. In the shorter part three, we will take this query and modify it yet again to run efficiently.

Continue To Part 3–>

The Pickle

The Art of Writing a Complex Oracle 10g SQL Query: Part 3

Published in Complex SQL, SQL by gpike Thursday April 19, 2007

<-- Back to Part 2

Finishing the SingleQuery: Adding Efficiency Alternatives

The query from Part II is a powerful tool to extract our desired result set, but it may not be the best way to perform the query in real-life circumstances. A SingleQuery must be both accurate and efficient to be a viable solution. The query (repeated below), presents several opportunities for improvement and this is the final step in the exercise.

Adding a WITH clause to eliminate multiple passes over the source table:

In the following query, please note the highlighted areas that both access the SALES table. In the first section, the query is grouping all of the individual transactions into days and includes totals. The second section simply creates the list of distinct days from SALES table to facilitate creation of the “model day” explained in Part II. An explain plan of this query shows that the SALES table will need two full table scans to complete, however the information needs for the second section is simply a subset of the information pulled for the first section:

SELECT   model_day_hours,
         model_time,
         total_txns,
         hr_avg,
         RANK() OVER (ORDER BY hr_avg DESC) daily_rank,
         daily_avg,
         (hr_avg - daily_avg)/daily_avg*100 percent_diff
FROM
  (
   SELECT mdl.model_day_hours,
          to_char(mdl.model_day_hours,'HH24') model_time,
          nvl(txn.total_txns,0) total_txns,
          AVG(nvl(txn.total_txns,0))
            OVER (PARTITION BY day
                  ORDER BY mdl.model_day_hours
                  ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
                 ) hr_avg,
          AVG(nvl(txn.total_txns,0))
            OVER (PARTITION BY day) daily_avg
   FROM
     (
      SELECT   TRUNC(time_of_day, ‘HH24′) hour_of_day,
               count(*) total_txns
      FROM     sales
      GROUP BY TRUNC(time_of_day, ‘HH24′)
     ) txn,
     (
      SELECT days.day,
             days.day+(hours.hour/24) model_day_hours
      FROM
        (
         SELECT 7+rownum hour
         FROM   all_objects
         WHERE  rownum <11
        ) hours,
        (
         SELECT DISTINCT trunc(time_of_day) day
         FROM sales
        ) days
     ) mdl
   WHERE    txn.hour_of_day(+) = mdl.model_day_hours
  )
ORDER BY model_day_hours;

As an alternative, the WITH clause (often refered to as Subquery Factoring) can be employed to select from SALES one time and use this aliased query throughout the rest of the statement. In the following example, the WITH clause and references to the new alias are indicated in red:


WITH txn AS
    (SELECT    /*+ parallel (s 2) */ TRUNC(time_of_day, ‘HH24′) hour_of_day,
               count(*) total_txns
      FROM     sales s
      GROUP BY TRUNC(time_of_day, ‘HH24′)
      )
SELECT   model_day_hours,
         model_time,
         total_txns,
         hr_avg,
         rank() OVER (ORDER BY hr_avg DESC) daily_rank,
         daily_avg,
         (hr_avg - daily_avg)/daily_avg*100 percent_diff
FROM
  (
   SELECT mdl.model_day_hours,
          to_char(mdl.model_day_hours,’HH24′) model_time,
          nvl(txn.total_txns,0) total_txns,
          AVG(nvl(txn.total_txns,0))
            OVER (PARTITION BY day
                  ORDER BY mdl.model_day_hours
                  ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
                 ) hr_avg,
          AVG(nvl(txn.total_txns,0))
            OVER (PARTITION BY day) daily_avg
   FROM txn,
     (
      SELECT days.day,
             days.day+(hours.hour/24) model_day_hours
      FROM
        (
         SELECT 7+rownum hour
         FROM   all_objects
         WHERE  rownum <11
        ) hours,
        (
         SELECT DISTINCT trunc(hour_of_day) day
         FROM txn
        ) days
     ) mdl
   WHERE    txn.hour_of_day(+) = mdl.model_day_hours
  )
ORDER BY 1;

Unfortunately, as efficiencies are added to queries, it can make them more difficult to decipher and it becomes critical to maintain proper comments for future developers (and for the author!).

Other Query Changes and Alternative Tuning Techniques:

There are several possible changes that an be made to improve the response time of this SingleQuery. In tests with 10 million rows, response times of less than 5 seconds were obtained using a combination of the following techniques:

1) Replace the calculation of the model day with a table of operating hours.
2) Replace the selection of distinct days with a materialzied view.
3) Add a /*+ parallel */ hint to improve the full table scan of the SALES table.
4) Cache the SALES table into memory.

The Pickle

HTML Form Tricks and Flexible Parameter Passing in Oracle Portal

Published in Oracle Portal by gpike Sunday April 1, 2007

Overview: Flexible Parameter Passing and Eliminating Page Re-Rendering

If you have ever created custom HTML forms and deployed these into Oracle Portal as portlets with PL/SQL or JSPs, you know that submitting the forms requires the use of Page Parameters to facilitate passing each individual variable back to the portlet for processing. As an additional irritant, the entire Portal Page is refreshed upon form submission. Fortunately, there is a better way to submit complex HTML forms to portlets that bypasses the need for dozens of Page Parameters and allows only the portlet that displays the HTML form to be re-rendered.

To explain this concept, assume that a Portal Page contains a Dynamic Page portlet that executes an Oracle PL/SQL procedure or a URL portlet that runs a JSP. For PL/SQL, a preparatory step is required, so Java programmers can skip down to the <IFRAME> Alternate Example section below.

Step 1: Adding PL/SQL Flexible Parameter Passing

Assume that our form gathers personal information for a specified user (with the user identified by the parameter “pUserID”) and submits the data back to the portlet to save into the database. The PL/SQL procedure at the very least needs to accept the “pUserID” as an input parameter:

PROCEDURE getUserData(pUserID IN NUMBER) IS....

A Portal Page parameter is also needed to pass the “pUserID” from the URL into the portlet. But we also want this portlet (a PL/SQL Server Page) to process the submitted HTML form. That will require the addition of each element from the HTML form as a parameter and perhaps a “pAction” parameter to tell the procedure if its displaying the form for the first time or processing the submitted results. Our procedure now looks like:

PROCEDURE getUserData(
                      pUserID       IN NUMBER,
                      pAction       IN NUMBER,
                      pAddressLine1 IN VARCHAR2,
                      pAddressLine2 IN VARCHAR2,
                      pCity         IN VARCHAR2,
                      pState        IN VARCHAR2
                      ...and many more parameters!...
                     );

This set of parameters could go on for quite a while and each one requires a corresponding Portal Page parameter to forward the values to the portlet. No thanks! An alternate approach first starts with replacing individual parameters in the procedure with PL/SQL Flexible Parameter Passing. With Flexible Parameter Passing, name/value array pairs automatically store any number of submitted form elements. With Flexible Parameter Passing, our procedure looks much more simple and becomes much more powerful:

PROCEDURE getUserData(
      name_array  in PORTAL.wwsec_api.array DEFAULT PORTAL.wwsec_api.empty,
      value_array in PORTAL.wwsec_api.array DEFAULT PORTAL.wwsec_api.empty
);

A special note: You must call your flexible parameter arrays “name_array” and “value_array”. Please also see the Oracle documentation for more information.

If the submitted HTML form contains 100 elements, the name_array and value_array are populated with 100 rows each. The name_array contains the text from name field in each HTML form element while the value_array contains the value from the form element. The “pUserID” and “pAction” variables can be passed along as well by using hidden form elements. While this all looks great, how can a PL/SQL procedure with arrays as input parameters be used in a Portal Dynamic Page? It can’t, of course!

Step 2: Create a Wrapper Procedure

The goal of the wrapper procedure is to take only those parameters that are absolutely necessary to render the form the first time and forward these to the getUserData() procedure. In this case, to render a blank HTML form we only need to pass the “pUserID” and “pAction” parameters. The wrapper procedure acts as an intermediary between the Portal and the getUserData() procedure. Since the wrapper accepts regular NUMBER data types, the wrapper can be used when defining the Dynamic Page:

PROCEDURE render_portlet (
                          pAction In NUMBER,
                          pUserID IN NUMBER
                         ) IS

    name_array   PORTAL.wwsec_api.array default PORTAL.wwsec_api.empty;
    value_array  PORTAL.wwsec_api.array default PORTAL.wwsec_api.empty;

BEGIN
    -- Set up the name and value arrays
    name_array(1)  := 'pAction';
    value_array(1) := pAction;
    name_array(2)  := 'pUserID';
    value_array(2) := pUserID;

    -- Call the procedure to render the form
    getUserData(name_array, value_array);

END;

Step 3: Submitting the Form

The last step to make everything work properly and eliminate the need to re-render the entire Portal page when the form is submitted involves simply changing the submission URL:

/portal/pls/portal/!your_schema.render_form

Please especially note the existence of the strange looking “!” character in this URL. This informs the mod_plsql engine that Flexible Parameter Passing should be used. It causes the server to package up the contents of the URL into name/value arrays prior to calling the getUserData() procedure. Without the “!”, the procedure would be called with individually-named parameters causing an execution time error.

Since portlets are themselves rendered in <IFRAMES>s, when this HTML form is submitted, only that portlet will be re-rendered. Mission accomplished!

Java (and PL/SQL) Alternate Approach: Wrapping procedures in an <IFRAME>

For Java and PL/SQL programmers, the elimination of excess Page Parameters and stopping the submitted HTML form from re-rendering the entire Portal page can be accomplished by wrapping the procedure in an <IFRAME> (Java programmers may wish to review Step 2 above to understand the wrapper concept). As in the the previous example, the wrapper procedure takes in only those parameters necessary to create the initial HTML form. But rather than calling the procedure directly, the wrapper returns an <IFRAME> to the browser that in turn calls the getUserData() procedure:

PROCEDURE render_portlet_iframe (
                                 pAction In NUMBER,
                                 pUserID IN NUMBER
                                ) IS

    iframeUrl VARCHAR2(100) :=
            '/portal/pls/portal/!your_schema.render_form?’;

BEGIN

    iframeUrl := iframeUrl ||’pAction=’||pAction||’&pUserID=’||pUserID;
    htp.p(’<iframe src=”‘||iframeUrl||’”></iframe>’);

END;

Although not shown here, a wrapping JSP would perform the same steps accept only those parameters necessary to render the initial HTML form and send the browser an <IFRAME> that calls a separate JSP to render the actual form. Submitting the form to the form-rendering JSP will only cause the contents of the <IFRAME> to refresh. Mission accomplished yet again!

Some of these concepts may need additional detail so please don’t hesitate to comment on this article or send a question to SingleQuery from this page’s header. Happy coding.

The Pickle

78 queries. 0.725 seconds.
Powered by Wordpress
theme by cmoanz