Sunday, July 10, 2011

To Sum It Up...

ROLLUP, CUBE, GROUPING SETS and GROUPING can be used to subtotal and total aggregates based on the values in the GROUP BY clause.

ROLLUP generates subtotals and totals (for example, salary by departments, and the total salary for the report).  If there is more than one field listed in the GROUP BY ROLLUP clause, the subtotals move from right to left.

This shows the basic 'GROUP BY' which will give subtotals for the dept, but no grand total.
SELECT DeptNo
     , SUM(Sal)
  FROM Emp
 WHERE DeptNo <> 10
 GROUP BY DeptNo;

DEPTNO   SUM(SAL)
------ ----------
    30       9400
    20      10875
If 'GROUP BY' is changed to 'GROUP BY ROLLUP' it adds a grand total to the subtotals.
SELECT DeptNo
     , SUM(Sal)
  FROM Emp
 WHERE DeptNo <> 10
 GROUP BY ROLLUP (DeptNo);

DEPTNO   SUM(SAL)
------ ----------
    20      10875
    30       9400
            20275
The GROUPING function can be used to note which lines are subtotals or totals.

This first example is used to show the settings of GROUPING.
SELECT DeptNo
     , SUM(Sal)
     , GROUPING(DeptNo)
     , DECODE(
              GROUPING(DeptNo) -- Decode the GROUPING value for DeptNo
             ,1                -- If it is 1,
             ,'Total'          -- Then print 'Total'
             ,DeptNo           -- otherwise, print the actual DeptNo
             ) as Dept_Tot     -- call this column 'Dept_Tot'
  FROM Emp
 WHERE DeptNo <> 10
 GROUP BY ROLLUP (DeptNo);

DEPTNO   SUM(SAL) GROUPING(DEPTNO) DEPT_TOT
------ ---------- ---------------- -------------
    20      10875                0 20
    30       9400                0 30
            20275                1 Total
As you can see, when the GROUPING value for the DeptNo is '1,' this is a total. Otherwise, this is the actual DeptNo number.

Obviously, we wouldn't want our report to list DeptNo twice, so I would move this around to the front. And of course, the GROUPING value doesn't need to be displayed at all.
Column Dept_Tot Format A13

SELECT DECODE(GROUPING(DeptNo),1,'Total',DeptNo) as Dept_Tot
     , SUM(Sal)
  FROM Emp
 WHERE DeptNo <> 10
 GROUP BY ROLLUP (DeptNo);

DEPT_TOT        SUM(SAL)
------------- ----------
20                 10875
30                  9400
Total              20275
Here's a query where we want subtotals based on two values. Since ROLLUP goes right to left, the first subtotal will actually be the GROUP BY for the Job. After that, comes the DeptNo subtotal, and then the Grand Total.  This example, doesn't decode the GROUPING values.
SELECT DeptNo
     , Job
     , SUM(Sal)
  FROM Emp
 WHERE DeptNo <> 10
 GROUP BY ROLLUP (DeptNo
                 ,Job
                 );

DEPTNO JOB         SUM(SAL)
------ --------- ----------
    20 CLERK           1900
    20 ANALYST         6000
    20 MANAGER         2975
    20                10875
    30 CLERK            950
    30 MANAGER         2850
    30 SALESMAN        5600
    30                 9400
                      20275
Now, lets add the headings into this report using DECODE.

Since the GROUP BY has already subtotaled the job, When the GROUPING value of Job is  set to '1', it will actually be for the higher group, which is the department.

When the GROUPING value of DeptNo is '1,' it is for the higher group, which is the entire report.
Column Job_Tot Format A16

SELECT DECODE(GROUPING(DeptNo),1,'* Grand Total',DeptNo) as Dept_Tot
     , DECODE(GROUPING(Job   ),1,'* Dept Total' ,Job   ) as Job_Tot
     , SUM(Sal)
  FROM Emp
 WHERE DeptNo <> 10
 GROUP BY ROLLUP (DeptNo
                 ,Job
                 );

DEPT_TOT      JOB_TOT            SUM(SAL)
------------- ---------------- ----------
20            CLERK                  1900
20            ANALYST                6000
20            MANAGER                2975
20            * Dept Total          10875
30            CLERK                   950
30            MANAGER                2850
30            SALESMAN               5600
30            * Dept Total           9400
* Grand Total * Dept Total          20275
This is close to what I want, but I don't want it to say 'Dept Total' on the Grand total line.  I'm going to switch from DECODE to CASE because I think it's going to be easier to understand.

Since I've put the DeptNo in the Total line, I'm going to drop that column.
SELECT Case When GROUPING(Job)    = 1
             And Grouping(DeptNo) = 0
            Then '* Total: Dept '||DeptNo
            --
            When GROUPING(Job)    = 1
             And GROUPING(DeptNo) = 1
            Then '** Grand Total'
            Else Job
        End    as Job_Tot
     , SUM(Sal)
  FROM Emp
 WHERE DeptNo <> 10
 GROUP BY ROLLUP (DeptNo
                 ,Job
                 );

JOB_TOT            SUM(SAL)
---------------- ----------
CLERK                  1900
ANALYST                6000
MANAGER                2975
* Total: Dept 20      10875
CLERK                   950
MANAGER                2850
SALESMAN               5600
* Total: Dept 30       9400
** Grand Total        20275

CUBE: Cross-Tabulation


The Cube statement does a cross-tabulation, which means every possible combinations of rows is aggregated.  I'm going to narrow down the output based on job to make it simpler to view.  Also, I'm going to display the GROUPING values for the two columns.
SELECT DeptNo
     , Grouping(DeptNo)
     , Job
     , Grouping(Job)
     , SUM(Sal)
  FROM Emp
 WHERE DeptNo <> 10
   AND Job In ('MANAGER','CLERK')
 GROUP BY CUBE (DeptNo
               ,Job
               );
As shown, there are four combinations of the GROUPING values.
DEPTNO GROUPING(DEPTNO) JOB       GROUPING(JOB)   SUM(SAL)
------ ---------------- --------- ------------- ----------
                      1                       1       8675 <-- grand
                      1 CLERK                 0       2850 <-- all clerks
                      1 MANAGER               0       5825 <-- all managers
    20                0                       1       4875 <-- dept 20
    20                0 CLERK                 0       1900 <-- clerks in dept 20
    20                0 MANAGER               0       2975 <-- managers in dept 20
    30                0                       1       3800 <-- dept 30
    30                0 CLERK                 0        950 <-- clerks in dept 30
    30                0 MANAGER               0       2850 <-- managers in dept 30
Based on that, we can write a total statement about each condition showing the actual values. It's padded with dots to the right.
Column Tot_Hdr Format A50

SELECT Rpad(
            Case When Grouping(DeptNo) = 1
                  And Grouping(Job)    = 1
                 Then '*** Grand Total'
                 --
                 When Grouping(DeptNo) = 1
                  And Grouping(Job)    = 0
                 Then '    ** Total for All '||Job ||'s'
                  --
                 When Grouping(DeptNo) = 0
                  And Grouping(Job)    = 1
                 Then '    ** Total for dept '|| DeptNo
                  --
                 When Grouping(DeptNo) = 0
                  And Grouping(Job)    = 0
                 Then ' * Total for '|| Job || 's in Dept ' || DeptNo
            End
           ,50  -- the end of Rpad
           ,'. '
           )               As Tot
       --------------------------------------------------------
     , To_Char(
               Sum(Sal)
              ,'$9,999.99'
              )            As Tot_Sal
  FROM Emp
 WHERE DeptNo <> 10               -- narrow down this example
   AND Job In ('MANAGER','CLERK') -- narrow down this example
 GROUP BY Cube (DeptNo
               ,Job
               );

TOT                                                          TOT_SAL
------------------------------------------------------------ ----------
*** Grand Total. . . . . . . . . . . . . . . . . .            $8,675.00
    ** Total for All CLERKs. . . . . . . . . . . .            $2,850.00
    ** Total for All MANAGERs. . . . . . . . . . .            $5,825.00
    ** Total for dept 20. . . . . . . . . . . . .             $4,875.00
       * Total for CLERKs in Dept 20. . . . . . .             $1,900.00
       * Total for MANAGERs in Dept 20. . . . . .             $2,975.00
    ** Total for dept 30. . . . . . . . . . . . .             $3,800.00
       * Total for CLERKs in Dept 30. . . . . . .               $950.00
       * Total for MANAGERs in Dept 30. . . . . .             $2,850.00
Here's the same type of thing with three fields:
SELECT DeptNo
     , Grouping(DeptNo)
     , Job
     , Grouping(Job)
     , To_Char(Hiredate,'YYYY') As HireYr 
     , Grouping(To_Char(Hiredate,'YYYY')) As Grouping_Yr
     , SUM(Sal)
  FROM Emp
 WHERE DeptNo <> 10
   AND Job In ('MANAGER','CLERK')
   AND To_Char(Hiredate,'YYYY') In ('1981','1987') -- narrow down this example
 GROUP BY Cube (DeptNo
               ,Job
               ,To_Char(Hiredate,'YYYY')
               );

DEPTNO GROUPING(DEPTNO) JOB       GROUPING(JOB) HIRE GROUPING_Yr SUM(SAL)
------ ---------------- --------- ------------- ---- ----------- --------
                      1                       1               1     8675
                      1                       1 1981          0     6775
                      1                       1 1987          0     1100
                      1 CLERK                 0               1     2850
                      1 CLERK                 0 1981          0      950
                      1 CLERK                 0 1987          0     1100
                      1 MANAGER               0               1     5825
                      1 MANAGER               0 1981          0     5825
    20                0                       1               1     4875
    20                0                       1 1981          0     2975
    20                0                       1 1987          0     1100
    20                0 CLERK                 0               1     1900
    20                0 CLERK                 0 1987          0     1100
    20                0 MANAGER               0               1     2975
    20                0 MANAGER               0 1981          0     2975
    30                0                       1               1     3800
    30                0                       1 1981          0     3800
    30                0 CLERK                 0               1      950
    30                0 CLERK                 0 1981          0      950
    30                0 MANAGER               0               1     2850
    30                0 MANAGER               0 1981          0     2850

Column Tot format A60

SELECT RPad(
            Case When Grouping(Deptno)                   = 1
                  And Grouping(Job)                      = 1
                  And Grouping(To_Char(Hiredate,'YYYY')) = 1
                 Then '****Grand Total'
                 ----------------------
                 -- Those with two '1's
                 ----------------------
                 When Grouping(Deptno)                   = 1
                  And Grouping(Job)                      = 1
                  And Grouping(To_Char(Hiredate,'YYYY')) = 0
                 Then '    ***Total of people hired in '||To_Char(Hiredate,'YYYY')
                 --
                 When Grouping(Deptno)                   = 1
                  And Grouping(Job)                      = 0
                  And Grouping(To_Char(Hiredate,'YYYY')) = 1
                 Then '    ***Total for '||InitCap(Job)||'s'
                 --
                 When Grouping(Deptno)                   = 0
                  And Grouping(Job)                      = 1
                  And Grouping(To_Char(Hiredate,'YYYY')) = 1
                 Then '    ***Total for Department '||DeptNo
                 ---------------------
                 -- Those with one '1'
                 ---------------------
                 When Grouping(Deptno)                   = 1
                  And Grouping(Job)                      = 0
                  And Grouping(To_Char(Hiredate,'YYYY')) = 0
                 Then ' **Total for '||InitCap(Job)||'s hired in '
                                  ||To_Char(Hiredate,'YYYY')
                 --
                 When Grouping(Deptno)                   = 0
                  And Grouping(Job)                      = 1
                  And Grouping(To_Char(Hiredate,'YYYY')) = 0
                 Then ' **Total for Department '||DeptNo||' hired in '
                                      ||To_Char(Hiredate,'YYYY')
                 --
                 When Grouping(Deptno)                   = 0
                  And Grouping(Job)                      = 0
                  And Grouping(To_Char(Hiredate,'YYYY')) = 1
                 Then ' **Total for Department '||DeptNo||'''s '||InitCap(Job)||'s'
                 ---------------------
                 -- Those with no '1'
                 ---------------------
                 When Grouping(Deptno)                   = 0
                  And Grouping(Job)                      = 0
                  And Grouping(To_Char(Hiredate,'YYYY')) = 0
                 Then '   *Total for Department '||DeptNo||'''s '||InitCap(Job)
                                    ||'s hired in '||To_Char(Hiredate,'YYYY')
             End
           ,60     -- the end of the RPad
           ,'. '
           )               As Tot
      ---------------------------------------------------------------------------------
     , To_Char(
               Sum(Sal)
              ,'$9,999.99'
              )           As Tot_Sal
  FROM Emp
 WHERE DeptNo <> 10                                -- narrow down this example
   AND Job In ('MANAGER','CLERK')                  -- narrow down this example
   AND To_Char(Hiredate,'YYYY') In ('1981','1987') -- narrow down this example
 GROUP BY Cube (DeptNo
           ,Job
           ,To_Char(Hiredate,'YYYY')
           );

TOT                                                          TOT_SAL
------------------------------------------------------------ ----------
****Grand Total. . . . . . . . . . . . . . . . . . . . . . .  $7,875.00
    ***Total of people hired in 1981. . . . . . . . . . . .   $6,775.00
    ***Total of people hired in 1987. . . . . . . . . . . .   $1,100.00
    ***Total for Clerks. . . . . . . . . . . . . . . . . . .  $2,050.00
       **Total for Clerks hired in 1981. . . . . . . . . . .    $950.00
       **Total for Clerks hired in 1987. . . . . . . . . . .  $1,100.00
    ***Total for Managers. . . . . . . . . . . . . . . . . .  $5,825.00
       **Total for Managers hired in 1981. . . . . . . . . .  $5,825.00
    ***Total for Department 20. . . . . . . . . . . . . . .   $4,075.00
       **Total for Department 20 hired in 1981. . . . . . .   $2,975.00
       **Total for Department 20 hired in 1987. . . . . . .   $1,100.00
       **Total for Department 20's Clerks. . . . . . . . . .  $1,100.00
         *Total for Department 20's Clerks hired in 1987. .   $1,100.00
       **Total for Department 20's Managers. . . . . . . . .  $2,975.00
         *Total for Department 20's Managers hired in 1981.   $2,975.00
    ***Total for Department 30. . . . . . . . . . . . . . .   $3,800.00
       **Total for Department 30 hired in 1981. . . . . . .   $3,800.00
       **Total for Department 30's Clerks. . . . . . . . . .    $950.00
         *Total for Department 30's Clerks hired in 1981. .     $950.00
       **Total for Department 30's Managers. . . . . . . . .  $2,850.00
         *Total for Department 30's Managers hired in 1981.   $2,850.00

GROUPING SETS

GROUPING SETS allows multiple 'GROUP BY' clauses in the same result set. It does not display all the totals shown in the rollup and cube.
This could be a large savings in processing if it works to grab both sets of data at the same time instead of making two passes through the database.
SELECT DeptNo
     , Job
     , To_Char(Hiredate,'YYYY') As HireYr
     , SUM(Sal)
  FROM Emp
 WHERE DeptNo <> 10                                -- narrow down this example
   AND Job In ('MANAGER','CLERK')                  -- narrow down this example
   AND To_Char(Hiredate,'YYYY') In ('1981','1987') -- narrow down this example
 GROUP BY GROUPING SETS ((DeptNo, Job)
                        ,(Job   , To_Char(Hiredate,'YYYY'))
                        );

DEPTNO JOB       HIRE   SUM(SAL)
------ --------- ---- ----------
    20 MANAGER              2975
    20 CLERK                1100
    30 CLERK                 950
    30 MANAGER              2850
       CLERK     1981        950
       MANAGER   1981       5825
       CLERK     1987       1100
This is the same as the following two queries (in output -- not processing time).
SELECT DeptNo
     , Job
     , SUM(Sal)
  FROM Emp
 WHERE DeptNo <> 10                     
   AND Job In ('MANAGER','CLERK')       
 GROUP BY DeptNo, Job;

    DEPTNO JOB         SUM(SAL)
---------- --------- ----------
        20 CLERK           1900
        20 MANAGER         2975
        30 CLERK            950
        30 MANAGER         2850

SELECT  Job
     , To_Char(Hiredate,'YYYY') As HireYr
     , SUM(Sal)
  FROM Emp
 WHERE Job In ('MANAGER','CLERK')       
   AND To_Char(Hiredate,'YYYY') In ('1981','1987') 
 GROUP BY Job, To_Char(Hiredate,'YYYY');

JOB       HIRE   SUM(SAL)
--------- ---- ----------
CLERK     1981        950
MANAGER   1981       8275
CLERK     1987       1100

No comments:

Post a Comment