Oracle has some neat ways of handling hierarchical data.
A single Oracle SQL can find the entire tree of data above or below a node in a hierarchy using the CONNECT BY clause.
What if the requirement is to flatten hierarchical data?
For example, given a table containing the employee reporting hierarchy of an organization (image alongside), get a single SQL to return the 4 types of employee roles as columns:
- PROJECT_MANAGER (Level 1)
- DBA (Level 2)
- TEAM_LEAD (Level 2)
- TEAM_MEMBER (Level 3)
This is trickier than transposing rows as columns in a non-hierarchical table, and needs a little extra to get the result. But it is nowhere near as tough as it appears. No multiple self-joins, no recursive CTE.
This post will show you how to flatten hierarchical data using a single SQL in Oracle.
To build the SQL, let’s populate the hierarchical table EMPL with some data, as below.
Simple SELECT on the data:
SQL> -- Data in EMPL table SQL> select * from empl; EMP_NAME EMP_ROLE MANAGER_EMP_NA ---------------- --------------- -------------- Peter Matthews PROJECT_MANAGER Tom Ledford TEAM_LEAD Peter Matthews Timothy Leigh TEAM_LEAD Peter Matthews David Botham DBA Peter Matthews Trevor Malcolm TEAM_MEMBER Tom Ledford Toby Mayer TEAM_MEMBER Tom Ledford Trevor McDermott TEAM_MEMBER Timothy Leigh Thor Martin TEAM_MEMBER Timothy Leigh 9 rows selected.
The expected result:
PROJECT_MANAGER DBA TEAM_LEAD TEAM_MEMBER --------------- --------------- --------------- --------------- Peter Matthews Peter Matthews Tom Ledford Peter Matthews Tom Ledford Trevor Malcolm Peter Matthews Tom Ledford Toby Mayer Peter Matthews Timothy Leigh Peter Matthews Timothy Leigh Trevor McDermott Peter Matthews Timothy Leigh Thor Martin Peter Matthews David Botham
We get to the expected result using a combination of SYS_CONNECT_BY_PATH and regular expressions.
The Magic SQL
SQL> select emplevel1 project_manager 2 , decode(rolelevel2, 'DBA', emplevel2) dba 3 , decode(rolelevel2, 'TEAM_LEAD', emplevel2) team_lead 4 , emplevel3 team_member 5 from 6 ( 7 select 8 regexp_substr(role_path, '[^/]+', 1, 1) rolelevel1 9 , regexp_substr(emp_path, '[^/]+', 1, 1) emplevel1 10 , regexp_substr(role_path, '[^/]+', 1, 2) rolelevel2 11 , regexp_substr(emp_path, '[^/]+', 1, 2) emplevel2 12 , regexp_substr(role_path, '[^/]+', 1, 3) rolelevel3 13 , regexp_substr(emp_path, '[^/]+', 1, 3) emplevel3 14 from 15 ( 16 select 17 sys_connect_by_path (emp_role, '/') role_path 18 , sys_connect_by_path (emp_name, '/') emp_path 19 from empl 20 start with manager_emp_name is null 21 connect by prior 22 emp_name = manager_emp_name 23 ) 24 ); PROJECT_MANAGER DBA TEAM_LEAD TEAM_MEMBER --------------- --------------- --------------- --------------- Peter Matthews Peter Matthews Tom Ledford Peter Matthews Tom Ledford Trevor Malcolm Peter Matthews Tom Ledford Toby Mayer Peter Matthews Timothy Leigh Peter Matthews Timothy Leigh Trevor McDermott Peter Matthews Timothy Leigh Thor Martin Peter Matthews David Botham 8 rows selected.
Making Sense of That SQL
1. The first step is to concatenate hierarchical data in a single “/”-delimited string. This gives us:
SQL>-- Concatenate hierarchical data SQL>-- in a single /-delimited string SQL>select 2 sys_connect_by_path (emp_role, '/') role_path 3 , sys_connect_by_path (emp_name, '/') emp_path 4 from empl 5 start with manager_emp_name is null 6 connect by prior 7 emp_name = manager_emp_name; ROLE_PATH EMP_PATH -------------------------------------- ----------------------------------------- /PROJECT_MANAGER /Peter Matthews /PROJECT_MANAGER/TEAM_LEAD /Peter Matthews/Tom Ledford /PROJECT_MANAGER/TEAM_LEAD/TEAM_MEMBER /Peter Matthews/Tom Ledford/Trevor Malcolm /PROJECT_MANAGER/TEAM_LEAD/TEAM_MEMBER /Peter Matthews/Tom Ledford/Toby Mayer /PROJECT_MANAGER/TEAM_LEAD /Peter Matthews/Timothy Leigh /PROJECT_MANAGER/TEAM_LEAD/TEAM_MEMBER /Peter Matthews/Timothy Leigh/Trevor McDermott /PROJECT_MANAGER/TEAM_LEAD/TEAM_MEMBER /Peter Matthews/Timothy Leigh/Thor Martin /PROJECT_MANAGER/DBA /Peter Matthews/David Botham 9 rows selected.
So employee names are concatenated in order of reporting hierarchy, as emp_path = emp1/emp2/…/empN, where 1 is the topmost node in the hierarchy. Similarly, the reporting path is role_path = role1/role2/…/roleN
2. Next, separate out the elements from the string using regular expressions. The Nth match of “/” in the strings role_path and emp_path finds the values for text in rolelevelN and emplevelN.
SQL> -- Regexp to separate out discrete roles and SQL> -- employees, by level SQL> select 2 regexp_substr(role_path, '[^/]+', 1, 1) rolelevel1 3 , regexp_substr(emp_path, '[^/]+', 1, 1) emplevel1 4 , regexp_substr(role_path, '[^/]+', 1, 2) rolelevel2 5 , regexp_substr(emp_path, '[^/]+', 1, 2) emplevel2 6 , regexp_substr(role_path, '[^/]+', 1, 3) rolelevel3 7 , regexp_substr(emp_path, '[^/]+', 1, 3) emplevel3 8 from 9 ( 10 -- Concatenate hierarchical data 11 -- in a single /-delimited string 12 select 13 sys_connect_by_path (emp_role, '/') role_path 14 , sys_connect_by_path (emp_name, '/') emp_path 15 from empl 16 start with manager_emp_name is null 17 connect by prior 18 emp_name = manager_emp_name 19 ); ROLELEVEL1 EMPLEVEL1 ROLELEVEL2 EMPLEVEL2 ROLELEVEL3 EMPLEVEL3 --------------- --------------- ---------- ------------- ----------- ---------------- PROJECT_MANAGER Peter Matthews PROJECT_MANAGER Peter Matthews TEAM_LEAD Tom Ledford PROJECT_MANAGER Peter Matthews TEAM_LEAD Tom Ledford TEAM_MEMBER Trevor Malcolm PROJECT_MANAGER Peter Matthews TEAM_LEAD Tom Ledford TEAM_MEMBER Toby Mayer PROJECT_MANAGER Peter Matthews TEAM_LEAD Timothy Leigh PROJECT_MANAGER Peter Matthews TEAM_LEAD Timothy Leigh TEAM_MEMBER Trevor McDermott PROJECT_MANAGER Peter Matthews TEAM_LEAD Timothy Leigh TEAM_MEMBER Thor Martin PROJECT_MANAGER Peter Matthews DBA David Botham 8 rows selected.
In this data set, we have a decision to make at Level 2, since the DBA as well as the TEAM_LEAD report to the PROJECT_MANAGER. Level 1 can always be taken as PROJECT_MANAGER, Level 3 as TEAM_MEMBER.
As with PIVOT queries on non-hierarchical data, here too we need to be sure that the column headers of flattened data are fixed.
3. Finally, make an inline view of the SQL above, and DECODE on rolelevelN to choose the right column headers.
SQL> -- Transposed columns by rolelevelN SQL> select emplevel1 project_manager 2 , decode(rolelevel2, 'DBA', emplevel2) dba 3 , decode(rolelevel2, 'TEAM_LEAD', emplevel2) team_lead 4 , emplevel3 team_member 5 from 6 ( 7 -- Regexp to separate out discrete roles and 8 -- employees, by level 9 select 10 regexp_substr(role_path, '[^/]+', 1, 1) rolelevel1 11 , regexp_substr(emp_path, '[^/]+', 1, 1) emplevel1 12 , regexp_substr(role_path, '[^/]+', 1, 2) rolelevel2 13 , regexp_substr(emp_path, '[^/]+', 1, 2) emplevel2 14 , regexp_substr(role_path, '[^/]+', 1, 3) rolelevel3 15 , regexp_substr(emp_path, '[^/]+', 1, 3) emplevel3 16 from 17 ( 18 -- Concatenate hierarchical data 19 -- in a single /-delimited string 20 select 21 sys_connect_by_path (emp_role, '/') role_path 22 , sys_connect_by_path (emp_name, '/') emp_path 23 from empl 24 start with manager_emp_name is null 25 connect by prior 26 emp_name = manager_emp_name 27 ) 28 ); PROJECT_MANAGER DBA TEAM_LEAD TEAM_MEMBER --------------- --------------- --------------- --------------- Peter Matthews Peter Matthews Tom Ledford Peter Matthews Tom Ledford Trevor Malcolm Peter Matthews Tom Ledford Toby Mayer Peter Matthews Timothy Leigh Peter Matthews Timothy Leigh Trevor McDermott Peter Matthews Timothy Leigh Thor Martin Peter Matthews David Botham 8 rows selected.