How to convert Oracle aggregate functions, which contain PARTITION BY and ORDER BY clauses
In Oracle, you can write a SELECT statement with aggregate functions (like MAX, MIN or AVG) over a partitioned query result set and specify how data are ordered within a partition. Such queries have the following format:
SELECT column_list,
Aggr_fun(column_name) OVER (PARTITION BY column_name ORDER BY column_name)
AS alias_name
FROM table_name;
SQL Server allows selecting data using aggregate functions with PARTITION BY clause too, but there are differences in the syntax which should be taken into account during the conversion.
The point is that SQL Server does not support ORDER BY clause in this context. Besides, ORDER BY uses a sliding window concept to determine the range of rows involved in calculations for the current row. This concept has no counterpart in SQL Server. In Oracle, a range of rows defined by the window can be ordered according to options specified in the ORDER BY clause. Thus, NULL values can be treated differently and you can specify if they should appear first or last in the ordering sequence. This means that rows can be sorted differently within a partition in Oracle and SQL Server. The difference in default sorting of NULL values lead not only to different order of rows but also to wrong aggregation results.
The next workaround is dealing with sorting rows within a partition. For now, we ignore the differences in the handling of NULLs. This approach is based on comparing column values.
1) Emulation of the ascending ordering
The first and the simplest case is when ORDER BY expression consists of one column. You can emulate the ascending ordering of the field by the following scheme.
SELECT ….,
(SELECT Aggr_fun(column) FROM TABLE P1
WHERE
P1.COL1 <= P0.COL1
and
P1.COL_val1 = P0.COL_val1
and
P1.COL_val2 = P0.COL_val2 …
)
FROM TABLE P0
When ORDER BY expression consists of multiple columns we should specify a Boolean condition which describes the sorting algorithm. This condition should be used instead of the “P1.COL1 <= P0.COL1” expression in the WHERE clause from the previous case. Thus, if an ORDER BY expression consists of two columns (col1, col2) the condition will be:
WHERE
1=CASE
WHEN p1.COL1 > p0.COL1 THEN 0
WHEN p1.COL1 < p0.COL1 THEN 1
WHEN p1.COL1 = p0.COL1 THEN
CASE
WHEN p1.COL2 > p0.COL2 THEN 0
WHEN p1.COL2 <= p0.COL2 THEN 1
END
END
The examples to clarify this approach will be given below. The CASE expression will become more and more complex when the number of columns grows. The way to build this construction is to insert
WHEN p1.col_n = p0.col_n THEN
CASE
WHEN p1.col_n+1 > p0.col_n+1 THEN 0
WHEN p1.col_n+1 <= p0.col_n+1 THEN 1
END
into the most internal CASE operator when the next column appears in ORDER BY.
2) Emulation of the descending ordering
In the case when ORDER BY expression consists of one column, it is enough to change the inequality operator to reverse the order in SQL Server. That is, the expression “P1.COL1 <= P0.COL1” should be changed to “P1.COL1 >= P0.COL1”.
When ORDER BY expression consists of multiple columns, it is enough to change the inequality signs to the opposite in the corresponding level of the CASE operator.
Now, I will give some examples of such conversions. First, let us consider the case of using one column in ORDER BY clause. Note that ordering is ascending by default in Oracle.
Oracle:
SELECT ACC_PERIOD, EMP_LOAN_CODE,EMP_CODE,
sum(CR) over (partition by ACC_PERIOD order by EMP_LOAN_CODE)
as total
from EMP_LOANS_ALLSTATMENT;
SQL Server:
SELECT ACC_PERIOD, EMP_LOAN_CODE,
(SELECT SUM(CR) FROM EMP_LOANS_ALLSTATMENT p1 WHERE
p1.EMP_LOAN_CODE <= p0.EMP_LOAN_CODE
and p1.ACC_PERIOD = p0.ACC_PERIOD) AS total
FROM EMP_LOANS_ALLSTATMENT p0
When we add an additional column in ORDER BY clause, for example, the previous OVER clause in Oracle query will be “SUM(CR) over (partition by ACC_PERIOD order by EMP_LOAN_CODE, EMP_CODE)”, then the query should be converted as follows:
SELECT ACC_PERIOD,EMP_LOAN_CODE, EMP_CODE,
(
SELECT SUM(CR) FROM EMP_LOANS_ALLSTATMENT p1
WHERE
1=CASE
WHEN p1.EMP_LOAN_CODE>p0.EMP_LOAN_CODE THEN 0
WHEN p1.EMP_LOAN_CODE
WHEN p0.EMP_LOAN_CODE=p1.EMP_LOAN_CODE THEN
CASE
WHEN p1.EMP_CODE>p0.EMP_CODE THEN 0
WHEN p1.EMP_CODE<=p0.EMP_CODE THEN 1
END
END
and p1.ACC_PERIOD = p0.ACC_PERIOD
) AS total
FROM EMP_LOANS_ALLSTATMENT p0
Now, let us consider the situation when a column is sorted in descending order. If we want to sort EMP_LOAN_CODE column in descending order, the previous WHERE clause should be rewritten in SQL Server in the following way:
WHERE
1=CASE
WHEN p1.EMP_LOAN_CODE
WHEN p1.EMP_LOAN_CODE>p0.EMP_LOAN_CODE THEN 1
WHEN p0.EMP_LOAN_CODE=p1.EMP_LOAN_CODE THEN
CASE
WHEN p1.EMP_CODE
WHEN p1.EMP_CODE>=p0.EMP_CODE THEN 1
END
END
Written by: Galina Shevchenko
Tagged as aggregate, Conversion, Migration, Oracle, ORDER BY, OVER, PARTITION BY, SQL Server + Categorized as Newsletters, Newsletter #7, February 27, 2009, Uncategorized
Leave a Reply
You must be logged in to post a comment.