Oracle 23ai(23.7)에 SQL Time Bucketing 기능이 추가되었습니다. 이 기능을 사용하면 시계열 데이터를 특정 간격(버킷)으로 집계 분석할 수 있습니다.
TIME_BUCKET 함수는 입력 날짜 값를 특정 원점에 맞춰 지정된 간격으로 버킷화하여 버킷의 시작 부분을 반환합니다. 아래는 TIME_BUCKET 함수의 구문입니다. 첫 번째 인수는 입력 날짜 값, 두 번째 인수는 간격, 세 번째 인수는 원점, 네 번째 인수는 버킷의 시작 또는 끝을 반환할지 여부(기본값은 START), 다섯 번째 인수는 월말일에 대해 버킷을 결정하는 방식(기본값은 ON OVERFLOW ROUND)을 입력합니다.
TIME_BUCKET (datetime, stride, origin, [ START | END ] [ ON OVERFLOW { ROUND | ERROR } | LAST DAY OF MONTH ])
두 번째 인수는 INTERVAL YEAR TO MONTH, INTERVAL DAY TO SECOND, ISO 8601 시간 인터벌 문자열 형식으로 입력할 수 있습니다. 아래는 ISO 8601 시간 인터벌 문자열 형식의 구문입니다.
[-]P[yearsY][monthsM][daysD][T[hoursH][minutesM][seconds[.fac_secs]S]]
다섯 번째 인수는 아래와 같이 동작합니다.
- ON OVERFLOW ROUND: 해당 월의 원점과 같은 날에 반올림 (해당 날짜가 없는 월의 경우 해당 월의 마지막 날로 반올림)
- ON OVERFLOW ERROR: 해당 월의 원점과 같은 날로 잘림 (해당 날짜가 없는 월의 경우 오류 발생)
- LAST DAY OF MONTH: 원점이 해당 월의 마지막 날이고 범위에 월 또는 년만 포함된 경우 해당 월의 마지막 날로 잘림
테스트를 위해 아래와 같이 테이블을 생성하겠습니다.
-- 1
DROP TABLE t1 PURGE;
DROP TABLE t2 PURGE;
CREATE TABLE t1 (c1) AS
SELECT DATE '2050-01-01' + NUMTODSINTERVAL (ROWNUM - 1, 'SECOND')
FROM XMLTABLE ('1 to 7200');
CREATE TABLE t2 (c1) AS
SELECT DATE '2050-01-01' + NUMTODSINTERVAL (ROWNUM - 1, 'DAY')
FROM XMLTABLE ('1 to 1000')
WHERE DATE '2050-01-01' + NUMTODSINTERVAL (ROWNUM - 1, 'DAY') < DATE '2052-01-01';
아래 예제는 원점이 00:00:00이고 간격이 10분인 버킷으로 데이터를 집계한 결과를 보여줍니다. 네 번째 인수가 END면 다음 버킷의 시작 값을 반환합니다. 즉, 버킷에는 시작 이상, 끝 미만인 데이터가 포함됩니다.
-- 2
SELECT TIME_BUCKET (c1, 'PT10M', TO_DATE ('0001-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS')) AS start_dt
, TIME_BUCKET (c1, 'PT10M', TO_DATE ('0001-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS'), END) AS end_dt
, MIN (c1) AS min_dt, MAX (c1) AS max_dt, COUNT (*) AS cnt
FROM t1
GROUP BY TIME_BUCKET (c1, 'PT10M', TO_DATE ('0001-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS'))
, TIME_BUCKET (c1, 'PT10M', TO_DATE ('0001-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS'), END);
START_DT END_DT MIN_DT MAX_DT CNT
------------------- ------------------- ------------------- ------------------- ---
2050-01-01 00:00:00 2050-01-01 00:10:00 2050-01-01 00:00:00 2050-01-01 00:09:59 600
2050-01-01 00:10:00 2050-01-01 00:20:00 2050-01-01 00:10:00 2050-01-01 00:19:59 600
2050-01-01 00:20:00 2050-01-01 00:30:00 2050-01-01 00:20:00 2050-01-01 00:29:59 600
2050-01-01 00:30:00 2050-01-01 00:40:00 2050-01-01 00:30:00 2050-01-01 00:39:59 600
2050-01-01 00:40:00 2050-01-01 00:50:00 2050-01-01 00:40:00 2050-01-01 00:49:59 600
2050-01-01 00:50:00 2050-01-01 01:00:00 2050-01-01 00:50:00 2050-01-01 00:59:59 600
2050-01-01 01:00:00 2050-01-01 01:10:00 2050-01-01 01:00:00 2050-01-01 01:09:59 600
2050-01-01 01:10:00 2050-01-01 01:20:00 2050-01-01 01:10:00 2050-01-01 01:19:59 600
2050-01-01 01:20:00 2050-01-01 01:30:00 2050-01-01 01:20:00 2050-01-01 01:29:59 600
2050-01-01 01:30:00 2050-01-01 01:40:00 2050-01-01 01:30:00 2050-01-01 01:39:59 600
2050-01-01 01:40:00 2050-01-01 01:50:00 2050-01-01 01:40:00 2050-01-01 01:49:59 600
2050-01-01 01:50:00 2050-01-01 02:00:00 2050-01-01 01:50:00 2050-01-01 01:59:59 600
12 rows selected.
23.7 이전 버전은 아래의 표현식으로 2번 예제의 집계를 수행할 수 있습니다.
-- 3
SELECT TRUNC (c1, 'HH24') + NUMTODSINTERVAL ( FLOOR (TO_CHAR (c1, 'MI.SS') / 10) * 10, 'MINUTE') AS start_dt
, TRUNC (c1, 'HH24') + NUMTODSINTERVAL ((FLOOR (TO_CHAR (c1, 'MI.SS') / 10) + 1) * 10, 'MINUTE') AS end_dt
, MIN (c1) AS min_dt, MAX (c1) AS max_dt, COUNT (*) AS cnt
FROM t1
GROUP BY TRUNC (c1, 'HH24') + NUMTODSINTERVAL ( FLOOR (TO_CHAR (c1, 'MI.SS') / 10) * 10, 'MINUTE')
, TRUNC (c1, 'HH24') + NUMTODSINTERVAL ((FLOOR (TO_CHAR (c1, 'MI.SS') / 10) + 1) * 10, 'MINUTE');
START_DT END_DT MIN_DT MAX_DT CNT
------------------- ------------------- ------------------- ------------------- ---
2050-01-01 00:00:00 2050-01-01 00:10:00 2050-01-01 00:00:00 2050-01-01 00:09:59 600
2050-01-01 00:10:00 2050-01-01 00:20:00 2050-01-01 00:10:00 2050-01-01 00:19:59 600
2050-01-01 00:20:00 2050-01-01 00:30:00 2050-01-01 00:20:00 2050-01-01 00:29:59 600
2050-01-01 00:30:00 2050-01-01 00:40:00 2050-01-01 00:30:00 2050-01-01 00:39:59 600
2050-01-01 00:40:00 2050-01-01 00:50:00 2050-01-01 00:40:00 2050-01-01 00:49:59 600
2050-01-01 00:50:00 2050-01-01 01:00:00 2050-01-01 00:50:00 2050-01-01 00:59:59 600
2050-01-01 01:00:00 2050-01-01 01:10:00 2050-01-01 01:00:00 2050-01-01 01:09:59 600
2050-01-01 01:10:00 2050-01-01 01:20:00 2050-01-01 01:10:00 2050-01-01 01:19:59 600
2050-01-01 01:20:00 2050-01-01 01:30:00 2050-01-01 01:20:00 2050-01-01 01:29:59 600
2050-01-01 01:30:00 2050-01-01 01:40:00 2050-01-01 01:30:00 2050-01-01 01:39:59 600
2050-01-01 01:40:00 2050-01-01 01:50:00 2050-01-01 01:40:00 2050-01-01 01:49:59 600
2050-01-01 01:50:00 2050-01-01 02:00:00 2050-01-01 01:50:00 2050-01-01 01:59:59 600
12 rows selected.
아래 예제는 원점이 0001-01-01이고 간격이 2달인 버킷으로 데이터를 집계한 결과를 보여줍니다.
-- 4
SELECT TIME_BUCKET (c1, 'P2M', DATE '0001-01-01') AS dt
, TIME_BUCKET (c1, 'P2M', DATE '0001-01-01', END) AS dt
, MIN (c1) AS min_dt, MAX (c1) AS max_dt, COUNT (*) AS cnt
FROM t2
GROUP BY TIME_BUCKET (c1, 'P2M', DATE '0001-01-01')
, TIME_BUCKET (c1, 'P2M', DATE '0001-01-01', END);
DT DT MIN_DT MAX_DT CNT
------------------- ------------------- ------------------- ------------------- ---
2050-01-01 00:00:00 2050-03-01 00:00:00 2050-01-01 00:00:00 2050-02-28 00:00:00 59
2050-03-01 00:00:00 2050-05-01 00:00:00 2050-03-01 00:00:00 2050-04-30 00:00:00 61
2050-05-01 00:00:00 2050-07-01 00:00:00 2050-05-01 00:00:00 2050-06-30 00:00:00 61
2050-07-01 00:00:00 2050-09-01 00:00:00 2050-07-01 00:00:00 2050-08-31 00:00:00 62
2050-09-01 00:00:00 2050-11-01 00:00:00 2050-09-01 00:00:00 2050-10-31 00:00:00 61
2050-11-01 00:00:00 2051-01-01 00:00:00 2050-11-01 00:00:00 2050-12-31 00:00:00 61
2051-01-01 00:00:00 2051-03-01 00:00:00 2051-01-01 00:00:00 2051-02-28 00:00:00 59
2051-03-01 00:00:00 2051-05-01 00:00:00 2051-03-01 00:00:00 2051-04-30 00:00:00 61
2051-05-01 00:00:00 2051-07-01 00:00:00 2051-05-01 00:00:00 2051-06-30 00:00:00 61
2051-07-01 00:00:00 2051-09-01 00:00:00 2051-07-01 00:00:00 2051-08-31 00:00:00 62
2051-09-01 00:00:00 2051-11-01 00:00:00 2051-09-01 00:00:00 2051-10-31 00:00:00 61
2051-11-01 00:00:00 2052-01-01 00:00:00 2051-11-01 00:00:00 2051-12-31 00:00:00 61
12 rows selected.
23.7 이전 버전은 아래의 표현식으로 4번 예제의 집계를 수행할 수 있습니다.
-- 5
SELECT TRUNC (c1, 'YYYY') + NUMTOYMINTERVAL ((CEIL (TO_CHAR (c1, 'MM') / 2) - 1) * 2, 'MONTH') AS start_dt
, TRUNC (c1, 'YYYY') + NUMTOYMINTERVAL ( CEIL (TO_CHAR (c1, 'MM') / 2) * 2, 'MONTH') AS end_dt
, MIN (c1) AS min_dt, MAX (c1) AS max_dt, COUNT (*) AS cnt
FROM t2
GROUP BY TRUNC (c1, 'YYYY') + NUMTOYMINTERVAL ((CEIL (TO_CHAR (c1, 'MM') / 2) - 1) * 2, 'MONTH')
, TRUNC (c1, 'YYYY') + NUMTOYMINTERVAL ( CEIL (TO_CHAR (c1, 'MM') / 2) * 2, 'MONTH');
START_DT END_DT MIN_DT MAX_DT CNT
------------------- ------------------- ------------------- ------------------- ---
2050-01-01 00:00:00 2050-03-01 00:00:00 2050-01-01 00:00:00 2050-02-28 00:00:00 59
2050-03-01 00:00:00 2050-05-01 00:00:00 2050-03-01 00:00:00 2050-04-30 00:00:00 61
2050-05-01 00:00:00 2050-07-01 00:00:00 2050-05-01 00:00:00 2050-06-30 00:00:00 61
2050-07-01 00:00:00 2050-09-01 00:00:00 2050-07-01 00:00:00 2050-08-31 00:00:00 62
2050-09-01 00:00:00 2050-11-01 00:00:00 2050-09-01 00:00:00 2050-10-31 00:00:00 61
2050-11-01 00:00:00 2051-01-01 00:00:00 2050-11-01 00:00:00 2050-12-31 00:00:00 61
2051-01-01 00:00:00 2051-03-01 00:00:00 2051-01-01 00:00:00 2051-02-28 00:00:00 59
2051-03-01 00:00:00 2051-05-01 00:00:00 2051-03-01 00:00:00 2051-04-30 00:00:00 61
2051-05-01 00:00:00 2051-07-01 00:00:00 2051-05-01 00:00:00 2051-06-30 00:00:00 61
2051-07-01 00:00:00 2051-09-01 00:00:00 2051-07-01 00:00:00 2051-08-31 00:00:00 62
2051-09-01 00:00:00 2051-11-01 00:00:00 2051-09-01 00:00:00 2051-10-31 00:00:00 61
2051-11-01 00:00:00 2052-01-01 00:00:00 2051-11-01 00:00:00 2051-12-31 00:00:00 61
12 rows selected.
아래 예제는 다섯 번째 인수의 동작을 보여줍니다.
-- 6-1
SELECT TIME_BUCKET (c1, 'P1M', DATE '0001-11-30') AS start_dt
, MIN (c1) AS min_dt, MAX (c1) AS max_dt, COUNT (*) AS cnt
FROM t2
WHERE c1 < DATE '2051-01-01'
GROUP BY TIME_BUCKET (c1, 'P1M', DATE '0001-11-30');
START_DT MIN_DT MAX_DT CNT
------------------- ------------------- ------------------- ---
2049-12-30 00:00:00 2050-01-01 00:00:00 2050-01-29 00:00:00 29
2050-01-30 00:00:00 2050-01-30 00:00:00 2050-02-27 00:00:00 29
2050-02-28 00:00:00 2050-02-28 00:00:00 2050-03-29 00:00:00 30
2050-03-30 00:00:00 2050-03-30 00:00:00 2050-04-29 00:00:00 31
2050-04-30 00:00:00 2050-04-30 00:00:00 2050-05-29 00:00:00 30
2050-05-30 00:00:00 2050-05-30 00:00:00 2050-06-29 00:00:00 31
2050-06-30 00:00:00 2050-06-30 00:00:00 2050-07-29 00:00:00 30
2050-07-30 00:00:00 2050-07-30 00:00:00 2050-08-29 00:00:00 31
2050-08-30 00:00:00 2050-08-30 00:00:00 2050-09-29 00:00:00 31
2050-09-30 00:00:00 2050-09-30 00:00:00 2050-10-29 00:00:00 30
2050-10-30 00:00:00 2050-10-30 00:00:00 2050-11-29 00:00:00 31
2050-11-30 00:00:00 2050-11-30 00:00:00 2050-12-29 00:00:00 30
2050-12-30 00:00:00 2050-12-30 00:00:00 2050-12-31 00:00:00 2
13 rows selected.
-- 6-2
SELECT TIME_BUCKET (c1, 'P1M', DATE '0001-11-30' ON OVERFLOW ERROR) AS start_dt
, MIN (c1) AS min_dt, MAX (c1) AS max_dt, COUNT (*) AS cnt
FROM t2
WHERE c1 < DATE '2051-01-01'
GROUP BY TIME_BUCKET (c1, 'P1M', DATE '0001-11-30' ON OVERFLOW ERROR);
ORA-01839: date not valid for month specified
-- 6-3
SELECT TIME_BUCKET (c1, 'P1M', DATE '0001-11-30' LAST DAY OF MONTH) AS start_dt
, MIN (c1) AS min_dt, MAX (c1) AS max_dt, COUNT (*) AS cnt
FROM t2
WHERE c1 < DATE '2051-01-01'
GROUP BY TIME_BUCKET (c1, 'P1M', DATE '0001-11-30' LAST DAY OF MONTH);
START_DT MIN_DT MAX_DT CNT
------------------- ------------------- ------------------- ---
2049-12-31 00:00:00 2050-01-01 00:00:00 2050-01-30 00:00:00 30
2050-01-31 00:00:00 2050-01-31 00:00:00 2050-02-27 00:00:00 28
2050-02-28 00:00:00 2050-02-28 00:00:00 2050-03-30 00:00:00 31
2050-03-31 00:00:00 2050-03-31 00:00:00 2050-04-29 00:00:00 30
2050-04-30 00:00:00 2050-04-30 00:00:00 2050-05-30 00:00:00 31
2050-05-31 00:00:00 2050-05-31 00:00:00 2050-06-29 00:00:00 30
2050-06-30 00:00:00 2050-06-30 00:00:00 2050-07-30 00:00:00 31
2050-07-31 00:00:00 2050-07-31 00:00:00 2050-08-30 00:00:00 31
2050-08-31 00:00:00 2050-08-31 00:00:00 2050-09-29 00:00:00 30
2050-09-30 00:00:00 2050-09-30 00:00:00 2050-10-30 00:00:00 31
2050-10-31 00:00:00 2050-10-31 00:00:00 2050-11-29 00:00:00 30
2050-11-30 00:00:00 2050-11-30 00:00:00 2050-12-30 00:00:00 31
2050-12-31 00:00:00 2050-12-31 00:00:00 2050-12-31 00:00:00 1
13 rows selected.
관련 링크