Created
April 5, 2018 20:40
-
-
Save MikeMKH/b8647995afd3ac849b2e4005fc9a8cd6 to your computer and use it in GitHub Desktop.
SQL example of using row_number to find the changes on an column in time series data
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
select | |
key_col | |
,startdate | |
,enddate | |
,change_col | |
from ( | |
select distinct | |
key_col | |
,startdate = min(date_col) over (partition by group_key, key_col) | |
,enddate = max(date_col) over (partition by group_key, key_col) | |
,firstdate = min(date_col) over (partition by key_col) | |
,latestdate = max(date_col) over (partition by key_col) | |
,group_key | |
,change_col | |
from ( | |
select | |
key_col | |
,date_col | |
,group_key = | |
row_number() over (partition by key_col order by date_col) - | |
row_number() over (partition by key_col, change_col order by date_col) | |
,change_col | |
from tbl | |
) as z | |
where date_col > '2018-01-01' | |
) as x | |
where not (x.startdate = x.firstdate and x.enddate = x.latestdate) -- filter out key_cols that do have not changed | |
order by key_col |
markdreyer
commented
Apr 5, 2018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment