As part of trying to monitor my power usage and to know when my power prices change I download the power prices online from my power provider every 15 minutes and store this in a database.
I have been doing this for more than 4 months now and wanted to produce some data showing how the prices have changed over time.
I wanted to produce a report of all of the price movements per product over time.
The Database
I have two SQL Server tables described below that contain the data:
CREATETABLE [dbo].[PowerUpdate]( [id] [int] IDENTITY(1,1) NOTNULL, [DateTime] [datetime] NOTNULL, [Enabled] [bit] NOTNULL )
CREATETABLE [dbo].[PowerItem]( [id] [int] IDENTITY(1,1) NOTNULL, [PowerUpdateID] [int] NOTNULL, [Name] [nvarchar](128) NOTNULL, [Price] [decimal](9, 4) NOTNULL, [Type] [nvarchar](128) NOTNULL, [Description] [ntext] NULL, )
I also have one view that just joins the tables back together for reporting purposes.
CREATEVIEW [dbo].[vPowerData]asSELECT PowerItem_1.id, PowerItem_1.Name, PowerItem_1.Price, PowerUpdate_1.DateTimeFROM PowerItem AS PowerItem_1
INNERJOIN PowerUpdate AS PowerUpdate_1 ON PowerItem_1.PowerUpdateID = PowerUpdate_1.id
Some sample data
So here is some examples of the data that I have in the database.
select top 10 * from powerupdate | select top 10 * from poweritem |
![]() | ![]() |
PowerUpdate contains records with an ID and a Date and Time which identifies when the update occurred.
PowerItem contains the actual power products that were available for the particular update date & time and their current price.
What I want in my report
What I would like to see is for every product, the date and time when the price was first seen, and the last date and time that the price was seen and the actual prices.
To make it more complex a product’s price could return to the same value as a previous date and time and I would like the last price to display null for the next price and next price datetime fields.
Here is an example of the required output
The final solution
Now I had problems trying to get my head around my report requirement. I spent some time trying to come up with a solution but failed – I knew that it could be done using a SET based solution, which is what I wanted, but I could not do it myself.
I posted my question to the guru’s at www.sqlservercentral.com– the original post is here http://www.sqlservercentral.com/Forums/Topic930464-338-1.aspx
I received some good replies from people on the forum and really appreciate the work all the people put into this, to help me!
I liked the solution provided by a member called “Mark-101232”
Below is his solution to the problem which did exactly what I needed and was very fast compared to some solutions given.
WITH CTE1 AS (SELECT Name ,Price ,DateTime, ROW_NUMBER() OVER(PARTITION BY Name ORDERBY DATETime) AS rn1, ROW_NUMBER() OVER(PARTITION BY Name,Price ORDERBY DATETime) AS rn2FROM dbo.vPowerData), CTE2 AS (SELECT Name,Price AS [Min Price],MIN(DateTime) AS [Min DateTime],MAX(rn1) AS maxRNFROM CTE1GROUPBY Name,Price,rn2-rn1)SELECT a.Name,a.[Min Price],a.[Min DateTime], b.Price AS [NextPrice], b.DateTime AS [Next Price DateTime]FROM CTE2 aLEFTOUTERJOIN CTE1 b ON b.Name=a.Name AND b.rn1=a.maxRN+1 AND b.DateTime>a.[Min DateTime]ORDERBY a.Name,a.[Min DateTime];
Now although I have a solution I really need to understand the logic here so I decided to write this blog post to try to strip down this TSQL code so I can actually understand what is going on and in the process this may help someone else as well.
The first step is to split the query down into its parts.
Part 1
SELECT Name ,Price ,DateTime, ROW_NUMBER() OVER(PARTITION BY Name ORDERBY DATETime) AS rn1, ROW_NUMBER() OVER(PARTITION BY Name,Price ORDERBY DATETime) AS rn2FROM dbo.vPowerData
What this is TSQL is doing is return the name, price and date time and also two row numbers.
For more details on row_number() see http://msdn.microsoft.com/en-us/library/ms186734.aspx
The first row_number() function returns the a row number (resetting it to 1 when the name changes) and ordering it by the date time
The second row_number() function returns the a row number (resetting it to 1 when the name and price changes) and ordering it by the date time
The easiest way to look at this is to look at a subset of data and show you what gets returned.
Assume our PowerItem table only contains the following records
Note: There are 3 different products listed, but only the “$49.95 Value Pack” has had the price changing as follows:
- 0.1693
- 0.1698
- 0.1691
The TSQL code in Part 1 will return the following result set
Part 2
SELECT Name,Price AS [Min Price],MIN(DateTime) AS [Min DateTime],MAX(rn1) AS maxRNFROM CTE1GROUPBY Name,Price,rn2-rn1
Note: In this case treat CTE1 as the final result set that is visible in Part 1 above
The TSQL code in Part 2 returns the following result set
Part 3
SELECT a.Name,a.[Min Price],a.[Min DateTime], b.Price AS [NextPrice], b.DateTime AS [Next Price DateTime]FROM CTE2 aLEFTOUTERJOIN CTE1 b ON b.Name=a.Name AND b.rn1=a.maxRN+1 AND b.DateTime>a.[Min DateTime]ORDERBY a.Name,a.[Min DateTime];
Part 3 is really just a simple Left Outer Join sorted by name and the Minimum Date Time, the reason for the left outer join is that we want the final price and date.
Hint: I often try to think as CTE (Common Table Expressions) as a physical table and you can therefore logically join it easier (well it works for me - sometimes)
CTE 1 AS B | CTE 2 AS A |
![]() | ![]() |
So in this case we have the following going on (for the product name ‘$49.95 Value Pack’)
- Row 1
- Row 1 does not join to CTE2 since b.rn1 (in this case 4) = a.maxRN+1 (in this case 5) does not exist
- Row 2
- Row 2 does not join to CTE2 since b.rn1 (in this case 1) = a.maxRN+1 (in this case 2) does not exist
- Row 3
- Row 3 does join CTE2 since b.rn1 (in this case 2) = a.maxRN+1 (in this case 3) does exist
- Row 4
- Row 4 does join CTE2 since b.rn1 (in this case 3) = a.maxRN+1 (in this case 4) does exist
Note: there is also other filtered here such as b.name = a.name and b.datetime > a.[min datetime]
So for the first 4 rows this products the following output (when the order by clause is added)
Now I hope I have explained what is going on because I am still trying to decipher and understand this myself.
But in any case it has made it easier for me to comprehend so I am happy with that and if it helps anyone else then that is good.
Additional resources to create database and populate with data
- Script to create database and tables
- Script to add 4 months worth of data (1 MB download, expands to 150MB)
Notes:
- Additional indexes could help in the solution as well.
Additional comments:
- Many thanks to those that answered or submitted a potential solution to my query on SQL Server Central – it was very much appreciated. I am in no way trying to take away from what you have done, but instead hope I am helping others in the SQL Server community by understanding the particular solution I choose to accept, and hopefully explaining it so we can all learn and be better TSQL developers.