Windowing Functions Answers

  1. Find the first year that each park was played in:

SELECT *
FROM (
SELECT park, YearID,
ROW_NUMBER() OVER (PARTITION BY park ORDER BY yearID) AS RowNum
FROM Teams
WHERE park IS NOT NULL
) Z
WHERE RowNum = 1
ORDER BY yearID DESC, park

So, let’s try to go thru the reasons why I used the functions I did and why it works. I used the row_number() function here because I wanted to find the first year for the park. That’s the key phrase in this requirement – first. Because I need to first the first year something happened, I can easily use the ORDER BY qualifier to make sure that everything is ordered by the year – keeping that part in the subquery. In the main part of the query, I simply look for those that have the RowNum = 1 and we’re done!

2. Show the rank with ties (both ways) of the games started in 1983 for each league for all players

SELECT YearID, lgID, playerID, GS,
RANK() OVER (PARTITION BY YearID, lgID ORDER BY GS DESC) AS TheRank,
DENSE_RANK() OVER (PARTITION BY YearID, lgID ORDER BY GS DESC) AS DenseRank
FROM dbo.Appearances
WHERE yearID = 1983
ORDER BY lgID, GS

What I was trying to say for the both ways part is to do both the rank and the dense rank functions. This shows all of the players that made any appearence in any number of games during 1983 for either league and ranks them with or without skips.

3. Place the players into 5 distinct buckets that were up for election into the Hall of Fame in 2001

SELECT yearid, playerID, votes, needed,
NTILE(5) OVER (PARTITION BY yearID ORDER BY votes DESC)
FROM dbo.HallOfFame
WHERE yearid = 2001
ORDER BY votes DESC

So, the fact that I ask for a specific number of buckets forces us to use the NTILE() function. The rest of this should be fairly obvious as we ORDER BY the votes to put all of the players into the specified number of buckets.

4. I would like to see the rolling totals and reverse rolling totals of the number of games Dale Murphy played for each team during his storied career.

SELECT yearID, teamID, G,
SUM(G) OVER (PARTITION BY b.playerID, teamID ORDER BY YearID
ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW) AS RunningTotal,
SUM(G) OVER (PARTITION BY b.playerID, teamID ORDER BY YearID
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS ReverseTotal
FROM dbo.People p
JOIN dbo.Batting B ON B.playerID = p.playerID
WHERE p.nameFirst = 'Dale'
AND p.nameLast = 'Murphy'
ORDER BY yearID, teamID

This one should be a huge help from the queries shown during the presentation. Hopefully, everyone remembers the ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING portion.

5. Let’s see the number of home runs hit by Chipper Jones in each of his historic seasons, as well as the number of home runs he hit the previous season as well as what he would hit the next season.

SELECT yearID, teamID, HR,
LAG(HR) OVER (ORDER BY yearID),
LEAD(HR) OVER (ORDER BY yearID)
FROM dbo.People p
JOIN dbo.Batting B ON B.playerID = p.playerID
WHERE p.nameFirst = 'Chipper'
AND p.nameLast = 'Jones'
ORDER BY yearID, teamID

Hopefully, everyone remembers that previous is a signal for LAG and next is a signal for LEAD. If not, go back to my presentation notes.

And with that, we have come to the end of our Windowing Function Exercises. If you have any questions, leave me a comment! And thanks again!