Yes, that’s right my friends! We have reached day 5!
There was a lot going on with this one, so let’s go ahead and get started. First thing we want to do, as always, is to go ahead and load all of our data into SQL Server. With this one, I cheated a little and loaded it into multiple tables to start with – since there are 7 different groups of data that we have to play with.
Could I have loaded it into one table and pull from there into other tables? Absolutely! Do I really want to? Heck, no!
Now that I have all of the data that they gave me loaded into tables, I can start putting all of the data into tables that I can actually work with.
We’re told that the mapping tables are all in the format of:
[Destination Range Start] [Source Range Start] [Range Length]
And we’re told that any number that isn’t listed is just the same number.
With that information, let’s start creating some real data tables we can mess with.
For example, I have the following data as part of my input for the Seed To Soil.
3957953582 2540115966 24844899
First, we have to separate out the data we’ve been given into the format shown above. My first thought is to start using the LEFT, RIGHT, and SUBSTRING functions. But there’s an easier way!
SELECT TheList,
PARSENAME(REPLACE(TheList, ' ', '.'), 3) AS Destination,
PARSENAME(REPLACE(TheList, ' ', '.'), 2) AS Source,
PARSENAME(REPLACE(TheList, ' ', '.'), 1) AS RangeLength
FROM SeedToSoil
Yes, the PARSENAME function is normally used for seperating out things like IP Addresses or database objects, but it can be used for any items that are seperated by a “.”. Now, it won’t work if you have 5 or more items in your list and the positions go from right to left, but as seen above, it’s a heck of a lot easier to do this then to attempt to figure out the correct string wizardry to grab the second piece of data…
With the above all split out and looking pretty, we can now start creating real tables!
The first thing I wanted to do was to do something like the following:
DECLARE cCur CURSOR READ_ONLY FOR
select
parsename(REPLACE(TheList, ' ', '.'), 3) AS Destination,
parsename(REPLACE(TheList, ' ', '.'), 2) AS Source,
parsename(REPLACE(TheList, ' ', '.'), 1) AS RangeLength
FROM seedtosoil
DECLARE @Destination bigint,
@Source bigint,
@RangeLength bigint,
@Counter bigint
OPEN cCur
FETCH NEXT FROM cCur INTO @Destination, @Source, @RangeLength
WHILE (@@fetch_status <> -1)
BEGIN
IF (@@fetch_status <> -2)
BEGIN
SET @Counter = 0
WHILE @Counter < @RangeLength
BEGIN
INSERT INTO #SeedToSoil
SELECT @Destination + @Counter,
@Source + @Counter
END
END
FETCH NEXT FROM cCur INTO @Destination, @Source, @RangeLength
END
CLOSE cCur
DEALLOCATE cCur
GO
Now, this is a somewhat common practice since we’re adding in a fixed amount of numbers and SQL Server can loop through these fairly quickly and we’re all done!
Except, these numbers are rather large. Notice in the above example for my input, you would be looping through the inner while loop 24,844,899 times. Yes, that’s almost 25 Million. And there are several more rows of data like this. There has to be a better way…
And thankfully, there is. That way, my friends, is with what is known as a Tally table. Yes, the concepts for it started in SQL Server sometime back – noted here. You will need a fairly large set of numbers in the table – otherwise you won’t get to do the exercise as it has large numbers.
From there, you can lose the cursor and the while loops!
with cte as (select
TheList,
parsename(REPLACE(TheList, ' ', '.'), 3) AS Destination,
parsename(REPLACE(TheList, ' ', '.'), 2) AS Source,
parsename(REPLACE(TheList, ' ', '.'), 1) AS RangeLength
FROM seedtosoil)
INSERT INTO SeedToSoilGoodness
SELECT cte.Source, t.n
FROM Tally t
JOIN cte on t.n > cte.Destination
AND t.n < cte.Destination + cte.RangeLength
As long as your Tally table is properly indexed, then this will be many times faster than the cursor method shown above. Like thousands of times faster…
From here, we still need to add all of the values that don’t currently exist in the – in what I call here – SeedToSoilGoodness table.
INSERT INTO SeedToSoilGoodness
SELECT n, n -- Since we want the same number on both sides
FROM Tally
WHERE NOT EXISTS (SELECT 1
FROM SeedToSoilRange
WHERE n = Destination
OR n = Source)
Now, once you have all of this in place, remember you still have several tables to do basically the same thing as above to: SoilToFertilizer, FertilizerToWater, WaterToLight, LightToTemperature, TemperatureToHumidity, and HumidityToLocation.
Once you have all of that in place, you can finally create the query that will get the data you were requested in the first place!
SELECT TL.Value, MIN(HL.Destination)
FROM Seeds S
CROSS APPLY STRING_SPLIT(TheList, ' ') TL
JOIN SeedToSoilGoodness SS ON SS.Source = TL.Value
JOIN SoilToFertilizerGoodness SF ON SF.Source = SS.Destination
JOIN FertilizerToWaterGoodness FW ON FW.Source = SF.Destination
JOIN WaterToLightGoodness WL ON WL.Source = FW.Destination
JOIN LightToTemperatureGoodness LT ON LT.Source = WL.Destination
JOIN TemperatureToHumidityGoodness TH ON TH.Source = LT.Destination
JOIN HumidityToLocationGoodness HL ON HL.Source = LT.Destination
GROUP BY TL.Value
Yes, it’s everyone’s friends CROSS APPLY and STRING_SPLIT because the seeds table is truly made up on one row in the table. Then we have to use all of our reference tables to find what our lowest location number.
Lots of work here. Hopefully, you have found a great “new” use for your tally tables. And how much time it can save because we are set-based instead of relying on cursors and loops in SQL Server. Until tomorrow when we’re back with you for Part 2 of Day 5 of #AdventOfCode!