Walking Through the Advent of Code Day 3 - Sherpa of Data

Walking Through the Advent of Code Day 3

Today, we’re starting on Day 3 of the #AdventOfCode series. There’s a lot to cover here, so let’s get started, shall we?

As always, we must load the data into our database. In my case, I’m loading it into a table called AOCDay3 – memorable and easy to type all in one.

The first thing I would like to do here is to separate all of the part numbers and the “special characters” into 2 different tables. First, we’ll put all of the numbers into one.

DECLARE @pattern nvarchar(MAX) = N'%[0-9]%';
WITH Src AS
(    SELECT Row,
        SUBSTRING(Schematic, PATINDEX(@pattern, Schematic), 3) Val,
        STUFF(Schematic, 3, PATINDEX(@pattern, Schematic)+3, '') Txt,
	Schematic
    FROM aocDay3
    WHERE PATINDEX(@pattern, Schematic)>0
    UNION ALL
    SELECT Row,
        SUBSTRING(Txt, PATINDEX(@pattern, Txt), 3),
	STUFF(Txt, 3, PATINDEX(@pattern, Txt)+3, ''),
	Schematic
    FROM Src
    WHERE PATINDEX(@pattern, Txt)>0
  )
select Row, REPLACE(Val, '.', '') AS Val, PATINDEX('%' + Val + '%', Schematic) AS Position
INTO #Day3Numbers
FROM Src

I want to point out here the @Pattern variable. In this case, we have it looking for ‘%[0-9]%’. Yay for REGEX! This allows us to look for anything that is a number.

Then we do something very similar to what we did in Walking Through the Advent of Code Day 2 – find the digits, and the remaining text, and we’re home free!

Notice, though, that the position we grab is found at the very end. We just find the string in the line and show where it is. Except, that doesn’t always work.

Let’s show an example from my input string:

Notice that there is a 929 at position 7 and a 29 at position 64. The way I have it coded, it would search for the “29” and find it at position 8. That’s not quite what we want…

DECLARE @pattern nvarchar(MAX) = N'%[0-9]%';

WITH Src AS
(    SELECT Row,
        SUBSTRING(Schematic, PATINDEX(@pattern, Schematic), 3) Val,
        STUFF(Schematic, 1, PATINDEX(@pattern, Schematic)+2, '') Txt,
	PATINDEX(@pattern, Schematic) pos
    FROM aocDay3
    WHERE PATINDEX(@pattern, Schematic)>0
    UNION ALL
    SELECT Row,
        SUBSTRING(Txt, PATINDEX(@pattern, Txt), 3),
	STUFF(Txt, 1, PATINDEX(@pattern, Txt)+2, ''),
	pos + 1 + PATINDEX(@pattern, Txt) AS Pos
    FROM Src
    WHERE PATINDEX(@pattern, Txt)>0  )
select Row, REPLACE(Val, '.', '') AS Val, Pos AS Position
INTO #Day3Numbers
FROM Src

Notice that now we’ve put the position of the part number in the recursive cte. Yes, we are always seeing the part number as 3 characters — is that always correct?

Absolutely not, sadly. But we can fix that later – because, like Heman, we have the power!

DECLARE @pattern nvarchar(MAX) = N'%[^0-9]%';
WITH Src AS
(
    SELECT Row,
        SUBSTRING(Schematic, PATINDEX(@pattern, Schematic), 1) Val,
        STUFF(Schematic, 1, PATINDEX(@pattern, Schematic)+1, '') Txt,
	PATINDEX(@pattern, Schematic) pos
    FROM aocDay3
    WHERE PATINDEX(@pattern, Schematic)>0
    UNION ALL
    SELECT Row,
        SUBSTRING(Txt, PATINDEX(@pattern, Txt), 1),
	STUFF(Txt, 1, PATINDEX(@pattern, Txt)+1, ''),
	pos + 1 + PATINDEX(@pattern, Txt) AS Pos
    FROM Src
    WHERE PATINDEX(@pattern, Txt)>0
)
select Row, REPLACE(Val, '.', '') AS Val, Pos AS Position
INTO #Day3SpecChars
FROM Src

Here’s the code I was able to use for the Special Characters table. Notice that the pattern is slightly different – “%[^0-9]%”. The “^” character tells SQL Server that we want anything that is not a number in our fun table. When you use the “^” character, be sure to place it inside of the [] – otherwise, SQL Server will look for it!

From here, we go back to our original Numbers table and do a little bit of cleanup.

SELECT * 
FROM #Day3Numbers 
WHERE Val like '%[^0-9]%'

This code, as you can probably guess, lets us peruse the Numbers temp table for any ugliness, i.e. non-number values, in our Val column. I’ll leave it up to you to cycle through the data and clean that up with a simple UPDATE statement, especially since yours could be different than mine – since they change the input script for almost everyone.

I then added a column to the Numbers table so that we could tell if we had found a “Special Character” around it. In my case, I called it NearSpecChar and set the datatype as a bit.

Then, I cycled through all of the rows of input with different variations of the following code:

UPDATE N
SET NearSpecChar = 1
FROM #Day3Numbers N
JOIN #Day3SpecChars SC ON SC.Row = N.Row
WHERE SC.Position = N.Position - 1;

For this attempt at saying whether or not there is a special character in front of the part number, I JOINed my two temp tables on the row – since that has to be equal – and then I checked, in my WHERE clause, if the position of the Special Character was 1 character before the number.

As a slight thought-provoking and hair-pulling exercise, I leave for you – my lovely readers – the rest of the updates to be completed. The hardest one to me was the case where the special character was on the diagonal from the number – yes, that is a real case you have to work through. As always, please work through multiple rows to make sure you have covered all scenarios – the example in the original exercise is very helpful.

And so with that, I feel like I have been typing a lot. Hopefully, you’ve learned a little about Regex and recursive ctes. Y’all send over your comments to me – via LinkedIn or the comments section or even carrier pigeon – and let me know what y’all think. Hopefully, you’re enjoying yourself and are learning something either about SQL or how I work out coding techniques.

Until next time my friends!

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.