Scrutinizing the Data in All of our tables

Today, I was working with SQL Server to get row counts from several tables so I thought I’d be smart and work with some functions in SQL Server to make it smarter / easier.

Now, if I am truly only getting “straight” row counts from these tables, I would be able to create a query like the below that would provide the answers with no problem:

SELECT
      QUOTENAME(SCHEMA_NAME(sOBJ.schema_id)) + '.' + QUOTENAME(sOBJ.name) AS [TableName]
      , SUM(sPTN.Rows) AS [RowCount]
FROM  sys.objects AS sOBJ
INNER JOIN sys.partitions AS sPTN
 ON sOBJ.object_id = sPTN.object_id
WHERE sOBJ.type = 'U'
      AND sOBJ.is_ms_shipped = 0x0
      AND index_id < 2 -- 0:Heap, 1:Clustered
GROUP BY 
      sOBJ.schema_id
      , sOBJ.name
ORDER BY [TableName]
  

This gives exactly what we want – if we want to know how many rows total are in each table. And in this case, we’re viewing all tables in the database.

We need to make one small change if we want just the row counts for all the tables in one schema.

SELECT
      QUOTENAME(SCHEMA_NAME(sOBJ.schema_id)) + '.' + QUOTENAME(sOBJ.name) AS [TableName]
      , SUM(sPTN.Rows) AS [RowCount]
FROM 
      sys.objects AS sOBJ
      INNER JOIN sys.partitions AS sPTN
            ON sOBJ.object_id = sPTN.object_id
WHERE
      sOBJ.type = 'U'
      AND sOBJ.is_ms_shipped = 0x0
      AND index_id < 2 -- 0:Heap, 1:Clustered
      and SCHEMA_NAME(sOBJ.schema_id) = 'Application'
GROUP BY 
      sOBJ.schema_id
      , sOBJ.name
ORDER BY [TableName]

As long as I can spell the schema name, then we’re livin’ the high life!

What do we do if we only need to get the row counts of the tables in a certain schema, but only where a certain field is populated?

Definitely can’t use the query we’ve been using. But, we can use a function that many people have used before, but not quite like this. In this case my friends, I want to use the sp_MSForEachTable function.

Now, many people have used this to get row counts or “general loopiness” fun that loops through all of the tables, but there is so much more to it!

I’ve used it before like this:

Now some people have a problem with the results here:

  1. It’s hard to copy the results to everyone’s favorite tool – Excel – to send off to whoever with the questions you have.
  2. I truly have to loop through all of the tables in the database. I can’t just look at one schema.

Let’s fix the first issue, then we’ll move on to the other.

As you can see, in the precommand parameter, we’re setting up a global temp table ##Stats for all of our statistics needs – or at least what we need to accomplish this. We’ve changed our command1 parameter – which is the parameter you’re working with if you don’t specify a parameter – to push our data into the new global table. Then in our postcommand parameter, we get the data from our global table and drop it.

Yes, this still gets data from all the tables. Let’s fix that!

Yes, that’s right. The obscenely oddly named parameter @WhereAnd allows us to specify the schema name (Notice that it’s shockingly like how we specified the schema above!)

After all that, let’s add a way to get just a subset of data from each table. For example, I want to know how many rows from each table are valid after today.

Sigh… One of the tables in our chosen schema does not have the ValidTo field. Let’s fix that as well.

Notice the fix for this was placed into the @WhereAnd parameter. The table SystemParameters is the only table that didn’t have the column ValidTo so we asked SQL Server nicely not to attempt to get data from that table.

And with that, my friends, we have concluded yet another blog post. Hopefully, you learned something that you can use in the future that will make your life easier. For each time you have to get data from all (or most) of your tables:

1 comment

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.