Sometimes, you’ll have a lot of comma-separated data in your databases. Maybe you’ll have lots of data shoved into one field separated only by a vertical bar. As we all should be asking, “How do I get that data out of that one field and separated into several rows?”
Well, I’m glad you asked, my friends!
I have a zip code file that has several different values within it. Things like state, timezone, latitude, and longitude are included with it. For example, I created a table with some of the data from the file:
CREATE TABLE CommaValues (
Id INTEGER NOT NULL IDENTITY(1, 1),
ZipCodeData VARCHAR(MAX) NULL)
INSERT INTO CommaValues (ZipCodeData)
SELECT '00501,UNIQUE,0,Holtsville,,"I R S Service Center",NY,"Suffolk County",America/New_York,631,NA,US,40.81,-73.04,562' UNION
SELECT '00544,UNIQUE,0,Holtsville,,"Irs Service Center",NY,"Suffolk County",America/New_York,631,NA,US,40.81,-73.04,0' UNION
SELECT '00601,STANDARD,0,Adjuntas,,"Colinas Del Gigante, Jard De Adjuntas, Urb San Joaquin",PR,"Adjuntas Municipio",America/Puerto_Rico,"787,939",NA,US,18.16,-66.72,0' UNION
SELECT '00602,STANDARD,0,Aguada,,"Alts De Aguada, Bo Guaniquilla, Comunidad Las Flores, Ext Los Robles, Urb Isabel La Catolica",PR,"Aguada Municipio",America/Puerto_Rico,"787,939",NA,US,18.38,-67.18,0' UNION
SELECT '00603,STANDARD,0,Aguadilla,Ramey,"Bda Caban, Bda Esteves, Bo Borinquen, Bo Ceiba Baja, Ext El Prado, Ext Marbella, Repto Jimenez, Repto Juan Aguiar, Repto Lopez, Repto Tres Palmas, Sect Las Villas, Urb Borinquen, Urb El Prado, Urb Esteves, Urb Garcia, Urb Las Americas, Urb Las Casitas Country Club, Urb Maleza Gdns, Urb Marbella, Urb Ramey, Urb Rubianes, Urb San Carlos, Urb Santa Marta, Urb Victoria, Villa Alegria, Villa Linda, Villa Lydia, Villa Universitaria, Villas De Almeria, Vista Alegre, Vista Verde",PR,"Aguadilla Municipio",America/Puerto_Rico,787,NA,US,18.43,-67.15,0' UNION
SELECT '00604,"PO BOX",0,Aguadilla,Ramey,,PR,,America/Puerto_Rico,,NA,US,18.43,-67.15,0' UNION
SELECT '00605,"PO BOX",0,Aguadilla,,,PR,,America/Puerto_Rico,,NA,US,18.43,-67.15,0' UNION
SELECT '00606,STANDARD,0,Maricao,,"Urb San Juan Bautista",PR,"Maricao Municipio",America/Puerto_Rico,"787,939",NA,US,18.18,-66.98,0' UNION
SELECT '00610,STANDARD,0,Anasco,,"Brisas De Anasco, Est De Valle Verde, Jard De Anasco, Paseo Del Valle, Repto Daguey, Urb San Antonio",PR,"Anasco Municipio",America/Puerto_Rico,787,NA,US,18.28,-67.14,0' UNION
SELECT '00611,"PO BOX",0,Angeles,,,PR,,America/Puerto_Rico,,NA,US,18.28,-66.79,0' UNION
SELECT '00612,STANDARD,0,Arecibo,,"Alt De Juncos, Alt De San Felipe, Bda Duhamel, Bo El Pasaje, Bo Islote, Bo Islote Ii, Bo Jarealitos, Bo Obrero, Bo Santana, Comunidad Buenos Aires, Est De Arecibo, Est De Balseiro, Ext Marisol, Ext Tanama, Ext Villa Los Santos I, Ext Villa Los Santos Ii, Hacienda Toledo, Jard De Arecibo, Jard De San Rafael, Parc Mattey, Parc Navas, Parc Perez, Parc Rodriguez Olmo, Parq De Jardines, Paseos Reales, Repto Marquez, Repto San Jose, Repto San Juan, Sect Abra, Sect El Cano, Sect Las Animas, Sect Muelle, Urb Arecibo Gdns, Urb College Park Iv, Urb Costas Del Atlantico, Urb El Paraiso, Urb Factor, Urb Garcia, Urb Garden View, Urb La Mucura, Urb Las Brisas, Urb Los Llanos, Urb Los Pinos, Urb Los Pinos Ii, Urb Marisol, Urb Martell, Urb Ocean Vw, Urb Radioville, Urb Regional, Urb San Daniel, Urb San Felipe, Urb San Lorenzo, Urb Tanama, Urb University Gdns, Urb Victor Rojas 1, Urb Victor Rojas 2, Urb Villamar, Urb Zeno Gandia, Valle Escondido, Villa Los Santos, Villa Lucia, Villa Serena, Villa Toledo, Villas Del Capitan, Villas Del Sol, Vista Azul, Vista Del Atlantico",PR,"Arecibo Municipio",America/Puerto_Rico,"787,939",NA,US,18.45,-66.73,0' UNION
SELECT '00613,"PO BOX",0,Arecibo,,,PR,,America/Puerto_Rico,,NA,US,18.45,-66.73,0' UNION
SELECT '00614,"PO BOX",0,Arecibo,,,PR,,America/Puerto_Rico,,NA,US,18.45,-66.73,0'
To change the comma seperated data from the single field into multiple rows, we would run the following:
SELECT id, value
FROM dbo.CommaValues cv
CROSS APPLY STRING_SPLIT(cv.ZipCodeData, ',')
The function that makes all of the magic happen is STRING_SPLIT. We tell it what field that we want to split and how to split it. In this case, we want the field ZipCodeData from dbo.CommaValues – shown in the first parameter. We want to separate all of the data by commas so we pass that in with the second parameter.
If there are additional items to split within each of those groups, say for example, if you had data like “Alonsy | Doctor, Martha, Rose | Tennant” then you would do the following
SELECT id, z.Value, y.value
FROM dbo.CommaValues cv
CROSS APPLY STRING_SPLIT(cv.ZipCodeData, '|') Z
CROSS APPLY STRING_SPLIT(z.Value, ',') y
Notice that the only difference is that we’ve run the STRING_SPLIT function on the data again after we ran it once. Yes, the order that the “|” and the “,” are split does make a difference, so be careful as you work with your data.
And with that, my friends, we have come to the end of our time together for this time. Next time, I’ll review why we used the CROSS APPLY with the STRING_SPLIT function and the other options that are available.
Until next time my friends!