Dynamic SQL
Dynamic SQL
- The EXEC command
- The sp_executesql stored procedure
Using Pivot with Dynamic SQL
SQL Injections
(INCLUDE SYNTAX WITH EVERYTHING)
(INCLUDE NOTES FROM SQL COOKBOOK, Udemy)
(Include notes from here: http://www.sommarskog.se/dynamic_sql.html#good_practices and http://www.benkotips.com/pages/DynamicSQL.aspx)
Dynamic SQL
Dynamic SQL is a batch of T-SQL code that is generated and executed on-the-fly at run-time. It is the capability of constructing a batch of T-SQL code as a character string and then executing that batch. The code that is generated on-the-fly is created based on some condition or parameters in the batch. When the “conditions or parameters” are different the T-SQL code produces different T-SQL to be executed. There’s two ways of executing dynamic SQL in SQL Server: the EXEC (short for EXECUTE) command, and the sp_executesql stored procedure. Because T-SQL does not allow us to use variables or parameters to specific table or column names, dynamic T-SQL can be used instead. The following are some use cases for dynamic SQL:
- Automating administrative tasks: For example, we can query metadata and construct and execute a BACKUP DATABASE statement for each database in the instance.
- Improving performance of certain tasks: For example, we can construct parameterized ad-hoc queries that can use previously cached execution plans.
- Constructing elements of the code based on querying the actual data: For example, we can construct a PIVOT query dynamically when we don’t know ahead of time which elements should appear in the IN clause of the PIVOT operator.
As a general rule we should use dynamic SQL only if we cannot use static SQL to accomplish our goals, or if using static SQL is too cumbersome.
In general, using dynamic SQL is going to be something that is going to hinder performance. In the background, dynamic SQL is going to operate as a separate batch. The batch that the dynamic SQL is being called from is separate from the calling batch. What this means ultimately is that the dynamic batch is going to be parsed, resolved, and optimized as a separate unit. So if we understand that aspect of dynamic SQL there are some situations where we can use dynamic SQL to actually improve performance (not a typical use case). It is recommended to replace dynamic SQL with native SQL when the dynamic SQL is not required or necessary, to significantly improve the performance of the application.
The EXEC command
The EXEC command accepts a character string in parentheses as input and executes the batch of code within the character string. EXEC supports both regular and Unicode character strings as input. (Note that EXEC can also be used to execute a stored procedure.)
(Include syntax)
It is important to enclose @statement in parenthesis. If we don’t then the EXECUTE statement takes @statement, and instead of running the dynamic SQL, it thinks the variable value is the name of a stored procedure. We’ll get the following error:
As an example, the following code stores a character string with a PRINT statement in the variable @sql and then uses the EXEC command to execute the batch of code within the variable:
(Include code, explanation, and results from page 401)
(Notice the use of two single quotes to represent one single quote in a string within a string.)
The EXEC statement is a one-shot thing. Any variable created inside the EXEC last only for that one invocation. EXEC cannot reference variables from the calling batch, and the database context reverts back to the context of the calling statement if changed inside the EXEC.
Since EXEC is completely dynamic, there is no way to check security or whether the objects it references even exist until execution of the SQL string. If EXEC is within a stored procedure, the permissions are checked against the user who invokes the stored procedure, not against the owner of the procedure.
There is also the issue of compilation. The query plan from an EXEC is not reused as the plans of regular stored procedures are. Rather, statements inside the EXEC are not compiled until the EXEC command is executed.
As another example, suppose we have an application where the user interface allows the user to select the table they want to read from a drop down list. Therefore, each time someone uses the interface they could pick a different table from which they want to return data. For this example, let’s assume this user interface displays table information from the AdventureWorks2017 database and the user picks the Sales.SalesOrderDetail table. The code below shows a method of using dynamic SQL code to return the TOP 10 records from the SalesOrderDetail table:
-- Declare variable to hold dynamic SQL code DECLARE @CMD NVARCHAR(1000); -- Declare name of table to read DECLARE @Table NVARCHAR(125); SET @Table = 'AdventureWorks2017.Sales.SalesOrderDetail'; -- Build dynamic TSQL Statement SET @CMD = 'SELECT TOP 10 * FROM '+@Table; --Execute dynamic TSQL Statement EXECUTE (@CMD);
The code above first declares a variable name @CMD to hold the dynamic SELECT statement that is going to be built and the @Table variable to hold the table name. Then we set the @Table variable to Sales.SalesOrderDetail. To build our actual dynamic SQL statement use a SET statement to set the variable @CMD to the concatenated string value that contains a SELECT statement and the @TABLE variable value. We then execute our dynamic SQL statement contained in the @CMD variable using the EXECUTE statement. To further test the dynamic T-SQL the code above, we can try using a different AdventureWork2017 table in the code by changing the “SET @Table=” statement to use the Sales.Sales.OrderHeadertable.
There are times when we need to write some more complicated dynamic SQL. As a DBA one of those situations where we might need to do this is when we want to generate code to perform some kind of database maintenance. When we need to build dynamic SQL for database maintenance purposes we usually read a system view and then generate a script that is displayed and/or executed. Suppose we are a DBA that has taken over maintaining a database and we want to delete several test tables that got created in a database. The tables all have names that start with the prefix “Test”. To demonstrate how we might read the sys.tablesview and generate of the appropriate DELETE statements, let’s look at the code below.
-- Section 1: Create database and Sample Tables USE master; GO CREATE DATABASE DYNA; GO USE DYNA; GO CREATE TABLE MyData1(Id INT, DataDesc VARCHAR(100)); CREATE TABLE MyData2(Id INT, DataDesc VARCHAR(100)); CREATE TABLE TestData1(Id INT, DataDesc VARCHAR(100)); CREATE TABLE TestData2(Id INT, DataDesc VARCHAR(100)); GO -- Section 2: Dynamic TSQL code to generate script to delete Test tables USE DYNA; GO DECLARE @TableName VARCHAR(100); DECLARE @CMD VARCHAR(1000); SELECT TOP 1 @TableName = name FROM sys.tables WHERE name LIKE 'Test%' ORDER BY name; WHILE @@ROWCOUNT > 0 BEGIN SELECT @CMD = 'DROP TABLE '+@TableName+';'; PRINT @CMD; EXECUTE (@CMD); SELECT TOP 1 @TableName = name FROM sys.tables WHERE name LIKE 'Test%' AND name > @TableName ORDER BY name; END; -- Section 3: Cleanup USE master; GO DROP DATABASE DYNA;
The code above contains three different sections. The first section creates a database called DYNA, and then creates 4 different tables, two of which start with “Test”. These two tables that start with “Test” are the tables we want to delete with dynamic TSQL code. The second section of code is our dynamic TSQL code. The last section of code cleans up by deleting the test database we created. The dynamic SQL code in Section 2 first prints out the delete statements that it run, then deletes the test tables we created in Section 1. We do this by processing through a WHILE loop while looking for different tables that start with the character string “Test”. For each table we find that starts with “Test” we construct a DELETE command that is stored in the variable @CMD. We then display the DELETE statement by using a PRINT statement, immediately followed by executing the statement by using the EXECUTE statement. The last section, section 3 cleans up by dropping the DNYA database. This is a very simple example of how to examine rows of metadata and generate dynamic TSQL. As a DBA there are many times where it will come in handy to understand how to write TSQL code that generates TSQL code.
The sp_executesql stored procedure
While EXEC was a handy tool, SQL Server 7.0 introduced the sp_executesql stored procedure as an enhanced alternative to the EXEC command for executing dynamic SQL code. In the past, if a plan is found in cache that would satisfy the query but differs only in the parameters of the query, then instead of creating a new plan, the optimizer would likely use the existing plan. To take advantage of this feature, Microsoft created the extended store procedure sp_executesql. It’s more flexible since it has an interface, and it’s also more secure since it supports input and output parameters. Unlike the EXEC command, sp_executesql supports only Unicode character strings as the input batch of code.
We can use sp_executesql to execute T-SQL stored within a variable. The use of input and output parameters in our dynamic SQL code allows us to write more secure and efficient code. In terms of security, parameters appearing in the code cannot be considered part of the code, but rather only operands in expressions. So, by using parameters, we eliminate exposure to SQL injections.
Note that sp_executesql is a system stored procedure. The sp_execute stored procedure has two input parameters and an assignment section. We specify the Unicode character string holding the batch of code we want to run as the first parameter (which is called @stmt in the following example), and a Unicode character string holding the declarations of input and output parameters in the second input parameter (which is called @params in the following example).
The first parameter (@stmt) contains the basic statement we want to have executed. It must be a Unicode string (NCHAR, NVARCHAR, or NTEXT); a regular string character will not do. We cannot use a complex Unicode expression, such as a concatenation, but we can build the Unicode string and put it into a variable. If we specify a parameter such as @params, then it must appear in the @stmt parameter. The @stmt parameter can even be a batch of statements; we’re not limited to just one.
The second parameter, @params, contains the list of parameters to be used by the statement. This is a list of all of the parameters much like in a DECLARE statement but without the DECLARE keyword. It too must be a Unicode string. The final pieces of the sp_execute statement are designated in the syntax diagram by @param1.
If a parameter is defined in the @params parameter, then that parameter must be present in the call to sp_executesql. We can include our parameters in our code in one of two ways. We can assign values to our own parameter names as stated in our declaration inside @params, or we can pass them in the same order as they were declared and omit the parameter names. Changing only the parameter values makes the cached execution plan “reusable.”
We then specify the assignments of input and output parameters separated by commas. We set the parameter values by specifying the parameters and desired value. The parameters are listed in order defined within the @parameterDefinition string.
- @parm1 is the first parameter defined within the @parameterDefinition Value is the value we wish to set it to.
- @parm2, is the second parameters, if defines, as declared in @parameterDefinition.
- and so on…
In this example we construct a batch of code that queries against the Sales.Orders table, and uses an input parameter called @orderid in the query’s filter:
(Include code, explanation, and results from page 402)
In the example above, we assign the value of 10248 as the input parameter, but even if we run the code again with a different value, the code string remains the same. This increases the chances of reusing a previously cached plan.
(Include code, explanation, and result from Advanced SQL book)
Once we run this stored procedure, we can execute it again, only changing the value of @EmployeeID. SQL Server will see that this execution differs only by the parameter in cache and may reuse the existing plan. The optimizer’s plan-matching algorithms require fully qualified object names to take advantage of plan reuse. In other words, Northwind.dbo.Employees, not just Employees.
One useful application of sp_executesql is in cases where we are inserting into a horizontally partitioned view (horizontally partitioned views divide up the rows of a table among multiple tables). We have to determine which table the data should be inserted into, based upon the partitioning criteria.
For example, we could partition an order table based on the year in which a customer’s order was placed. Our naming convention would then include the year in the table name, like follows:
(Include code from Advanced SQL)
We build one table for each of the years in which we are maintaining data. Next, we have to build the @stmt string, as follows:
(Include code from Advanced SQL)
This statement assumes that we are using stored procedure parameters to supply the values for @OrderID, @OrderDate, @CustomerID, and @Total. The INSERT statement dynamically generates the table name. We can build the @params string as show below:
(Include code from Advanced SQL)
Behind the scenes, as we execute the code, we will place into cache one copy of the generated INSERT statement for each order year, meaning that is one for each table. Subsequent executions of sp_executesql stored procedure for a particular year will use the cached copy.
This approach is elegant from the standpoint that we can keep our INSET code down to a minimum and eliminate conditional processing. Also, if we now add another table for subsequent years, we do not have to modify our code.
As another example, suppose we want to write a stored procedure that returns either the average LineTotal or sum of LineTotal by ProductID for products shipped in 2011. We want this written as a stored procedure. The stored procedure should accept one parameter @ReturnAverage. If true, then we’ll return the average, otherwise the sum. Of course, we could write this as two separate queries as shown in the following stored proc but that wouldn’t be much fun, as it would be too much typing and prone to errors!
CREATE PROC uspCalcuateSalesSummaryStatic @returnAverage BIT AS IF(@returnAverage = 1) BEGIN SELECT SOD.ProductID, AVG(SOD.LineTotal) AS ResultAvg FROM Sales.SalesOrderDetail SOD INNER JOIN Sales.SalesOrderHEader SOH ON SOH.SalesOrderID = SOD.SalesOrderID WHERE YEAR(SOH.ShipDate) = 2011 GROUP BY SOD.ProductID; END; ELSE BEGIN SELECT SOD.ProductID, SUM(SOD.LineTotal) AS ResultSum FROM Sales.SalesOrderDetail SOD INNER JOIN Sales.SalesOrderHEader SOH ON SOH.SalesOrderID = SOD.SalesOrderID WHERE YEAR(SOH.ShipDate) = 2011 GROUP BY SOD.ProductID; END;
The bad part here is there is a lot of duplicate code, and not much unique code. With all this redundancy, we’ve can modify this query as dynamic SQL, as follows:
CREATE PROCEDURE uspCalcuateSalesSummaryDynamic @returnAverage bit AS DECLARE @statement NVARCHAR(4000), @function NVARCHAR(10) IF (@returnAverage = 1) SET @function = 'Avg' ELSE SET @function = 'Sum' SET @statement = 'SELECT SOD.ProductID,' + @function + + '(SOD.LineTotal) as Result' + @function + ' FROM Sales.SalesOrderDetail SOD INNER JOIN Sales.SalesOrderHEader SOH ON SOH.SalesOrderID = SOD.SalesOrderID WHERE YEAR(SOH.ShipDate) = 2011 GROUP BY SOD.ProductID' EXECUTE sp_executesql @statement
Here, instead of having two complete versions of the SQL code, one for AVG, the other for SUM, we build the requested version on-the-fly. The code is built and saved into the variable @statement. This variable is built based on the parameter value @returnAverage. If set to 1, then @function represents the Average; otherwise, Summation. We can see where the SQL is then built to create statement.
Let’s take our previous example and extend it. Rather than hardcoding the shipDate into the query as we did, let’s bring that in as a parameter. This makes the query more flexible and works with years other than 2011. To make this change, we’ll add a parameter to our stored procedure, as well as the dynamic query. We’ll then use the sp_executesql command to call the dynamic query using these parameters. The updated stored procedure with changes is shown below:
CREATE PROCEDURE uspCalcuateSalesSummaryDynamic2 @returnAverage bit, @shipDate int AS DECLARE @statement NVARCHAR(4000), @parameterDefinition NVARCHAR(4000), @function NVARCHAR(10) IF (@returnAverage = 1) SET @function = 'Avg' ELSE SET @function = 'Sum' SET @parameterDefinition = '@shipDate int' SET @statement = 'SELECT SOD.ProductID,' + @function + + '(SOD.LineTotal) as Result' + @function + ' FROM Sales.SalesOrderDetail SOD INNER JOIN Sales.SalesOrderHEader SOH ON SOH.SalesOrderID = SOD.SalesOrderID WHERE YEAR(SOH.ShipDate) = @shipDate GROUP BY SOD.ProductID' EXECUTE sp_executesql @statement, @parameterDefinition, @shipDate
To run this, simply call the uspCalculateSalesSummaryDynamic2 proc from a query window using the following command:
EXECUTE uspCalcuateSalesSummaryDynamic2 1,2013
As a summary, here are several reasons why Microsoft recommends using sp_executesql to run dynamic SQL:
- With EXECUTE all parameters much be converted from their native type to Unicode. This hamper the optimizer’s ability to match the dynamically built SQL with a pre-existing plan. SP_EXECUTESQL supports parameter substitution so we can build out the parameters in our string but then we have additional parameters that we pass in for parameter substitution.
- By using sp_executesql, the optimizer recognizes the parameters within the dynamic SQL, making it easier for the optimizer to match plans. It’s also going to be better for reusability of execution plans because sp_executesql can now accept parameters whereas the EXEC command cannot. So there is a very high possibility that it will be able to reuse those execution plans whereas EXEC cannot.
- It is easier to read parameterized queries with sp_executesql than it is to read a bunch of concatenated text which incorporates them in the case of EXEC.
- With EXEC we are limited, everything is going to have to be a string value, either VARCHAR or NVARCHAR data type. Because we’re working with the EXEC SQL command, everything is going to have to be in a string format. With the introduction of NVARCHAR(MAX) we no longer have any real use cases for EXEC command, so we always use sp_executesql unless we found some very compelling reason why we must use the EXEC statement. With the introduction of NVARCHAR(MAX) we no longer have any real use cases for EXEC command, so we always use SP_EXECUTESQL unless we found some very compelling reason why we must use the EXEC statement. With SP_EXECUTESQL we can substitute those parameter values and it’s going to allow us to pass in the original value and data type of that parameter. We have an interface we’re working with where we declare the input and output variables.
- We sp_executesql we don’t have to rewrite code every time since it’s already written for us. We just have to change the value of the variable that we’re passing in. So it’s going to be able to reuse those execution plans whereas with the EXEC statement anytime we want to change the value we have to re-write and re-execute the entire string every single time. So EXEC will not reuse execution plans whereas SP_EXECUTESQL will. With SP_EXECUTESQL, because we can pass in these variables in their original data type forms, they’re going to be more secure.
- Parameterized queries more secure and less susceptible to SQL injection attacks.
Using PIVOT with Dynamic SQL
The PIVOT operator can be used to pivot data. In a static query, we have to know ahead of time which values to specify in the IN clause of the PIVOT operator. For example, the following is a static query with the PIVOT operator:
(Include code, explanation, and results from page 402)
The code queries against the Sales.Orders table and pivots the data so that it returns shipper IDs in the rows, order years in the columns, and the total freight in the intersection of each shipper and order year.
Since we have to know ahead of time which values to specify in the IN clause of the PIVOT operator in static queries (order years in this case), we need to revise the code every year. We can instead query the distinct order years from the data, construct a batch of dynamic SQL code based on the years we queried, and execute the dynamic SQL batch, like follows:
(Include code, explanation, and results from page 403)
(Note that there are more efficient wats to concatenate strings than using a cursor (like in the query above), such as Common Language Runtime (CLR) aggregates and the FOR XML PATH option.)
SQL Injections
A drawback of dynamic SQL is that it opens up the possibility for a SQL Injection attack. SQL Injection is a hacking technique where malicious users try to exploit the use of a free form data entry field. These malicious users try to insert additional SQL code into a data entry field beyond how the data entry field was originally intended to be used. By inserting SQL code, they can fool the system into returning data that they were originally not supposed to get, or worse yet, run additional SQL commands against our database. Depending on the permissions that our application runs under, a SQL injection attack could insert data into our database tables, drop a table, or worse yet setup a new login having sysadmin rights.
To demonstrate how dynamic SQL can be subject to a SQL injection attack if not managed properly, let’s first create a database and a table using the code below. We will use this database and table to demonstrate how dynamic SQL can be vulnerable to a SQL injection attack.
USE master; GO CREATE DATABASE DYNA; GO USE DYNA; GO CREATE TABLE Product (ID INT, ProductName VARCHAR(100), Price MONEY); INSERT INTO Product VALUES (1, 'Red Wagon', 12.99), (2, 'Red Barn', 23.18), (2, 'Farm Animals', 7.59), (2, 'Toy Solders', 17.76); SELECT * FROM Product;
The code above creates a database name DYNA, and then creates and populates a table name Product with 4 rows of data.
Suppose our application has a data selection screen where an end user can enter a text string that is contained in ProductName and then the application will return all the table records that contain the entered text string. The application does this by passing the text string that a user enters to a stored procedure name GetProducts, and then the data returned from the stored procedure is displayed to the user. The code for the stored procedure GetProducts is shown below:
CREATE PROC GetProducts (@EnteredText varchar (100)) AS DECLARE @CMD varchar(1000); SET @CMD = 'SELECT ProductName, Price ' + 'FROM Product ' + 'WHERE ProductName LIKE ''%' + @EnteredText + '%'''; PRINT @CMD EXEC (@CMD);
We can see this stored procedure accepts a single parameter @EnteredText. This parameter is then used to dynamically create a SQL statement that is stored in the variable @CMD. That variable is then executed. The stored procedure can then be executed, as follows:
EXEC GetProducts 'Red';
Because the code in our stored procedure GetProducts takes a parameter and generates the VARCHAR variable @CMD, it leaves the stored procedure open for a SQL injection attack. For example, we can execute the GetProducts stored procedure with the code below:
EXEC GetProducts 'Red%'' and ID = 1 --';
We passed a number of other characters in additional to the string “Red” to our stored procedure GetProducts. These additional characters we passed allows us to restrict our query to only return the Products that have “Red” in the ProductName column and have an ID value of 1. By allowing my stored procedure to use the unedited text in the @EnteredText parameter we are able to inject additional characters into that parameter to cause the code to perform other actions not originally intended to be used in the GetProducts stored procedure.
Most SQL injection attacks are trying to get additional data out of your system, or just want to corrupt our database. To explore this a little more let’s look at the code below:
EXEC GetProducts 'Red'' ;SELECT * FROM Product;--';
The code generates two result sets. If we compare the results of the normal execution of GetProduct stored procedure found in first result set, with second result set, we can see the code in above generated some additional output columns that our stored procedures weren’t originally designed to display, but was displayed due to a SQL injection attack.
The code did allow us to exploit the @EnteredText parameter of the GetProduct stored procedure to return data for all columns of the Client table. To accomplish this we added the “’ ;SELECT * FROM Product;–“ string to our parameter. For our last example we will perform a destructive SQL injection attack, as follows:
EXEC GetProducts 'Red'' ;DROP TABLE Product;--';
In the code above we added a DELETE statement to the @EMAIL parameter. In this example we deleted the Client table.
No one wants to have their code compromised by a SQL injection attack. In order to combat against SQL Injection attacks we should consider the following points when developing our T-SQL application code:
- The best way to avoid a SQL Injection attack is to not use dynamic SQL
- Edit user entered parameters for special characters like semi-colons and comments
- Make our parameters only as long as needed to support the user entered data
- If we must use dynamic SQL then we should use parameterized T-SQL that uses sp_execute sql to execute our dynamic T-SQL, instead of EXEC.
- Tighten up security to only allow minimal rights needed to execute your dynamic SQL.
The code below provides an example of how we can modify our GetUserName stored procedure to use parameterized TSQL:
ALTER PROC GetProducts (@EnteredText varchar (100)) AS DECLARE @CMD nvarchar(1000); DECLARE @WildCardParm varchar(102); SET @CMD = 'SELECT ProductName, Price ' + 'FROM Product ' + 'WHERE ProductName LIKE @EnteredParm'; SET @WildCardParm = '%' + @EnteredText + '%'; EXEC sp_executesql @CMD,N'@EnteredParm varchar(100)',@EnteredParm=@WildCardParm;
In the code above we altered the GetProducts stored procedure to use sp_executesql to execute our dynamic SQL. In this modified stored procedure, we made the following changes:
- Changed the string @CMD to no longer include the value of the @EnteredText variable in the command string. Instead I introduced the user entered text in a variable named @EnteredParm.
- Added a SET statement to set the variable @WildCardParm to place the wildcard character (%) at the beginning and end of the @EnteredText parameter.
- Changed how the string @CMD was executed. Instead of using the EXEC statement to execute the string, we used the procedure sp_executesql.
By making these changes the user entered text will now be executed as a parameter driven query. By doing this, a user can no longer try to inject additional TSQL code into the GetProduct store procedure. To verify this, we run the four different queries shown below. But since we already deleted our Product table, we first need to recreate it with data.
CREATE TABLE Product (ID INT, ProductName VARCHAR(100), Price MONEY); INSERT INTO Product VALUES (1, 'Red Wagon', 12.99), (2, 'Red Barn', 23.18), (2, 'Farm Animals', 7.59), (2, 'Toy Solders', 17.76);
--Query1 EXEC GetProducts 'Red'; --Query2 EXEC GetProducts 'Red%'' and ID = 1 --'; --Query3 EXEC GetProducts 'Red'' ;SELECT * FROM Product;--'; --Query4 EXEC GetProducts 'Red'' ;DROP TABLE Product;--';
When we execute these different queries (Query1, Query2, Query3, Query4) we will find that only Query1 returns data. The reason the others do not return data is the dynamic SQL generated now is looking for ProductName values that contain the additional user enter injection values, which of course don’t match any of the Product column values in the Product table.