Blog - Data Pathfinder

Performance issue insert select

11 May 2022

A while ago I noticed a performance issue with a view in Azure SQL Server. The returned result set of this view is inserted into a table; just like this insert into <table> (<fields>) select <fields> from <view>. In this view we are joining tables and views. And those views are also joining other tables and views. So in short we have a view on a view construction. I know SQL Server is not always happy with this but from a data logistics point of view it is a simple and nice solution.

I still believe a view on a view construction is not bad and should not have an impact on performance. If it has then we have a code issue and/or a problem in the technical data model e.g. a missing index. But sometimes it is what it is and we can not change easily the techinical data model. And often the data and business rules are complex and so the code/SQL. Sometimes we need to persist a view for performance reasons. In the end it is finding the right balance between performance, Azure costs, easy development and we should be able to deliver the data right on time as agreed with our stakeholders.

Okay back to the performance issue. It turns out that inserting all returned records into an empty table takes about 11 minutes for 50k records. A clustered columnstore index is the only applied index on that table before inserting records and this one is not really needed as we take the number of records into account. But using clustered columnstore indexes is part of the homebuild framework.

My first thought was that this performance issue should be an easy one to solve with adding the option force order. In my experience with a view on a view construction this option might improve the execution of a view. But in this case it doesn't help, I cancel the query after 45 minutes. I forgot it was still running.... So, I decided to test several "insert - select" options (and in combination if possible) to investigate in this case the best option(s):

clustered columnstore index (CCIX)
the table hint tablock
the query hint force order
table compression (TC)

Below the average run time in minutes of the various scenario's I tested

Scenario	Degree of Parallelism	Run Time (avg in min)
CCIX without tablock and force order	1	11:32
CCIX with tablock and without force order	1	4:43
CCIX with tablock and force order	1	5:05
CCIX without tablock and with force order	1	after 20 min query is cancelled
TC without tablock and force order	1	11:32
TC with tablock and without force order	16	0:32
TC with tablock and force order	1	0:34
TC without tablock and with force order	1	after 20 min query is cancelled

In this case I have chosen to populate the table with table compression and tablock.

Just some background of the test setup:

Buffers and cache are cleared before running the query
Each scenario is executed 5 times
Before running the query the table was dropped en re-created with or without CCIX and TC
Used Azure SQL Server General Purpose Serverless Gen5 16 vcores

Below the query to see the Degree of Parallism (field dop in the query) and other properties as well:

select
	req.session_id
,	req.start_time
,	req.status
,	req.command
,	DatabaseName = db_name(req.database_id)
,	req.blocking_session_id
,	req.wait_type
,	req.last_wait_type
,	ssn.login_name
,	ssn.program_name
,	ObjectName = object_name(qtext.objectid, qtext.dbid)
,	qtext.text
,	qplan.query_plan
,	req.dop
,	req.reads
,	req.writes
,	req.logical_reads
,	req.total_elapsed_time
,	req.parallel_worker_count

from sys.dm_exec_requests as req
	cross apply sys.dm_exec_sql_text(req.sql_handle) as qtext
	cross apply sys.dm_exec_query_plan(req.plan_handle) as qplan
	inner join sys.dm_exec_sessions as ssn on req.session_id = ssn.session_id

where req.session_id <> @@spid;

Happy coding!

Just Some Things to Read

Performance issue insert select

11 May 2022

Get in touch