Just Some Things to Read
Performance issue insert select
11 May 2022
A while ago I noticed a performance issue with a view in Azure SQL Server.
The returned result set of this view is inserted into a table; just like this insert into <table> (<fields>) select <fields> from <view>.
In this view we are joining tables and views. And those views are also joining other tables and views.
So in short we have a view on a view construction. I know SQL Server is not always happy with this but from a data logistics point of view it is a simple and nice solution.
I still believe a view on a view construction is not bad and should not have an impact on performance. If it has then we have a code issue and/or a problem in the technical data model e.g. a missing index.
But sometimes it is what it is and we can not change easily the techinical data model. And often the data and business rules are complex and so the code/SQL.
Sometimes we need to persist a view for performance reasons.
In the end it is finding the right balance between performance, Azure costs, easy development and we should be able to deliver the data right on time as agreed with our stakeholders.
Okay back to the performance issue. It turns out that inserting all returned records into an empty table takes about 11 minutes for 50k records.
A clustered columnstore index is the only applied index on that table before inserting records and this one is not really needed as we take the number of records into account.
But using clustered columnstore indexes is part of the homebuild framework.
My first thought was that this performance issue should be an easy one to solve with adding the option force order.
In my experience with a view on a view construction this option might improve the execution of a view.
But in this case it doesn't help, I cancel the query after 45 minutes. I forgot it was still running....
So, I decided to test several "insert - select" options (and in combination if possible) to investigate in this case the best option(s):
- clustered columnstore index (CCIX)
- the table hint tablock
- the query hint force order
- table compression (TC)
Below the average run time in minutes of the various scenario's I tested
| Scenario | Degree of Parallelism | Run Time (avg in min) |
|---|---|---|
| CCIX without tablock and force order | 1 | 11:32 |
| CCIX with tablock and without force order | 1 | 4:43 |
| CCIX with tablock and force order | 1 | 5:05 |
| CCIX without tablock and with force order | 1 | after 20 min query is cancelled |
| TC without tablock and force order | 1 | 11:32 |
| TC with tablock and without force order | 16 | 0:32 |
| TC with tablock and force order | 1 | 0:34 |
| TC without tablock and with force order | 1 | after 20 min query is cancelled |
Just some background of the test setup:
- Buffers and cache are cleared before running the query
- Each scenario is executed 5 times
- Before running the query the table was dropped en re-created with or without CCIX and TC
- Used Azure SQL Server General Purpose Serverless Gen5 16 vcores
select
req.session_id
, req.start_time
, req.status
, req.command
, DatabaseName = db_name(req.database_id)
, req.blocking_session_id
, req.wait_type
, req.last_wait_type
, ssn.login_name
, ssn.program_name
, ObjectName = object_name(qtext.objectid, qtext.dbid)
, qtext.text
, qplan.query_plan
, req.dop
, req.reads
, req.writes
, req.logical_reads
, req.total_elapsed_time
, req.parallel_worker_count
from sys.dm_exec_requests as req
cross apply sys.dm_exec_sql_text(req.sql_handle) as qtext
cross apply sys.dm_exec_query_plan(req.plan_handle) as qplan
inner join sys.dm_exec_sessions as ssn on req.session_id = ssn.session_id
where req.session_id <> @@spid;
Happy coding!