What you must know about performance optimization

What you must know about performance optimization

       Recently, some customers reported that the system imported EXECL for data processing timed out. My first reaction at the time was that it was impossible. I obviously did performance optimization. How could it time out? How many pieces of data are needed to happen? ! So I asked a customer for EXECL, and found that there were more than 7,500 pieces of data. After backing up the customer database, I performed code debugging to find out the poor performance. They are all things that are often talked about, but they are easy to ignore. There are only two points here, whether to use String or StringBuilder, whether to verify the correctness of the data one by one in the loop, use SQL to fetch the numbers, or Take it out once and verify it in the code! The following will use the actual data combined with the chart to give an accurate answer.

Read the table of contents

back to the top

String and StringBuilder performance difference comparison

   The difference between String and StringBuilder will not be mentioned here. In learning and working, we often hear that the use of StringBuilder objects for splicing strings is very fast, but maybe you just know this knowledge. Have you paid attention to this in actual development work? I also learned this knowledge after the customer feedback and followed the actual results by myself, and I will keep this in mind in the subsequent development! The following actual data may explain some problems.

      Call this function separately, the number of cycles is 1, 5, 15, 200, 500, 1500, 2500, 5500, 8500, 20000. The data after that can be downloaded for the final DEMO experiment. String is already too slow at this time. In order to ensure the accuracy of the data, the data of each magnitude here are taken ten times, and then the average value is calculated.

///<summary>
       ///Compare the speed of string and StringBuilder splicing strings
       ///For each magnitude test, take the average of ten times
       ///</summary>
       ///<param name="Total">Number of cycles</param>
        public static void StringSpeedComparer(int Total){
            List<string> list = new List<string>();
            for (int i = 0; i <Total; i++)
            {
                list.Add(Guid.NewGuid().ToString());
            }

            int iTest = 10;
           //Total execution time ms
            double TotalMilliseconds = 0;


           //String stitching
            string strGUID = String.Empty;
            while (iTest> 0)
            {
                DateTime dtBegin = DateTime.Now;
                foreach (string temp in list)
                {
                    strGUID = strGUID + temp + ";";
                }
                DateTime dtEnd = DateTime.Now;
                TotalMilliseconds += (dtEnd-dtBegin).TotalMilliseconds;
                iTest--;
            }
            Console.WriteLine("String splicing {0} strings takes {1}ms", Total, TotalMilliseconds/10);

           //StringBuilder stitching
            StringBuilder sb = new StringBuilder();
            iTest = 10;
            TotalMilliseconds = 0;
            while (iTest> 0)
            {
                DateTime dtBegin = DateTime.Now;
                foreach (string temp in list)
                {
                    sb.AppendFormat("{0};", temp);
                }
                DateTime dtEnd = DateTime.Now;
                TotalMilliseconds += (dtEnd-dtBegin).TotalMilliseconds;
                iTest--;
            }
            Console.WriteLine("It takes {1}ms for StringBuilder to stitch {0} strings together", Total, TotalMilliseconds/10);
        }

The execution result is as follows:

 Draw into a graph:

   It can be seen directly from the above figure that String splicing is geometrically increasing, while StringBuilder is linear, and the increasing trend is slow. I believe everyone knows which splicing to use in the case of a large number of cycles! In the amount of 7500, it can save a full 4s of time, is the performance improved a lot?

back to the top

Is it cyclical or one-time?

  Background: There are 7500 rows of student information data in EXECL. These data must be imported into the student table (p_Student), but the student number (StudentNo) must be unique, and prompt information must be given when it is not unique. This requires reading the student information in EXECL in the background code and then verifying whether the student code exists in the database. Of course, the student ID filled in EXECL must also be verified as unique. Let's simulate this process and compare the performance in two ways. ,

  First create a student information table, insert 7500 pieces of data, the following is the SQL script, the student ID is inserted here is newid, the actual situation will not be like this, here is only guaranteed to be unique, but it is disordered, as realistic as possible situation.

/*---------------------------Data Dictionary Generation Tool (V2.1)------------ --------------------*/
GO
IF NOT EXISTS(SELECT 1 FROM sysobjects WHERE id=OBJECT_ID('[p_Student]'))
BEGIN
/*============================================== ==============*/
/* Table: p_Student */
/*============================================== ==============*/
CREATE TABLE [dbo].[p_Student](
    [StudentGUID] uniqueidentifier,
    [Name] varchar(40),
    [Major] varchar(100),
    [Sex] varchar(8),
    [StudentNo] varchar(100),
    PRIMARY KEY(StudentGUID)
)
    

declare @CurrentUser sysname
select @CurrentUser = user_name()
execute sp_addextendedproperty'MS_Description','student information table','user', @CurrentUser,'table','p_Student'
execute sp_addextendedproperty'MS_Description','Student Information GUID' ,'user', @CurrentUser,'table','p_Student','column','StudentGUID'
execute sp_addextendedproperty'MS_Description','Name' ,'user', @CurrentUser,'table','p_Student','column','Name'
execute sp_addextendedproperty'MS_Description','Professional' ,'user', @CurrentUser,'table','p_Student','column','Major'
execute sp_addextendedproperty'MS_Description','Gender' ,'user', @CurrentUser,'table','p_Student','column','Sex'
execute sp_addextendedproperty'MS_Description','Student Number' ,'user', @CurrentUser,'table','p_Student','column','StudentNo'

END
GO
--Insert 7500 pieces of simulated data
DECLARE @Count AS INT
SELECT @Count=COUNT(1) FROM p_Student
IF @Count=0
BEGIN
    DECLARE @i AS INT
    SET @i=7500
    WHILE @i>0
    BEGIN
        INSERT INTO dbo.p_Student
                (StudentGUID,
                  Name,
                  Major,
                  Sex,
                  StudentNo
                )
        VALUES (NEWID(), - StudentGUID-uniqueidentifier
                  @i, - Name-varchar(40)
                  'Software Engineering', - Major-varchar(100)
                  'Male', - Sex-varchar(8)
                  NEWID() - StudentNo-varchar(100)
                )
        SET @i=@i-1
    END
END
GO

      After the basic information is ready, enter the background code

///<summary>
       ///Statistical cycle check and one-time check performance difference
       ///</summary>
        public static void Check(int Total)
        {
           //Simulate student ID
            List<string> listStudetNo = new List<string>();
            for (int i = 0; i <Total; i++)
            {
                listStudetNo.Add(Guid.NewGuid().ToString());
            }
            using (SqlConnection con = new SqlConnection(SqlCon))
            {
                con.Open();
                string strSQL = "SELECT COUNT(1) FROM dbo.p_Student WHERE StudentNo='{0}'";
                SqlCommand cmd = con.CreateCommand();

               //Cycle check
                double TotalMilliseconds = 0;
                for (int i = 0; i <10; i++)
                {
                    foreach (string studentNo in listStudetNo)
                    {
                        DateTime dtBegin = DateTime.Now;
                        cmd.CommandText = String.Format(strSQL, studentNo);
                        int count = (int)cmd.ExecuteScalar();
                        if (count> 0)
                        {
                            Console.WriteLine("The {0} number is duplicated, please re-enter!", studentNo);
                            return;
                        }
                        DateTime dtEnd = DateTime.Now;
                        TotalMilliseconds += (dtEnd-dtBegin).TotalMilliseconds;
                    }
                }
                Console.WriteLine("It takes {1}ms to verify {0} student numbers in a loop", Total, TotalMilliseconds/10);

               //One-time verification
                TotalMilliseconds = 0;
                strSQL = "SELECT TOP 1 StudentNo FROM dbo.p_Student WHERE StudentNo IN ('{0}')";
                for (int i = 0; i <10; i++)
                {
                    DateTime dtBegin = DateTime.Now;
                    StringBuilder sb = new StringBuilder();
                    foreach (string studentNo in listStudetNo)
                    {
                        sb.AppendFormat("{0};", studentNo);
                    }
                    cmd.CommandText = String.Format(strSQL,sb.ToString().Substring(0, sb.ToString().Length-1).Replace(";","','"));
                    string no = (string)cmd.ExecuteScalar();
                    if (!string.IsNullOrEmpty(no))
                    {
                        Console.WriteLine("The {0} number is duplicated, please re-enter!", no);
                        return;
                    }
                    DateTime dtEnd = DateTime.Now;
                    TotalMilliseconds += (dtEnd-dtBegin).TotalMilliseconds;
                }
                Console.WriteLine("It takes {1}ms to verify {0} student numbers at one time", Total, TotalMilliseconds/10);
            }
        }

    It can be seen directly from the figure above that both the cyclic check and the one-time check increase linearly, and the one-time check speed is almost twice as fast as the one-time check.

back to the top

Sample download and summary

Sample sql , sample code DEMO

         In fact, performance optimization is more than just this point. It needs to be summarized in daily work. This performance optimization also surprises me. There is a SQL that takes about 20s to execute before optimization. Indexes are added to the table, and the speed changes. It became 0s, and finally the performance problem was solved satisfactorily.

        Performance optimization ideas:

        1: Please use StringBuilder for a large number of string splicing

        2: Never check SQL in a large number of loops, consider whether it can be replaced by a one-time query, or query the data once and make logical judgments in the code

        3: SQL execution speed is slow, you can use the execution plan to see if the table lacks indexes.

      Well, this article is about to end here, if you think it is good for you, remember to like it!

   Related Reading: Attaching a database method without a log file, a method to delete a database log file, a data dictionary generation tool series

Reference: https://cloud.tencent.com/developer/article/1014588 What you must know about performance optimization-Cloud + Community-Tencent Cloud