This paper has made some explorations on how to optimize the performance of large databases, and proposed several strategies to optimize database access performance, especially the effective analysis and design of SQL statements, so that it can speed up execution and reduce network transmission. Work efficiently and make the most of your system's efficiency.
With the continuous increase of hospital information system modules, especially the use of electronic medical records in the past two years, clinical diagnosis and treatment information has been written into the database in large quantities, and the amount of data has increased sharply, resulting in a very large business database and a significant decline in the speed of business processing. Based on this problem, this paper makes some explorations on how to optimize the performance of large databases, and puts forward some strategies to optimize database access performance, especially the effective analysis and design of SQL statements, so that it can speed up execution and reduce Network transmission, can work more efficiently, and give full play to the efficiency of the system.
After years of informatization construction, the hospital has achieved remarkable results. The informationization has mainly been based on charging and accounting, and gradually transitioned to clinical medical care and service patients. With the continuous increase of hospital information system modules, especially the use of electronic medical records in the past two years, clinical diagnosis and treatment information has been written into the database in large quantities, and the amount of data has increased sharply, resulting in a very large business database and a significant decline in the speed of business processing. In addition, large data volume queries or report statistics are also required in frequent business databases, resulting in frequent blocking or deadlock in business processing, which seriously affects daily work. Therefore, how to optimize the database performance, that is, to improve the throughput of the database and reduce the waiting time of users is of great significance.
The traditional database performance optimization mainly considers the operating system, client application software program design, network and other hardware devices. This method only adjusts the surrounding environment of the database, and can only temporarily alleviate the problem, but can not solve the problem fundamentally. . In practical applications, more cases are that hospital information systems (including database systems) have been designed, but in the process of operation, as the data size increases, the system has periodic performance problems. The performance optimization of the hospital database system proposed in this paper is based on the improvement of the existing hardware facilities upgrade, the physical design of the database, and the relationship standardization. The problem of effective analysis and design of the SQL statement is made to speed up the execution. Reduce network transmission, work more efficiently, and make full use of system efficiency.
1 Reasonable use of the indexThe most effective way to improve database query speed is to optimize the index. An index is a kind of data organization built on an entity table. It can improve the query efficiency of one or more records in the access table. The purpose of using the index is to avoid full table scan, reduce the number of disk I/Os, and speed up the query. The establishment of indexes in large tables is of great significance for speeding up the query of tables. But it doesn't index any data tables. Indexes usually improve the performance of select, update, and delete statements (when fewer rows are accessed), but reduce the performance of insert statements (because you need to have both tables and indexes) Make an insert). In addition, too many indexes will create maintenance overhead, which will only reduce, rather than increase, the performance of the system, the use of the index is just right. The principles of index usage are as follows:
(1) Indexes are created on columns that are frequently connected, but are not specified as foreign keys, and fields that are not frequently connected are automatically indexed by the optimizer.
(2) Indexes are built on columns that are frequently sorted or grouped (ie, group by or order by operations), and tables that are frequently deleted or inserted do not create too many indexes.
(3) Searches are created on columns with different values ​​that are often used in conditional expressions. Do not create indexes on columns with different values. For example, in the "gender" column of the employee table, there are only two different values ​​of "male" and "female", so there is no need to establish an index. If the index is built here, it will not improve the efficiency of the query, but will seriously reduce the update speed.
(4) If there are multiple columns to be sorted, a compound index (compound index) can be created on these columns. Try to use a narrower index so that each page of the data page can be reduced by storing more index rows.
(5) In the query, the index is often used as a conditional expression and columns with different values, and the columns with different values ​​are not indexed.
(6) After the database table updates the big data, delete and re-index to improve the query speed.
In short, the establishment of the index must be cautious, the need to establish each index must be carefully analyzed, there must be a basis for establishment. Excessive indexing or inadequate, incorrect indexes are not good for improving database performance.
2 SQL statement optimizationThe SQL language is a very flexible language. The implementation of the same function can often be expressed in several different statements, but the execution efficiency of the statements may be very different. Therefore, in any database application system, reasonable optimization of SQL statements will greatly improve the performance of the entire database system. All SQL statement execution processes are divided into three phases, which are processing syntax analysis, execution, and reading data.
Figure 1 SQL statement execution process
When using SQL, performance differences are particularly evident in large or complex database environments, such as in some large tables in HIS. After a period of summarization, it is found that the reason why the SQL statement is relatively low mainly comes from inappropriate index design, insufficient connection conditions, non-optimizable WHERE clauses and other inappropriate statement operations, etc. After that, its running speed has been significantly improved. The following will explain each of these aspects:
2.1 LIKE operator
The LIKE operator can apply wildcard queries. The wildcard combination can reach almost arbitrary queries, but if it is not used well, it will cause performance problems, such as like 'a%' using index, like '%a' is not used. index. When querying with like '%a%', the query time is proportional to the total length of the field value, so you can't use the CHAR type, but VARCHAR.
2.2 Limiting the return line
In the query Select statement, use the Where clause to limit the number of rows returned, avoiding table scans. If unnecessary data is returned, the server's I/O resources are wasted, which increases the burden on the network and reduces performance. If the table is large, the table is locked during the table scan, and other joins are prohibited from accessing the table, which has serious consequences. You can use the TOP statement to limit the return results. When returning multiple rows of data, try not to use the cursor as it takes up a lot of resources and should use the datastore.
2.3 UNION operator
UNION will filter out duplicate records after the table is linked, so the result set will be sorted after the table is linked, and the duplicate records will be deleted and the result will be returned. In most applications, duplicate records are not generated. The most common is the process table and the history table UNION. It is recommended to use the UNION ALL operator instead of UNION because the UNION ALL operation simply returns the two results after merging them.
2.4 Between and IN
Between is faster than IN at some point, and Between can find ranges faster based on the index. Such as:
Select * from YF_KCMX where YPXH in (12,13)
Select * from YF_KCMX where between 12 and 13
Generally, the extra rows can be eliminated before the GROUP BY HAVING clause, so try not to use them to do the work of culling the rows. Their order of execution should be as follows: select's Where clause selects all appropriate lines, Group By is used to group statistical rows, and Having words are used to eliminate redundant packets. In this way, the Group By Having has a small overhead and the query is fast. Grouping and Having large data rows is very resource intensive. If the purpose of Group BY does not include calculations, just grouping, then using Distinct is faster.
2.5 attention to detail
Generally do not use the following words: "<>", "!=", "!>", "!<", "NOT", "NOT EXISTS", "NOT IN", "NOT LIKE", and "LIKE ' %500'", because they don't go indexing all are table scans. NOT IN will scan the table multiple times, using EXISTS, NOT EXISTS, IN, LEFT OUTER JOIN instead, especially left join, while Exists is faster than IN, the slowest is NOT operation. If the value of the column contains empty, it used to The index does not work, "<>", "!=", "!>", etc. still can not be optimized, can not use the index.
Do not add functions to column names in the Where name, such as Convert, substring, etc. If you must use a function, create a computed column and create an index instead. Can also be modified to write:
WHERE SUBSTRING(firstname,1,1) = 'm'
Changed to: WHERE firstname like 'm%' (index scan), but MIN() and MAX() can use the appropriate index.
Select * form ZY_FYMX where FYDJ > 3000
Analysis In this statement, if FYDJ is of type Float, the optimizer optimizes it to Convert(float, 3000). Since 3000 is an integer, we should use 3000.0 when programming instead of waiting for the DBMS to convert. The same conversion of characters and integer data. Should be changed to:
Select * form ZY_FYMX where FYDJ > 3000.00
2.6 Avoid related subqueries
A column's label appears in both the main query and the where clause, so it is likely that the subquery must be re-queried once the column value in the main query has changed. The more nesting levels of the query, the lower the efficiency, so you should try to avoid subqueries. If the subquery is unavoidable, filter as many rows as possible in the subquery.
Insulated Copper Tube Terminals
Insulated Copper Tube Terminals,High quality insulated terminal,copper tube terminal
Taixing Longyi Terminals Co.,Ltd. , https://www.longyicopperlugs.com