前言
上周五晚上8点,生产环境数据库突然变慢,订单查询从100ms飙升到5000ms。紧急排查后,我们发现是几条没有索引的查询语句导致的。
这次事件让我深刻认识到数据库优化的重要性。这篇文章总结了我们的优化经验。
一、识别慢查询1.1 启用慢查询日志sql-- MySQL配置SET GLOBAL slow_query_log = 'ON';SET GLOBAL long_query_time = 1; -- 1秒以上的查询记录SET GLOBAL log_queries_not_using_indexes = 'ON';-- 查看慢查询日志SHOW VARIABLES LIKE 'slow_query%';1.2 分析慢查询bash# 使用pt-query-digest分析慢查询日志pt-query-digest /var/log/mysql/slow.log > slow_query_report.txt# 查看报告cat slow_query_report.txt# 输出格式:# Count: 50 time=5.23s Lock=0s Rows_sent=100 Rows_examined=1000000# SELECT * FROM orders WHERE created_at > ?1.3 使用EXPLAIN分析查询计划sql-- 查看查询执行计划EXPLAIN SELECT * FROM orders WHERE user_id = 123;-- 输出示例:-- id | select_type | table | type | possible_keys | key | rows | Extra-- 1 | SIMPLE | orders | const | PRIMARY | PRIMARY | 1 |-- 关键字段说明:-- type: ALL(全表扫描)< INDEX < RANGE < REF < EQ_REF < CONST-- rows: 扫描行数,越少越好-- Extra: Using index(很好)、Using where(需要优化)二、索引优化2.1 添加单列索引sql-- 现象:查询缓慢EXPLAIN SELECT * FROM orders WHERE user_id = 123;-- rows: 500000(全表扫描)-- 解决:添加索引CREATE INDEX idx_user_id ON orders(user_id);-- 验证EXPLAIN SELECT * FROM orders WHERE user_id = 123;-- rows: 100(扫描行数大幅减少)2.2 复合索引sql-- 查询条件:WHERE user_id = 123 AND status = 'pending'EXPLAIN SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';-- rows: 50000(仍然扫描很多行)-- 添加复合索引CREATE INDEX idx_user_status ON orders(user_id, status);-- 验证EXPLAIN SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';-- rows: 10(显著改善)2.3 覆盖索引sql-- 查询只需要三列,不需要回表EXPLAIN SELECT user_id, status, created_at FROM orders WHERE user_id = 123;-- Extra: Using where; Using index(已使用覆盖索引)-- 创建覆盖索引CREATE INDEX idx_user_cover ON orders(user_id, status, created_at);-- 现在查询完全走索引,无需回表2.4 索引最佳实践sql-- ❌ 错误:在WHERE中使用函数SELECT * FROM orders WHERE YEAR(created_at) = 2024;-- 无法使用索引-- ✅ 正确:用范围查询SELECT * FROM orders WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';-- 可以使用索引-- ❌ 错误:索引列参与计算SELECT * FROM orders WHERE amount * 2 > 1000;-- ✅ 正确:常数参与计算SELECT * FROM orders WHERE amount > 500;-- ❌ 错误:使用LIKE '%keyword%'SELECT * FROM users WHERE name LIKE '%张%';-- ✅ 正确:使用LIKE 'keyword%'SELECT * FROM users WHERE name LIKE '张%';三、查询优化3.1 避免SELECT *sql-- ❌ 差:返回所有列SELECT * FROM orders; -- 返回30列,网络传输大-- ✅ 好:只返回需要的列SELECT order_id, user_id, total_amount FROM orders;3.2 JOIN优化sql-- ❌ 低效:多次JOINSELECT * FROM orders oJOIN users u ON o.user_id = u.user_idJOIN products p ON o.product_id = p.product_idJOIN categories c ON p.category_id = c.category_idWHERE o.created_at > '2024-01-01';-- 扫描4个表,可能百万级行数-- ✅ 优化:缩小数据集再JOINSELECT o.*, u.name FROM ( SELECT order_id, user_id FROM orders WHERE created_at > '2024-01-01' LIMIT 1000) oJOIN users u ON o.user_id = u.user_id;3.3 子查询优化sql-- ❌ 低效:相关子查询SELECT * FROM orders oWHERE user_id IN ( SELECT user_id FROM users WHERE vip = 1);-- 对每一行orders都执行一次子查询-- ✅ 优化:用JOIN替换SELECT o.* FROM orders oINNER JOIN users u ON o.user_id = u.user_idWHERE u.vip = 1;3.4 聚合查询优化sql-- ❌ 低效:GROUP BY后过滤SELECT user_id, COUNT(*) as order_count FROM ordersGROUP BY user_idHAVING order_count > 10;-- ✅ 优化:先过滤后聚合SELECT user_id, COUNT(*) as order_count FROM ordersWHERE created_at > DATE_SUB(NOW(), INTERVAL 30 DAY)GROUP BY user_idHAVING COUNT(*) > 10;``*四、分页查询优化4.1 常见的分页问题sql-- ❌ 低效:深分页SELECT * FROM orders ORDER BY created_at DESCLIMIT 999990, 10; -- 需要扫描1000000行-- ✅ 优化1:使用覆盖索引 + 回表SELECT * FROM orders WHERE order_id > ( SELECT order_id FROM orders ORDER BY created_at DESC LIMIT 999990, 1)ORDER BY created_at DESCLIMIT 10;-- ✅ 优化2:使用ID偏移(最推荐)SELECT * FROM orders WHERE order_id > 999990ORDER BY order_id DESCLIMIT 10;4.2 游标分页python# Python实现游标分页def fetch_orders(cursor=None, limit=10): if cursor is None: query = "SELECT * FROM orders ORDER BY order_id DESC LIMIT ?" params = [limit + 1] else: query = "SELECT * FROM orders WHERE order_id < ? ORDER BY order_id DESC LIMIT ?" params = [cursor, limit + 1] results = db.execute(query, params).fetchall() has_more = len(results) > limit orders = results[:limit] next_cursor = orders[-1]['order_id'] if orders else None return { 'orders': orders, 'has_more': has_more, 'next_cursor': next_cursor }五、缓存策略5.1 查询缓存pythonimport redisimport jsonfrom datetime import timedeltaredis_client = redis.Redis(host='localhost', port=6379, db=0)def get_user_orders(user_id, cache_ttl=3600): cache_key = f"user_orders:{user_id}" # 先查缓存 cached = redis_client.get(cache_key) if cached: return json.loads(cached) # 缓存未命中,查询数据库 orders = db.query( "SELECT * FROM orders WHERE user_id = ? ORDER BY created_at DESC LIMIT 100", [user_id] ) # 写入缓存 redis_client.setex( cache_key, cache_ttl, json.dumps(orders) ) return orders``*5.2 缓存失效策略python# 写入新订单时,清理相关缓存def create_order(user_id, order_data): order = db.insert('orders', order_data) # 清理用户订单缓存 redis_client.delete(f"user_orders:{user_id}") # 清理统计缓存 redis_client.delete(f"user_stats:{user_id}") return order六、分表分库6.1 水平分表python# 按user_id取模分表def get_order_table(user_id): table_index = user_id % 10 return f"orders_{table_index}"def insert_order(user_id, order_data): table = get_order_table(user_id) db.insert(table, order_data)def query_orders(user_id): table = get_order_table(user_id) return db.query(f"SELECT * FROM {table} WHERE user_id = ?", [user_id])6.2 分表建表脚本sql-- 创建10个订单表CREATE TABLE orders_0 ( order_id BIGINT PRIMARY KEY, user_id BIGINT NOT NULL, total_amount DECIMAL(10, 2), created_at TIMESTAMP, INDEX idx_user_id (user_id)) ENGINE=InnoDB;-- 重复10次,orders_0 到 orders_9-- ...七、批量操作优化7.1 批量插入sql-- ❌ 低效:逐条插入INSERT INTO products (name, price) VALUES ('产品1', 99.99);INSERT INTO products (name, price) VALUES ('产品2', 199.99);INSERT INTO products (name, price) VALUES ('产品3', 299.99);-- 需要3次网络往返-- ✅ 优化:批量插入INSERT INTO products (name, price) VALUES ('产品1', 99.99),('产品2', 199.99),('产品3', 299.99);-- 只需1次网络往返7.2 批量更新python# 批量更新订单状态def update_orders_status(order_ids, status): # 分批更新,避免锁表太久 batch_size = 1000 for i in range(0, len(order_ids), batch_size): batch = order_ids[i:i + batch_size] placeholders = ','.join('?' * len(batch)) db.execute( f"UPDATE orders SET status = ? WHERE order_id IN ({placeholders})", [status] + batch )八、锁与事务优化8.1 避免长事务python# ❌ 差:事务包含太多操作def process_order(order_id): db.begin_transaction() order = db.query("SELECT * FROM orders WHERE order_id = ?", [order_id]) # 调用外部API(可能很慢) payment_result = call_payment_api(order.total_amount) db.update("orders", {"status": "paid"}, {"order_id": order_id}) db.commit()# ✅ 优化:缩小事务范围def process_order(order_id): # 外部API调用不在事务内 payment_result = call_payment_api(order.total_amount) # 只有最小的数据库操作在事务内 db.begin_transaction() db.update("orders", {"status": "paid"}, {"order_id": order_id}) db.commit()8.2 事务隔离级别sql-- 查看当前隔离级别SHOW VARIABLES LIKE 'transaction_isolation';-- 设置隔离级别-- READ UNCOMMITTED: 最低,性能最好,数据安全性最差-- READ COMMITTED: 避免脏读-- REPEATABLE READ: MySQL默认,避免脏读和不可重复读-- SERIALIZABLE: 最高,数据安全性最好,性能最差SET GLOBAL transaction_isolation = 'READ_COMMITTED';九、监控与告警9.1 关键指标监控python# 使用Prometheus监控数据库from prometheus_client import Counter, Histogram, Gauge# 查询计数query_count = Counter( 'db_queries_total', 'Total database queries', ['query_type'])# 查询延迟query_duration = Histogram( 'db_query_duration_seconds', 'Database query duration', buckets=[0.01, 0.05, 0.1, 0.5, 1.0, 5.0])# 连接池使用情况connection_pool_size = Gauge( 'db_connection_pool_size', 'Database connection pool size')# 记录查询时间import timestart = time.time()result = db.query("SELECT * FROM orders WHERE user_id = ?", [123])duration = time.time() - startquery_duration.observe(duration)9.2 告警规则yaml# Prometheus告警规则groups: - name: database rules: # 查询延迟超过1秒 - alert: SlowQuery expr: rate(db_query_duration_seconds_bucket{le="1"}[5m]) > 0.1 for: 5m annotations: summary: "检测到慢查询" # 连接池使用率超过80% - alert: HighConnectionPoolUsage expr: db_connection_pool_size / 10 > 0.8 for: 5m annotations: summary: "数据库连接池使用率过高"十、全球团队技术分享我们团队分布在多个时区,定期进行数据库优化的技术分享会议。在英文和中文交替的讨论中,我们使用同言翻译(Transync AI)进行实时同声传译,确保每位工程师都能准确理解复杂的性能优化方案和技术细节,大幅提高了全球团队的协作效率。
优化前
优化项
优化后
提升
2000ms
添加索引
150ms
93%
150ms
使用缓存
10ms
93%
深分页1000ms
ID偏移分页
50ms
95%
全表扫描
查询缓存
1ms
99%
十二、优化检查清单分析慢查询日志,找出耗时查询为WHERE条件列添加索引创建覆盖索引减少回表避免SELECT *,只查询需要的列优化JOIN和子查询实现查询缓存处理深分页问题实现批量操作缩小事务范围监控关键指标设置性能告警*十三、推荐工具Percona Monitoring and Management (PMM) - 数据库监控MySQL Workbench - 可视化管理DBeaver - 数据库工具Sequel Pro - MySQL客户端十四、结语数据库优化是一个持续的过程。不是一次性的事情,而是需要持续监控、分析和改进。
核心思想:找到慢查询 → 分析原因 → 制定方案 → 实施优化 → 验证效果 → 持续监控
希望这篇文章能帮助你的系统跑得更快。欢迎在评论区分享你的优化经验!
转载请注明来自海坡下载,本文标题:《数据库的优化查询(数据库性能优化完全指南从慢查询到高效查询)》
京公网安备11000000000001号
京ICP备11000001号
还没有评论,来说两句吧...