Linux 统计Web服务日志命令

马哥Linux运维

共 11415字,需浏览 23分钟

 ·

2024-07-05 17:42

阅读目录

  • Apache日志统计

  • Nginx 日志统计

  • 统计Web服务状态

  • 其他统计组合

  • 次数统计

本人在Linux运维中收集的一些通用的统计,Apache/Nginx服务器日志的命令组合。

Apache日志统计

# 列出当天访问次数最多的IP命令[root@lyshark.cnblogs.com httpd]# cut -d- -f 1 access_log | uniq -c | sort -rn | head -20
# 查看当天有多少个IP访问[root@lyshark.cnblogs.com httpd]# awk '{print $1}' access_log | sort | uniq | wc -l
# 查看某一个页面总计被访问的次数[root@lyshark.cnblogs.com httpd]# cat access_log | grep "index.php" | wc -l
# 查看每一个IP访问了多少个页面[root@lyshark.cnblogs.com httpd]# awk '{++S[$1]} END {for (a in S) print a,S[a]}' access_log
# 将每个IP访问的页面数进行从小到大排序[root@lyshark.cnblogs.com httpd]# awk '{++S[$1]} END {for (a in S) print S[a],a}' access_log | sort -n
# 查看某一个IP访问了哪些页面[root@lyshark.cnblogs.com httpd]# grep "^192.168.1.2" access_log | awk '{print $1,$7}'
# 去掉搜索引擎统计当天的页面[root@lyshark.cnblogs.com httpd]# awk '{print $12,$1}' access_log | grep ^"Mozilla" | awk '{print $2}' |sort | uniq | wc -l
# 查看21/Nov/2019:03:40:26这一个小时内有多少IP访问[root@lyshark.cnblogs.com httpd]# awk '{print $4,$1}' access_log | grep "21/Nov/2019:03:40:26" | awk '{print $2}'| sort | uniq | wc -l

Nginx 日志统计

# 列出所有的IP访问情况[root@lyshark.cnblogs.com httpd]# awk '{print $1}' access_log | sort -n | uniq
# 查看访问最频繁的前100IP[root@lyshark.cnblogs.com httpd]# awk '{print $1}' access_log | sort -n | uniq -c | sort -rn | head -n 100
# 查看访问100次以上的IP[root@lyshark.cnblogs.com httpd]# awk '{print $1}' access_log | sort -n | uniq -c | awk '{if($1 >100) print $0}' | sort -rn
# 查询某个IP的详细访问情况,按访问频率排序[root@lyshark.cnblogs.com httpd]# grep '192.168.1.2' access_log | awk '{print $7}' | sort | uniq -c | sort -rn | head -n 100
# 页面访问统计:查看访问最频繁的前100个页面[root@lyshark.cnblogs.com httpd]# awk '{print $7}' access_log | sort | uniq -c | sort -rn | head -n 100
# 页面访问统计:查看访问最频繁的前100个页面(排除php|py)[root@lyshark.cnblogs.com httpd]# grep -E -v ".php|.py" access_log | awk '{print $7}' | sort |uniq -c | sort -rn | head -n 100
# 页面访问统计:查看页面访问次数超过100次的页面[root@lyshark.cnblogs.com httpd]# cat access_log | cut -d ' ' -f 7 | sort |uniq -c | awk '{if ($1 > 100) print$0}'
# 页面访问统计:查看最近1000条记录中,访问量最高的页面[root@lyshark.cnblogs.com httpd]# tail -1000 access_log | awk '{print $7}' | sort | uniq -c | sort -nr
# 每秒请求量统计:统计每秒的请求数前100的时间点(精确到秒)[root@lyshark.cnblogs.com httpd]# awk '{print $4}' access_log | cut -c14-21 | sort | uniq -c | sort -nr | head -n 100
# 每分钟请求量统计 11、统计每分钟的请求数,top100的时间点(精确到分钟)[root@lyshark.cnblogs.com httpd]# awk '{print $4}' access_log | cut -c14-18 | sort | uniq -c | sort -nr | head -n 100
# 每小时请求量统计 12、统计每小时的请求数,top100的时间点(精确到小时)[root@lyshark.cnblogs.com httpd]# awk '{print $4}' access_log | cut -c14-15 | sort | uniq -c | sort -nr | head -n 100

统计Web服务状态

# 统计网站爬虫[root@lyshark.cnblogs.com httpd]# grep -E 'Googlebot|Baiduspider' access_log | awk '{ print $1 }' | sort | uniq
# 统计网站中浏览器的访问情况[root@lyshark.cnblogs.com httpd]# cat access_log | grep -v -E 'MSIE|Firefox|Chrome|Opera|Safari|Gecko|Maxthon' | sort | uniq -c | sort -r -n | head -n 100
# 统计网段分布情况[root@lyshark.cnblogs.com httpd]# cat access_log | awk '{print $1}' | awk -F'.' '{print $1"."$2"."$3".0"}' | sort | uniq -c | sort -r -n | head -n 200
# 统计来访域名[root@lyshark.cnblogs.com httpd]# cat access_log | awk '{print $2}' | sort | uniq -c | sort -rn | more
# 统计HTTP状态[root@lyshark.cnblogs.com httpd]# cat access_log | awk '{print $9}' | sort | uniq -c | sort -rn | more
# URL访问次数统计[root@lyshark.cnblogs.com httpd]# cat access_log | awk '{print $7}' | sort | uniq -c | sort -rn | more
# URL访问流量统计[root@lyshark.cnblogs.com httpd]# cat access_log | awk '{print $7}' | egrep '?|&' | sort | uniq -c | sort -rn | more
# 文件流量统计[root@lyshark.cnblogs.com httpd]# cat access_log | awk '{sum[$7]+=$10}END{for(i in sum){print sum[i],i}}' | \sort -rn | more | grep '200' access_log | \awk '{sum[$7]+=$10}END{for(i in sum){print sum[i],i}}' | sort -rn | more

其他统计组合

# 列出当天访问次数最多的IP命令[root@lyshark.cnblogs.com httpd]# cut -d- -f 1 access_log | uniq -c | sort -rn | head -20
# 查看当天有多少个IP访问[root@lyshark.cnblogs.com httpd]# awk '{print $1}' access_log | sort | uniq | wc -l
# 查看某一个页面总计被访问的次数[root@lyshark.cnblogs.com httpd]# cat access_log | grep "index.php" | wc -l
# 查看每一个IP访问了多少个页面[root@lyshark.cnblogs.com httpd]# awk '{++S[$1]} END {for (a in S) print a,S[a]}' access_log
# 将每个IP访问的页面数进行从小到大排序[root@lyshark.cnblogs.com httpd]# awk '{++S[$1]} END {for (a in S) print S[a],a}' access_log | sort -n
# 查看某一个IP访问了哪些页面[root@lyshark.cnblogs.com httpd]# grep "^192.168.1.2" access_log | awk '{print $1,$7}'
# 去掉搜索引擎统计当天的页面[root@lyshark.cnblogs.com httpd]# awk '{print $12,$1}' access_log | grep ^"Mozilla" | awk '{print $2}' |sort | uniq | wc -l
# 查看21/Nov/2019:03:40:26这一个小时内有多少IP访问[root@lyshark.cnblogs.com httpd]# awk '{print $4,$1}' access_log | grep "21/Nov/2019:03:40:26" | awk '{print $2}'| sort | uniq | wc -lNginx日志统计:
# 列出所有的IP访问情况[root@lyshark.cnblogs.com httpd]# awk '{print $1}' access_log | sort -n | uniq
# 查看访问最频繁的前100IP[root@lyshark.cnblogs.com httpd]# awk '{print $1}' access_log | sort -n | uniq -c | sort -rn | head -n 100
# 查看访问100次以上的IP[root@lyshark.cnblogs.com httpd]# awk '{print $1}' access_log | sort -n | uniq -c | awk '{if($1 >100) print $0}' | sort -rn
# 查询某个IP的详细访问情况,按访问频率排序[root@lyshark.cnblogs.com httpd]# grep '192.168.1.2' access_log | awk '{print $7}' | sort | uniq -c | sort -rn | head -n 100
# 页面访问统计:查看访问最频繁的前100个页面[root@lyshark.cnblogs.com httpd]# awk '{print $7}' access_log | sort | uniq -c | sort -rn | head -n 100
# 页面访问统计:查看访问最频繁的前100个页面(排除php|py)[root@lyshark.cnblogs.com httpd]# grep -E -v ".php|.py" access_log | awk '{print $7}' | sort |uniq -c | sort -rn | head -n 100
# 页面访问统计:查看页面访问次数超过100次的页面[root@lyshark.cnblogs.com httpd]# cat access_log | cut -d ' ' -f 7 | sort |uniq -c | awk '{if ($1 > 100) print$0}'
# 页面访问统计:查看最近1000条记录中,访问量最高的页面[root@lyshark.cnblogs.com httpd]# tail -1000 access_log | awk '{print $7}' | sort | uniq -c | sort -nr
# 每秒请求量统计:统计每秒的请求数前100的时间点(精确到秒)[root@lyshark.cnblogs.com httpd]# awk '{print $4}' access_log | cut -c14-21 | sort | uniq -c | sort -nr | head -n 100
# 每分钟请求量统计 11、统计每分钟的请求数,top100的时间点(精确到分钟)[root@lyshark.cnblogs.com httpd]# awk '{print $4}' access_log | cut -c14-18 | sort | uniq -c | sort -nr | head -n 100
# 每小时请求量统计 12、统计每小时的请求数,top100的时间点(精确到小时)[root@lyshark.cnblogs.com httpd]# awk '{print $4}' access_log | cut -c14-15 | sort | uniq -c | sort -nr | head -n 100统计其他页面数据:
# 统计网站爬虫[root@lyshark.cnblogs.com httpd]# grep -E 'Googlebot|Baiduspider' access_log | awk '{ print $1 }' | sort | uniq
# 统计网站中浏览器的访问情况[root@lyshark.cnblogs.com httpd]# cat access_log | grep -v -E 'MSIE|Firefox|Chrome|Opera|Safari|Gecko|Maxthon' | sort | uniq -c | sort -r -n | head -n 100
# 统计网段分布情况[root@lyshark.cnblogs.com httpd]# cat access_log | awk '{print $1}' | awk -F'.' '{print $1"."$2"."$3".0"}' | sort | uniq -c | sort -r -n | head -n 200
# 统计来访域名[root@lyshark.cnblogs.com httpd]# cat access_log | awk '{print $2}' | sort | uniq -c | sort -rn | more
# 统计HTTP状态[root@lyshark.cnblogs.com httpd]# cat access_log | awk '{print $9}' | sort | uniq -c | sort -rn | more
# URL访问次数统计[root@lyshark.cnblogs.com httpd]# cat access_log | awk '{print $7}' | sort | uniq -c | sort -rn | more
# URL访问流量统计[root@lyshark.cnblogs.com httpd]# cat access_log | awk '{print $7}' | egrep '?|&' | sort | uniq -c | sort -rn | more
# 文件流量统计[root@lyshark.cnblogs.com httpd]# cat access_log | awk '{sum[$7]+=$10}END{for(i in sum){print sum[i],i}}' | \sort -rn | more | grep '200' access_log | \awk '{sum[$7]+=$10}END{for(i in sum){print sum[i],i}}' | sort -rn | more

次数统计

查看某一个页面被访问的次数[root@lyshark.cnblogs.com httpd]# grep "/index.php" log_file | wc -l
查看每一个IP访问了多少个页面[root@lyshark.cnblogs.com httpd]# awk '{++S[$1]} END {for (a in S) print a,S[a]}' log_file
将每个IP访问的页面数进行从小到大排序[root@lyshark.cnblogs.com httpd]# awk '{++S[$1]} END {for (a in S) print S[a],a}' log_file | sort -n
查看某一个IP访问了哪些页面[root@lyshark.cnblogs.com httpd]# grep ^111.111.111.111 log_file| awk '{print $1,$7}'
去掉搜索引擎统计当天的页面[root@lyshark.cnblogs.com httpd]# awk '{print $12,$1}' log_file | grep ^"Mozilla | awk '{print $2}' |sort | uniq | wc -l
查看201862114时这一个小时内有多少IP访问[root@lyshark.cnblogs.com httpd]# awk '{print $4,$1}' log_file | grep 21/Jun/2018:14 | awk '{print $2}'| sort | uniq | wc -l
统计爬虫[root@lyshark.cnblogs.com httpd]# grep -E 'Googlebot|Baiduspider' /www/logs/access.2019-02-23.log | awk '{ print $1 }' | sort | uniq
统计浏览器[root@lyshark.cnblogs.com httpd]# cat /www/logs/access.2019-02-23.log | grep -v -E 'MSIE|Firefox|Chrome|Opera|Safari|Gecko|Maxthon' | sort | uniq -c | sort -r -n | head -n 100
IP 统计[root@lyshark.cnblogs.com httpd]# grep '23/May/2019' /www/logs/access.2019-02-23.log | awk '{print $1}' | awk -F'.' '{print $1"."$2"."$3"."$4}' | sort | uniq -c | sort -r -n | head -n 10 2206 219.136.134.13 1497 182.34.15.248 1431 211.140.143.100 1431 119.145.149.106 1427 61.183.15.179 1427 218.6.8.189 1422 124.232.150.171 1421 106.187.47.224 1420 61.160.220.252 1418 114.80.201.18
统计网段[root@lyshark.cnblogs.com httpd]# cat /www/logs/access.2019-02-23.log | awk '{print $1}' | awk -F'.' '{print $1"."$2"."$3".0"}' | sort | uniq -c | sort -r -n | head -n 200
统计域名[root@lyshark.cnblogs.com httpd]# cat /www/logs/access.2019-02-23.log |awk '{print $2}'|sort|uniq -c|sort -rn|more
HTTP状态[root@lyshark.cnblogs.com httpd]# cat /www/logs/access.2019-02-23.log |awk '{print $9}'|sort|uniq -c|sort -rn|more5056585 3041125579 200 7602 400 5 301
URL 统计[root@lyshark.cnblogs.com httpd]# cat /www/logs/access.2019-02-23.log |awk '{print $7}'|sort|uniq -c|sort -rn|more
文件流量统计[root@lyshark.cnblogs.com httpd]# cat /www/logs/access.2019-02-23.log |awk '{sum[$7]+=$10}END{for(i in sum){print sum[i],i}}'|sort -rn|moregrep ' 200 ' /www/logs/access.2019-02-23.log |awk '{sum[$7]+=$10}END{for(i in sum){print sum[i],i}}'|sort -rn|more
URL访问量统计[root@lyshark.cnblogs.com httpd]# cat /www/logs/access.2019-02-23.log | awk '{print $7}' | egrep '?|&' | sort | uniq -c | sort -rn | more
查出运行速度最慢的脚本[root@lyshark.cnblogs.com httpd]# grep -v 0$ /www/logs/access.2019-02-23.log | awk -F '" ' '{print $4" " $1}' web.log | awk '{print $1" "$8}' | sort -n -k 1 -r | uniq > /tmp/slow_url.txt
IP, URL 抽取[root@lyshark.cnblogs.com httpd]# tail -f /www/logs/access.2019-02-23.log | grep '/test.html' | awk '{print $1" "$7}'

链接:https://www.cnblogs.com/LyShark/p/12500145.html

(版权归原作者所有,侵删)


浏览 69
点赞
评论
收藏
分享

手机扫一扫分享

举报
评论
图片
表情
推荐
点赞
评论
收藏
分享

手机扫一扫分享

举报