每天10G的nginx日志,需要怎么分析?

yu34po 2013-12-06 11:28:17
RT,要分析出每IP的访问数量和每IP的访问内容。
...全文
564 14 打赏 收藏 转发到动态 举报
写回复
用AI写文章
14 条回复
切换为时间正序
请发表友善的回复…
发表回复
Ericz 2013-12-18
  • 打赏
  • 举报
回复
10G也不算大,如果用脚本的话很容易也能分析出来可能会有点慢,但是有利于后期维护,如果非要要求运行速度,可能需要用C语言,楼主的需求是什么呢?代码的原理都差不多,就是扫描一遍文件,按照IP统计。
Vegertar 2013-12-14
  • 打赏
  • 举报
回复
29行错了, _hash_file_map[out_file_name] = of 改成 _hash_file_map[iphash] = of
Vegertar 2013-12-14
  • 打赏
  • 举报
回复
10G 日志,要分析的话只要散列到小文件就好。

Python 2.7.5 (default, Aug 25 2013, 00:04:04) 
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import simple
>>> simple.preprocess('/tmp/log')
>>> simple.query('61.4.184.92')
INFO:root:61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101091101&type=observe&date=201312042349&appid=7c1429&key=BUyFU0GyXhzGNDNVpMQaortggDQ= HTTP/1.1" 200 76 "-" "Dalvik/1.6.0 (Linux; U; Android 4.1.1; MI 2S MIUI/JLB23.0)" -

INFO:root:61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101181502&type=observe&date=201212042358&appid=7c1429&key=d6MVaHYcjds8O69Fd48hw4JNQxc= HTTP/1.1" 200 76 "-" "Dalvik/1.6.0 (Linux; U; Android 4.2.2; Philips T3500 Build/JDQ39)" -

INFO:root:61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101120201&type=forecast3h&date=201312042349&appid=f63d32&key=93MlJYwsegk7wcPTH77nL%2Fe9uRg%3D HTTP/1.1" 200 4470 "-" "SAMSUNG-Android" -

INFO:root:61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101050101&type=forecast3h&date=201312042349&appid=f63d32&key=PXAV3jDeBp%2Bz5SdaAmtpeKQl1xk%3D HTTP/1.1" 200 4614 "-" "SAMSUNG-Android" -

INFO:root:61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101280601&type=observe&date=201312042349&appid=f63d32&key=oL0wNOM41qqbS2LZePZRMBqwELs%3D HTTP/1.1" 200 298 "-" "SAMSUNG-Android" -

INFO:root:61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101181001&type=forecast&date=201312042349&appid=f63d32&key=rn59yR4fNs7%2FgmSydlOkzZMRXH4%3D HTTP/1.1" 200 1165 "-" "SAMSUNG-Android" -

INFO:root:The requested ip(61.4.184.92) has 6 record(s)
>>> simple.query('192.168.1.100')
ERROR:root:There is no such ip in history: 192.168.1.100


#!/usr/bin/env python2.7
#
# simple.py
#

import logging

logging.basicConfig(level=logging.DEBUG)

_hash_file_pattern = '/tmp/log.hash.%d'
_hash_file_map = {}

def _ip2int(ip):
    import struct
    import socket
    return struct.unpack('!I', socket.inet_aton(ip))[0]

def _iphash(ip):
    return _ip2int(ip) % 1021

def preprocess(log):
    for line in open(log):
        ip = line[:line.find(' ')]
        iphash = _iphash(ip)
        out_file_name = _hash_file_pattern % iphash
        of = _hash_file_map.get(iphash, open(out_file_name, 'a+'))
        of.seek(0, 2)
        of.write(line)
        _hash_file_map[out_file_name] = of

def query(ip):
    iphash = _iphash(ip)
    out_file_name = _hash_file_pattern % iphash
    try:
        of = _hash_file_map.get(iphash, open(out_file_name, 'r'))
        of.seek(0)
        count = 0
        for line in of:
            if line[:line.find(' ')] == ip:
                count += 1
                logging.info(line)

        logging.info('The requested ip(%s) has %d record(s)' % (ip, count))
    except IOError:
        logging.error('There is no such ip in history: ' + ip)

masterz 2013-12-14
  • 打赏
  • 举报
回复
数据库建索引挺慢的。
nullw 2013-12-13
  • 打赏
  • 举报
回复
还是写入数据库,建索引,用c++或其它分析快。
masterz 2013-12-13
  • 打赏
  • 举报
回复
用C++写个程序分析一下。读一遍文件就好了。
yu34po 2013-12-10
  • 打赏
  • 举报
回复
引用 6 楼 qq120848369 的回复:
先放进hadoop,然后写mapper和reducer。。
这个要用hadoop么?配起来会不会比较麻烦?不用实时分析,只要一天分析一次就OK了
jkjium 2013-12-10
  • 打赏
  • 举报
回复
数量和内容分开处理比较方便吧? 统计数量:awk '{print $1}' log.txt | sort | uniq -c 内容:跟据以上结果再grep
ljc007 2013-12-06
  • 打赏
  • 举报
回复
贴一段日志出来,我测试一下代码。
qq120848369 2013-12-06
  • 打赏
  • 举报
回复
先放进hadoop,然后写mapper和reducer。。
yu34po 2013-12-06
  • 打赏
  • 举报
回复
引用 3 楼 ljc007 的回复:
[ljc007]$ awk '{a[$1]++}END{for(i in a)print i,a[i]}' urfile 61.4.184.93 10 61.4.184.90 6 61.4.184.91 7 61.4.184.92 7 你测试一下这个代码需要执行多长时间
1。4G数据处理10分钟无果
WO浣熊OW 2013-12-06
  • 打赏
  • 举报
回复
引用 3 楼 ljc007 的回复:
[ljc007]$ awk '{a[$1]++}END{for(i in a)print i,a[i]}' urfile 61.4.184.93 10 61.4.184.90 6 61.4.184.91 7 61.4.184.92 7 你测试一下这个代码需要执行多长时间
你知道的太多了!
ljc007 2013-12-06
  • 打赏
  • 举报
回复
[ljc007]$ awk '{a[$1]++}END{for(i in a)print i,a[i]}' urfile 61.4.184.93 10 61.4.184.90 6 61.4.184.91 7 61.4.184.92 7 你测试一下这个代码需要执行多长时间
yu34po 2013-12-06
  • 打赏
  • 举报
回复
61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101091101&type=observe&date=201312042349&appid=7c1429&key=BUyFU0GyXhzGNDNVpMQaortggDQ= HTTP/1.1" 200 76 "-" "Dalvik/1.6.0 (Linux; U; Android 4.1.1; MI 2S MIUI/JLB23.0)" -
61.4.184.91 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101280601&type=observe&date=201301012336&appid=7c1429&key=oMqeris3J3IZ3CHkbOKd06X5NYg= HTTP/1.1" 200 77 "-" "Dalvik/1.4.0 (Linux; U; Android 4.0; US900G Build/GRK39F)" -
61.4.184.93 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101090101&type=observe&date=201312042349&appid=f63d32&key=bheUQV3tGn2xWQ9irl%2B37J2Vkjs%3D HTTP/1.1" 200 303 "-" "SAMSUNG-Android" -
61.4.184.93 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101100805&type=observe&date=201312042348&appid=7c1429&key=qggG1p4SesvT0DU3dZDPLTaVwCs= HTTP/1.1" 200 76 "-" "Dalvik/1.6.0 (Linux; U; Android 4.0.4; GT-S7562 Build/IMM76I)" -
61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101181502&type=observe&date=201212042358&appid=7c1429&key=d6MVaHYcjds8O69Fd48hw4JNQxc= HTTP/1.1" 200 76 "-" "Dalvik/1.6.0 (Linux; U; Android 4.2.2; Philips T3500 Build/JDQ39)" -
61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101120201&type=forecast3h&date=201312042349&appid=f63d32&key=93MlJYwsegk7wcPTH77nL%2Fe9uRg%3D HTTP/1.1" 200 4470 "-" "SAMSUNG-Android" -
61.4.184.91 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101070801&type=forecast&date=201312042349&appid=f63d32&key=UG569rKKIvtoXUhUH2KKAff7WtU%3D HTTP/1.1" 200 1165 "-" "SAMSUNG-Android" -
61.4.184.90 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101110101&type=forecast&date=201312042349&appid=f63d32&key=Lu7X6OrSpspE26sp4ReHcKeT2Uo%3D HTTP/1.1" 200 1158 "-" "SAMSUNG-Android" -
61.4.184.90 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101210101&type=observe&date=200001040934&appid=7c1429&key=N6Rsf8HWIQquODZD6UV1nqDxAq8= HTTP/1.1" 200 77 "-" "Dalvik/1.6.0 (Linux; U; Android 4.1.1; N70DC-S Build/JRO03H)" -
61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101050101&type=forecast3h&date=201312042349&appid=f63d32&key=PXAV3jDeBp%2Bz5SdaAmtpeKQl1xk%3D HTTP/1.1" 200 4614 "-" "SAMSUNG-Android" -
61.4.184.90 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101150901&type=forecast&date=201312042349&appid=f63d32&key=1A%2B%2B%2FzK3Y81MsJtk%2FQz1FWewpV8%3D HTTP/1.1" 200 1178 "-" "SAMSUNG-Android" -
61.4.184.93 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101110807&type=forecast3h&date=201312042349&appid=f63d32&key=yLwpn77%2BMQUxG88U%2Bw9DnGzHoXU%3D HTTP/1.1" 200 4487 "-" "SAMSUNG-Android" -
61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101280601&type=observe&date=201312042349&appid=f63d32&key=oL0wNOM41qqbS2LZePZRMBqwELs%3D HTTP/1.1" 200 298 "-" "SAMSUNG-Android" -
61.4.184.91 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101110101&type=observe&date=201312042349&appid=f63d32&key=KIGS0Mrd%2BtxGAw2QxffBSPLudgM%3D HTTP/1.1" 200 297 "-" "SAMSUNG-Android" -
61.4.184.90 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101050901&type=observe&date=201311032350&appid=7c1429&key=qsp5O1PiXAiL7dbOgzr6czkbl1Q= HTTP/1.1" 200 77 "-" "Dalvik/1.6.0 (Linux; U; Android 4.1.1; MI 2A MIUI/JLB20.0)" 10.172.19.85
61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101181001&type=forecast&date=201312042349&appid=f63d32&key=rn59yR4fNs7%2FgmSydlOkzZMRXH4%3D HTTP/1.1" 200 1165 "-" "SAMSUNG-Android" -
61.4.184.93 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101250101&type=temp&date=201312042349&appid=f63d32&key=lruvnBBc6YicdB2I4rVbWzxq9ks%3D HTTP/1.1" 200 26 "-" "SAMSUNG-Android" -
61.4.184.93 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101210101&type=observe&date=201312042349&appid=f63d32&key=Q52d%2FOp%2F6sKinR9A0PqnMrH7ZQ8%3D HTTP/1.1" 200 298 "-" "SAMSUNG-Android" 10.128.165.209
61.4.184.93 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101191201&type=forecast&date=201312042349&appid=f63d32&key=D%2FvpXPLqv8w9d4B3Ai0T6eTNGec%3D HTTP/1.1" 200 1164 "-" "SAMSUNG-Android" -
61.4.184.91 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101020100&type=forecast&date=201312042349&appid=f63d32&key=LiZ1%2FjwdXPFbcw5LQ4%2Bloycoynk%3D HTTP/1.1" 200 1166 "-" "SAMSUNG-Android" -
61.4.184.91 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101240601&type=observe&date=201212042349&appid=7c1429&key=lfXLCmrdscNVlGn88KLFv6M/fSw= HTTP/1.1" 200 76 "-" "Dalvik/1.6.0 (Linux; U; Android 4.0.4; HYUNDAI T20 Build/IMM76D)" -
61.4.184.93 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101260201&type=observe&date=201312042349&appid=f63d32&key=ggmOFVKXOY0loNHLXT4g%2BRyeHUY%3D HTTP/1.1" 200 298 "-" "SAMSUNG-Android" -
61.4.184.93 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101131001&type=observe&date=201301091725&appid=7c1429&key=sepcW9rh6bW6G6k3dYkAMdRi2Rc= HTTP/1.1" 200 77 "-" "Dalvik/1.6.0 (Linux; U; Android 4.0.4; ZTE U817 Build/IMM76D)" -
61.4.184.93 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101280301&type=observe&date=201312042349&appid=5b9529&key=5RpBSBt3UoRvEFDOhEtNTxY63Ag%3D HTTP/1.1" 200 48 "-" "Jakarta Commons-HttpClient/3.1-rc1" -
61.4.184.93 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101210901&type=observe&date=201312042349&appid=f63d32&key=0SN0sdlHUCOUvrk3zjZ5KqueMPA%3D HTTP/1.1" 200 298 "-" "SAMSUNG-Android" -
61.4.184.91 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101270507&type=observe&date=201312042348&appid=7c1429&key=7rb44bA5Ou581EHpr+yu43v1sMw= HTTP/1.1" 200 77 "-" "Dalvik/1.4.0 (Linux; U; Android 2.3.4; SHW-M110S Build/GINGERBREAD)" -
61.4.184.90 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101280301&type=forecast3d&date=201312042349&appid=5b9529&key=iKV%2BWpVEX5QXMIJh8%2FBytd3iVfQ%3D HTTP/1.1" 200 562 "-" "Jakarta Commons-HttpClient/3.1-rc1" -
61.4.184.92 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101040100&type=forecast3h&date=201312042350&appid=f63d32&key=tMADH88lt5KmNbeGaJNbRQNTWx8%3D HTTP/1.1" 200 4619 "-" "SAMSUNG-Android" -
61.4.184.90 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101110101&type=all&date=201312041137&appid=f63d32&key=hVTLiUIZERsEtnmw5i4HD7Er5K4%3D HTTP/1.0" 200 5899 "-" "-" -
61.4.184.91 - - [05/Dec/2013:00:10:05 +0800] "GET /data/?areaid=101090301&type=all&date=201312041146&appid=f63d32&key=bO30zDwAE5bLc%2BcEVt80q%2BfkuNw%3D HTTP/1.0" 200 6507 "-" "-" -
引用 1 楼 ljc007 的回复:
贴一段日志出来,我测试一下代码。

23,223

社区成员

发帖
与我相关
我的任务
社区描述
Linux/Unix社区 应用程序开发区
社区管理员
  • 应用程序开发区社区
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧