发表回复 
 
主题评价:
  • 0 次(票) - 平均星级: 0
  • 1
  • 2
  • 3
  • 4
  • 5
xunsearch服务器一到晚上就死了,重启后又好了,我把日志贴上来
2012年02月21日, 09:23
xunsearch服务器一到晚上就死了,重启后又好了,我把日志贴上来
xunsearch服务器一到晚上就死了,重启xunsearch后又好了,
重启xunsearch时,停止服务的时候特别的慢,帮我看看,非常的感谢了
我把日志贴上来,请大家帮我看一看
已经确定不是linux系统的问题

出问题大约是晚上半夜,早上8点左右我重点启的xunsearch服务

[url=http://www.jpgo5.com/log.rar]日志下载[/url]
原来那个贴子回复不能贴附件,开个新贴
查找这个用户的全部帖子
引用并回复
2012年02月21日, 13:08
RE: xunsearch服务器一到晚上就死了,重启后又好了,我把日志贴上来
停止服务时是 searchd 非常慢还是 indexd 非常慢呢?

你的日志是发生卡死时的日志吗,indexd.log 里看到你更新并发有 20 个,为什么会这么多?可否进一步优化,但并没有异常。

搜索进程看不出有异样。

下次出错时我希望你可以 结合 ps , top 查看占用最大的进程,然后用 strace -p <pid> 去看这个进程的系统调用情况。

你说的情况我们目前自己还没有碰到
查找这个用户的全部帖子
引用并回复
2012年02月21日, 15:44
RE: xunsearch服务器一到晚上就死了,重启后又好了,我把日志贴上来
并发一般设多少比较合适,
我通过任务,每隔10分钟检测一下数据库有没有新的数据,有数据的话,通过
index_put($index)来提交一次索引,所以我每一次提交20个数据,如果我设少了,那十分钟内如果有30个资料的更新,那索引就更新不完了。请问一下,有没有更好的解决方​案
查找这个用户的全部帖子
引用并回复
2012年02月21日, 15:44
RE: xunsearch服务器一到晚上就死了,重启后又好了,我把日志贴上来
并发一般设多少比较合适,
我通过任务,每隔10分钟检测一下数据库有没有新的数据,有数据的话,通过
index_put($index)来提交一次索引,所以我每一次提交20个数据,如果我设少了,那十分钟内如果有30个资料的更新,那索引就更新不完了。请问一下,有没有更好的解决方​案
查找这个用户的全部帖子
引用并回复
2012年02月22日, 09:30 (这个帖子最后修改于: 2012年02月22日 09:42 by jatwxf.)
RE: xunsearch服务器一到晚上就死了,重启后又好了,我把日志贴上来
我昨天已经把提交索引的关了,已经没有提交索引了,但第二天起来xunsearch还是挂了,我重启的时候停卡 xs-searchd 的时候非常的慢
[root@edu ~]# /usr/local/xunsearch/bin/xs-ctl.sh -b inet restart
INFO: stopping server[xs-indexd] (BIND:8383) .... [OK]
INFO: re-starting server[xs-indexd] ... (BIND:8383)
INFO: stopping server[xs-searchd] (BIND:8384) .......................................... [OK]
INFO: re-starting server[xs-searchd] ... (BIND:8384)
[hr]
我昨天已经把提交索引的关了,已经没有提交索引了,但第二天起来xunsearch还是挂了,我重启的时候停卡 xs-searchd 的时候非常的慢
[root@edu ~]# /usr/local/xunsearch/bin/xs-ctl.sh -b inet restart
INFO: stopping server[xs-indexd] (BIND:8383) .... [OK]
INFO: re-starting server[xs-indexd] ... (BIND:8383)
INFO: stopping server[xs-searchd] (BIND:8384) .......................................... [OK]
INFO: re-starting server[xs-searchd] ... (BIND:8384)


[url=http://www.jpgo5.com/indexd.rar]日志下载[/url]
查找这个用户的全部帖子
引用并回复
2012年02月22日, 11:33
RE: xunsearch服务器一到晚上就死了,重启后又好了,我把日志贴上来
strace -p <xs-searchd的进程号> 看看吧
查找这个用户的全部帖子
引用并回复
2012年02月22日, 13:47
RE: xunsearch服务器一到晚上就死了,重启后又好了,我把日志贴上来
那我下次死的时候,就用这个方法查一下吧,不过现在我通过top查看到有三个 xs-searchd的进程
这个正常吗
查找这个用户的全部帖子
引用并回复
2012年02月22日, 14:58
RE: xunsearch服务器一到晚上就死了,重启后又好了,我把日志贴上来
3个进程是正常的
查找这个用户的全部帖子
引用并回复
2012年02月22日, 22:51
RE: xunsearch服务器一到晚上就死了,重启后又好了,我把日志贴上来
又当机了,我把strace的结果列出来,三个进程,有一个进程starce就不动,另外一个提示Resource temporarily unavailable


14179 ? 00:00:00 xs-searchd
14186 ? 00:00:01 xs-searchd
14193 ? 00:00:00 xs-searchd
14212 pts/0 00:00:00 ps
31168 ? 00:00:47 xs-indexd
31171 ? 00:00:00 xs-searchd
[root@edu ~]# strace -p 14179
Process 14179 attached - interrupt to quit
clock_gettime(CLOCK_MONOTONIC, {223647, 727467159}) = 0
epoll_wait(1, {}, 32, 2184) = 0
clock_gettime(CLOCK_MONOTONIC, {223649, 926081228}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
time(NULL) = 1329922500
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223655, 160187730}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223655, 829184099}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223658, 48058012}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, {sa_family=AF_INET, sin_port=htons(43270), sin_addr=inet_addr("110.76.46.90")}, [16]) = 197
setsockopt(197, SOL_TCP, TCP_NODELAY, [1], 4) = 0
fcntl64(197, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
epoll_ctl(1, EPOLL_CTL_ADD, 197, {EPOLLIN, {u32=197, u64=197}}) = 0
time(NULL) = 1329922508
fcntl64(3, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
write(3, "2012-02-22 22:55:08 worker2[1417"..., 88) = 88
fcntl64(3, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0, len=0}) = 0
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=197, u64=197}}}, 32, 5000) = 1
clock_gettime(CLOCK_MONOTONIC, {223658, 67575022}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 197, {EPOLLIN, {u32=197, u64=197}}) = 0
recv(197, "\1\0\0\0\5\0\0\0005ucom", 1024, 0) = 13
send(197, "\200\0\311\0\0\0\0\0", 8, 0) = 8
epoll_ctl(1, EPOLL_CTL_ADD, 197, {EPOLLIN, {u32=197, u64=197}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=197, u64=197}}}, 32, 5000) = 1
clock_gettime(CLOCK_MONOTONIC, {223658, 109047343}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 197, {EPOLLIN, {u32=197, u64=197}}) = 0
recv(197, "\340\0\0\0\0\0\0\0\301\36\377\0\0\0\0\0\302\0\4\0\0\0\0\0\302\0\10\0\0\0\0\0"..., 1024, 0) = 92
time(NULL) = 1329922508
fcntl64(3, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
write(3, "2012-02-22 22:55:08 worker2[1417"..., 95) = 95
fcntl64(3, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0, len=0}) = 0
futex(0x8063c48, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x8063c44, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 19940) = 1
clock_gettime(CLOCK_MONOTONIC, {223660, 369822960}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223663, 899869714}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223666, 151361467}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223679, 719415602}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223681, 208007966}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223685, 790094004}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223688, 304220879}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223688, 470347622}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223688, 523424213}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223692, 788326842}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, {sa_family=AF_INET, sin_port=htons(43314), sin_addr=inet_addr("110.76.46.90")}, [16]) = 207
setsockopt(207, SOL_TCP, TCP_NODELAY, [1], 4) = 0
fcntl64(207, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
epoll_ctl(1, EPOLL_CTL_ADD, 207, {EPOLLIN, {u32=207, u64=207}}) = 0
time(NULL) = 1329922543
fcntl64(3, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
write(3, "2012-02-22 22:55:43 worker2[1417"..., 88) = 88
fcntl64(3, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0, len=0}) = 0
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=207, u64=207}}}, 32, 5000) = 1
clock_gettime(CLOCK_MONOTONIC, {223692, 789968112}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 207, {EPOLLIN, {u32=207, u64=207}}) = 0
recv(207, "\1\0\0\0\5\0\0\0005ucom", 1024, 0) = 13
send(207, "\200\0\311\0\0\0\0\0", 8, 0) = 8
epoll_ctl(1, EPOLL_CTL_ADD, 207, {EPOLLIN, {u32=207, u64=207}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=207, u64=207}}}, 32, 5000) = 1
clock_gettime(CLOCK_MONOTONIC, {223692, 825766805}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 207, {EPOLLIN, {u32=207, u64=207}}) = 0
recv(207, "\340\0\0\0\0\0\0\0\301\36\377\0\0\0\0\0\302\0\4\0\0\0\0\0\302\0\10\0\0\0\0\0"..., 1024, 0) = 95
time(NULL) = 1329922543
fcntl64(3, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
write(3, "2012-02-22 22:55:43 worker2[1417"..., 95) = 95
fcntl64(3, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0, len=0}) = 0
futex(0x8063c48, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x8063c44, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
epoll_wait(1, ^A{{EPOLLIN, {u32=0, u64=0}}}, 32, 19963) = 1
clock_gettime(CLOCK_MONOTONIC, {223699, 150267754}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1,

[root@edu ~]# strace -p 14186



Process 14186 attached - interrupt to quit
clock_gettime(CLOCK_MONOTONIC, {223725, 964574149}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 15542) = 1
clock_gettime(CLOCK_MONOTONIC, {223726, 854266500}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223728, 756611555}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223730, 636334176}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223732, 804635168}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, {{EPOLLIN, {u32=0, u64=0}}}, 32, 20000) = 1
clock_gettime(CLOCK_MONOTONIC, {223734, 554057117}) = 0
epoll_ctl(1, EPOLL_CTL_DEL, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
accept(0, 0xbfe5c418, [16]) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(1, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
epoll_wait(1, <unfinished ...>


[root@edu ~]# strace -p 14193 查看这个进程的时候,后面就不动了
Process 14193 attached - interrupt to quit
futex(0x8063c48, FUTEX_WAIT_PRIVATE, 18, NULL
查找这个用户的全部帖子
引用并回复
2012年02月23日, 13:29
RE: xunsearch服务器一到晚上就死了,重启后又好了,我把日志贴上来
futex(0x8063c48, FUTEX_WAIT_PRIVATE, 18, NULL

不动是引起死锁了,这个问题偶尔是有出现。请问你当时的访问并发数比较高吗?建议尝试以下改法观察一两天。
修改 bin/xs-ctl.sh 第 79 行加入红色部分强制只开一个工作进程

bin/xs-searchd -l tmp/searchd.log[color=red] -n 1[/color] -b $bsearch -k $cmd
查找这个用户的全部帖子
引用并回复
发表回复 


论坛跳转:


正在浏览该主题的用户: 1 个游客