免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 16837 | 回复: 9
打印 上一主题 下一主题

HELP! 日志里有很多堆栈信息 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2013-07-19 08:51 |只看该作者 |倒序浏览
日志中有很多堆栈信息,帮忙看看怎么回事?如下.
04:00000:00585:2013/07/19 02:36:57.22 kernel  Current process (0x58cd0083) infected with signal 11 (SIGSEGV)
04:00000:00585:2013/07/19 02:36:57.22 kernel  Address 0x00000001000105bc (mem_pageallocate+0x7c), siginfo (code, address) = (50, 0x00cd593800cd5939)
04:00000:00585:2013/07/19 02:36:57.22 kernel  ************************************
04:00000:00585:2013/07/19 02:36:57.22 kernel  curdb = 0 tempdb = 0 pstat = 0x10000
04:00000:00585:2013/07/19 02:36:57.22 kernel  lasterror = 0 preverror = 0 transtate = 1
04:00000:00585:2013/07/19 02:36:57.22 kernel  curcmd = 0 program =
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x0000000100281668 pcstkwalk+0x84()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x0000000100281ee4 ucstkgentrace+0x238()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x0000000100280680 ucbacktrace+0xe4()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x00000001003bd4c4 terminate_process__fdpr_3+0x938()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x0000000100b67bd4 kisignal+0x1bc()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x00000001000105bc mem_pageallocate+0x7c()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x0000000100020270 memcreate+0x6c()
04:00000:00585:2013/07/19 02:36:57.22 kernel  [Handler pc: 0x0000000100270450 hdl_backout installed by the following function:-]
04:00000:00585:2013/07/19 02:36:57.22 kernel  [Handler pc: 0x00000001004b9150 ut_handle installed by the following function:-]
04:00000:00585:2013/07/19 02:36:57.22 kernel  [Handler pc: 0x00000001004b9150 ut_handle installed by the following function:-]
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x000000010015ea90 conn_hdlr__fdpr_2+0x21c()
04:00000:00585:2013/07/19 02:36:57.22 kernel  end of stack trace, spid 585, kpid 1489830019, suid 0
04:00000:00585:2013/07/19 02:36:57.22 server  The SQL Server is terminating this process.
04:00000:00585:2013/07/19 02:36:57.22 kernel  Current process (0x58cd0083) infected with signal 4 (SIGILL)
04:00000:00585:2013/07/19 02:36:57.22 kernel  Address 0x0000000000000000 (), siginfo (code, address) = (30, 0x0000000000000000)
04:00000:00585:2013/07/19 02:36:57.22 kernel  ************************************
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x0000000100281668 pcstkwalk+0x84()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x0000000100281ee4 ucstkgentrace+0x238()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x0000000100280680 ucbacktrace+0xe4()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x00000001003bcca4 terminate_process__fdpr_3+0x118()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x0000000100b67bd4 kisignal+0x1bc()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x0000000000000000 ()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x00000001003bd54c terminate_process__fdpr_3+0x9c0()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x0000000100b67bd4 kisignal+0x1bc()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x00000001000105bc mem_pageallocate+0x7c()
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x0000000100020270 memcreate+0x6c()
04:00000:00585:2013/07/19 02:36:57.22 kernel  [Handler pc: 0x0000000100270450 hdl_backout installed by the following function:-]
04:00000:00585:2013/07/19 02:36:57.22 kernel  [Handler pc: 0x00000001004b9150 ut_handle installed by the following function:-]
04:00000:00585:2013/07/19 02:36:57.22 kernel  [Handler pc: 0x00000001004b9150 ut_handle installed by the following function:-]
04:00000:00585:2013/07/19 02:36:57.22 kernel  pc: 0x000000010015ea90 conn_hdlr__fdpr_2+0x21c()
04:00000:00585:2013/07/19 02:36:57.22 kernel  end of stack trace, spid 585, kpid 1489830019, suid 0
04:00000:01153:2013/07/19 02:36:59.99 kernel  Current process (0x58ce009a) infected with signal 11 (SIGSEGV)
04:00000:01153:2013/07/19 02:36:59.99 kernel  Address 0x00000001000105bc (mem_pageallocate+0x7c), siginfo (code, address) = (50, 0x00cd593800cd5939)
04:00000:01153:2013/07/19 02:36:59.99 kernel  ************************************
04:00000:01153:2013/07/19 02:36:59.99 kernel  curdb = 0 tempdb = 0 pstat = 0x10000
04:00000:01153:2013/07/19 02:36:59.99 kernel  lasterror = 0 preverror = 0 transtate = 1
04:00000:01153:2013/07/19 02:36:59.99 kernel  curcmd = 0 program =
04:00000:01153:2013/07/19 02:36:59.99 kernel  pc: 0x0000000100281668 pcstkwalk+0x84()
04:00000:01153:2013/07/19 02:36:59.99 kernel  pc: 0x0000000100281ee4 ucstkgentrace+0x238()

论坛徽章:
6
水瓶座
日期:2014-06-04 03:34:37水瓶座
日期:2014-06-17 13:20:31数据库技术版块每日发帖之星
日期:2016-07-09 06:20:00数据库技术版块每日发帖之星
日期:2016-07-17 06:20:00数据库技术版块每日发帖之星
日期:2016-08-01 06:20:00数据库技术版块每日发帖之星
日期:2016-08-04 06:20:00
2 [报告]
发表于 2013-07-19 12:39 |只看该作者
什么平台,版本?
内存配置多大?
如果买了原厂服务,找工程师解决。

论坛徽章:
0
3 [报告]
发表于 2013-07-22 09:13 |只看该作者
AIX53, ASE 12.5, max memory 28G, data cache 15G, procedure cache 3.5G, stack size 251904 bytes,
是内存配置问题??
服务已过期...

论坛徽章:
6
水瓶座
日期:2014-06-04 03:34:37水瓶座
日期:2014-06-17 13:20:31数据库技术版块每日发帖之星
日期:2016-07-09 06:20:00数据库技术版块每日发帖之星
日期:2016-07-17 06:20:00数据库技术版块每日发帖之星
日期:2016-08-01 06:20:00数据库技术版块每日发帖之星
日期:2016-08-04 06:20:00
4 [报告]
发表于 2013-07-22 12:51 |只看该作者
ASE12.5原厂也不支持了。
发一下ipcs -a的结果看看。

论坛徽章:
0
5 [报告]
发表于 2013-07-23 09:12 |只看该作者
$ ipcs -a
IPC status from /dev/mem as of Tue Jul 23 09:11:44 BEIST 2013
T        ID     KEY        MODE       OWNER    GROUP  CREATOR   CGROUP CBYTES  QNUM QBYTES LSPID LRPID   STIME    RTIME    CTIME
Message Queues:
T        ID     KEY        MODE       OWNER    GROUP  CREATOR   CGROUP NATTCH     SEGSZ  CPID  LPID   ATIME    DTIME    CTIME
Shared Memory:
m  10485760 0x690022b9 --rw-------   sybase   sybase   sybase   sybase      8 35701915648 213982 237802  1:42:53  1:42:53  1:42:19
m   1048577 0x78000011 --rw-rw-rw-     root   system     root   system      1 268435456 230050 295072  9:57:15  9:01:59  9:57:15
m   1048578 0x78000010 --rw-rw-rw-     root   system     root   system      1  16777216 230050 295072  9:57:15  9:01:59  9:57:15
m   5242885 0x690022a1 --rw-------   sybase   sybase   sybase   sybase      6 20387991552 258366 217548  3:42:15  3:42:15  3:41:38
T        ID     KEY        MODE       OWNER    GROUP  CREATOR   CGROUP NSEMS   OTIME    CTIME
Semaphores:
s         1 0x62031656 --ra-r--r--     root   system     root   system     1  9:56:36  9:56:36
s         6 0x0103149c --ra-------     root   system     root   system     1  9:57:32  9:57:32
s  20971531 0x0101c6e8 --ra-ra-ra-     root   system     root   system     1 15:55:37 15:55:26

论坛徽章:
0
6 [报告]
发表于 2013-07-23 09:13 |只看该作者
在sybase官网找到的解释:

Current process infected
Message text
current process (0x%x) infected with %d
This error may be caused by a hardware problem.

Explanation
Adaptive Server reports this error when it detects a UNIX signal that specifies an error. The values (“%d”) that display in this error message vary by platform and Adaptive Server Enterprise versions; the most common values are 10 and 11.



Current process infected with 10
A value of 10 [SIGBUS] means that the operating system detected an address alignment error or a miscellaneous hardware error (for example, bus timeout).

A timeout can occur when the CPU issues a request across the bus for the contents of a memory location, and that request is not answered within that CPU’s timeout period (usually a few nanoseconds).



Current process infected with 11
A value of 11 [SIGSEGV] means that the operating system detected a segment violation error.

Sometimes this error occurs in conjunction with a stack overflow or data corruption. For more information on stack overflow, refer to the error write-up “Stack guardword corrupted”.

The error message appears in the Adaptive Server error log followed by a stack trace. The “SQL causing error” or the <lasterror> that displays in the Adaptive Server error log may be the underlying cause for this error. But the message can also be just the last data Adaptive Server had in its cache space.

To identify the <lasterror> (except in the cases where the <lasterror> is 0), get the number that Adaptive Server displays by in the <lasterror> field from the Adaptive Server error log and consult this manual for more information on the error number.

In the following example, the value for <lasterror> is 614.

00: 94/02/14 11:32:26.02 kernel: current process (0x1fb001d)
infected with 11
00: 94/02/14 11:32:26.07 kernel: Address 0x808a6ef
(closetable+0x2f7), siginfo (code, address) = (2, 0x30)
00: 94/02/14 11:32:26.07 kernel: ************************************
00: 94/02/14 11:32:26.07 kernel: “SQL causing error” : CREATE TRIGGER
00: 94/02/14 11:32:26.07 kernel: curdb = 22 pstat = 0x10018
“lasterror = 614”
Action
1.Try to eliminate the <lasterror>, which may be one of the causes for this error (except when <lasterror> is 0).

2.Rerun the command referenced by the SQL causing error to see if the problem reoccurs.

If the process is infected with 11 and you can reproduce the problem, correct it as follows:

•If the SQL causing error is a compiled object such as a stored procedure, trigger, or view, drop and recreate the object.

•If the SQL causing error is ad hoc rather than a compiled object, moving the data may fix the problem. Use one of these options:

◦Select the table data into a new table, drop the old table, and rename the new table to the old table name;

or

◦Bulk copy the affected table out, drop and re-create the table, and bulk copy back in. This is the most efficient solution for a large table.

If moving the data corrects the problem, the data may have been corrupt. Be aware that moving corrupted data can lead to a data loss.

Check your hardware error log as this error can be caused by hardware failure as well.

论坛徽章:
0
7 [报告]
发表于 2013-07-26 10:04 |只看该作者
回复 2# andkylee
这几天堆栈信息只报了两次,我把报错的过程自己执行一遍,无异常.
期间只做过统计值更新(定时任务),checktable也无异常.....


   

论坛徽章:
71
15-16赛季CBA联赛之同曦
日期:2018-08-23 15:41:42辰龙
日期:2014-08-15 09:07:43狮子座
日期:2014-06-03 13:55:33亥猪
日期:2014-06-02 11:17:08巨蟹座
日期:2014-05-06 10:02:03午马
日期:2014-05-04 08:18:27亥猪
日期:2014-04-29 11:11:32技术图书徽章
日期:2014-04-24 15:51:26技术图书徽章
日期:2014-04-17 11:01:53辰龙
日期:2014-04-15 12:45:46亥猪
日期:2014-04-11 09:06:23射手座
日期:2014-04-01 15:28:10
8 [报告]
发表于 2013-07-26 10:41 |只看该作者
试下增大stack size

论坛徽章:
0
9 [报告]
发表于 2013-07-27 17:50 |只看该作者
回复 8# zhaopingzi


    Configuration option is not unique.

Parameter Name                 Default     Memory Used Config Value Run Value   Unit                 Type      
------------------------------ ----------- ----------- ------------ ----------- -------------------- ----------
esp execution stacksize        65536           0          65536        65536            bytes                static     
stack guard size                   4096      #20352       16384        16384            bytes                static     
stack size                           88472     #312912      251904       251904         bytes                static     
已经很大了.
另外请教一下, stack size与stack guard size的联系, 官网的英文解释没看懂....

论坛徽章:
0
10 [报告]
发表于 2013-12-12 14:44 |只看该作者
stack size与stack guard size的联系
-----
看过电影里面的压力表吧,通俗的解释stack guard size 就是表盘上红色的那段,明白了?

你的这个问题: 我的建议是调大procedure cache size.
另外12.5已经end of life 了, 可以考虑升级到新版本.
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP