由于监控需要,最近写了一个Tuxedo Clinet,用于定时调用监控服务,其中碰见了一些问题,有些已解,有些存疑,很是苦恼。
问题1:strtok函数
开始在Solaris 5.9平台写的程序中,有一段逻辑是读取文件,逐行解析,文件格式KEY=VALUE,模拟如下程序,
#include stdio.h #include stdlib.h #include string.h void main(int argc, char *argv[]) { char service[10]; char sendbuf[100]; char *token; char line[]="LOCAL=01JAN10";
printf("beginn");char *delimter="="; token=strtok(line,"="); strtok(line,delimter); strcpy(service,line); strcpy(service, strtok(line, delimter)); printf("service is %sn",service); strcpy(sendbuf, strtok(NULL, delimter)); printf("sendbuf is %sn",sendbuf); return; }
执行正常,但是在Linux 6.3平台,重新编译没报错,执行的时候,就出了core,如下所示,
输出,
begin
service is LOCAL
Segmentation fault (core dumped)
P.S. 补充知识:如何设置core文件的路径
一开始执行出core,没有生成core文件,若需要可以如下配置,
cd /proc/sys/kernel
echo “/opt/app/core” core_pattern,可以加些参数,例如进程PID号,
echo “/opt/app/core-%e-%p-%t” core_pattern,产生的文件名为core-命令名-pid-时间戳,以下是参数列表:
%p - insert pid into filename 添加pid %u - insert current uid into filename 添加当前uid %g - insert current gid into filename 添加当前gid %s - insert signal that caused the coredump into the filename 添加导致产生core的信号 %t - insert UNIX time that the coredump occurred into filename 添加core文件生成时的unix时间 %h - insert hostname where the coredump happened into filename 添加主机名 %e - insert coredumping executable name into filename 添加命令名
- 将ulimit -c unlimited写入profile中,表示core文件的大小不受限制。
有了core,我们可以用gdb看下core文件,
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /DATA/app/tuxapp/monitor/code/one...(no debugging symbols found)...done. [New Thread 8709] Missing separate debuginfo for Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/8f/cb014d96e04978ad2256ce074e192ed72b7559 Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Core was generated by `./one'. Program terminated with signal 11, Segmentation fault. #0 0x00000031de934152 in __strcpy_ssse3 () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.166.el6_7.7.x86_64
定位,
(gdb) bt #0 0x00000031de934152 in __strcpy_ssse3 () from /lib64/libc.so.6 #1 0x0000000000400693 in main ()
结合应用日志,定位strcpy(sendbuf, strtok(NULL, delimter));这句有问题。
如下两种写法,
写法一
#include stdio.h #include stdlib.h #include string.h void main(int argc, char *argv[]) { char service[10]; char sendbuf[100]; char line[]="LOCAL=01JAN10"; memset(service, 0, sizeof(service)); memset(sendbuf, 0, sizeof(sendbuf)); strcpy(service, strtok(line, "=")); strcpy(sendbuf, strtok(NULL, "=")); printf("[%s][%s]n", service, sendbuf); return; }
写法二
#include stdio.h #include stdlib.h #include string.h void main(int argc, char *argv[]) { char service[10]; char sendbuf[100]; char line[]="LOCAL=01JAN10"; memset(service, 0, sizeof(service)); memset(sendbuf, 0, sizeof(sendbuf)); char *delimter="="; strcpy(service, strtok(line, delimter)); strcpy(sendbuf, strtok(NULL, delimter)); printf("[%s][%s]n", service, sendbuf); return; }
可以执行正常,
[LOCAL][01JAN10]
很是奇怪,临时解决方案,就是不用strtok,而是字符位置,进行解析。至于strtok表现为何不同,需要指教。
问题二:Fadd32函数
由于使用了Tuxedo的FML32结构体,原始写法:
(void)Fadd32(fbfr,UI_ROW, (char *)&ut.row, (FLDLEN32)0);
Solaris正常使用,但是Linux平台,提示,
Tperrno = 12, TPESYSTEM - internal system error
ULOG记录,
142731.vm-vmw43314-app!?proc.9112.3589154560.0: LIBTUX_CAT:6031: ERROR: Unable to pre-process buffer before tranmission. Error code(12/4183) 142731.vm-vmw43314-app!?proc.9112.3589154560.0: LIBWSC_CAT:1045: ERROR: Presend on message failed 142731.vm-vmw43314-app!?proc.9112.3589154560.0: LIBWSC_CAT:1011: ERROR: tpcall() message send failure
关于这三个报错,
1011
ERROR: tpcall() message send failure Description An attempt to send a request to the Workstation Handler process failed during a tpcall. This could be a result of the network going down, the Workstation Handler process not running, or the site of the Workstation Handler going down. Action Shut the client down and attempt to reconnect. If this fails, contact your BEA TUXEDO system Technical Support.
1045
ERROR: Presend on message failed Description An attempt to presend a message failed. This is done in a buffer-type switch function. ActionShut the client down and contact your BEA TUXEDO system Technical Support.
6031
ERROR: Unable to pre-process buffer before tranmission. Error code(val/val) Description While handling a message before transmission (presend), the system was unable to process the message. ActionContact BEA Customer Support.
没有太明显的说明信息。
请教高人,指出FML32使用的不对,比如UI_ROW是long,但是ut.row定义的是int,使用如下写法,
long tmp = ut.row;
if (Fadd32(fbfr,UI_ROW, (char *)&(tmp), (FLDLEN32)0) 0)
printf(“Fadd32 failed: %d(%s)n”, tperrno, tpstrerror(tperrno));
尽管编译会报错,
note: expected 'char *' but argument is of type 'char (*)[7]' warning: passing argument 3 of 'Fadd32' from incompatible pointer type
但是可以正常执行了,不易。
感谢各位朋友们的帮助,孟哥、张老、海哥、类总、秀哥,给出各种建议,当然,上面存疑的问题,要是有朋友了解,还请指教!
如果您觉得本文有帮助,欢迎关注转发:bisal的个人杂货铺,