linux2.6.25版本netlink造成内核panic

satanaelzhou 2011-08-21 10:45:31
我用的是linux2.6.25内核的基础上自己增加了个自定义的netlink。内核态和用户态可以进行通信,但是在执行完通信后,系统会发生如下的错误:
ipencrypt init netlink
Receive netlink message begin
Receive netlink message payload miss here!
Receive netlink message, skb address:c70411e0, nlh address:c7bcf400
send netlink message begin
send netlink message 1
send netlink message skb address:c70410a0, nlh address:c7bcf200,msg adddress:c7bcf210
send netlink message 2
failure send netlink
Receive netlink message end
Unable to handle kernel paging request for data at address 0x3c31323e
Faulting instruction address: 0xc006bc34
Oops: Kernel access of bad area, sig: 11 [#1]
Freescale MPC8272 ADS
Modules linked in: ipencrypt_netlink
NIP: c006bc34 LR: c006bc08 CTR: c022ff00
REGS: c7bcbba0 TRAP: 0300 Not tainted (2.6.25)
MSR: 00001032 <ME,IR,DR> CR: 28002284 XER: 20000000
DAR: 3c31323e, DSISR: 20000000
TASK = c78477c0[778] 'klogd' THREAD: c7bca000
GPR00: 00000000 c7bcbc50 c78477c0 c03194c8 000004d0 c01c5998 ffffffff c7bcbe48
GPR08: c7bcbddc c0319000 00000000 00000480 48000288 100ac86c 100b0000 1008afb0
GPR16: 100beee8 1007ecb4 c7bcbce8 c0333388 00000000 bf80a728 0000005f c7bcbce8
GPR24: c799b1d8 c799b238 000004d0 c01c5998 000004d0 00009032 3c31323e c03194c8
Call Trace:
[c7bcbc50] [c0018194] (unreliable)
[c7bcbc70] [c01ca414]
[c7bcbc90] [c01c5998]
[c7bcbce0] [c02300f0]
[c7bcbd50] [c01c3970]
[c7bcbe30] [c01c3ce8]
[c7bcbf00] [c01c4674]
[c7bcbf40] [c000fcf8]
--- Exception: c01Instruction dump:
4bfffe2d 2b830010 7c7f1b78 409d003c 7fa000a6 57a0045e 7c000124 83df0074
2f9e0000 419e0078 801f0080 5400103a <7d3e002e> 913f0074 7fa00124 73808000
---[ end trace 8337b812452f0803 ]---
KERNEL: assertion (!atomic_read(&sk->sk_wmem_alloc)) failed at net/unix/af_unix.c (350)
KERNEL: assertion (sk_unhashed(sk)) failed at net/unix/af_unix.c (351)
KERNEL: assertion (!sk->sk_socket) failed at net/unix/af_unix.c (352)
Attempt to release alive unix socket: c799b180
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc01ccc4c
Oops: Kernel access of bad area, sig: 11 [#2]
Freescale MPC8272 ADS
Modules linked in: ipencrypt_netlink
NIP: c01ccc4c LR: c01ccda8 CTR: c0230f78
REGS: c7bbfbb0 TRAP: 0300 Tainted: G D (2.6.25)
MSR: 00001032 <ME,IR,DR> CR: 20022488 XER: 00000000
DAR: 00000000, DSISR: 22000000
TASK = c7b727c0[776] 'syslogd' THREAD: c7bbe000
GPR00: 00000000 c7bbfc60 c7b727c0 c799b000 00000000 c7bbfcc8 c7bbfce8 00000000
GPR08: c7bbfdbc 00000000 00009032 c70411e0 20000482 100ac86c 100b0000 1008afb0
GPR16: bf881778 bf8812f8 bf881378 00000000 c7bbfce8 c799b0b8 c00336dc c7bbfcc8
GPR24: 00000000 c7bbfc68 c7bbfc74 7fffffff c7404b00 c799b064 c799b000 c70411e0
Call Trace:
[c7bbfc60] [00000004] (unreliable)
[c7bbfcc0] [c01ccda8]
[c7bbfce0] [c0230fec]
[c7bbfd30] [c01c3514]
[c7bbfe20] [c01c3800]
[c7bbff00] [c01c4664]
[c7bbff40] [c000fcf8]
--- Exception: c01Instruction dump:
41beff4c 801f0068 54007ffe 90170000 40b2ff18 813d0008 3929ffff 913d0008
817f0000 813f0004 931f0000 931f0004 <91690000> 912b0004 7d400124 419eff18
---[ end trace 8337b812452f0803 ]---
------------[ cut here ]------------
Badness at c0022ca0 [verbose debug info unavailable]
NIP: c0022ca0 LR: c01c55f0 CTR: c01c4708
REGS: c7bbf990 TRAP: 0700 Tainted: G D (2.6.25)
MSR: 00021032 <ME,IR,DR> CR: 20022422 XER: 20000000
TASK = c7b727c0[776] 'syslogd' THREAD: c7bbe000
GPR00: 00000001 c7bbfa40 c7b727c0 c799b000 00000000 00000000 00000000 00000000
GPR08: 00001032 c03243c0 00000020 c7bbe000 80024428 100ac86c 100b0000 1008afb0
GPR16: bf881778 bf8812f8 bf881378 00000000 c7bbfce8 c799b0b8 c00336dc c7bbfcc8
GPR24: 00000000 c7bbfc68 ffffffff 00000000 c7404b00 c7404b0c c7ba1900 c799b000
Call Trace:
[c7bbfa40] [c031da0c] (unreliable)
[c7bbfa50] [c01c55f0]
[c7bbfa90] [c01c29d0]
[c7bbfac0] [c01c472c]
[c7bbfad0] [c006fff8]
[c7bbfaf0] [c006ca78]
[c7bbfb10] [c001f1a4]
[c7bbfb30] [c00207a8]
[c7bbfb70] [c000dd38]
[c7bbfb90] [c0012198]
[c7bbfba0] [c0010198]
--- Exception: 300[c7bbfc60] [00000004] (unreliable)
[c7bbfcc0] [c01ccda8]
[c7bbfce0] [c0230fec]
[c7bbfd30] [c01c3514]
[c7bbfe20] [c01c3800]
[c7bbff00] [c01c4664]
[c7bbff40] [c000fcf8]
--- Exception: c01Instruction dump:
80010014 83e1000c 38210010 7c0803a6 4e800020 4bfe3535 4bffffdc 3d20c032
392943c0 80090138 7c000034 5400d97e <0f000000> 2f800000 419eff94 38000001
Unable to handle kernel paging request for data at address 0x00000004
Faulting instruction address: 0xc01c7fcc
Oops: Kernel access of bad area, sig: 11 [#3]
Freescale MPC8272 ADS
Modules linked in: ipencrypt_netlink
NIP: c01c7fcc LR: c0230a20 CTR: c0230b4c
REGS: c7bbf9c0 TRAP: 0300 Tainted: G D (2.6.25)
MSR: 00001032 <ME,IR,DR> CR: 20028428 XER: 20000000
DAR: 00000004, DSISR: 22000000
TASK = c7b727c0[776] 'syslogd' THREAD: c7bbe000
GPR00: 00000000 c7bbfa70 c7b727c0 c799b064 00000001 00000000 00000000 00000000
GPR08: 00001032 ffffffe3 c70411e0 00000000 20024428 100ac86c 100b0000 1008afb0
GPR16: bf881778 bf8812f8 bf881378 00000000 c7bbfce8 c799b0b8 c00336dc c7bbfcc8
GPR24: 00000000 c7417d80 00000007 00000003 c7ba1080 00000000 c799b064 c799b000
Call Trace:
[c7bbfa70] [c02309c4] (unreliable)
[c7bbfaa0] [c01c3eb8]
[c7bbfac0] [c01c4734]
[c7bbfad0] [c006fff8]
[c7bbfaf0] [c006ca78]
[c7bbfb10] [c001f1a4]
[c7bbfb30] [c00207a8]
[c7bbfb70] [c000dd38]
[c7bbfb90] [c0012198]
[c7bbfba0] [c0010198]
--- Exception: 300[c7bbfc60] [00000004] (unreliable)
[c7bbfcc0] [c01ccda8]
[c7bbfce0] [c0230fec]
[c7bbfd30] [c01c3514]
[c7bbfe20] [c01c3800]
[c7bbff00] [c01c4664]
[c7bbff40] [c000fcf8]
--- Exception: c01Instruction dump:
7d0000a6 5500045e 7c000124 81430000 7f8a1800 419e0034 81230008 38000000
816a0000 3929ffff 91230008 91630000 <906b0004> 900a0000 900a0004 7d000124
---[ end trace 8337b812452f0803 ]---
Fixing recursive fault but reboot is needed!
Unable to handle kernel paging request for data at address 0x4011332c
Faulting instruction address: 0xc007b7a8
Oops: Kernel access of bad area, sig: 11 [#4]
Freescale MPC8272 ADS
Modules linked in: ipencrypt_netlink
NIP: c007b7a8 LR: c0076044 CTR: c001a5e4
REGS: c7199d10 TRAP: 0300 Tainted: G D (2.6.25)
MSR: 00009032 <EE,ME,IR,DR> CR: 44000424 XER: 00000000
DAR: 4011332c, DSISR: 20000000
TASK = c7009ba0[902] 'ps' THREAD: c7198000
GPR00: c0076044 c7199dc0 c7009ba0 4011332c 0000001d 00020001 00000002 41c23977
GPR08: 00000004 c7bcf3f4 c7198034 c7bcf400 24000424 100ac86c 100b0000 1008afb0
GPR16: c0304bb4 c743452c c025c5a0 c7bcf428 c0320000 c70a2780 c7199e38 fffffff2
GPR24: 00000400 c7198000 00000400 00000400 0000ffff 00000000 00020001 4011332c
Call Trace:
[c7199dc0] [c0019834] (unreliable)
[c7199de0] [c0076044]
[c7199e30] [c006ea94]
[c7199ef0] [c006f4fc]
[c7199f10] [c006fb74]
[c7199f40] [c000fcf8]
--- Exception: c01Instruction dump:
8017fffc 90010018 4bffffcc 9421ffe0 7d800026 7c0802a6 bfc10018 7c7f1b79
7cbe2b78 90010024 91810014 41820054 <801f0000> 2e040017 2f804601 41be0014
---[ end trace 8337b812452f0803 ]---
Unable to handle kernel paging request for data at address 0x03de006c
Faulting instruction address: 0xc024d844
Oops: Kernel access of bad area, sig: 11 [#5]
Freescale MPC8272 ADS
Modules linked in: ipencrypt_netlink
NIP: c024d844 LR: c0075e0c CTR: 00000000
REGS: c7bcbcf0 TRAP: 0300 Tainted: G D (2.6.25)
MSR: 00009032 <EE,ME,IR,DR> CR: 24002484 XER: 00000000
DAR: 03de006c, DSISR: 20000000
TASK = c70097c0[903] 'grep' THREAD: c7bca000
GPR00: c7bcbdb4 c7bcbda0 c70097c0 03de006c c7bcbda8 c7b811a0 c001af64 00000000
GPR08: c7bcbdb4 00000000 00000000 c7b81340 000021c6 100ac86c c70a2d80 00000000
GPR16: bfffffff c7bcbe38 00000000 00000000 00000400 00000000 00000000 00000000
GPR24: 0fedb340 00000000 c7bcf400 c7bcbf20 c7bcbf20 c70a2d80 c7bcbda8 c7bcf400
Call Trace:
[c7bcbda0] [c0075dec] (unreliable)
[c7bcbdd0] [c00764a8]
[c7bcbe30] [c006ebcc]
[c7bcbef0] [c006f668]
[c7bcbf10] [c006fae4]
[c7bcbf40] [c000fcf8]
--- Exception: c01Instruction dump:
2f890000 4dbd0020 48000174 7c001828 3000ffff 7c00192d 40a2fff4 2f800000
419c000c 38600000 4e800020 4800026c <7c001828> 3000ffff 7c00192d 40a2fff4
---[ end trace 8337b812452f0803 ]---
Unable to handle kernel paging request for data at address 0x0023ae75
Faulting instruction address: 0xc007b424
Oops: Kernel access of bad area, sig: 11 [#6]
Freescale MPC8272 ADS
Modules linked in: ipencrypt_netlink
NIP: c007b424 LR: c00754cc CTR: c0075ab0
REGS: c7bcbaf0 TRAP: 0300 Tainted: G D (2.6.25)
MSR: 00001032 <ME,IR,DR> CR: 24002424 XER: 20000000
DAR: 0023ae75, DSISR: 20000000
TASK = c70097c0[903] 'grep' THREAD: c7bca000
GPR00: 00001032 c7bcbba0 c70097c0 ffffffff 0023ae69 00000000 c7bcf428 00000000
GPR08: 00009032 c7426500 00000020 00000000 24004422 100ac86c c70a2d80 00000000
GPR16: bfffffff c7bcbe38 00000000 00000000 00000400 00000000 00000000 00000000
GPR24: 0fedb340 ffffffff c743452c 00000000 c70a2d80 ffffffff c7bcf428 c70a2d80
Call Trace:
[c7bcbba0] [c0058d40] (unreliable)
[c7bcbbc0] [c00754cc]
[c7bcbbf0] [c0075ad0]
[c7bcbc10] [c006fff8]
[c7bcbc30] [c006ca78]
[c7bcbc50] [c001f1a4]
[c7bcbc70] [c00207a8]
[c7bcbcb0] [c000dd38]
[c7bcbcd0] [c0012198]
[c7bcbce0] [c0010198]
--- Exception: 300[c7bcbda0] [c0075dec] (unreliable)
[c7bcbdd0] [c00764a8]
[c7bcbe30] [c006ebcc]
[c7bcbef0] [c006f668]
[c7bcbf10] [c006fae4]
[c7bcbf40] [c000fcf8]
--- Exception: c01Instruction dump:
7c9f2378 90010024 7cde3378 91810010 409200e0 39600000 7c0000a6 5400045e
7c000124 809e0000 2f840000 419e0034 <8004000c> 7fc9f378 7f80f800 40be0014
---[ end trace 8337b812452f0803 ]---
Fixing recursive fault but reboot is needed!


...全文
559 11 打赏 收藏 转发到动态 举报
AI 作业
写回复
用AI写文章
11 条回复
切换为时间正序
请发表友善的回复…
发表回复
satanaelzhou 2011-08-25
  • 打赏
  • 举报
回复
前面的引起的panic已经解决,在用户空间的发送和接收的时候使用同一份空间造成的。
现在有如下的问题还没有解决,请大家给些思路。
ipencrypt init netlink
Receive netlink message begin
Receive netlink message payload miss here!
Receive netlink message, skb address:c70e11e0, nlh address:c7193000
send netlink message begin
send netlink message 1
send netlink message skb address:c70e1500, nlh address:c717b000,msg adddress:c717b010
send netlink message 2
failure send netlink
Receive netlink message end
syslogd: UNIX socket error: Operation not supported
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc01cc240
Oops: Kernel access of bad area, sig: 11 [#1]
Freescale MPC8272 ADS
Modules linked in: ipencrypt_netlink
NIP: c01cc240 LR: c0234d14 CTR: c001c278
REGS: c7bbfd80 TRAP: 0300 Not tainted (2.6.25)
MSR: 00001032 <ME,IR,DR> CR: 82008424 XER: 00000000
DAR: 00000000, DSISR: 20000000
TASK = c7b7c3f0[777] 'syslogd' THREAD: c7bbe000
GPR00: 00000000 c7bbfe30 c7b7c3f0 c79a0064 00000001 daf956f6 00000001 dc447636
GPR08: 00009032 c79a0180 00000000 c79a0170 004c4b40 100ac86c 100b0000 1008afb0
GPR16: bf9e4828 bf9e43a8 bf9e4428 00000008 00000004 00000003 100ada4e 00000001
GPR24: 00000000 c7417f80 00000007 00000003 c7b83d80 00000000 c79a0064 c79a0000
NIP [c01cc240] skb_dequeue+0x20/0x58
LR [c0234d14] unix_release_sock+0xe4/0x210
Call Trace:
[c7bbfe30] [c0234cb8] unix_release_sock+0x88/0x210 (unreliable)
[c7bbfe60] [c01c813c] sock_release+0x30/0x9c
[c7bbfe80] [c01c89b8] sock_close+0x2c/0x68
[c7bbfe90] [c007397c] __fput+0xb4/0x19c
[c7bbfeb0] [c00703fc] filp_close+0x68/0xb0
[c7bbfed0] [c0020e58] put_files_struct+0xf8/0x100
[c7bbfef0] [c002245c] do_exit+0x144/0x63c
[c7bbff30] [c002298c] do_group_exit+0x38/0x98
[c7bbff40] [c000fed0] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe2c7d8
LR = 0xfedd884
Instruction dump:
816b0000 2f8b0000 409effe4 4e800020 7d0000a6 5500045e 7c000124 81430000
7f8a1800 419e0034 81230008 38000000 <816a0000> 3929ffff 91230008 91630000
---[ end trace 20baa49ce2b01ab1 ]---
Fixing recursive fault but reboot is needed!
satanaelzhou 2011-08-25
  • 打赏
  • 举报
回复
哎,这里的linux的气氛还是不行啊。
问题本人已经解决,receive的回调函数中的goto语句引起的所有的错误。
感谢各位的友情参与!
satanaelzhou 2011-08-24
  • 打赏
  • 举报
回复
恩,cpu不是x86架构的。问题有进展,后续解决了给大家贴出来。
lvyinghong 2011-08-23
  • 打赏
  • 举报
回复
nlh = NLMSG_PUT(skb, daemon_pid, msg_ctx ? msg_ctx->counter : 0,
msg_type, payload_len);

daemon_pid 这个位置应该是填发送者的pid,不是接收者的。 好像内核发送填0吧。


你要把那些地址解析成函数名,看起来才容易阿,这个都不知道上面来的。你这个什么cpu看起来怪怪的
satanaelzhou 2011-08-22
  • 打赏
  • 举报
回复
能根据打印的日志 确定那里越界了么。我代码也贴出来了,我自查了好多遍,自己没有发现,有关内存操作的不妥当的地方。
帅得不敢出门 2011-08-22
  • 打赏
  • 举报
回复
哪里访问越界了.
satanaelzhou 2011-08-21
  • 打赏
  • 举报
回复
嗯 ,现在可以确定是内核代码引起的,因为我加了一个自定义的netlink的操作,代码在上面罗列了。您觉得不是我加的netlink的代码引起的崩溃而是 其他的驱动层面的代码引起的 是这个意思么?我现在分析的是因为skb_buff引用中出了问题。能帮我再确定下问题的主要方向么。谢谢您了。
satanaelzhou 2011-08-21
  • 打赏
  • 举报
回复
就是内核恐慌,内核崩溃了。上面是崩溃后打印出来的堆栈信息。
「已注销」 2011-08-21
  • 打赏
  • 举报
回复
什么事内核panic
satanaelzhou 2011-08-21
  • 打赏
  • 举报
回复
我的内核态代码如下:

#include <linux/module.h>
#include <linux/skbuff.h>
#include <linux/kernel.h>
#include <linux/errno.h>
#include <linux/types.h>
#include <net/sock.h>

#include <linux/netlink.h>
#include <net/net_namespace.h>

//#include <net/ipencrypt_netlink.h>


MODULE_LICENSE("GPL");
MODULE_AUTHOR("123");
MODULE_DESCRIPTION("ip netlink");

struct sock* ipencrypt_nl_sk = NULL;


struct ipencrypt_msg{
unsigned int index;
char data[64];
};


struct ipencrypt_msg_ctx{
unsigned int index;
unsigned int counter;
};

static int ipencrypt_send_netlink(char *data, int data_len,
struct ipencrypt_msg_ctx *msg_ctx, u16 msg_type,
u16 msg_flags, pid_t daemon_pid)
{
printk("send netlink message begin\n");
struct sk_buff *skb;
struct nlmsghdr *nlh;
struct ipencrypt_msg *msg;
size_t payload_len;
int rc;

payload_len = ((data && data_len) ? (sizeof(*msg) + data_len) : 0);
skb = alloc_skb(NLMSG_SPACE(payload_len), GFP_KERNEL);
if (!skb) {
rc = -ENOMEM;
printk(KERN_ERR, "Failed to allocate socket buffer\n");
goto out;
}
nlh = NLMSG_PUT(skb, daemon_pid, msg_ctx ? msg_ctx->counter : 0,
msg_type, payload_len);
nlh->nlmsg_flags = msg_flags;
if (msg_ctx && payload_len) {
msg = (struct ipencrypt_msg *)NLMSG_DATA(nlh);
msg->index = msg_ctx->index;
//msg->data_len = data_len;
memcpy(msg->data, data, data_len);
}
printk("send netlink message 1\n");
printk("send netlink message skb address:%x, nlh address:%x,msg adddress:%x\n", skb, nlh,msg);
rc = netlink_unicast(ipencrypt_nl_sk, skb, daemon_pid, 0);
if (rc < 0) {
printk("Failed to send eCryptfs netlink message; rc = [%d]\n", rc);
goto out;
}
rc = 0;
printk("send netlink message 2\n");
goto out;
nlmsg_failure:
printk("send netlink message nlmsg_failure\n");
rc = -EMSGSIZE;
kfree_skb(skb);
out:
return rc;
}

static void ipencrypt_receive_nl_message(struct sk_buff *skb)
{
printk("Receive netlink message begin\n");
struct nlmsghdr *nlh;
char data[16] = "i got it";
struct ipencrypt_msg_ctx msg_ctx;
u32 pid;

nlh = nlmsg_hdr(skb);
if (!NLMSG_OK(nlh, skb->len)) {
printk(KERN_ERR, "Received corrupt netlink message\n");
goto free;
}

pid = nlh->nlmsg_pid;

memset(&msg_ctx, 0, sizeof(struct ipencrypt_msg_ctx));
msg_ctx.index = 0;
msg_ctx.counter = nlh->nlmsg_seq;

printk("Receive netlink message payload %s\n", (char*)NLMSG_DATA(nlh));
printk("Receive netlink message, skb address:%x, nlh address:%x\n", skb, nlh);

if(!ipencrypt_send_netlink(data, sizeof(data), &msg_ctx, 0, 0, pid)){
printk("failure send netlink\n");
}
printk("Receive netlink message end\n");
/*switch (nlh->nlmsg_type) {
case ECRYPTFS_NLMSG_RESPONSE:
if (ecryptfs_process_nl_response(skb)) {
ecryptfs_printk(KERN_WARNING, "Failed to "
"deliver netlink response to "
"requesting operation\n");
}
break;
case ECRYPTFS_NLMSG_HELO:
if (ecryptfs_process_nl_helo(skb)) {
ecryptfs_printk(KERN_WARNING, "Failed to "
"fulfill HELO request\n");
}
break;
case ECRYPTFS_NLMSG_QUIT:
if (ecryptfs_process_nl_quit(skb)) {
ecryptfs_printk(KERN_WARNING, "Failed to "
"fulfill QUIT request\n");
}
break;
default:
ecryptfs_printk(KERN_WARNING, "Dropping netlink "
"message of unrecognized type [%d]\n",
nlh->nlmsg_type);
break;
}*/
free:
kfree_skb(skb);
}


static int __init ipencrypt_init_netlink()
{
int rc;

ipencrypt_nl_sk = netlink_kernel_create(&init_net
, NETLINK_IPENCRYPT
, 0
, ipencrypt_receive_nl_message
, NULL
, THIS_MODULE);
if(!ipencrypt_nl_sk){
rc = -EIO;
printk(KERN_ERR, "Failed to create netlink socket\n");
goto out;
}
ipencrypt_nl_sk->sk_sndtimeo = HZ;
rc = 0;

printk("ipencrypt init netlink\n");
return 0;
out:
return 0;
}

static void __exit ipencrypt_release_netlink(void)
{
printk("ipencrypt release netlink\n");
netlink_kernel_release(ipencrypt_nl_sk);
ipencrypt_nl_sk = NULL;
}

module_init(ipencrypt_init_netlink);
module_exit(ipencrypt_release_netlink);

只要netlink_kernel_create传入的回调函数为NULL就不会出panic,是收到消息的回调处理引起的!但是不清楚是什么原因造成,麻烦各位大牛帮忙看看。
jackyjkchen 2011-08-21
  • 打赏
  • 举报
回复
保护模式系统崩溃必然是内核态代码造成,是不是驱动方面的问题?

4,465

社区成员

发帖
与我相关
我的任务
社区描述
Linux/Unix社区 内核源代码研究区
社区管理员
  • 内核源代码研究区社区
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧