Linux内核之数据类型与结构的学习

常用数据类型一个字（WORD）指CPU一次可以处理的数据大小，通用register的大小也是一个WORD，内存总线的宽度也是一个WORD，内存指针的大小也是一个WORD（C语言中的long类型的位宽就是WORD的大小）。C语言并没有规定每一种数据类型的位宽，但是有一些共识：char总是一个字节（8位）short是16位linux上的int总是32位long可能是32或者64位linux内核中具有位宽规定的数据类型有：若用户空间要用指定位宽的数据类型，只要在前面加上“__”即可，如__u32。表示内存地址一定使用unsigned long的类型。还有一些特殊的数据类型，比如pid_t、size_t等以_t结尾的数据类型，它们也是通过typeof定义的，但打印它们的时候，需要将它们转化为最大的数据类型（通常使用unsigned long的打印格式，"%lu"）。数据类型的符号声明为：signed char i = -1;unsigned char = 255;字节对齐指的是，一个变量存放的地址需要是该变量对应数据类型大小的倍数。比如一个32位宽的变量，若它存放的地址是4（因为是4字节）的倍数，则它是字节对齐的。在某些平台上，若访问字节不对齐的变量，可能会导致异常。对于结构体的数据对齐来说：struct animal_struct { char dog; /* 1 byte */ unsigned long cat; /* 4 bytes */unsigned short pig; /* 2 bytes */ char fox; /* 1 byte */};//编译器会将其优化为：struct animal_struct { char dog; /* 1 byte */ u8 __pad0[3]; /* 3 bytes */ unsigned long cat; /* 4 bytes */ unsigned short pig; /* 2 bytes */ char fox; /* 1 byte */ u8 __pad1; /* 1 byte */};//这扩大了数据原本的占用空间，其实通过调整顺序来避免数据对齐的问题：struct animal_struct { unsigned long cat; /* 4 bytes */ unsigned short pig; /* 2 bytes */ char dog; /* 1 byte */ char fox; /* 1 byte */};//如果需要防止编译器向结构体中插入padding，可以使用__attribute__struct { u16 id; u64 lun; u16 reserved1; u32 reserved2;} __attribute__((packed)) scsi;//否则的话，在id之后会有2个字节（32位平台）或6个字节（64位平台）的padding。字节序指的是在一个WORD中BYTES的顺序！如果高位字节（most significant byte）在内存低地址，就是大端字节序（big-endian）；如果是低位字节（least significant byte）在内存低地址，就是小端字节序（little-endian）。比如，对于数字1027来说，两种不同的字节序为：//判断当前平台的字节序：int x = 1;if (*(char *)&x == 1) /* little endian */else/* big endian *///系统中也有__BIG_ENDIAN和__LITTLE_ENDIAN宏，可以用宏条件判断，但更好的是用字节序的转换方法：u23 __cpu_to_be32(u32); /* convert cpu’s byte order to big-endian */ u32 __cpu_to_le32(u32); /* convert cpu’s byte order to little-endian */ u32 __be32_to_cpu(u32); /* convert big-endian to cpu’s byte order */ u32 __le32_to_cpus(u32); /* convert little-endian to cpu’s byte order *///返回值是指针的函数通常用NULL表示错误，但有的时候需要让指针也返回错误信息：void *ERR_PTR(long error);long PTR_ERR(const void *ptr);long IS_ERR(const void *ptr);双向链表 //普通双向链表的结构体：struct fox { unsigned long tail_length; /* length in centimeters of tail */ unsigned long weight; /* weight in kilograms */ bool is_fantastic; /* is this fox fantastic? */ struct fox *next; /* next fox in linked list */ struct fox *prev; /* previous fox in linked list */};//而内核的实现方式是在结构体中加入链表节点struct list_head { struct list_head *next struct list_head *prev;};struct fox { unsigned long tail_length; /* length in centimeters of tail */ unsigned long weight; /* weight in kilograms */ bool is_fantastic; /* is this fox fantastic? */ struct list_head list; /* list of all fox structures */};是很奇怪，list_head尽管有“head”这个词，但还是表示一个普通node。一个node也叫一个entry。程序中一般有一个独立的链表头，后面挂上链表节点，比如网络程序中常见的ptype_all、ptype_base、__get_cpu_var(softnet_data).poll_list等都是链表头。链表有关的操作函数为：INIT_LIST_HEAD() — 初始化节点LIST_HEAD() — 定义链表头list_add(struct list_head *new, struct list_head *head)list_del(struct list_head *entry) 等等来看一下list_add的实现，虽然参数中的head不一定要代入链表头，但是当head就是链表头的情况下，list_add就称为头插法，list_add_tail就称为尾插法：static inline void list_add(struct list_head *new, struct list_head *head){ __list_add(new, head, head->next);}static inline void list_add_tail(struct list_head *new, struct list_head *head){ __list_add(new, head->prev, head);}static inline void __list_add(struct list_head *new,struct list_head *prev,struct list_head *next){ next->prev = new; new->next = next; new->prev = prev; prev->next = new;}遍历一个链表的复杂度为O(n)，首先看一下如何从链表节点获取包含此链表节点的结构体变量：#define list_entry(ptr, type, member) container_of(ptr, type, member)container_of(ptr, type, member)等价于：const typeof(((type *)0)->member ) *__mptr = (ptr);(type *)((char *)__mptr - ((size_t) &((type *)0)->member)); 它借用了0地址获得member在结构体变量中的偏移地址#define list_for_each(pos, head) \ for (pos = (head)->next; prefetch(pos->next), pos != (head); pos = pos->next)#define list_for_each_entry(pos, head, member) \ for (pos = list_entry((head)->next, typeof(*pos), member); \ prefetch(pos->member.next), &pos->member != (head); \ pos = list_entry(pos->member.next, typeof(*pos), member))遍历的方法是：struct list_head *p; //链表头struct fox *f;list_for_each(p, &fox_list) {/* f points to the structure in which the list is embedded */f = list_entry(p, struct fox, list);}高级版是：list_for_each_entry(f, &fox_list, list) {/* on each iteration, ‘f’ points to the next fox structure ... */}一般遍历一边删除用：list_for_each_entry_safe(pos, next, head, member)• Hash链表在list中每个节点都是一样的，不管是链表头还是链表节点，但是 hlist不是这样的。在hlist中，头结点使用struct hlist_head表示，而对于链表节点使用struct hlist_node表示。而且，hlist不是双向循环链表。来看一下Hash链表头和链表节点：struct hlist_head {struct hlist_node *first;};struct hlist_node {struct hlist_node *next, **pprev;};这样的好处是，Hash链表头就比较小，因为Hash链表为减少冲突一般有很多个链表头，链表头小可以节省内存资源。链表节点有了一个pprev的好处是，插入删除等操作都容易实现，否则光有next的话，只能将节点一个个遍历才能进行链表操作。//链表节点删除 static inline void __hlist_del(struct hlist_node *n) { struct hlist_node *next = n->next; struct hlist_node **pprev = n->pprev; *pprev = next; if (next) next->pprev = pprev; } static inline void hlist_del(struct hlist_node *n) { __hlist_del(n); n->next = LIST_POISON1; n->pprev = LIST_POISON2; }//链表的首个节点（头插法） static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h) { struct hlist_node *first = h->first; n->next = first; if (first) first->pprev = &n->next; h->first = n; n->pprev = &h->first; }//普通节点插入 static inline void hlist_add_before(struct hlist_node *n, struct hlist_node *next) { n->pprev = next->pprev; n->next = next; next->pprev = &n->next; *(n->pprev) = n; } static inline void hlist_add_after(struct hlist_node *n, struct hlist_node *next) { next->next = n->next; n->next = next; next->pprev = &n->next; if(next->next) next->next->pprev = &next->next; }• 队列队列就像producer和consumer，producer生产数据（如消息、网络帧），consumer处理数据。• 映射value = Lookup (key)，内核也实现了这种数据结构，称为idr。• 二叉树二叉树的排序规则为：左child总是比parent小，右child总是比parent大。平衡二叉树的特点在于，叶子（无child的节点）之间的深度变化在1以内。内核中实现了一种平衡二叉树，称为red-black树，特点在于还具有颜色属性。遍历二叉树的算法复杂度为O(log(n))，当涉及到大数据的查询时，应该考虑采用二叉树。

(以上内容不代表本站观点。)
---------------------------------