| xj | b04a402 | 2021-11-25 15:01:52 +0800 | [diff] [blame] | 1 | ===================================================================== | 
|  | 2 | Everything you never wanted to know about kobjects, ksets, and ktypes | 
|  | 3 | ===================================================================== | 
|  | 4 |  | 
|  | 5 | :Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 
|  | 6 | :Last updated: December 19, 2007 | 
|  | 7 |  | 
|  | 8 | Based on an original article by Jon Corbet for lwn.net written October 1, | 
|  | 9 | 2003 and located at http://lwn.net/Articles/51437/ | 
|  | 10 |  | 
|  | 11 | Part of the difficulty in understanding the driver model - and the kobject | 
|  | 12 | abstraction upon which it is built - is that there is no obvious starting | 
|  | 13 | place. Dealing with kobjects requires understanding a few different types, | 
|  | 14 | all of which make reference to each other. In an attempt to make things | 
|  | 15 | easier, we'll take a multi-pass approach, starting with vague terms and | 
|  | 16 | adding detail as we go. To that end, here are some quick definitions of | 
|  | 17 | some terms we will be working with. | 
|  | 18 |  | 
|  | 19 | - A kobject is an object of type struct kobject.  Kobjects have a name | 
|  | 20 | and a reference count.  A kobject also has a parent pointer (allowing | 
|  | 21 | objects to be arranged into hierarchies), a specific type, and, | 
|  | 22 | usually, a representation in the sysfs virtual filesystem. | 
|  | 23 |  | 
|  | 24 | Kobjects are generally not interesting on their own; instead, they are | 
|  | 25 | usually embedded within some other structure which contains the stuff | 
|  | 26 | the code is really interested in. | 
|  | 27 |  | 
|  | 28 | No structure should EVER have more than one kobject embedded within it. | 
|  | 29 | If it does, the reference counting for the object is sure to be messed | 
|  | 30 | up and incorrect, and your code will be buggy.  So do not do this. | 
|  | 31 |  | 
|  | 32 | - A ktype is the type of object that embeds a kobject.  Every structure | 
|  | 33 | that embeds a kobject needs a corresponding ktype.  The ktype controls | 
|  | 34 | what happens to the kobject when it is created and destroyed. | 
|  | 35 |  | 
|  | 36 | - A kset is a group of kobjects.  These kobjects can be of the same ktype | 
|  | 37 | or belong to different ktypes.  The kset is the basic container type for | 
|  | 38 | collections of kobjects. Ksets contain their own kobjects, but you can | 
|  | 39 | safely ignore that implementation detail as the kset core code handles | 
|  | 40 | this kobject automatically. | 
|  | 41 |  | 
|  | 42 | When you see a sysfs directory full of other directories, generally each | 
|  | 43 | of those directories corresponds to a kobject in the same kset. | 
|  | 44 |  | 
|  | 45 | We'll look at how to create and manipulate all of these types. A bottom-up | 
|  | 46 | approach will be taken, so we'll go back to kobjects. | 
|  | 47 |  | 
|  | 48 |  | 
|  | 49 | Embedding kobjects | 
|  | 50 | ================== | 
|  | 51 |  | 
|  | 52 | It is rare for kernel code to create a standalone kobject, with one major | 
|  | 53 | exception explained below.  Instead, kobjects are used to control access to | 
|  | 54 | a larger, domain-specific object.  To this end, kobjects will be found | 
|  | 55 | embedded in other structures.  If you are used to thinking of things in | 
|  | 56 | object-oriented terms, kobjects can be seen as a top-level, abstract class | 
|  | 57 | from which other classes are derived.  A kobject implements a set of | 
|  | 58 | capabilities which are not particularly useful by themselves, but which are | 
|  | 59 | nice to have in other objects.  The C language does not allow for the | 
|  | 60 | direct expression of inheritance, so other techniques - such as structure | 
|  | 61 | embedding - must be used. | 
|  | 62 |  | 
|  | 63 | (As an aside, for those familiar with the kernel linked list implementation, | 
|  | 64 | this is analogous as to how "list_head" structs are rarely useful on | 
|  | 65 | their own, but are invariably found embedded in the larger objects of | 
|  | 66 | interest.) | 
|  | 67 |  | 
|  | 68 | So, for example, the UIO code in drivers/uio/uio.c has a structure that | 
|  | 69 | defines the memory region associated with a uio device:: | 
|  | 70 |  | 
|  | 71 | struct uio_map { | 
|  | 72 | struct kobject kobj; | 
|  | 73 | struct uio_mem *mem; | 
|  | 74 | }; | 
|  | 75 |  | 
|  | 76 | If you have a struct uio_map structure, finding its embedded kobject is | 
|  | 77 | just a matter of using the kobj member.  Code that works with kobjects will | 
|  | 78 | often have the opposite problem, however: given a struct kobject pointer, | 
|  | 79 | what is the pointer to the containing structure?  You must avoid tricks | 
|  | 80 | (such as assuming that the kobject is at the beginning of the structure) | 
|  | 81 | and, instead, use the container_of() macro, found in <linux/kernel.h>:: | 
|  | 82 |  | 
|  | 83 | container_of(pointer, type, member) | 
|  | 84 |  | 
|  | 85 | where: | 
|  | 86 |  | 
|  | 87 | * "pointer" is the pointer to the embedded kobject, | 
|  | 88 | * "type" is the type of the containing structure, and | 
|  | 89 | * "member" is the name of the structure field to which "pointer" points. | 
|  | 90 |  | 
|  | 91 | The return value from container_of() is a pointer to the corresponding | 
|  | 92 | container type. So, for example, a pointer "kp" to a struct kobject | 
|  | 93 | embedded *within* a struct uio_map could be converted to a pointer to the | 
|  | 94 | *containing* uio_map structure with:: | 
|  | 95 |  | 
|  | 96 | struct uio_map *u_map = container_of(kp, struct uio_map, kobj); | 
|  | 97 |  | 
|  | 98 | For convenience, programmers often define a simple macro for "back-casting" | 
|  | 99 | kobject pointers to the containing type.  Exactly this happens in the | 
|  | 100 | earlier drivers/uio/uio.c, as you can see here:: | 
|  | 101 |  | 
|  | 102 | struct uio_map { | 
|  | 103 | struct kobject kobj; | 
|  | 104 | struct uio_mem *mem; | 
|  | 105 | }; | 
|  | 106 |  | 
|  | 107 | #define to_map(map) container_of(map, struct uio_map, kobj) | 
|  | 108 |  | 
|  | 109 | where the macro argument "map" is a pointer to the struct kobject in | 
|  | 110 | question.  That macro is subsequently invoked with:: | 
|  | 111 |  | 
|  | 112 | struct uio_map *map = to_map(kobj); | 
|  | 113 |  | 
|  | 114 |  | 
|  | 115 | Initialization of kobjects | 
|  | 116 | ========================== | 
|  | 117 |  | 
|  | 118 | Code which creates a kobject must, of course, initialize that object. Some | 
|  | 119 | of the internal fields are setup with a (mandatory) call to kobject_init():: | 
|  | 120 |  | 
|  | 121 | void kobject_init(struct kobject *kobj, struct kobj_type *ktype); | 
|  | 122 |  | 
|  | 123 | The ktype is required for a kobject to be created properly, as every kobject | 
|  | 124 | must have an associated kobj_type.  After calling kobject_init(), to | 
|  | 125 | register the kobject with sysfs, the function kobject_add() must be called:: | 
|  | 126 |  | 
|  | 127 | int kobject_add(struct kobject *kobj, struct kobject *parent, | 
|  | 128 | const char *fmt, ...); | 
|  | 129 |  | 
|  | 130 | This sets up the parent of the kobject and the name for the kobject | 
|  | 131 | properly.  If the kobject is to be associated with a specific kset, | 
|  | 132 | kobj->kset must be assigned before calling kobject_add().  If a kset is | 
|  | 133 | associated with a kobject, then the parent for the kobject can be set to | 
|  | 134 | NULL in the call to kobject_add() and then the kobject's parent will be the | 
|  | 135 | kset itself. | 
|  | 136 |  | 
|  | 137 | As the name of the kobject is set when it is added to the kernel, the name | 
|  | 138 | of the kobject should never be manipulated directly.  If you must change | 
|  | 139 | the name of the kobject, call kobject_rename():: | 
|  | 140 |  | 
|  | 141 | int kobject_rename(struct kobject *kobj, const char *new_name); | 
|  | 142 |  | 
|  | 143 | kobject_rename does not perform any locking or have a solid notion of | 
|  | 144 | what names are valid so the caller must provide their own sanity checking | 
|  | 145 | and serialization. | 
|  | 146 |  | 
|  | 147 | There is a function called kobject_set_name() but that is legacy cruft and | 
|  | 148 | is being removed.  If your code needs to call this function, it is | 
|  | 149 | incorrect and needs to be fixed. | 
|  | 150 |  | 
|  | 151 | To properly access the name of the kobject, use the function | 
|  | 152 | kobject_name():: | 
|  | 153 |  | 
|  | 154 | const char *kobject_name(const struct kobject * kobj); | 
|  | 155 |  | 
|  | 156 | There is a helper function to both initialize and add the kobject to the | 
|  | 157 | kernel at the same time, called surprisingly enough kobject_init_and_add():: | 
|  | 158 |  | 
|  | 159 | int kobject_init_and_add(struct kobject *kobj, struct kobj_type *ktype, | 
|  | 160 | struct kobject *parent, const char *fmt, ...); | 
|  | 161 |  | 
|  | 162 | The arguments are the same as the individual kobject_init() and | 
|  | 163 | kobject_add() functions described above. | 
|  | 164 |  | 
|  | 165 |  | 
|  | 166 | Uevents | 
|  | 167 | ======= | 
|  | 168 |  | 
|  | 169 | After a kobject has been registered with the kobject core, you need to | 
|  | 170 | announce to the world that it has been created.  This can be done with a | 
|  | 171 | call to kobject_uevent():: | 
|  | 172 |  | 
|  | 173 | int kobject_uevent(struct kobject *kobj, enum kobject_action action); | 
|  | 174 |  | 
|  | 175 | Use the KOBJ_ADD action for when the kobject is first added to the kernel. | 
|  | 176 | This should be done only after any attributes or children of the kobject | 
|  | 177 | have been initialized properly, as userspace will instantly start to look | 
|  | 178 | for them when this call happens. | 
|  | 179 |  | 
|  | 180 | When the kobject is removed from the kernel (details on how to do that are | 
|  | 181 | below), the uevent for KOBJ_REMOVE will be automatically created by the | 
|  | 182 | kobject core, so the caller does not have to worry about doing that by | 
|  | 183 | hand. | 
|  | 184 |  | 
|  | 185 |  | 
|  | 186 | Reference counts | 
|  | 187 | ================ | 
|  | 188 |  | 
|  | 189 | One of the key functions of a kobject is to serve as a reference counter | 
|  | 190 | for the object in which it is embedded. As long as references to the object | 
|  | 191 | exist, the object (and the code which supports it) must continue to exist. | 
|  | 192 | The low-level functions for manipulating a kobject's reference counts are:: | 
|  | 193 |  | 
|  | 194 | struct kobject *kobject_get(struct kobject *kobj); | 
|  | 195 | void kobject_put(struct kobject *kobj); | 
|  | 196 |  | 
|  | 197 | A successful call to kobject_get() will increment the kobject's reference | 
|  | 198 | counter and return the pointer to the kobject. | 
|  | 199 |  | 
|  | 200 | When a reference is released, the call to kobject_put() will decrement the | 
|  | 201 | reference count and, possibly, free the object. Note that kobject_init() | 
|  | 202 | sets the reference count to one, so the code which sets up the kobject will | 
|  | 203 | need to do a kobject_put() eventually to release that reference. | 
|  | 204 |  | 
|  | 205 | Because kobjects are dynamic, they must not be declared statically or on | 
|  | 206 | the stack, but instead, always allocated dynamically.  Future versions of | 
|  | 207 | the kernel will contain a run-time check for kobjects that are created | 
|  | 208 | statically and will warn the developer of this improper usage. | 
|  | 209 |  | 
|  | 210 | If all that you want to use a kobject for is to provide a reference counter | 
|  | 211 | for your structure, please use the struct kref instead; a kobject would be | 
|  | 212 | overkill.  For more information on how to use struct kref, please see the | 
|  | 213 | file Documentation/kref.txt in the Linux kernel source tree. | 
|  | 214 |  | 
|  | 215 |  | 
|  | 216 | Creating "simple" kobjects | 
|  | 217 | ========================== | 
|  | 218 |  | 
|  | 219 | Sometimes all that a developer wants is a way to create a simple directory | 
|  | 220 | in the sysfs hierarchy, and not have to mess with the whole complication of | 
|  | 221 | ksets, show and store functions, and other details.  This is the one | 
|  | 222 | exception where a single kobject should be created.  To create such an | 
|  | 223 | entry, use the function:: | 
|  | 224 |  | 
|  | 225 | struct kobject *kobject_create_and_add(char *name, struct kobject *parent); | 
|  | 226 |  | 
|  | 227 | This function will create a kobject and place it in sysfs in the location | 
|  | 228 | underneath the specified parent kobject.  To create simple attributes | 
|  | 229 | associated with this kobject, use:: | 
|  | 230 |  | 
|  | 231 | int sysfs_create_file(struct kobject *kobj, struct attribute *attr); | 
|  | 232 |  | 
|  | 233 | or:: | 
|  | 234 |  | 
|  | 235 | int sysfs_create_group(struct kobject *kobj, struct attribute_group *grp); | 
|  | 236 |  | 
|  | 237 | Both types of attributes used here, with a kobject that has been created | 
|  | 238 | with the kobject_create_and_add(), can be of type kobj_attribute, so no | 
|  | 239 | special custom attribute is needed to be created. | 
|  | 240 |  | 
|  | 241 | See the example module, samples/kobject/kobject-example.c for an | 
|  | 242 | implementation of a simple kobject and attributes. | 
|  | 243 |  | 
|  | 244 |  | 
|  | 245 |  | 
|  | 246 | ktypes and release methods | 
|  | 247 | ========================== | 
|  | 248 |  | 
|  | 249 | One important thing still missing from the discussion is what happens to a | 
|  | 250 | kobject when its reference count reaches zero. The code which created the | 
|  | 251 | kobject generally does not know when that will happen; if it did, there | 
|  | 252 | would be little point in using a kobject in the first place. Even | 
|  | 253 | predictable object lifecycles become more complicated when sysfs is brought | 
|  | 254 | in as other portions of the kernel can get a reference on any kobject that | 
|  | 255 | is registered in the system. | 
|  | 256 |  | 
|  | 257 | The end result is that a structure protected by a kobject cannot be freed | 
|  | 258 | before its reference count goes to zero. The reference count is not under | 
|  | 259 | the direct control of the code which created the kobject. So that code must | 
|  | 260 | be notified asynchronously whenever the last reference to one of its | 
|  | 261 | kobjects goes away. | 
|  | 262 |  | 
|  | 263 | Once you registered your kobject via kobject_add(), you must never use | 
|  | 264 | kfree() to free it directly. The only safe way is to use kobject_put(). It | 
|  | 265 | is good practice to always use kobject_put() after kobject_init() to avoid | 
|  | 266 | errors creeping in. | 
|  | 267 |  | 
|  | 268 | This notification is done through a kobject's release() method. Usually | 
|  | 269 | such a method has a form like:: | 
|  | 270 |  | 
|  | 271 | void my_object_release(struct kobject *kobj) | 
|  | 272 | { | 
|  | 273 | struct my_object *mine = container_of(kobj, struct my_object, kobj); | 
|  | 274 |  | 
|  | 275 | /* Perform any additional cleanup on this object, then... */ | 
|  | 276 | kfree(mine); | 
|  | 277 | } | 
|  | 278 |  | 
|  | 279 | One important point cannot be overstated: every kobject must have a | 
|  | 280 | release() method, and the kobject must persist (in a consistent state) | 
|  | 281 | until that method is called. If these constraints are not met, the code is | 
|  | 282 | flawed.  Note that the kernel will warn you if you forget to provide a | 
|  | 283 | release() method.  Do not try to get rid of this warning by providing an | 
|  | 284 | "empty" release function; you will be mocked mercilessly by the kobject | 
|  | 285 | maintainer if you attempt this. | 
|  | 286 |  | 
|  | 287 | Note, the name of the kobject is available in the release function, but it | 
|  | 288 | must NOT be changed within this callback.  Otherwise there will be a memory | 
|  | 289 | leak in the kobject core, which makes people unhappy. | 
|  | 290 |  | 
|  | 291 | Interestingly, the release() method is not stored in the kobject itself; | 
|  | 292 | instead, it is associated with the ktype. So let us introduce struct | 
|  | 293 | kobj_type:: | 
|  | 294 |  | 
|  | 295 | struct kobj_type { | 
|  | 296 | void (*release)(struct kobject *kobj); | 
|  | 297 | const struct sysfs_ops *sysfs_ops; | 
|  | 298 | struct attribute **default_attrs; | 
|  | 299 | const struct kobj_ns_type_operations *(*child_ns_type)(struct kobject *kobj); | 
|  | 300 | const void *(*namespace)(struct kobject *kobj); | 
|  | 301 | }; | 
|  | 302 |  | 
|  | 303 | This structure is used to describe a particular type of kobject (or, more | 
|  | 304 | correctly, of containing object). Every kobject needs to have an associated | 
|  | 305 | kobj_type structure; a pointer to that structure must be specified when you | 
|  | 306 | call kobject_init() or kobject_init_and_add(). | 
|  | 307 |  | 
|  | 308 | The release field in struct kobj_type is, of course, a pointer to the | 
|  | 309 | release() method for this type of kobject. The other two fields (sysfs_ops | 
|  | 310 | and default_attrs) control how objects of this type are represented in | 
|  | 311 | sysfs; they are beyond the scope of this document. | 
|  | 312 |  | 
|  | 313 | The default_attrs pointer is a list of default attributes that will be | 
|  | 314 | automatically created for any kobject that is registered with this ktype. | 
|  | 315 |  | 
|  | 316 |  | 
|  | 317 | ksets | 
|  | 318 | ===== | 
|  | 319 |  | 
|  | 320 | A kset is merely a collection of kobjects that want to be associated with | 
|  | 321 | each other.  There is no restriction that they be of the same ktype, but be | 
|  | 322 | very careful if they are not. | 
|  | 323 |  | 
|  | 324 | A kset serves these functions: | 
|  | 325 |  | 
|  | 326 | - It serves as a bag containing a group of objects. A kset can be used by | 
|  | 327 | the kernel to track "all block devices" or "all PCI device drivers." | 
|  | 328 |  | 
|  | 329 | - A kset is also a subdirectory in sysfs, where the associated kobjects | 
|  | 330 | with the kset can show up.  Every kset contains a kobject which can be | 
|  | 331 | set up to be the parent of other kobjects; the top-level directories of | 
|  | 332 | the sysfs hierarchy are constructed in this way. | 
|  | 333 |  | 
|  | 334 | - Ksets can support the "hotplugging" of kobjects and influence how | 
|  | 335 | uevent events are reported to user space. | 
|  | 336 |  | 
|  | 337 | In object-oriented terms, "kset" is the top-level container class; ksets | 
|  | 338 | contain their own kobject, but that kobject is managed by the kset code and | 
|  | 339 | should not be manipulated by any other user. | 
|  | 340 |  | 
|  | 341 | A kset keeps its children in a standard kernel linked list.  Kobjects point | 
|  | 342 | back to their containing kset via their kset field. In almost all cases, | 
|  | 343 | the kobjects belonging to a kset have that kset (or, strictly, its embedded | 
|  | 344 | kobject) in their parent. | 
|  | 345 |  | 
|  | 346 | As a kset contains a kobject within it, it should always be dynamically | 
|  | 347 | created and never declared statically or on the stack.  To create a new | 
|  | 348 | kset use:: | 
|  | 349 |  | 
|  | 350 | struct kset *kset_create_and_add(const char *name, | 
|  | 351 | struct kset_uevent_ops *u, | 
|  | 352 | struct kobject *parent); | 
|  | 353 |  | 
|  | 354 | When you are finished with the kset, call:: | 
|  | 355 |  | 
|  | 356 | void kset_unregister(struct kset *kset); | 
|  | 357 |  | 
|  | 358 | to destroy it.  This removes the kset from sysfs and decrements its reference | 
|  | 359 | count.  When the reference count goes to zero, the kset will be released. | 
|  | 360 | Because other references to the kset may still exist, the release may happen | 
|  | 361 | after kset_unregister() returns. | 
|  | 362 |  | 
|  | 363 | An example of using a kset can be seen in the | 
|  | 364 | samples/kobject/kset-example.c file in the kernel tree. | 
|  | 365 |  | 
|  | 366 | If a kset wishes to control the uevent operations of the kobjects | 
|  | 367 | associated with it, it can use the struct kset_uevent_ops to handle it:: | 
|  | 368 |  | 
|  | 369 | struct kset_uevent_ops { | 
|  | 370 | int (*filter)(struct kset *kset, struct kobject *kobj); | 
|  | 371 | const char *(*name)(struct kset *kset, struct kobject *kobj); | 
|  | 372 | int (*uevent)(struct kset *kset, struct kobject *kobj, | 
|  | 373 | struct kobj_uevent_env *env); | 
|  | 374 | }; | 
|  | 375 |  | 
|  | 376 |  | 
|  | 377 | The filter function allows a kset to prevent a uevent from being emitted to | 
|  | 378 | userspace for a specific kobject.  If the function returns 0, the uevent | 
|  | 379 | will not be emitted. | 
|  | 380 |  | 
|  | 381 | The name function will be called to override the default name of the kset | 
|  | 382 | that the uevent sends to userspace.  By default, the name will be the same | 
|  | 383 | as the kset itself, but this function, if present, can override that name. | 
|  | 384 |  | 
|  | 385 | The uevent function will be called when the uevent is about to be sent to | 
|  | 386 | userspace to allow more environment variables to be added to the uevent. | 
|  | 387 |  | 
|  | 388 | One might ask how, exactly, a kobject is added to a kset, given that no | 
|  | 389 | functions which perform that function have been presented.  The answer is | 
|  | 390 | that this task is handled by kobject_add().  When a kobject is passed to | 
|  | 391 | kobject_add(), its kset member should point to the kset to which the | 
|  | 392 | kobject will belong.  kobject_add() will handle the rest. | 
|  | 393 |  | 
|  | 394 | If the kobject belonging to a kset has no parent kobject set, it will be | 
|  | 395 | added to the kset's directory.  Not all members of a kset do necessarily | 
|  | 396 | live in the kset directory.  If an explicit parent kobject is assigned | 
|  | 397 | before the kobject is added, the kobject is registered with the kset, but | 
|  | 398 | added below the parent kobject. | 
|  | 399 |  | 
|  | 400 |  | 
|  | 401 | Kobject removal | 
|  | 402 | =============== | 
|  | 403 |  | 
|  | 404 | After a kobject has been registered with the kobject core successfully, it | 
|  | 405 | must be cleaned up when the code is finished with it.  To do that, call | 
|  | 406 | kobject_put().  By doing this, the kobject core will automatically clean up | 
|  | 407 | all of the memory allocated by this kobject.  If a KOBJ_ADD uevent has been | 
|  | 408 | sent for the object, a corresponding KOBJ_REMOVE uevent will be sent, and | 
|  | 409 | any other sysfs housekeeping will be handled for the caller properly. | 
|  | 410 |  | 
|  | 411 | If you need to do a two-stage delete of the kobject (say you are not | 
|  | 412 | allowed to sleep when you need to destroy the object), then call | 
|  | 413 | kobject_del() which will unregister the kobject from sysfs.  This makes the | 
|  | 414 | kobject "invisible", but it is not cleaned up, and the reference count of | 
|  | 415 | the object is still the same.  At a later time call kobject_put() to finish | 
|  | 416 | the cleanup of the memory associated with the kobject. | 
|  | 417 |  | 
|  | 418 | kobject_del() can be used to drop the reference to the parent object, if | 
|  | 419 | circular references are constructed.  It is valid in some cases, that a | 
|  | 420 | parent objects references a child.  Circular references _must_ be broken | 
|  | 421 | with an explicit call to kobject_del(), so that a release functions will be | 
|  | 422 | called, and the objects in the former circle release each other. | 
|  | 423 |  | 
|  | 424 |  | 
|  | 425 | Example code to copy from | 
|  | 426 | ========================= | 
|  | 427 |  | 
|  | 428 | For a more complete example of using ksets and kobjects properly, see the | 
|  | 429 | example programs samples/kobject/{kobject-example.c,kset-example.c}, | 
|  | 430 | which will be built as loadable modules if you select CONFIG_SAMPLE_KOBJECT. |