| xj | b04a402 | 2021-11-25 15:01:52 +0800 | [diff] [blame] | 1 | The seq_file interface | 
 | 2 |  | 
 | 3 | 	Copyright 2003 Jonathan Corbet <corbet@lwn.net> | 
 | 4 | 	This file is originally from the LWN.net Driver Porting series at | 
 | 5 | 	http://lwn.net/Articles/driver-porting/ | 
 | 6 |  | 
 | 7 |  | 
 | 8 | There are numerous ways for a device driver (or other kernel component) to | 
 | 9 | provide information to the user or system administrator.  One useful | 
 | 10 | technique is the creation of virtual files, in debugfs, /proc or elsewhere. | 
 | 11 | Virtual files can provide human-readable output that is easy to get at | 
 | 12 | without any special utility programs; they can also make life easier for | 
 | 13 | script writers. It is not surprising that the use of virtual files has | 
 | 14 | grown over the years. | 
 | 15 |  | 
 | 16 | Creating those files correctly has always been a bit of a challenge, | 
 | 17 | however. It is not that hard to make a virtual file which returns a | 
 | 18 | string. But life gets trickier if the output is long - anything greater | 
 | 19 | than an application is likely to read in a single operation.  Handling | 
 | 20 | multiple reads (and seeks) requires careful attention to the reader's | 
 | 21 | position within the virtual file - that position is, likely as not, in the | 
 | 22 | middle of a line of output. The kernel has traditionally had a number of | 
 | 23 | implementations that got this wrong. | 
 | 24 |  | 
 | 25 | The 2.6 kernel contains a set of functions (implemented by Alexander Viro) | 
 | 26 | which are designed to make it easy for virtual file creators to get it | 
 | 27 | right. | 
 | 28 |  | 
 | 29 | The seq_file interface is available via <linux/seq_file.h>. There are | 
 | 30 | three aspects to seq_file: | 
 | 31 |  | 
 | 32 |      * An iterator interface which lets a virtual file implementation | 
 | 33 |        step through the objects it is presenting. | 
 | 34 |  | 
 | 35 |      * Some utility functions for formatting objects for output without | 
 | 36 |        needing to worry about things like output buffers. | 
 | 37 |  | 
 | 38 |      * A set of canned file_operations which implement most operations on | 
 | 39 |        the virtual file. | 
 | 40 |  | 
 | 41 | We'll look at the seq_file interface via an extremely simple example: a | 
 | 42 | loadable module which creates a file called /proc/sequence. The file, when | 
 | 43 | read, simply produces a set of increasing integer values, one per line. The | 
 | 44 | sequence will continue until the user loses patience and finds something | 
 | 45 | better to do. The file is seekable, in that one can do something like the | 
 | 46 | following: | 
 | 47 |  | 
 | 48 |     dd if=/proc/sequence of=out1 count=1 | 
 | 49 |     dd if=/proc/sequence skip=1 of=out2 count=1 | 
 | 50 |  | 
 | 51 | Then concatenate the output files out1 and out2 and get the right | 
 | 52 | result. Yes, it is a thoroughly useless module, but the point is to show | 
 | 53 | how the mechanism works without getting lost in other details.  (Those | 
 | 54 | wanting to see the full source for this module can find it at | 
 | 55 | http://lwn.net/Articles/22359/). | 
 | 56 |  | 
 | 57 | Deprecated create_proc_entry | 
 | 58 |  | 
 | 59 | Note that the above article uses create_proc_entry which was removed in | 
 | 60 | kernel 3.10. Current versions require the following update | 
 | 61 |  | 
 | 62 | -	entry = create_proc_entry("sequence", 0, NULL); | 
 | 63 | -	if (entry) | 
 | 64 | -		entry->proc_fops = &ct_file_ops; | 
 | 65 | +	entry = proc_create("sequence", 0, NULL, &ct_file_ops); | 
 | 66 |  | 
 | 67 | The iterator interface | 
 | 68 |  | 
 | 69 | Modules implementing a virtual file with seq_file must implement an | 
 | 70 | iterator object that allows stepping through the data of interest | 
 | 71 | during a "session" (roughly one read() system call).  If the iterator | 
 | 72 | is able to move to a specific position - like the file they implement, | 
 | 73 | though with freedom to map the position number to a sequence location | 
 | 74 | in whatever way is convenient - the iterator need only exist | 
 | 75 | transiently during a session.  If the iterator cannot easily find a | 
 | 76 | numerical position but works well with a first/next interface, the | 
 | 77 | iterator can be stored in the private data area and continue from one | 
 | 78 | session to the next. | 
 | 79 |  | 
 | 80 | A seq_file implementation that is formatting firewall rules from a | 
 | 81 | table, for example, could provide a simple iterator that interprets | 
 | 82 | position N as the Nth rule in the chain.  A seq_file implementation | 
 | 83 | that presents the content of a, potentially volatile, linked list | 
 | 84 | might record a pointer into that list, providing that can be done | 
 | 85 | without risk of the current location being removed. | 
 | 86 |  | 
 | 87 | Positioning can thus be done in whatever way makes the most sense for | 
 | 88 | the generator of the data, which need not be aware of how a position | 
 | 89 | translates to an offset in the virtual file. The one obvious exception | 
 | 90 | is that a position of zero should indicate the beginning of the file. | 
 | 91 |  | 
 | 92 | The /proc/sequence iterator just uses the count of the next number it | 
 | 93 | will output as its position. | 
 | 94 |  | 
 | 95 | Four functions must be implemented to make the iterator work. The | 
 | 96 | first, called start(), starts a session and takes a position as an | 
 | 97 | argument, returning an iterator which will start reading at that | 
 | 98 | position.  The pos passed to start() will always be either zero, or | 
 | 99 | the most recent pos used in the previous session. | 
 | 100 |  | 
 | 101 | For our simple sequence example, | 
 | 102 | the start() function looks like: | 
 | 103 |  | 
 | 104 | 	static void *ct_seq_start(struct seq_file *s, loff_t *pos) | 
 | 105 | 	{ | 
 | 106 | 	        loff_t *spos = kmalloc(sizeof(loff_t), GFP_KERNEL); | 
 | 107 | 	        if (! spos) | 
 | 108 | 	                return NULL; | 
 | 109 | 	        *spos = *pos; | 
 | 110 | 	        return spos; | 
 | 111 | 	} | 
 | 112 |  | 
 | 113 | The entire data structure for this iterator is a single loff_t value | 
 | 114 | holding the current position. There is no upper bound for the sequence | 
 | 115 | iterator, but that will not be the case for most other seq_file | 
 | 116 | implementations; in most cases the start() function should check for a | 
 | 117 | "past end of file" condition and return NULL if need be. | 
 | 118 |  | 
 | 119 | For more complicated applications, the private field of the seq_file | 
 | 120 | structure can be used to hold state from session to session.  There is | 
 | 121 | also a special value which can be returned by the start() function | 
 | 122 | called SEQ_START_TOKEN; it can be used if you wish to instruct your | 
 | 123 | show() function (described below) to print a header at the top of the | 
 | 124 | output. SEQ_START_TOKEN should only be used if the offset is zero, | 
 | 125 | however. | 
 | 126 |  | 
 | 127 | The next function to implement is called, amazingly, next(); its job is to | 
 | 128 | move the iterator forward to the next position in the sequence.  The | 
 | 129 | example module can simply increment the position by one; more useful | 
 | 130 | modules will do what is needed to step through some data structure. The | 
 | 131 | next() function returns a new iterator, or NULL if the sequence is | 
 | 132 | complete. Here's the example version: | 
 | 133 |  | 
 | 134 | 	static void *ct_seq_next(struct seq_file *s, void *v, loff_t *pos) | 
 | 135 | 	{ | 
 | 136 | 	        loff_t *spos = v; | 
 | 137 | 	        *pos = ++*spos; | 
 | 138 | 	        return spos; | 
 | 139 | 	} | 
 | 140 |  | 
 | 141 | The stop() function closes a session; its job, of course, is to clean | 
 | 142 | up. If dynamic memory is allocated for the iterator, stop() is the | 
 | 143 | place to free it; if a lock was taken by start(), stop() must release | 
 | 144 | that lock.  The value that *pos was set to by the last next() call | 
 | 145 | before stop() is remembered, and used for the first start() call of | 
 | 146 | the next session unless lseek() has been called on the file; in that | 
 | 147 | case next start() will be asked to start at position zero. | 
 | 148 |  | 
 | 149 | 	static void ct_seq_stop(struct seq_file *s, void *v) | 
 | 150 | 	{ | 
 | 151 | 	        kfree(v); | 
 | 152 | 	} | 
 | 153 |  | 
 | 154 | Finally, the show() function should format the object currently pointed to | 
 | 155 | by the iterator for output.  The example module's show() function is: | 
 | 156 |  | 
 | 157 | 	static int ct_seq_show(struct seq_file *s, void *v) | 
 | 158 | 	{ | 
 | 159 | 	        loff_t *spos = v; | 
 | 160 | 	        seq_printf(s, "%lld\n", (long long)*spos); | 
 | 161 | 	        return 0; | 
 | 162 | 	} | 
 | 163 |  | 
 | 164 | If all is well, the show() function should return zero.  A negative error | 
 | 165 | code in the usual manner indicates that something went wrong; it will be | 
 | 166 | passed back to user space.  This function can also return SEQ_SKIP, which | 
 | 167 | causes the current item to be skipped; if the show() function has already | 
 | 168 | generated output before returning SEQ_SKIP, that output will be dropped. | 
 | 169 |  | 
 | 170 | We will look at seq_printf() in a moment. But first, the definition of the | 
 | 171 | seq_file iterator is finished by creating a seq_operations structure with | 
 | 172 | the four functions we have just defined: | 
 | 173 |  | 
 | 174 | 	static const struct seq_operations ct_seq_ops = { | 
 | 175 | 	        .start = ct_seq_start, | 
 | 176 | 	        .next  = ct_seq_next, | 
 | 177 | 	        .stop  = ct_seq_stop, | 
 | 178 | 	        .show  = ct_seq_show | 
 | 179 | 	}; | 
 | 180 |  | 
 | 181 | This structure will be needed to tie our iterator to the /proc file in | 
 | 182 | a little bit. | 
 | 183 |  | 
 | 184 | It's worth noting that the iterator value returned by start() and | 
 | 185 | manipulated by the other functions is considered to be completely opaque by | 
 | 186 | the seq_file code. It can thus be anything that is useful in stepping | 
 | 187 | through the data to be output. Counters can be useful, but it could also be | 
 | 188 | a direct pointer into an array or linked list. Anything goes, as long as | 
 | 189 | the programmer is aware that things can happen between calls to the | 
 | 190 | iterator function. However, the seq_file code (by design) will not sleep | 
 | 191 | between the calls to start() and stop(), so holding a lock during that time | 
 | 192 | is a reasonable thing to do. The seq_file code will also avoid taking any | 
 | 193 | other locks while the iterator is active. | 
 | 194 |  | 
 | 195 |  | 
 | 196 | Formatted output | 
 | 197 |  | 
 | 198 | The seq_file code manages positioning within the output created by the | 
 | 199 | iterator and getting it into the user's buffer. But, for that to work, that | 
 | 200 | output must be passed to the seq_file code. Some utility functions have | 
 | 201 | been defined which make this task easy. | 
 | 202 |  | 
 | 203 | Most code will simply use seq_printf(), which works pretty much like | 
 | 204 | printk(), but which requires the seq_file pointer as an argument. | 
 | 205 |  | 
 | 206 | For straight character output, the following functions may be used: | 
 | 207 |  | 
 | 208 | 	seq_putc(struct seq_file *m, char c); | 
 | 209 | 	seq_puts(struct seq_file *m, const char *s); | 
 | 210 | 	seq_escape(struct seq_file *m, const char *s, const char *esc); | 
 | 211 |  | 
 | 212 | The first two output a single character and a string, just like one would | 
 | 213 | expect. seq_escape() is like seq_puts(), except that any character in s | 
 | 214 | which is in the string esc will be represented in octal form in the output. | 
 | 215 |  | 
 | 216 | There are also a pair of functions for printing filenames: | 
 | 217 |  | 
 | 218 | 	int seq_path(struct seq_file *m, const struct path *path, | 
 | 219 | 		     const char *esc); | 
 | 220 | 	int seq_path_root(struct seq_file *m, const struct path *path, | 
 | 221 | 			  const struct path *root, const char *esc) | 
 | 222 |  | 
 | 223 | Here, path indicates the file of interest, and esc is a set of characters | 
 | 224 | which should be escaped in the output.  A call to seq_path() will output | 
 | 225 | the path relative to the current process's filesystem root.  If a different | 
 | 226 | root is desired, it can be used with seq_path_root().  If it turns out that | 
 | 227 | path cannot be reached from root, seq_path_root() returns SEQ_SKIP. | 
 | 228 |  | 
 | 229 | A function producing complicated output may want to check | 
 | 230 | 	bool seq_has_overflowed(struct seq_file *m); | 
 | 231 | and avoid further seq_<output> calls if true is returned. | 
 | 232 |  | 
 | 233 | A true return from seq_has_overflowed means that the seq_file buffer will | 
 | 234 | be discarded and the seq_show function will attempt to allocate a larger | 
 | 235 | buffer and retry printing. | 
 | 236 |  | 
 | 237 |  | 
 | 238 | Making it all work | 
 | 239 |  | 
 | 240 | So far, we have a nice set of functions which can produce output within the | 
 | 241 | seq_file system, but we have not yet turned them into a file that a user | 
 | 242 | can see. Creating a file within the kernel requires, of course, the | 
 | 243 | creation of a set of file_operations which implement the operations on that | 
 | 244 | file. The seq_file interface provides a set of canned operations which do | 
 | 245 | most of the work. The virtual file author still must implement the open() | 
 | 246 | method, however, to hook everything up. The open function is often a single | 
 | 247 | line, as in the example module: | 
 | 248 |  | 
 | 249 | 	static int ct_open(struct inode *inode, struct file *file) | 
 | 250 | 	{ | 
 | 251 | 		return seq_open(file, &ct_seq_ops); | 
 | 252 | 	} | 
 | 253 |  | 
 | 254 | Here, the call to seq_open() takes the seq_operations structure we created | 
 | 255 | before, and gets set up to iterate through the virtual file. | 
 | 256 |  | 
 | 257 | On a successful open, seq_open() stores the struct seq_file pointer in | 
 | 258 | file->private_data. If you have an application where the same iterator can | 
 | 259 | be used for more than one file, you can store an arbitrary pointer in the | 
 | 260 | private field of the seq_file structure; that value can then be retrieved | 
 | 261 | by the iterator functions. | 
 | 262 |  | 
 | 263 | There is also a wrapper function to seq_open() called seq_open_private(). It | 
 | 264 | kmallocs a zero filled block of memory and stores a pointer to it in the | 
 | 265 | private field of the seq_file structure, returning 0 on success. The | 
 | 266 | block size is specified in a third parameter to the function, e.g.: | 
 | 267 |  | 
 | 268 | 	static int ct_open(struct inode *inode, struct file *file) | 
 | 269 | 	{ | 
 | 270 | 		return seq_open_private(file, &ct_seq_ops, | 
 | 271 | 					sizeof(struct mystruct)); | 
 | 272 | 	} | 
 | 273 |  | 
 | 274 | There is also a variant function, __seq_open_private(), which is functionally | 
 | 275 | identical except that, if successful, it returns the pointer to the allocated | 
 | 276 | memory block, allowing further initialisation e.g.: | 
 | 277 |  | 
 | 278 | 	static int ct_open(struct inode *inode, struct file *file) | 
 | 279 | 	{ | 
 | 280 | 		struct mystruct *p = | 
 | 281 | 			__seq_open_private(file, &ct_seq_ops, sizeof(*p)); | 
 | 282 |  | 
 | 283 | 		if (!p) | 
 | 284 | 			return -ENOMEM; | 
 | 285 |  | 
 | 286 | 		p->foo = bar; /* initialize my stuff */ | 
 | 287 | 			... | 
 | 288 | 		p->baz = true; | 
 | 289 |  | 
 | 290 | 		return 0; | 
 | 291 | 	} | 
 | 292 |  | 
 | 293 | A corresponding close function, seq_release_private() is available which | 
 | 294 | frees the memory allocated in the corresponding open. | 
 | 295 |  | 
 | 296 | The other operations of interest - read(), llseek(), and release() - are | 
 | 297 | all implemented by the seq_file code itself. So a virtual file's | 
 | 298 | file_operations structure will look like: | 
 | 299 |  | 
 | 300 | 	static const struct file_operations ct_file_ops = { | 
 | 301 | 	        .owner   = THIS_MODULE, | 
 | 302 | 	        .open    = ct_open, | 
 | 303 | 	        .read    = seq_read, | 
 | 304 | 	        .llseek  = seq_lseek, | 
 | 305 | 	        .release = seq_release | 
 | 306 | 	}; | 
 | 307 |  | 
 | 308 | There is also a seq_release_private() which passes the contents of the | 
 | 309 | seq_file private field to kfree() before releasing the structure. | 
 | 310 |  | 
 | 311 | The final step is the creation of the /proc file itself. In the example | 
 | 312 | code, that is done in the initialization code in the usual way: | 
 | 313 |  | 
 | 314 | 	static int ct_init(void) | 
 | 315 | 	{ | 
 | 316 | 	        struct proc_dir_entry *entry; | 
 | 317 |  | 
 | 318 | 	        proc_create("sequence", 0, NULL, &ct_file_ops); | 
 | 319 | 	        return 0; | 
 | 320 | 	} | 
 | 321 |  | 
 | 322 | 	module_init(ct_init); | 
 | 323 |  | 
 | 324 | And that is pretty much it. | 
 | 325 |  | 
 | 326 |  | 
 | 327 | seq_list | 
 | 328 |  | 
 | 329 | If your file will be iterating through a linked list, you may find these | 
 | 330 | routines useful: | 
 | 331 |  | 
 | 332 | 	struct list_head *seq_list_start(struct list_head *head, | 
 | 333 | 	       		 		 loff_t pos); | 
 | 334 | 	struct list_head *seq_list_start_head(struct list_head *head, | 
 | 335 | 			 		      loff_t pos); | 
 | 336 | 	struct list_head *seq_list_next(void *v, struct list_head *head, | 
 | 337 | 					loff_t *ppos); | 
 | 338 |  | 
 | 339 | These helpers will interpret pos as a position within the list and iterate | 
 | 340 | accordingly.  Your start() and next() functions need only invoke the | 
 | 341 | seq_list_* helpers with a pointer to the appropriate list_head structure. | 
 | 342 |  | 
 | 343 |  | 
 | 344 | The extra-simple version | 
 | 345 |  | 
 | 346 | For extremely simple virtual files, there is an even easier interface.  A | 
 | 347 | module can define only the show() function, which should create all the | 
 | 348 | output that the virtual file will contain. The file's open() method then | 
 | 349 | calls: | 
 | 350 |  | 
 | 351 | 	int single_open(struct file *file, | 
 | 352 | 	                int (*show)(struct seq_file *m, void *p), | 
 | 353 | 	                void *data); | 
 | 354 |  | 
 | 355 | When output time comes, the show() function will be called once. The data | 
 | 356 | value given to single_open() can be found in the private field of the | 
 | 357 | seq_file structure. When using single_open(), the programmer should use | 
 | 358 | single_release() instead of seq_release() in the file_operations structure | 
 | 359 | to avoid a memory leak. |