| xj | b04a402 | 2021-11-25 15:01:52 +0800 | [diff] [blame] | 1 | ------------------------------------------------------------------------------ | 
|  | 2 | T H E  /proc   F I L E S Y S T E M | 
|  | 3 | ------------------------------------------------------------------------------ | 
|  | 4 | /proc/sys         Terrehon Bowden <terrehon@pacbell.net>        October 7 1999 | 
|  | 5 | Bodo Bauer <bb@ricochet.net> | 
|  | 6 |  | 
|  | 7 | 2.4.x update	  Jorge Nerin <comandante@zaralinux.com>      November 14 2000 | 
|  | 8 | move /proc/sys	  Shen Feng <shen@cn.fujitsu.com>		  April 1 2009 | 
|  | 9 | ------------------------------------------------------------------------------ | 
|  | 10 | Version 1.3                                              Kernel version 2.2.12 | 
|  | 11 | Kernel version 2.4.0-test11-pre4 | 
|  | 12 | ------------------------------------------------------------------------------ | 
|  | 13 | fixes/update part 1.1  Stefani Seibold <stefani@seibold.net>       June 9 2009 | 
|  | 14 |  | 
|  | 15 | Table of Contents | 
|  | 16 | ----------------- | 
|  | 17 |  | 
|  | 18 | 0     Preface | 
|  | 19 | 0.1	Introduction/Credits | 
|  | 20 | 0.2	Legal Stuff | 
|  | 21 |  | 
|  | 22 | 1	Collecting System Information | 
|  | 23 | 1.1	Process-Specific Subdirectories | 
|  | 24 | 1.2	Kernel data | 
|  | 25 | 1.3	IDE devices in /proc/ide | 
|  | 26 | 1.4	Networking info in /proc/net | 
|  | 27 | 1.5	SCSI info | 
|  | 28 | 1.6	Parallel port info in /proc/parport | 
|  | 29 | 1.7	TTY info in /proc/tty | 
|  | 30 | 1.8	Miscellaneous kernel statistics in /proc/stat | 
|  | 31 | 1.9	Ext4 file system parameters | 
|  | 32 |  | 
|  | 33 | 2	Modifying System Parameters | 
|  | 34 |  | 
|  | 35 | 3	Per-Process Parameters | 
|  | 36 | 3.1	/proc/<pid>/oom_adj & /proc/<pid>/oom_score_adj - Adjust the oom-killer | 
|  | 37 | score | 
|  | 38 | 3.2	/proc/<pid>/oom_score - Display current oom-killer score | 
|  | 39 | 3.3	/proc/<pid>/io - Display the IO accounting fields | 
|  | 40 | 3.4	/proc/<pid>/coredump_filter - Core dump filtering settings | 
|  | 41 | 3.5	/proc/<pid>/mountinfo - Information about mounts | 
|  | 42 | 3.6	/proc/<pid>/comm  & /proc/<pid>/task/<tid>/comm | 
|  | 43 | 3.7   /proc/<pid>/task/<tid>/children - Information about task children | 
|  | 44 | 3.8   /proc/<pid>/fdinfo/<fd> - Information about opened file | 
|  | 45 | 3.9   /proc/<pid>/map_files - Information about memory mapped files | 
|  | 46 | 3.10  /proc/<pid>/timerslack_ns - Task timerslack value | 
|  | 47 | 3.11	/proc/<pid>/patch_state - Livepatch patch operation state | 
|  | 48 |  | 
|  | 49 | 4	Configuring procfs | 
|  | 50 | 4.1	Mount options | 
|  | 51 |  | 
|  | 52 | ------------------------------------------------------------------------------ | 
|  | 53 | Preface | 
|  | 54 | ------------------------------------------------------------------------------ | 
|  | 55 |  | 
|  | 56 | 0.1 Introduction/Credits | 
|  | 57 | ------------------------ | 
|  | 58 |  | 
|  | 59 | This documentation is  part of a soon (or  so we hope) to be  released book on | 
|  | 60 | the SuSE  Linux distribution. As  there is  no complete documentation  for the | 
|  | 61 | /proc file system and we've used  many freely available sources to write these | 
|  | 62 | chapters, it  seems only fair  to give the work  back to the  Linux community. | 
|  | 63 | This work is  based on the 2.2.*  kernel version and the  upcoming 2.4.*. I'm | 
|  | 64 | afraid it's still far from complete, but we  hope it will be useful. As far as | 
|  | 65 | we know, it is the first 'all-in-one' document about the /proc file system. It | 
|  | 66 | is focused  on the Intel  x86 hardware,  so if you  are looking for  PPC, ARM, | 
|  | 67 | SPARC, AXP, etc., features, you probably  won't find what you are looking for. | 
|  | 68 | It also only covers IPv4 networking, not IPv6 nor other protocols - sorry. But | 
|  | 69 | additions and patches  are welcome and will  be added to this  document if you | 
|  | 70 | mail them to Bodo. | 
|  | 71 |  | 
|  | 72 | We'd like  to  thank Alan Cox, Rik van Riel, and Alexey Kuznetsov and a lot of | 
|  | 73 | other people for help compiling this documentation. We'd also like to extend a | 
|  | 74 | special thank  you to Andi Kleen for documentation, which we relied on heavily | 
|  | 75 | to create  this  document,  as well as the additional information he provided. | 
|  | 76 | Thanks to  everybody  else  who contributed source or docs to the Linux kernel | 
|  | 77 | and helped create a great piece of software... :) | 
|  | 78 |  | 
|  | 79 | If you  have  any comments, corrections or additions, please don't hesitate to | 
|  | 80 | contact Bodo  Bauer  at  bb@ricochet.net.  We'll  be happy to add them to this | 
|  | 81 | document. | 
|  | 82 |  | 
|  | 83 | The   latest   version    of   this   document   is    available   online   at | 
|  | 84 | http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html | 
|  | 85 |  | 
|  | 86 | If  the above  direction does  not works  for you,  you could  try the  kernel | 
|  | 87 | mailing  list  at  linux-kernel@vger.kernel.org  and/or try  to  reach  me  at | 
|  | 88 | comandante@zaralinux.com. | 
|  | 89 |  | 
|  | 90 | 0.2 Legal Stuff | 
|  | 91 | --------------- | 
|  | 92 |  | 
|  | 93 | We don't  guarantee  the  correctness  of this document, and if you come to us | 
|  | 94 | complaining about  how  you  screwed  up  your  system  because  of  incorrect | 
|  | 95 | documentation, we won't feel responsible... | 
|  | 96 |  | 
|  | 97 | ------------------------------------------------------------------------------ | 
|  | 98 | CHAPTER 1: COLLECTING SYSTEM INFORMATION | 
|  | 99 | ------------------------------------------------------------------------------ | 
|  | 100 |  | 
|  | 101 | ------------------------------------------------------------------------------ | 
|  | 102 | In This Chapter | 
|  | 103 | ------------------------------------------------------------------------------ | 
|  | 104 | * Investigating  the  properties  of  the  pseudo  file  system  /proc and its | 
|  | 105 | ability to provide information on the running Linux system | 
|  | 106 | * Examining /proc's structure | 
|  | 107 | * Uncovering  various  information  about the kernel and the processes running | 
|  | 108 | on the system | 
|  | 109 | ------------------------------------------------------------------------------ | 
|  | 110 |  | 
|  | 111 |  | 
|  | 112 | The proc  file  system acts as an interface to internal data structures in the | 
|  | 113 | kernel. It  can  be  used to obtain information about the system and to change | 
|  | 114 | certain kernel parameters at runtime (sysctl). | 
|  | 115 |  | 
|  | 116 | First, we'll  take  a  look  at the read-only parts of /proc. In Chapter 2, we | 
|  | 117 | show you how you can use /proc/sys to change settings. | 
|  | 118 |  | 
|  | 119 | 1.1 Process-Specific Subdirectories | 
|  | 120 | ----------------------------------- | 
|  | 121 |  | 
|  | 122 | The directory  /proc  contains  (among other things) one subdirectory for each | 
|  | 123 | process running on the system, which is named after the process ID (PID). | 
|  | 124 |  | 
|  | 125 | The link  self  points  to  the  process reading the file system. Each process | 
|  | 126 | subdirectory has the entries listed in Table 1-1. | 
|  | 127 |  | 
|  | 128 |  | 
|  | 129 | Table 1-1: Process specific entries in /proc | 
|  | 130 | .............................................................................. | 
|  | 131 | File		Content | 
|  | 132 | clear_refs	Clears page referenced bits shown in smaps output | 
|  | 133 | cmdline	Command line arguments | 
|  | 134 | cpu		Current and last cpu in which it was executed	(2.4)(smp) | 
|  | 135 | cwd		Link to the current working directory | 
|  | 136 | environ	Values of environment variables | 
|  | 137 | exe		Link to the executable of this process | 
|  | 138 | fd		Directory, which contains all file descriptors | 
|  | 139 | maps		Memory maps to executables and library files	(2.4) | 
|  | 140 | mem		Memory held by this process | 
|  | 141 | root		Link to the root directory of this process | 
|  | 142 | stat		Process status | 
|  | 143 | statm		Process memory status information | 
|  | 144 | status		Process status in human readable form | 
|  | 145 | wchan		Present with CONFIG_KALLSYMS=y: it shows the kernel function | 
|  | 146 | symbol the task is blocked in - or "0" if not blocked. | 
|  | 147 | pagemap	Page table | 
|  | 148 | stack		Report full stack trace, enable via CONFIG_STACKTRACE | 
|  | 149 | smaps		an extension based on maps, showing the memory consumption of | 
|  | 150 | each mapping and flags associated with it | 
|  | 151 | numa_maps	an extension based on maps, showing the memory locality and | 
|  | 152 | binding policy as well as mem usage (in pages) of each mapping. | 
|  | 153 | .............................................................................. | 
|  | 154 |  | 
|  | 155 | For example, to get the status information of a process, all you have to do is | 
|  | 156 | read the file /proc/PID/status: | 
|  | 157 |  | 
|  | 158 | >cat /proc/self/status | 
|  | 159 | Name:   cat | 
|  | 160 | State:  R (running) | 
|  | 161 | Tgid:   5452 | 
|  | 162 | Pid:    5452 | 
|  | 163 | PPid:   743 | 
|  | 164 | TracerPid:      0						(2.4) | 
|  | 165 | Uid:    501     501     501     501 | 
|  | 166 | Gid:    100     100     100     100 | 
|  | 167 | FDSize: 256 | 
|  | 168 | Groups: 100 14 16 | 
|  | 169 | VmPeak:     5004 kB | 
|  | 170 | VmSize:     5004 kB | 
|  | 171 | VmLck:         0 kB | 
|  | 172 | VmHWM:       476 kB | 
|  | 173 | VmRSS:       476 kB | 
|  | 174 | RssAnon:             352 kB | 
|  | 175 | RssFile:             120 kB | 
|  | 176 | RssShmem:              4 kB | 
|  | 177 | VmData:      156 kB | 
|  | 178 | VmStk:        88 kB | 
|  | 179 | VmExe:        68 kB | 
|  | 180 | VmLib:      1412 kB | 
|  | 181 | VmPTE:        20 kb | 
|  | 182 | VmSwap:        0 kB | 
|  | 183 | HugetlbPages:          0 kB | 
|  | 184 | CoreDumping:    0 | 
|  | 185 | Threads:        1 | 
|  | 186 | SigQ:   0/28578 | 
|  | 187 | SigPnd: 0000000000000000 | 
|  | 188 | ShdPnd: 0000000000000000 | 
|  | 189 | SigBlk: 0000000000000000 | 
|  | 190 | SigIgn: 0000000000000000 | 
|  | 191 | SigCgt: 0000000000000000 | 
|  | 192 | CapInh: 00000000fffffeff | 
|  | 193 | CapPrm: 0000000000000000 | 
|  | 194 | CapEff: 0000000000000000 | 
|  | 195 | CapBnd: ffffffffffffffff | 
|  | 196 | NoNewPrivs:     0 | 
|  | 197 | Seccomp:        0 | 
|  | 198 | voluntary_ctxt_switches:        0 | 
|  | 199 | nonvoluntary_ctxt_switches:     1 | 
|  | 200 |  | 
|  | 201 | This shows you nearly the same information you would get if you viewed it with | 
|  | 202 | the ps  command.  In  fact,  ps  uses  the  proc  file  system  to  obtain its | 
|  | 203 | information.  But you get a more detailed  view of the  process by reading the | 
|  | 204 | file /proc/PID/status. It fields are described in table 1-2. | 
|  | 205 |  | 
|  | 206 | The  statm  file  contains  more  detailed  information about the process | 
|  | 207 | memory usage. Its seven fields are explained in Table 1-3.  The stat file | 
|  | 208 | contains details information about the process itself.  Its fields are | 
|  | 209 | explained in Table 1-4. | 
|  | 210 |  | 
|  | 211 | (for SMP CONFIG users) | 
|  | 212 | For making accounting scalable, RSS related information are handled in an | 
|  | 213 | asynchronous manner and the value may not be very precise. To see a precise | 
|  | 214 | snapshot of a moment, you can see /proc/<pid>/smaps file and scan page table. | 
|  | 215 | It's slow but very precise. | 
|  | 216 |  | 
|  | 217 | Table 1-2: Contents of the status files (as of 4.8) | 
|  | 218 | .............................................................................. | 
|  | 219 | Field                       Content | 
|  | 220 | Name                        filename of the executable | 
|  | 221 | Umask                       file mode creation mask | 
|  | 222 | State                       state (R is running, S is sleeping, D is sleeping | 
|  | 223 | in an uninterruptible wait, Z is zombie, | 
|  | 224 | T is traced or stopped) | 
|  | 225 | Tgid                        thread group ID | 
|  | 226 | Ngid                        NUMA group ID (0 if none) | 
|  | 227 | Pid                         process id | 
|  | 228 | PPid                        process id of the parent process | 
|  | 229 | TracerPid                   PID of process tracing this process (0 if not) | 
|  | 230 | Uid                         Real, effective, saved set, and  file system UIDs | 
|  | 231 | Gid                         Real, effective, saved set, and  file system GIDs | 
|  | 232 | FDSize                      number of file descriptor slots currently allocated | 
|  | 233 | Groups                      supplementary group list | 
|  | 234 | NStgid                      descendant namespace thread group ID hierarchy | 
|  | 235 | NSpid                       descendant namespace process ID hierarchy | 
|  | 236 | NSpgid                      descendant namespace process group ID hierarchy | 
|  | 237 | NSsid                       descendant namespace session ID hierarchy | 
|  | 238 | VmPeak                      peak virtual memory size | 
|  | 239 | VmSize                      total program size | 
|  | 240 | VmLck                       locked memory size | 
|  | 241 | VmPin                       pinned memory size | 
|  | 242 | VmHWM                       peak resident set size ("high water mark") | 
|  | 243 | VmRSS                       size of memory portions. It contains the three | 
|  | 244 | following parts (VmRSS = RssAnon + RssFile + RssShmem) | 
|  | 245 | RssAnon                     size of resident anonymous memory | 
|  | 246 | RssFile                     size of resident file mappings | 
|  | 247 | RssShmem                    size of resident shmem memory (includes SysV shm, | 
|  | 248 | mapping of tmpfs and shared anonymous mappings) | 
|  | 249 | VmData                      size of private data segments | 
|  | 250 | VmStk                       size of stack segments | 
|  | 251 | VmExe                       size of text segment | 
|  | 252 | VmLib                       size of shared library code | 
|  | 253 | VmPTE                       size of page table entries | 
|  | 254 | VmSwap                      amount of swap used by anonymous private data | 
|  | 255 | (shmem swap usage is not included) | 
|  | 256 | HugetlbPages                size of hugetlb memory portions | 
|  | 257 | CoreDumping                 process's memory is currently being dumped | 
|  | 258 | (killing the process may lead to a corrupted core) | 
|  | 259 | Threads                     number of threads | 
|  | 260 | SigQ                        number of signals queued/max. number for queue | 
|  | 261 | SigPnd                      bitmap of pending signals for the thread | 
|  | 262 | ShdPnd                      bitmap of shared pending signals for the process | 
|  | 263 | SigBlk                      bitmap of blocked signals | 
|  | 264 | SigIgn                      bitmap of ignored signals | 
|  | 265 | SigCgt                      bitmap of caught signals | 
|  | 266 | CapInh                      bitmap of inheritable capabilities | 
|  | 267 | CapPrm                      bitmap of permitted capabilities | 
|  | 268 | CapEff                      bitmap of effective capabilities | 
|  | 269 | CapBnd                      bitmap of capabilities bounding set | 
|  | 270 | NoNewPrivs                  no_new_privs, like prctl(PR_GET_NO_NEW_PRIV, ...) | 
|  | 271 | Seccomp                     seccomp mode, like prctl(PR_GET_SECCOMP, ...) | 
|  | 272 | Cpus_allowed                mask of CPUs on which this process may run | 
|  | 273 | Cpus_allowed_list           Same as previous, but in "list format" | 
|  | 274 | Mems_allowed                mask of memory nodes allowed to this process | 
|  | 275 | Mems_allowed_list           Same as previous, but in "list format" | 
|  | 276 | voluntary_ctxt_switches     number of voluntary context switches | 
|  | 277 | nonvoluntary_ctxt_switches  number of non voluntary context switches | 
|  | 278 | .............................................................................. | 
|  | 279 |  | 
|  | 280 | Table 1-3: Contents of the statm files (as of 2.6.8-rc3) | 
|  | 281 | .............................................................................. | 
|  | 282 | Field    Content | 
|  | 283 | size     total program size (pages)		(same as VmSize in status) | 
|  | 284 | resident size of memory portions (pages)	(same as VmRSS in status) | 
|  | 285 | shared   number of pages that are shared	(i.e. backed by a file, same | 
|  | 286 | as RssFile+RssShmem in status) | 
|  | 287 | trs      number of pages that are 'code'	(not including libs; broken, | 
|  | 288 | includes data segment) | 
|  | 289 | lrs      number of pages of library		(always 0 on 2.6) | 
|  | 290 | drs      number of pages of data/stack		(including libs; broken, | 
|  | 291 | includes library text) | 
|  | 292 | dt       number of dirty pages			(always 0 on 2.6) | 
|  | 293 | .............................................................................. | 
|  | 294 |  | 
|  | 295 |  | 
|  | 296 | Table 1-4: Contents of the stat files (as of 2.6.30-rc7) | 
|  | 297 | .............................................................................. | 
|  | 298 | Field          Content | 
|  | 299 | pid           process id | 
|  | 300 | tcomm         filename of the executable | 
|  | 301 | state         state (R is running, S is sleeping, D is sleeping in an | 
|  | 302 | uninterruptible wait, Z is zombie, T is traced or stopped) | 
|  | 303 | ppid          process id of the parent process | 
|  | 304 | pgrp          pgrp of the process | 
|  | 305 | sid           session id | 
|  | 306 | tty_nr        tty the process uses | 
|  | 307 | tty_pgrp      pgrp of the tty | 
|  | 308 | flags         task flags | 
|  | 309 | min_flt       number of minor faults | 
|  | 310 | cmin_flt      number of minor faults with child's | 
|  | 311 | maj_flt       number of major faults | 
|  | 312 | cmaj_flt      number of major faults with child's | 
|  | 313 | utime         user mode jiffies | 
|  | 314 | stime         kernel mode jiffies | 
|  | 315 | cutime        user mode jiffies with child's | 
|  | 316 | cstime        kernel mode jiffies with child's | 
|  | 317 | priority      priority level | 
|  | 318 | nice          nice level | 
|  | 319 | num_threads   number of threads | 
|  | 320 | it_real_value	(obsolete, always 0) | 
|  | 321 | start_time    time the process started after system boot | 
|  | 322 | vsize         virtual memory size | 
|  | 323 | rss           resident set memory size | 
|  | 324 | rsslim        current limit in bytes on the rss | 
|  | 325 | start_code    address above which program text can run | 
|  | 326 | end_code      address below which program text can run | 
|  | 327 | start_stack   address of the start of the main process stack | 
|  | 328 | esp           current value of ESP | 
|  | 329 | eip           current value of EIP | 
|  | 330 | pending       bitmap of pending signals | 
|  | 331 | blocked       bitmap of blocked signals | 
|  | 332 | sigign        bitmap of ignored signals | 
|  | 333 | sigcatch      bitmap of caught signals | 
|  | 334 | 0		(place holder, used to be the wchan address, use /proc/PID/wchan instead) | 
|  | 335 | 0             (place holder) | 
|  | 336 | 0             (place holder) | 
|  | 337 | exit_signal   signal to send to parent thread on exit | 
|  | 338 | task_cpu      which CPU the task is scheduled on | 
|  | 339 | rt_priority   realtime priority | 
|  | 340 | policy        scheduling policy (man sched_setscheduler) | 
|  | 341 | blkio_ticks   time spent waiting for block IO | 
|  | 342 | gtime         guest time of the task in jiffies | 
|  | 343 | cgtime        guest time of the task children in jiffies | 
|  | 344 | start_data    address above which program data+bss is placed | 
|  | 345 | end_data      address below which program data+bss is placed | 
|  | 346 | start_brk     address above which program heap can be expanded with brk() | 
|  | 347 | arg_start     address above which program command line is placed | 
|  | 348 | arg_end       address below which program command line is placed | 
|  | 349 | env_start     address above which program environment is placed | 
|  | 350 | env_end       address below which program environment is placed | 
|  | 351 | exit_code     the thread's exit_code in the form reported by the waitpid system call | 
|  | 352 | .............................................................................. | 
|  | 353 |  | 
|  | 354 | The /proc/PID/maps file containing the currently mapped memory regions and | 
|  | 355 | their access permissions. | 
|  | 356 |  | 
|  | 357 | The format is: | 
|  | 358 |  | 
|  | 359 | address           perms offset  dev   inode      pathname | 
|  | 360 |  | 
|  | 361 | 08048000-08049000 r-xp 00000000 03:00 8312       /opt/test | 
|  | 362 | 08049000-0804a000 rw-p 00001000 03:00 8312       /opt/test | 
|  | 363 | 0804a000-0806b000 rw-p 00000000 00:00 0          [heap] | 
|  | 364 | a7cb1000-a7cb2000 ---p 00000000 00:00 0 | 
|  | 365 | a7cb2000-a7eb2000 rw-p 00000000 00:00 0 | 
|  | 366 | a7eb2000-a7eb3000 ---p 00000000 00:00 0 | 
|  | 367 | a7eb3000-a7ed5000 rw-p 00000000 00:00 0 | 
|  | 368 | a7ed5000-a8008000 r-xp 00000000 03:00 4222       /lib/libc.so.6 | 
|  | 369 | a8008000-a800a000 r--p 00133000 03:00 4222       /lib/libc.so.6 | 
|  | 370 | a800a000-a800b000 rw-p 00135000 03:00 4222       /lib/libc.so.6 | 
|  | 371 | a800b000-a800e000 rw-p 00000000 00:00 0 | 
|  | 372 | a800e000-a8022000 r-xp 00000000 03:00 14462      /lib/libpthread.so.0 | 
|  | 373 | a8022000-a8023000 r--p 00013000 03:00 14462      /lib/libpthread.so.0 | 
|  | 374 | a8023000-a8024000 rw-p 00014000 03:00 14462      /lib/libpthread.so.0 | 
|  | 375 | a8024000-a8027000 rw-p 00000000 00:00 0 | 
|  | 376 | a8027000-a8043000 r-xp 00000000 03:00 8317       /lib/ld-linux.so.2 | 
|  | 377 | a8043000-a8044000 r--p 0001b000 03:00 8317       /lib/ld-linux.so.2 | 
|  | 378 | a8044000-a8045000 rw-p 0001c000 03:00 8317       /lib/ld-linux.so.2 | 
|  | 379 | aff35000-aff4a000 rw-p 00000000 00:00 0          [stack] | 
|  | 380 | ffffe000-fffff000 r-xp 00000000 00:00 0          [vdso] | 
|  | 381 |  | 
|  | 382 | where "address" is the address space in the process that it occupies, "perms" | 
|  | 383 | is a set of permissions: | 
|  | 384 |  | 
|  | 385 | r = read | 
|  | 386 | w = write | 
|  | 387 | x = execute | 
|  | 388 | s = shared | 
|  | 389 | p = private (copy on write) | 
|  | 390 |  | 
|  | 391 | "offset" is the offset into the mapping, "dev" is the device (major:minor), and | 
|  | 392 | "inode" is the inode  on that device.  0 indicates that  no inode is associated | 
|  | 393 | with the memory region, as the case would be with BSS (uninitialized data). | 
|  | 394 | The "pathname" shows the name associated file for this mapping.  If the mapping | 
|  | 395 | is not associated with a file: | 
|  | 396 |  | 
|  | 397 | [heap]                   = the heap of the program | 
|  | 398 | [stack]                  = the stack of the main process | 
|  | 399 | [vdso]                   = the "virtual dynamic shared object", | 
|  | 400 | the kernel system call handler | 
|  | 401 | [anon:<name>]            = an anonymous mapping that has been | 
|  | 402 | named by userspace | 
|  | 403 |  | 
|  | 404 | or if empty, the mapping is anonymous. | 
|  | 405 |  | 
|  | 406 | The /proc/PID/smaps is an extension based on maps, showing the memory | 
|  | 407 | consumption for each of the process's mappings. For each of mappings there | 
|  | 408 | is a series of lines such as the following: | 
|  | 409 |  | 
|  | 410 | 08048000-080bc000 r-xp 00000000 03:02 13130      /bin/bash | 
|  | 411 | Size:               1084 kB | 
|  | 412 | Rss:                 892 kB | 
|  | 413 | Pss:                 374 kB | 
|  | 414 | Shared_Clean:        892 kB | 
|  | 415 | Shared_Dirty:          0 kB | 
|  | 416 | Private_Clean:         0 kB | 
|  | 417 | Private_Dirty:         0 kB | 
|  | 418 | Referenced:          892 kB | 
|  | 419 | Anonymous:             0 kB | 
|  | 420 | LazyFree:              0 kB | 
|  | 421 | AnonHugePages:         0 kB | 
|  | 422 | ShmemPmdMapped:        0 kB | 
|  | 423 | Shared_Hugetlb:        0 kB | 
|  | 424 | Private_Hugetlb:       0 kB | 
|  | 425 | Swap:                  0 kB | 
|  | 426 | SwapPss:               0 kB | 
|  | 427 | KernelPageSize:        4 kB | 
|  | 428 | MMUPageSize:           4 kB | 
|  | 429 | Locked:                0 kB | 
|  | 430 | THPeligible:           0 | 
|  | 431 | VmFlags: rd ex mr mw me dw | 
|  | 432 | Name:           name from userspace | 
|  | 433 |  | 
|  | 434 | the first of these lines shows the same information as is displayed for the | 
|  | 435 | mapping in /proc/PID/maps.  The remaining lines show the size of the mapping | 
|  | 436 | (size), the amount of the mapping that is currently resident in RAM (RSS), the | 
|  | 437 | process' proportional share of this mapping (PSS), the number of clean and | 
|  | 438 | dirty private pages in the mapping. | 
|  | 439 |  | 
|  | 440 | The "proportional set size" (PSS) of a process is the count of pages it has | 
|  | 441 | in memory, where each page is divided by the number of processes sharing it. | 
|  | 442 | So if a process has 1000 pages all to itself, and 1000 shared with one other | 
|  | 443 | process, its PSS will be 1500. | 
|  | 444 | Note that even a page which is part of a MAP_SHARED mapping, but has only | 
|  | 445 | a single pte mapped, i.e.  is currently used by only one process, is accounted | 
|  | 446 | as private and not as shared. | 
|  | 447 | "Referenced" indicates the amount of memory currently marked as referenced or | 
|  | 448 | accessed. | 
|  | 449 | "Anonymous" shows the amount of memory that does not belong to any file.  Even | 
|  | 450 | a mapping associated with a file may contain anonymous pages: when MAP_PRIVATE | 
|  | 451 | and a page is modified, the file page is replaced by a private anonymous copy. | 
|  | 452 | "LazyFree" shows the amount of memory which is marked by madvise(MADV_FREE). | 
|  | 453 | The memory isn't freed immediately with madvise(). It's freed in memory | 
|  | 454 | pressure if the memory is clean. Please note that the printed value might | 
|  | 455 | be lower than the real value due to optimizations used in the current | 
|  | 456 | implementation. If this is not desirable please file a bug report. | 
|  | 457 | "AnonHugePages" shows the ammount of memory backed by transparent hugepage. | 
|  | 458 | "ShmemPmdMapped" shows the ammount of shared (shmem/tmpfs) memory backed by | 
|  | 459 | huge pages. | 
|  | 460 | "Shared_Hugetlb" and "Private_Hugetlb" show the ammounts of memory backed by | 
|  | 461 | hugetlbfs page which is *not* counted in "RSS" or "PSS" field for historical | 
|  | 462 | reasons. And these are not included in {Shared,Private}_{Clean,Dirty} field. | 
|  | 463 | "Swap" shows how much would-be-anonymous memory is also used, but out on swap. | 
|  | 464 | For shmem mappings, "Swap" includes also the size of the mapped (and not | 
|  | 465 | replaced by copy-on-write) part of the underlying shmem object out on swap. | 
|  | 466 | "SwapPss" shows proportional swap share of this mapping. Unlike "Swap", this | 
|  | 467 | does not take into account swapped out page of underlying shmem objects. | 
|  | 468 | "Locked" indicates whether the mapping is locked in memory or not. | 
|  | 469 | "THPeligible" indicates whether the mapping is eligible for THP pages - 1 if | 
|  | 470 | true, 0 otherwise. | 
|  | 471 |  | 
|  | 472 | "VmFlags" field deserves a separate description. This member represents the kernel | 
|  | 473 | flags associated with the particular virtual memory area in two letter encoded | 
|  | 474 | manner. The codes are the following: | 
|  | 475 | rd  - readable | 
|  | 476 | wr  - writeable | 
|  | 477 | ex  - executable | 
|  | 478 | sh  - shared | 
|  | 479 | mr  - may read | 
|  | 480 | mw  - may write | 
|  | 481 | me  - may execute | 
|  | 482 | ms  - may share | 
|  | 483 | gd  - stack segment growns down | 
|  | 484 | pf  - pure PFN range | 
|  | 485 | dw  - disabled write to the mapped file | 
|  | 486 | lo  - pages are locked in memory | 
|  | 487 | io  - memory mapped I/O area | 
|  | 488 | sr  - sequential read advise provided | 
|  | 489 | rr  - random read advise provided | 
|  | 490 | dc  - do not copy area on fork | 
|  | 491 | de  - do not expand area on remapping | 
|  | 492 | ac  - area is accountable | 
|  | 493 | nr  - swap space is not reserved for the area | 
|  | 494 | ht  - area uses huge tlb pages | 
|  | 495 | ar  - architecture specific flag | 
|  | 496 | dd  - do not include area into core dump | 
|  | 497 | sd  - soft-dirty flag | 
|  | 498 | mm  - mixed map area | 
|  | 499 | hg  - huge page advise flag | 
|  | 500 | nh  - no-huge page advise flag | 
|  | 501 | mg  - mergable advise flag | 
|  | 502 |  | 
|  | 503 | Note that there is no guarantee that every flag and associated mnemonic will | 
|  | 504 | be present in all further kernel releases. Things get changed, the flags may | 
|  | 505 | be vanished or the reverse -- new added. Interpretation of their meaning | 
|  | 506 | might change in future as well. So each consumer of these flags has to | 
|  | 507 | follow each specific kernel version for the exact semantic. | 
|  | 508 |  | 
|  | 509 | The "Name" field will only be present on a mapping that has been named by | 
|  | 510 | userspace, and will show the name passed in by userspace. | 
|  | 511 |  | 
|  | 512 | This file is only present if the CONFIG_MMU kernel configuration option is | 
|  | 513 | enabled. | 
|  | 514 |  | 
|  | 515 | Note: reading /proc/PID/maps or /proc/PID/smaps is inherently racy (consistent | 
|  | 516 | output can be achieved only in the single read call). | 
|  | 517 | This typically manifests when doing partial reads of these files while the | 
|  | 518 | memory map is being modified.  Despite the races, we do provide the following | 
|  | 519 | guarantees: | 
|  | 520 |  | 
|  | 521 | 1) The mapped addresses never go backwards, which implies no two | 
|  | 522 | regions will ever overlap. | 
|  | 523 | 2) If there is something at a given vaddr during the entirety of the | 
|  | 524 | life of the smaps/maps walk, there will be some output for it. | 
|  | 525 |  | 
|  | 526 |  | 
|  | 527 | The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG | 
|  | 528 | bits on both physical and virtual pages associated with a process, and the | 
|  | 529 | soft-dirty bit on pte (see Documentation/admin-guide/mm/soft-dirty.rst | 
|  | 530 | for details). | 
|  | 531 | To clear the bits for all the pages associated with the process | 
|  | 532 | > echo 1 > /proc/PID/clear_refs | 
|  | 533 |  | 
|  | 534 | To clear the bits for the anonymous pages associated with the process | 
|  | 535 | > echo 2 > /proc/PID/clear_refs | 
|  | 536 |  | 
|  | 537 | To clear the bits for the file mapped pages associated with the process | 
|  | 538 | > echo 3 > /proc/PID/clear_refs | 
|  | 539 |  | 
|  | 540 | To clear the soft-dirty bit | 
|  | 541 | > echo 4 > /proc/PID/clear_refs | 
|  | 542 |  | 
|  | 543 | To reset the peak resident set size ("high water mark") to the process's | 
|  | 544 | current value: | 
|  | 545 | > echo 5 > /proc/PID/clear_refs | 
|  | 546 |  | 
|  | 547 | Any other value written to /proc/PID/clear_refs will have no effect. | 
|  | 548 |  | 
|  | 549 | The /proc/pid/pagemap gives the PFN, which can be used to find the pageflags | 
|  | 550 | using /proc/kpageflags and number of times a page is mapped using | 
|  | 551 | /proc/kpagecount. For detailed explanation, see | 
|  | 552 | Documentation/admin-guide/mm/pagemap.rst. | 
|  | 553 |  | 
|  | 554 | The /proc/pid/numa_maps is an extension based on maps, showing the memory | 
|  | 555 | locality and binding policy, as well as the memory usage (in pages) of | 
|  | 556 | each mapping. The output follows a general format where mapping details get | 
|  | 557 | summarized separated by blank spaces, one mapping per each file line: | 
|  | 558 |  | 
|  | 559 | address   policy    mapping details | 
|  | 560 |  | 
|  | 561 | 00400000 default file=/usr/local/bin/app mapped=1 active=0 N3=1 kernelpagesize_kB=4 | 
|  | 562 | 00600000 default file=/usr/local/bin/app anon=1 dirty=1 N3=1 kernelpagesize_kB=4 | 
|  | 563 | 3206000000 default file=/lib64/ld-2.12.so mapped=26 mapmax=6 N0=24 N3=2 kernelpagesize_kB=4 | 
|  | 564 | 320621f000 default file=/lib64/ld-2.12.so anon=1 dirty=1 N3=1 kernelpagesize_kB=4 | 
|  | 565 | 3206220000 default file=/lib64/ld-2.12.so anon=1 dirty=1 N3=1 kernelpagesize_kB=4 | 
|  | 566 | 3206221000 default anon=1 dirty=1 N3=1 kernelpagesize_kB=4 | 
|  | 567 | 3206800000 default file=/lib64/libc-2.12.so mapped=59 mapmax=21 active=55 N0=41 N3=18 kernelpagesize_kB=4 | 
|  | 568 | 320698b000 default file=/lib64/libc-2.12.so | 
|  | 569 | 3206b8a000 default file=/lib64/libc-2.12.so anon=2 dirty=2 N3=2 kernelpagesize_kB=4 | 
|  | 570 | 3206b8e000 default file=/lib64/libc-2.12.so anon=1 dirty=1 N3=1 kernelpagesize_kB=4 | 
|  | 571 | 3206b8f000 default anon=3 dirty=3 active=1 N3=3 kernelpagesize_kB=4 | 
|  | 572 | 7f4dc10a2000 default anon=3 dirty=3 N3=3 kernelpagesize_kB=4 | 
|  | 573 | 7f4dc10b4000 default anon=2 dirty=2 active=1 N3=2 kernelpagesize_kB=4 | 
|  | 574 | 7f4dc1200000 default file=/anon_hugepage\040(deleted) huge anon=1 dirty=1 N3=1 kernelpagesize_kB=2048 | 
|  | 575 | 7fff335f0000 default stack anon=3 dirty=3 N3=3 kernelpagesize_kB=4 | 
|  | 576 | 7fff3369d000 default mapped=1 mapmax=35 active=0 N3=1 kernelpagesize_kB=4 | 
|  | 577 |  | 
|  | 578 | Where: | 
|  | 579 | "address" is the starting address for the mapping; | 
|  | 580 | "policy" reports the NUMA memory policy set for the mapping (see Documentation/admin-guide/mm/numa_memory_policy.rst); | 
|  | 581 | "mapping details" summarizes mapping data such as mapping type, page usage counters, | 
|  | 582 | node locality page counters (N0 == node0, N1 == node1, ...) and the kernel page | 
|  | 583 | size, in KB, that is backing the mapping up. | 
|  | 584 |  | 
|  | 585 | 1.2 Kernel data | 
|  | 586 | --------------- | 
|  | 587 |  | 
|  | 588 | Similar to  the  process entries, the kernel data files give information about | 
|  | 589 | the running kernel. The files used to obtain this information are contained in | 
|  | 590 | /proc and  are  listed  in Table 1-5. Not all of these will be present in your | 
|  | 591 | system. It  depends  on the kernel configuration and the loaded modules, which | 
|  | 592 | files are there, and which are missing. | 
|  | 593 |  | 
|  | 594 | Table 1-5: Kernel info in /proc | 
|  | 595 | .............................................................................. | 
|  | 596 | File        Content | 
|  | 597 | apm         Advanced power management info | 
|  | 598 | buddyinfo   Kernel memory allocator information (see text)	(2.5) | 
|  | 599 | bus         Directory containing bus specific information | 
|  | 600 | cmdline     Kernel command line | 
|  | 601 | cpuinfo     Info about the CPU | 
|  | 602 | devices     Available devices (block and character) | 
|  | 603 | dma         Used DMS channels | 
|  | 604 | filesystems Supported filesystems | 
|  | 605 | driver	     Various drivers grouped here, currently rtc (2.4) | 
|  | 606 | execdomains Execdomains, related to security			(2.4) | 
|  | 607 | fb	     Frame Buffer devices				(2.4) | 
|  | 608 | fs	     File system parameters, currently nfs/exports	(2.4) | 
|  | 609 | ide         Directory containing info about the IDE subsystem | 
|  | 610 | interrupts  Interrupt usage | 
|  | 611 | iomem	     Memory map						(2.4) | 
|  | 612 | ioports     I/O port usage | 
|  | 613 | irq	     Masks for irq to cpu affinity			(2.4)(smp?) | 
|  | 614 | isapnp	     ISA PnP (Plug&Play) Info				(2.4) | 
|  | 615 | kcore       Kernel core image (can be ELF or A.OUT(deprecated in 2.4)) | 
|  | 616 | kmsg        Kernel messages | 
|  | 617 | ksyms       Kernel symbol table | 
|  | 618 | loadavg     Load average of last 1, 5 & 15 minutes | 
|  | 619 | locks       Kernel locks | 
|  | 620 | meminfo     Memory info | 
|  | 621 | misc        Miscellaneous | 
|  | 622 | modules     List of loaded modules | 
|  | 623 | mounts      Mounted filesystems | 
|  | 624 | net         Networking info (see text) | 
|  | 625 | pagetypeinfo Additional page allocator information (see text)  (2.5) | 
|  | 626 | partitions  Table of partitions known to the system | 
|  | 627 | pci	     Deprecated info of PCI bus (new way -> /proc/bus/pci/, | 
|  | 628 | decoupled by lspci					(2.4) | 
|  | 629 | rtc         Real time clock | 
|  | 630 | scsi        SCSI info (see text) | 
|  | 631 | slabinfo    Slab pool info | 
|  | 632 | softirqs    softirq usage | 
|  | 633 | stat        Overall statistics | 
|  | 634 | swaps       Swap space utilization | 
|  | 635 | sys         See chapter 2 | 
|  | 636 | sysvipc     Info of SysVIPC Resources (msg, sem, shm)		(2.4) | 
|  | 637 | tty	     Info of tty drivers | 
|  | 638 | uptime      Wall clock since boot, combined idle time of all cpus | 
|  | 639 | version     Kernel version | 
|  | 640 | video	     bttv info of video resources			(2.4) | 
|  | 641 | vmallocinfo Show vmalloced areas | 
|  | 642 | .............................................................................. | 
|  | 643 |  | 
|  | 644 | You can,  for  example,  check  which interrupts are currently in use and what | 
|  | 645 | they are used for by looking in the file /proc/interrupts: | 
|  | 646 |  | 
|  | 647 | > cat /proc/interrupts | 
|  | 648 | CPU0 | 
|  | 649 | 0:    8728810          XT-PIC  timer | 
|  | 650 | 1:        895          XT-PIC  keyboard | 
|  | 651 | 2:          0          XT-PIC  cascade | 
|  | 652 | 3:     531695          XT-PIC  aha152x | 
|  | 653 | 4:    2014133          XT-PIC  serial | 
|  | 654 | 5:      44401          XT-PIC  pcnet_cs | 
|  | 655 | 8:          2          XT-PIC  rtc | 
|  | 656 | 11:          8          XT-PIC  i82365 | 
|  | 657 | 12:     182918          XT-PIC  PS/2 Mouse | 
|  | 658 | 13:          1          XT-PIC  fpu | 
|  | 659 | 14:    1232265          XT-PIC  ide0 | 
|  | 660 | 15:          7          XT-PIC  ide1 | 
|  | 661 | NMI:          0 | 
|  | 662 |  | 
|  | 663 | In 2.4.* a couple of lines where added to this file LOC & ERR (this time is the | 
|  | 664 | output of a SMP machine): | 
|  | 665 |  | 
|  | 666 | > cat /proc/interrupts | 
|  | 667 |  | 
|  | 668 | CPU0       CPU1 | 
|  | 669 | 0:    1243498    1214548    IO-APIC-edge  timer | 
|  | 670 | 1:       8949       8958    IO-APIC-edge  keyboard | 
|  | 671 | 2:          0          0          XT-PIC  cascade | 
|  | 672 | 5:      11286      10161    IO-APIC-edge  soundblaster | 
|  | 673 | 8:          1          0    IO-APIC-edge  rtc | 
|  | 674 | 9:      27422      27407    IO-APIC-edge  3c503 | 
|  | 675 | 12:     113645     113873    IO-APIC-edge  PS/2 Mouse | 
|  | 676 | 13:          0          0          XT-PIC  fpu | 
|  | 677 | 14:      22491      24012    IO-APIC-edge  ide0 | 
|  | 678 | 15:       2183       2415    IO-APIC-edge  ide1 | 
|  | 679 | 17:      30564      30414   IO-APIC-level  eth0 | 
|  | 680 | 18:        177        164   IO-APIC-level  bttv | 
|  | 681 | NMI:    2457961    2457959 | 
|  | 682 | LOC:    2457882    2457881 | 
|  | 683 | ERR:       2155 | 
|  | 684 |  | 
|  | 685 | NMI is incremented in this case because every timer interrupt generates a NMI | 
|  | 686 | (Non Maskable Interrupt) which is used by the NMI Watchdog to detect lockups. | 
|  | 687 |  | 
|  | 688 | LOC is the local interrupt counter of the internal APIC of every CPU. | 
|  | 689 |  | 
|  | 690 | ERR is incremented in the case of errors in the IO-APIC bus (the bus that | 
|  | 691 | connects the CPUs in a SMP system. This means that an error has been detected, | 
|  | 692 | the IO-APIC automatically retry the transmission, so it should not be a big | 
|  | 693 | problem, but you should read the SMP-FAQ. | 
|  | 694 |  | 
|  | 695 | In 2.6.2* /proc/interrupts was expanded again.  This time the goal was for | 
|  | 696 | /proc/interrupts to display every IRQ vector in use by the system, not | 
|  | 697 | just those considered 'most important'.  The new vectors are: | 
|  | 698 |  | 
|  | 699 | THR -- interrupt raised when a machine check threshold counter | 
|  | 700 | (typically counting ECC corrected errors of memory or cache) exceeds | 
|  | 701 | a configurable threshold.  Only available on some systems. | 
|  | 702 |  | 
|  | 703 | TRM -- a thermal event interrupt occurs when a temperature threshold | 
|  | 704 | has been exceeded for the CPU.  This interrupt may also be generated | 
|  | 705 | when the temperature drops back to normal. | 
|  | 706 |  | 
|  | 707 | SPU -- a spurious interrupt is some interrupt that was raised then lowered | 
|  | 708 | by some IO device before it could be fully processed by the APIC.  Hence | 
|  | 709 | the APIC sees the interrupt but does not know what device it came from. | 
|  | 710 | For this case the APIC will generate the interrupt with a IRQ vector | 
|  | 711 | of 0xff. This might also be generated by chipset bugs. | 
|  | 712 |  | 
|  | 713 | RES, CAL, TLB -- rescheduling, call and TLB flush interrupts are | 
|  | 714 | sent from one CPU to another per the needs of the OS.  Typically, | 
|  | 715 | their statistics are used by kernel developers and interested users to | 
|  | 716 | determine the occurrence of interrupts of the given type. | 
|  | 717 |  | 
|  | 718 | The above IRQ vectors are displayed only when relevant.  For example, | 
|  | 719 | the threshold vector does not exist on x86_64 platforms.  Others are | 
|  | 720 | suppressed when the system is a uniprocessor.  As of this writing, only | 
|  | 721 | i386 and x86_64 platforms support the new IRQ vector displays. | 
|  | 722 |  | 
|  | 723 | Of some interest is the introduction of the /proc/irq directory to 2.4. | 
|  | 724 | It could be used to set IRQ to CPU affinity, this means that you can "hook" an | 
|  | 725 | IRQ to only one CPU, or to exclude a CPU of handling IRQs. The contents of the | 
|  | 726 | irq subdir is one subdir for each IRQ, and two files; default_smp_affinity and | 
|  | 727 | prof_cpu_mask. | 
|  | 728 |  | 
|  | 729 | For example | 
|  | 730 | > ls /proc/irq/ | 
|  | 731 | 0  10  12  14  16  18  2  4  6  8  prof_cpu_mask | 
|  | 732 | 1  11  13  15  17  19  3  5  7  9  default_smp_affinity | 
|  | 733 | > ls /proc/irq/0/ | 
|  | 734 | smp_affinity | 
|  | 735 |  | 
|  | 736 | smp_affinity is a bitmask, in which you can specify which CPUs can handle the | 
|  | 737 | IRQ, you can set it by doing: | 
|  | 738 |  | 
|  | 739 | > echo 1 > /proc/irq/10/smp_affinity | 
|  | 740 |  | 
|  | 741 | This means that only the first CPU will handle the IRQ, but you can also echo | 
|  | 742 | 5 which means that only the first and third CPU can handle the IRQ. | 
|  | 743 |  | 
|  | 744 | The contents of each smp_affinity file is the same by default: | 
|  | 745 |  | 
|  | 746 | > cat /proc/irq/0/smp_affinity | 
|  | 747 | ffffffff | 
|  | 748 |  | 
|  | 749 | There is an alternate interface, smp_affinity_list which allows specifying | 
|  | 750 | a cpu range instead of a bitmask: | 
|  | 751 |  | 
|  | 752 | > cat /proc/irq/0/smp_affinity_list | 
|  | 753 | 1024-1031 | 
|  | 754 |  | 
|  | 755 | The default_smp_affinity mask applies to all non-active IRQs, which are the | 
|  | 756 | IRQs which have not yet been allocated/activated, and hence which lack a | 
|  | 757 | /proc/irq/[0-9]* directory. | 
|  | 758 |  | 
|  | 759 | The node file on an SMP system shows the node to which the device using the IRQ | 
|  | 760 | reports itself as being attached. This hardware locality information does not | 
|  | 761 | include information about any possible driver locality preference. | 
|  | 762 |  | 
|  | 763 | prof_cpu_mask specifies which CPUs are to be profiled by the system wide | 
|  | 764 | profiler. Default value is ffffffff (all cpus if there are only 32 of them). | 
|  | 765 |  | 
|  | 766 | The way IRQs are routed is handled by the IO-APIC, and it's Round Robin | 
|  | 767 | between all the CPUs which are allowed to handle it. As usual the kernel has | 
|  | 768 | more info than you and does a better job than you, so the defaults are the | 
|  | 769 | best choice for almost everyone.  [Note this applies only to those IO-APIC's | 
|  | 770 | that support "Round Robin" interrupt distribution.] | 
|  | 771 |  | 
|  | 772 | There are  three  more  important subdirectories in /proc: net, scsi, and sys. | 
|  | 773 | The general  rule  is  that  the  contents,  or  even  the  existence of these | 
|  | 774 | directories, depend  on your kernel configuration. If SCSI is not enabled, the | 
|  | 775 | directory scsi  may  not  exist. The same is true with the net, which is there | 
|  | 776 | only when networking support is present in the running kernel. | 
|  | 777 |  | 
|  | 778 | The slabinfo  file  gives  information  about  memory usage at the slab level. | 
|  | 779 | Linux uses  slab  pools for memory management above page level in version 2.2. | 
|  | 780 | Commonly used  objects  have  their  own  slab  pool (such as network buffers, | 
|  | 781 | directory cache, and so on). | 
|  | 782 |  | 
|  | 783 | .............................................................................. | 
|  | 784 |  | 
|  | 785 | > cat /proc/buddyinfo | 
|  | 786 |  | 
|  | 787 | Node 0, zone      DMA      0      4      5      4      4      3 ... | 
|  | 788 | Node 0, zone   Normal      1      0      0      1    101      8 ... | 
|  | 789 | Node 0, zone  HighMem      2      0      0      1      1      0 ... | 
|  | 790 |  | 
|  | 791 | External fragmentation is a problem under some workloads, and buddyinfo is a | 
|  | 792 | useful tool for helping diagnose these problems.  Buddyinfo will give you a | 
|  | 793 | clue as to how big an area you can safely allocate, or why a previous | 
|  | 794 | allocation failed. | 
|  | 795 |  | 
|  | 796 | Each column represents the number of pages of a certain order which are | 
|  | 797 | available.  In this case, there are 0 chunks of 2^0*PAGE_SIZE available in | 
|  | 798 | ZONE_DMA, 4 chunks of 2^1*PAGE_SIZE in ZONE_DMA, 101 chunks of 2^4*PAGE_SIZE | 
|  | 799 | available in ZONE_NORMAL, etc... | 
|  | 800 |  | 
|  | 801 | More information relevant to external fragmentation can be found in | 
|  | 802 | pagetypeinfo. | 
|  | 803 |  | 
|  | 804 | > cat /proc/pagetypeinfo | 
|  | 805 | Page block order: 9 | 
|  | 806 | Pages per block:  512 | 
|  | 807 |  | 
|  | 808 | Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10 | 
|  | 809 | Node    0, zone      DMA, type    Unmovable      0      0      0      1      1      1      1      1      1      1      0 | 
|  | 810 | Node    0, zone      DMA, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0 | 
|  | 811 | Node    0, zone      DMA, type      Movable      1      1      2      1      2      1      1      0      1      0      2 | 
|  | 812 | Node    0, zone      DMA, type      Reserve      0      0      0      0      0      0      0      0      0      1      0 | 
|  | 813 | Node    0, zone      DMA, type      Isolate      0      0      0      0      0      0      0      0      0      0      0 | 
|  | 814 | Node    0, zone    DMA32, type    Unmovable    103     54     77      1      1      1     11      8      7      1      9 | 
|  | 815 | Node    0, zone    DMA32, type  Reclaimable      0      0      2      1      0      0      0      0      1      0      0 | 
|  | 816 | Node    0, zone    DMA32, type      Movable    169    152    113     91     77     54     39     13      6      1    452 | 
|  | 817 | Node    0, zone    DMA32, type      Reserve      1      2      2      2      2      0      1      1      1      1      0 | 
|  | 818 | Node    0, zone    DMA32, type      Isolate      0      0      0      0      0      0      0      0      0      0      0 | 
|  | 819 |  | 
|  | 820 | Number of blocks type     Unmovable  Reclaimable      Movable      Reserve      Isolate | 
|  | 821 | Node 0, zone      DMA            2            0            5            1            0 | 
|  | 822 | Node 0, zone    DMA32           41            6          967            2            0 | 
|  | 823 |  | 
|  | 824 | Fragmentation avoidance in the kernel works by grouping pages of different | 
|  | 825 | migrate types into the same contiguous regions of memory called page blocks. | 
|  | 826 | A page block is typically the size of the default hugepage size e.g. 2MB on | 
|  | 827 | X86-64. By keeping pages grouped based on their ability to move, the kernel | 
|  | 828 | can reclaim pages within a page block to satisfy a high-order allocation. | 
|  | 829 |  | 
|  | 830 | The pagetypinfo begins with information on the size of a page block. It | 
|  | 831 | then gives the same type of information as buddyinfo except broken down | 
|  | 832 | by migrate-type and finishes with details on how many page blocks of each | 
|  | 833 | type exist. | 
|  | 834 |  | 
|  | 835 | If min_free_kbytes has been tuned correctly (recommendations made by hugeadm | 
|  | 836 | from libhugetlbfs https://github.com/libhugetlbfs/libhugetlbfs/), one can | 
|  | 837 | make an estimate of the likely number of huge pages that can be allocated | 
|  | 838 | at a given point in time. All the "Movable" blocks should be allocatable | 
|  | 839 | unless memory has been mlock()'d. Some of the Reclaimable blocks should | 
|  | 840 | also be allocatable although a lot of filesystem metadata may have to be | 
|  | 841 | reclaimed to achieve this. | 
|  | 842 |  | 
|  | 843 | .............................................................................. | 
|  | 844 |  | 
|  | 845 | meminfo: | 
|  | 846 |  | 
|  | 847 | Provides information about distribution and utilization of memory.  This | 
|  | 848 | varies by architecture and compile options.  The following is from a | 
|  | 849 | 16GB PIII, which has highmem enabled.  You may not have all of these fields. | 
|  | 850 |  | 
|  | 851 | > cat /proc/meminfo | 
|  | 852 |  | 
|  | 853 | MemTotal:     16344972 kB | 
|  | 854 | MemFree:      13634064 kB | 
|  | 855 | MemAvailable: 14836172 kB | 
|  | 856 | Buffers:          3656 kB | 
|  | 857 | Cached:        1195708 kB | 
|  | 858 | SwapCached:          0 kB | 
|  | 859 | Active:         891636 kB | 
|  | 860 | Inactive:      1077224 kB | 
|  | 861 | HighTotal:    15597528 kB | 
|  | 862 | HighFree:     13629632 kB | 
|  | 863 | LowTotal:       747444 kB | 
|  | 864 | LowFree:          4432 kB | 
|  | 865 | SwapTotal:           0 kB | 
|  | 866 | SwapFree:            0 kB | 
|  | 867 | Dirty:             968 kB | 
|  | 868 | Writeback:           0 kB | 
|  | 869 | AnonPages:      861800 kB | 
|  | 870 | Mapped:         280372 kB | 
|  | 871 | Shmem:             644 kB | 
|  | 872 | KReclaimable:   168048 kB | 
|  | 873 | Slab:           284364 kB | 
|  | 874 | SReclaimable:   159856 kB | 
|  | 875 | SUnreclaim:     124508 kB | 
|  | 876 | PageTables:      24448 kB | 
|  | 877 | NFS_Unstable:        0 kB | 
|  | 878 | Bounce:              0 kB | 
|  | 879 | WritebackTmp:        0 kB | 
|  | 880 | CommitLimit:   7669796 kB | 
|  | 881 | Committed_AS:   100056 kB | 
|  | 882 | VmallocTotal:   112216 kB | 
|  | 883 | VmallocUsed:       428 kB | 
|  | 884 | VmallocChunk:   111088 kB | 
|  | 885 | Percpu:          62080 kB | 
|  | 886 | HardwareCorrupted:   0 kB | 
|  | 887 | AnonHugePages:   49152 kB | 
|  | 888 | ShmemHugePages:      0 kB | 
|  | 889 | ShmemPmdMapped:      0 kB | 
|  | 890 |  | 
|  | 891 |  | 
|  | 892 | MemTotal: Total usable ram (i.e. physical ram minus a few reserved | 
|  | 893 | bits and the kernel binary code) | 
|  | 894 | MemFree: The sum of LowFree+HighFree | 
|  | 895 | MemAvailable: An estimate of how much memory is available for starting new | 
|  | 896 | applications, without swapping. Calculated from MemFree, | 
|  | 897 | SReclaimable, the size of the file LRU lists, and the low | 
|  | 898 | watermarks in each zone. | 
|  | 899 | The estimate takes into account that the system needs some | 
|  | 900 | page cache to function well, and that not all reclaimable | 
|  | 901 | slab will be reclaimable, due to items being in use. The | 
|  | 902 | impact of those factors will vary from system to system. | 
|  | 903 | Buffers: Relatively temporary storage for raw disk blocks | 
|  | 904 | shouldn't get tremendously large (20MB or so) | 
|  | 905 | Cached: in-memory cache for files read from the disk (the | 
|  | 906 | pagecache).  Doesn't include SwapCached | 
|  | 907 | SwapCached: Memory that once was swapped out, is swapped back in but | 
|  | 908 | still also is in the swapfile (if memory is needed it | 
|  | 909 | doesn't need to be swapped out AGAIN because it is already | 
|  | 910 | in the swapfile. This saves I/O) | 
|  | 911 | Active: Memory that has been used more recently and usually not | 
|  | 912 | reclaimed unless absolutely necessary. | 
|  | 913 | Inactive: Memory which has been less recently used.  It is more | 
|  | 914 | eligible to be reclaimed for other purposes | 
|  | 915 | HighTotal: | 
|  | 916 | HighFree: Highmem is all memory above ~860MB of physical memory | 
|  | 917 | Highmem areas are for use by userspace programs, or | 
|  | 918 | for the pagecache.  The kernel must use tricks to access | 
|  | 919 | this memory, making it slower to access than lowmem. | 
|  | 920 | LowTotal: | 
|  | 921 | LowFree: Lowmem is memory which can be used for everything that | 
|  | 922 | highmem can be used for, but it is also available for the | 
|  | 923 | kernel's use for its own data structures.  Among many | 
|  | 924 | other things, it is where everything from the Slab is | 
|  | 925 | allocated.  Bad things happen when you're out of lowmem. | 
|  | 926 | SwapTotal: total amount of swap space available | 
|  | 927 | SwapFree: Memory which has been evicted from RAM, and is temporarily | 
|  | 928 | on the disk | 
|  | 929 | Dirty: Memory which is waiting to get written back to the disk | 
|  | 930 | Writeback: Memory which is actively being written back to the disk | 
|  | 931 | AnonPages: Non-file backed pages mapped into userspace page tables | 
|  | 932 | HardwareCorrupted: The amount of RAM/memory in KB, the kernel identifies as | 
|  | 933 | corrupted. | 
|  | 934 | AnonHugePages: Non-file backed huge pages mapped into userspace page tables | 
|  | 935 | Mapped: files which have been mmaped, such as libraries | 
|  | 936 | Shmem: Total memory used by shared memory (shmem) and tmpfs | 
|  | 937 | ShmemHugePages: Memory used by shared memory (shmem) and tmpfs allocated | 
|  | 938 | with huge pages | 
|  | 939 | ShmemPmdMapped: Shared memory mapped into userspace with huge pages | 
|  | 940 | KReclaimable: Kernel allocations that the kernel will attempt to reclaim | 
|  | 941 | under memory pressure. Includes SReclaimable (below), and other | 
|  | 942 | direct allocations with a shrinker. | 
|  | 943 | Slab: in-kernel data structures cache | 
|  | 944 | SReclaimable: Part of Slab, that might be reclaimed, such as caches | 
|  | 945 | SUnreclaim: Part of Slab, that cannot be reclaimed on memory pressure | 
|  | 946 | PageTables: amount of memory dedicated to the lowest level of page | 
|  | 947 | tables. | 
|  | 948 | NFS_Unstable: NFS pages sent to the server, but not yet committed to stable | 
|  | 949 | storage | 
|  | 950 | Bounce: Memory used for block device "bounce buffers" | 
|  | 951 | WritebackTmp: Memory used by FUSE for temporary writeback buffers | 
|  | 952 | CommitLimit: Based on the overcommit ratio ('vm.overcommit_ratio'), | 
|  | 953 | this is the total amount of  memory currently available to | 
|  | 954 | be allocated on the system. This limit is only adhered to | 
|  | 955 | if strict overcommit accounting is enabled (mode 2 in | 
|  | 956 | 'vm.overcommit_memory'). | 
|  | 957 | The CommitLimit is calculated with the following formula: | 
|  | 958 | CommitLimit = ([total RAM pages] - [total huge TLB pages]) * | 
|  | 959 | overcommit_ratio / 100 + [total swap pages] | 
|  | 960 | For example, on a system with 1G of physical RAM and 7G | 
|  | 961 | of swap with a `vm.overcommit_ratio` of 30 it would | 
|  | 962 | yield a CommitLimit of 7.3G. | 
|  | 963 | For more details, see the memory overcommit documentation | 
|  | 964 | in vm/overcommit-accounting. | 
|  | 965 | Committed_AS: The amount of memory presently allocated on the system. | 
|  | 966 | The committed memory is a sum of all of the memory which | 
|  | 967 | has been allocated by processes, even if it has not been | 
|  | 968 | "used" by them as of yet. A process which malloc()'s 1G | 
|  | 969 | of memory, but only touches 300M of it will show up as | 
|  | 970 | using 1G. This 1G is memory which has been "committed" to | 
|  | 971 | by the VM and can be used at any time by the allocating | 
|  | 972 | application. With strict overcommit enabled on the system | 
|  | 973 | (mode 2 in 'vm.overcommit_memory'),allocations which would | 
|  | 974 | exceed the CommitLimit (detailed above) will not be permitted. | 
|  | 975 | This is useful if one needs to guarantee that processes will | 
|  | 976 | not fail due to lack of memory once that memory has been | 
|  | 977 | successfully allocated. | 
|  | 978 | VmallocTotal: total size of vmalloc memory area | 
|  | 979 | VmallocUsed: amount of vmalloc area which is used | 
|  | 980 | VmallocChunk: largest contiguous block of vmalloc area which is free | 
|  | 981 | Percpu: Memory allocated to the percpu allocator used to back percpu | 
|  | 982 | allocations. This stat excludes the cost of metadata. | 
|  | 983 |  | 
|  | 984 | .............................................................................. | 
|  | 985 |  | 
|  | 986 | vmallocinfo: | 
|  | 987 |  | 
|  | 988 | Provides information about vmalloced/vmaped areas. One line per area, | 
|  | 989 | containing the virtual address range of the area, size in bytes, | 
|  | 990 | caller information of the creator, and optional information depending | 
|  | 991 | on the kind of area : | 
|  | 992 |  | 
|  | 993 | pages=nr    number of pages | 
|  | 994 | phys=addr   if a physical address was specified | 
|  | 995 | ioremap     I/O mapping (ioremap() and friends) | 
|  | 996 | vmalloc     vmalloc() area | 
|  | 997 | vmap        vmap()ed pages | 
|  | 998 | user        VM_USERMAP area | 
|  | 999 | vpages      buffer for pages pointers was vmalloced (huge area) | 
|  | 1000 | N<node>=nr  (Only on NUMA kernels) | 
|  | 1001 | Number of pages allocated on memory node <node> | 
|  | 1002 |  | 
|  | 1003 | > cat /proc/vmallocinfo | 
|  | 1004 | 0xffffc20000000000-0xffffc20000201000 2101248 alloc_large_system_hash+0x204 ... | 
|  | 1005 | /0x2c0 pages=512 vmalloc N0=128 N1=128 N2=128 N3=128 | 
|  | 1006 | 0xffffc20000201000-0xffffc20000302000 1052672 alloc_large_system_hash+0x204 ... | 
|  | 1007 | /0x2c0 pages=256 vmalloc N0=64 N1=64 N2=64 N3=64 | 
|  | 1008 | 0xffffc20000302000-0xffffc20000304000    8192 acpi_tb_verify_table+0x21/0x4f... | 
|  | 1009 | phys=7fee8000 ioremap | 
|  | 1010 | 0xffffc20000304000-0xffffc20000307000   12288 acpi_tb_verify_table+0x21/0x4f... | 
|  | 1011 | phys=7fee7000 ioremap | 
|  | 1012 | 0xffffc2000031d000-0xffffc2000031f000    8192 init_vdso_vars+0x112/0x210 | 
|  | 1013 | 0xffffc2000031f000-0xffffc2000032b000   49152 cramfs_uncompress_init+0x2e ... | 
|  | 1014 | /0x80 pages=11 vmalloc N0=3 N1=3 N2=2 N3=3 | 
|  | 1015 | 0xffffc2000033a000-0xffffc2000033d000   12288 sys_swapon+0x640/0xac0      ... | 
|  | 1016 | pages=2 vmalloc N1=2 | 
|  | 1017 | 0xffffc20000347000-0xffffc2000034c000   20480 xt_alloc_table_info+0xfe ... | 
|  | 1018 | /0x130 [x_tables] pages=4 vmalloc N0=4 | 
|  | 1019 | 0xffffffffa0000000-0xffffffffa000f000   61440 sys_init_module+0xc27/0x1d00 ... | 
|  | 1020 | pages=14 vmalloc N2=14 | 
|  | 1021 | 0xffffffffa000f000-0xffffffffa0014000   20480 sys_init_module+0xc27/0x1d00 ... | 
|  | 1022 | pages=4 vmalloc N1=4 | 
|  | 1023 | 0xffffffffa0014000-0xffffffffa0017000   12288 sys_init_module+0xc27/0x1d00 ... | 
|  | 1024 | pages=2 vmalloc N1=2 | 
|  | 1025 | 0xffffffffa0017000-0xffffffffa0022000   45056 sys_init_module+0xc27/0x1d00 ... | 
|  | 1026 | pages=10 vmalloc N0=10 | 
|  | 1027 |  | 
|  | 1028 | .............................................................................. | 
|  | 1029 |  | 
|  | 1030 | softirqs: | 
|  | 1031 |  | 
|  | 1032 | Provides counts of softirq handlers serviced since boot time, for each cpu. | 
|  | 1033 |  | 
|  | 1034 | > cat /proc/softirqs | 
|  | 1035 | CPU0       CPU1       CPU2       CPU3 | 
|  | 1036 | HI:          0          0          0          0 | 
|  | 1037 | TIMER:      27166      27120      27097      27034 | 
|  | 1038 | NET_TX:          0          0          0         17 | 
|  | 1039 | NET_RX:         42          0          0         39 | 
|  | 1040 | BLOCK:          0          0        107       1121 | 
|  | 1041 | TASKLET:          0          0          0        290 | 
|  | 1042 | SCHED:      27035      26983      26971      26746 | 
|  | 1043 | HRTIMER:          0          0          0          0 | 
|  | 1044 | RCU:       1678       1769       2178       2250 | 
|  | 1045 |  | 
|  | 1046 |  | 
|  | 1047 | 1.3 IDE devices in /proc/ide | 
|  | 1048 | ---------------------------- | 
|  | 1049 |  | 
|  | 1050 | The subdirectory /proc/ide contains information about all IDE devices of which | 
|  | 1051 | the kernel  is  aware.  There is one subdirectory for each IDE controller, the | 
|  | 1052 | file drivers  and a link for each IDE device, pointing to the device directory | 
|  | 1053 | in the controller specific subtree. | 
|  | 1054 |  | 
|  | 1055 | The file  drivers  contains general information about the drivers used for the | 
|  | 1056 | IDE devices: | 
|  | 1057 |  | 
|  | 1058 | > cat /proc/ide/drivers | 
|  | 1059 | ide-cdrom version 4.53 | 
|  | 1060 | ide-disk version 1.08 | 
|  | 1061 |  | 
|  | 1062 | More detailed  information  can  be  found  in  the  controller  specific | 
|  | 1063 | subdirectories. These  are  named  ide0,  ide1  and  so  on.  Each  of  these | 
|  | 1064 | directories contains the files shown in table 1-6. | 
|  | 1065 |  | 
|  | 1066 |  | 
|  | 1067 | Table 1-6: IDE controller info in  /proc/ide/ide? | 
|  | 1068 | .............................................................................. | 
|  | 1069 | File    Content | 
|  | 1070 | channel IDE channel (0 or 1) | 
|  | 1071 | config  Configuration (only for PCI/IDE bridge) | 
|  | 1072 | mate    Mate name | 
|  | 1073 | model   Type/Chipset of IDE controller | 
|  | 1074 | .............................................................................. | 
|  | 1075 |  | 
|  | 1076 | Each device  connected  to  a  controller  has  a separate subdirectory in the | 
|  | 1077 | controllers directory.  The  files  listed in table 1-7 are contained in these | 
|  | 1078 | directories. | 
|  | 1079 |  | 
|  | 1080 |  | 
|  | 1081 | Table 1-7: IDE device information | 
|  | 1082 | .............................................................................. | 
|  | 1083 | File             Content | 
|  | 1084 | cache            The cache | 
|  | 1085 | capacity         Capacity of the medium (in 512Byte blocks) | 
|  | 1086 | driver           driver and version | 
|  | 1087 | geometry         physical and logical geometry | 
|  | 1088 | identify         device identify block | 
|  | 1089 | media            media type | 
|  | 1090 | model            device identifier | 
|  | 1091 | settings         device setup | 
|  | 1092 | smart_thresholds IDE disk management thresholds | 
|  | 1093 | smart_values     IDE disk management values | 
|  | 1094 | .............................................................................. | 
|  | 1095 |  | 
|  | 1096 | The most  interesting  file is settings. This file contains a nice overview of | 
|  | 1097 | the drive parameters: | 
|  | 1098 |  | 
|  | 1099 | # cat /proc/ide/ide0/hda/settings | 
|  | 1100 | name                    value           min             max             mode | 
|  | 1101 | ----                    -----           ---             ---             ---- | 
|  | 1102 | bios_cyl                526             0               65535           rw | 
|  | 1103 | bios_head               255             0               255             rw | 
|  | 1104 | bios_sect               63              0               63              rw | 
|  | 1105 | breada_readahead        4               0               127             rw | 
|  | 1106 | bswap                   0               0               1               r | 
|  | 1107 | file_readahead          72              0               2097151         rw | 
|  | 1108 | io_32bit                0               0               3               rw | 
|  | 1109 | keepsettings            0               0               1               rw | 
|  | 1110 | max_kb_per_request      122             1               127             rw | 
|  | 1111 | multcount               0               0               8               rw | 
|  | 1112 | nice1                   1               0               1               rw | 
|  | 1113 | nowerr                  0               0               1               rw | 
|  | 1114 | pio_mode                write-only      0               255             w | 
|  | 1115 | slow                    0               0               1               rw | 
|  | 1116 | unmaskirq               0               0               1               rw | 
|  | 1117 | using_dma               0               0               1               rw | 
|  | 1118 |  | 
|  | 1119 |  | 
|  | 1120 | 1.4 Networking info in /proc/net | 
|  | 1121 | -------------------------------- | 
|  | 1122 |  | 
|  | 1123 | The subdirectory  /proc/net  follows  the  usual  pattern. Table 1-8 shows the | 
|  | 1124 | additional values  you  get  for  IP  version 6 if you configure the kernel to | 
|  | 1125 | support this. Table 1-9 lists the files and their meaning. | 
|  | 1126 |  | 
|  | 1127 |  | 
|  | 1128 | Table 1-8: IPv6 info in /proc/net | 
|  | 1129 | .............................................................................. | 
|  | 1130 | File       Content | 
|  | 1131 | udp6       UDP sockets (IPv6) | 
|  | 1132 | tcp6       TCP sockets (IPv6) | 
|  | 1133 | raw6       Raw device statistics (IPv6) | 
|  | 1134 | igmp6      IP multicast addresses, which this host joined (IPv6) | 
|  | 1135 | if_inet6   List of IPv6 interface addresses | 
|  | 1136 | ipv6_route Kernel routing table for IPv6 | 
|  | 1137 | rt6_stats  Global IPv6 routing tables statistics | 
|  | 1138 | sockstat6  Socket statistics (IPv6) | 
|  | 1139 | snmp6      Snmp data (IPv6) | 
|  | 1140 | .............................................................................. | 
|  | 1141 |  | 
|  | 1142 |  | 
|  | 1143 | Table 1-9: Network info in /proc/net | 
|  | 1144 | .............................................................................. | 
|  | 1145 | File          Content | 
|  | 1146 | arp           Kernel  ARP table | 
|  | 1147 | dev           network devices with statistics | 
|  | 1148 | dev_mcast     the Layer2 multicast groups a device is listening too | 
|  | 1149 | (interface index, label, number of references, number of bound | 
|  | 1150 | addresses). | 
|  | 1151 | dev_stat      network device status | 
|  | 1152 | ip_fwchains   Firewall chain linkage | 
|  | 1153 | ip_fwnames    Firewall chain names | 
|  | 1154 | ip_masq       Directory containing the masquerading tables | 
|  | 1155 | ip_masquerade Major masquerading table | 
|  | 1156 | netstat       Network statistics | 
|  | 1157 | raw           raw device statistics | 
|  | 1158 | route         Kernel routing table | 
|  | 1159 | rpc           Directory containing rpc info | 
|  | 1160 | rt_cache      Routing cache | 
|  | 1161 | snmp          SNMP data | 
|  | 1162 | sockstat      Socket statistics | 
|  | 1163 | tcp           TCP  sockets | 
|  | 1164 | udp           UDP sockets | 
|  | 1165 | unix          UNIX domain sockets | 
|  | 1166 | wireless      Wireless interface data (Wavelan etc) | 
|  | 1167 | igmp          IP multicast addresses, which this host joined | 
|  | 1168 | psched        Global packet scheduler parameters. | 
|  | 1169 | netlink       List of PF_NETLINK sockets | 
|  | 1170 | ip_mr_vifs    List of multicast virtual interfaces | 
|  | 1171 | ip_mr_cache   List of multicast routing cache | 
|  | 1172 | .............................................................................. | 
|  | 1173 |  | 
|  | 1174 | You can  use  this  information  to see which network devices are available in | 
|  | 1175 | your system and how much traffic was routed over those devices: | 
|  | 1176 |  | 
|  | 1177 | > cat /proc/net/dev | 
|  | 1178 | Inter-|Receive                                                   |[... | 
|  | 1179 | face |bytes    packets errs drop fifo frame compressed multicast|[... | 
|  | 1180 | lo:  908188   5596     0    0    0     0          0         0 [... | 
|  | 1181 | ppp0:15475140  20721   410    0    0   410          0         0 [... | 
|  | 1182 | eth0:  614530   7085     0    0    0     0          0         1 [... | 
|  | 1183 |  | 
|  | 1184 | ...] Transmit | 
|  | 1185 | ...] bytes    packets errs drop fifo colls carrier compressed | 
|  | 1186 | ...]  908188     5596    0    0    0     0       0          0 | 
|  | 1187 | ...] 1375103    17405    0    0    0     0       0          0 | 
|  | 1188 | ...] 1703981     5535    0    0    0     3       0          0 | 
|  | 1189 |  | 
|  | 1190 | In addition, each Channel Bond interface has its own directory.  For | 
|  | 1191 | example, the bond0 device will have a directory called /proc/net/bond0/. | 
|  | 1192 | It will contain information that is specific to that bond, such as the | 
|  | 1193 | current slaves of the bond, the link status of the slaves, and how | 
|  | 1194 | many times the slaves link has failed. | 
|  | 1195 |  | 
|  | 1196 | 1.5 SCSI info | 
|  | 1197 | ------------- | 
|  | 1198 |  | 
|  | 1199 | If you  have  a  SCSI  host adapter in your system, you'll find a subdirectory | 
|  | 1200 | named after  the driver for this adapter in /proc/scsi. You'll also see a list | 
|  | 1201 | of all recognized SCSI devices in /proc/scsi: | 
|  | 1202 |  | 
|  | 1203 | >cat /proc/scsi/scsi | 
|  | 1204 | Attached devices: | 
|  | 1205 | Host: scsi0 Channel: 00 Id: 00 Lun: 00 | 
|  | 1206 | Vendor: IBM      Model: DGHS09U          Rev: 03E0 | 
|  | 1207 | Type:   Direct-Access                    ANSI SCSI revision: 03 | 
|  | 1208 | Host: scsi0 Channel: 00 Id: 06 Lun: 00 | 
|  | 1209 | Vendor: PIONEER  Model: CD-ROM DR-U06S   Rev: 1.04 | 
|  | 1210 | Type:   CD-ROM                           ANSI SCSI revision: 02 | 
|  | 1211 |  | 
|  | 1212 |  | 
|  | 1213 | The directory  named  after  the driver has one file for each adapter found in | 
|  | 1214 | the system.  These  files  contain information about the controller, including | 
|  | 1215 | the used  IRQ  and  the  IO  address range. The amount of information shown is | 
|  | 1216 | dependent on  the adapter you use. The example shows the output for an Adaptec | 
|  | 1217 | AHA-2940 SCSI adapter: | 
|  | 1218 |  | 
|  | 1219 | > cat /proc/scsi/aic7xxx/0 | 
|  | 1220 |  | 
|  | 1221 | Adaptec AIC7xxx driver version: 5.1.19/3.2.4 | 
|  | 1222 | Compile Options: | 
|  | 1223 | TCQ Enabled By Default : Disabled | 
|  | 1224 | AIC7XXX_PROC_STATS     : Disabled | 
|  | 1225 | AIC7XXX_RESET_DELAY    : 5 | 
|  | 1226 | Adapter Configuration: | 
|  | 1227 | SCSI Adapter: Adaptec AHA-294X Ultra SCSI host adapter | 
|  | 1228 | Ultra Wide Controller | 
|  | 1229 | PCI MMAPed I/O Base: 0xeb001000 | 
|  | 1230 | Adapter SEEPROM Config: SEEPROM found and used. | 
|  | 1231 | Adaptec SCSI BIOS: Enabled | 
|  | 1232 | IRQ: 10 | 
|  | 1233 | SCBs: Active 0, Max Active 2, | 
|  | 1234 | Allocated 15, HW 16, Page 255 | 
|  | 1235 | Interrupts: 160328 | 
|  | 1236 | BIOS Control Word: 0x18b6 | 
|  | 1237 | Adapter Control Word: 0x005b | 
|  | 1238 | Extended Translation: Enabled | 
|  | 1239 | Disconnect Enable Flags: 0xffff | 
|  | 1240 | Ultra Enable Flags: 0x0001 | 
|  | 1241 | Tag Queue Enable Flags: 0x0000 | 
|  | 1242 | Ordered Queue Tag Flags: 0x0000 | 
|  | 1243 | Default Tag Queue Depth: 8 | 
|  | 1244 | Tagged Queue By Device array for aic7xxx host instance 0: | 
|  | 1245 | {255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255} | 
|  | 1246 | Actual queue depth per device for aic7xxx host instance 0: | 
|  | 1247 | {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1} | 
|  | 1248 | Statistics: | 
|  | 1249 | (scsi0:0:0:0) | 
|  | 1250 | Device using Wide/Sync transfers at 40.0 MByte/sec, offset 8 | 
|  | 1251 | Transinfo settings: current(12/8/1/0), goal(12/8/1/0), user(12/15/1/0) | 
|  | 1252 | Total transfers 160151 (74577 reads and 85574 writes) | 
|  | 1253 | (scsi0:0:6:0) | 
|  | 1254 | Device using Narrow/Sync transfers at 5.0 MByte/sec, offset 15 | 
|  | 1255 | Transinfo settings: current(50/15/0/0), goal(50/15/0/0), user(50/15/0/0) | 
|  | 1256 | Total transfers 0 (0 reads and 0 writes) | 
|  | 1257 |  | 
|  | 1258 |  | 
|  | 1259 | 1.6 Parallel port info in /proc/parport | 
|  | 1260 | --------------------------------------- | 
|  | 1261 |  | 
|  | 1262 | The directory  /proc/parport  contains information about the parallel ports of | 
|  | 1263 | your system.  It  has  one  subdirectory  for  each port, named after the port | 
|  | 1264 | number (0,1,2,...). | 
|  | 1265 |  | 
|  | 1266 | These directories contain the four files shown in Table 1-10. | 
|  | 1267 |  | 
|  | 1268 |  | 
|  | 1269 | Table 1-10: Files in /proc/parport | 
|  | 1270 | .............................................................................. | 
|  | 1271 | File      Content | 
|  | 1272 | autoprobe Any IEEE-1284 device ID information that has been acquired. | 
|  | 1273 | devices   list of the device drivers using that port. A + will appear by the | 
|  | 1274 | name of the device currently using the port (it might not appear | 
|  | 1275 | against any). | 
|  | 1276 | hardware  Parallel port's base address, IRQ line and DMA channel. | 
|  | 1277 | irq       IRQ that parport is using for that port. This is in a separate | 
|  | 1278 | file to allow you to alter it by writing a new value in (IRQ | 
|  | 1279 | number or none). | 
|  | 1280 | .............................................................................. | 
|  | 1281 |  | 
|  | 1282 | 1.7 TTY info in /proc/tty | 
|  | 1283 | ------------------------- | 
|  | 1284 |  | 
|  | 1285 | Information about  the  available  and actually used tty's can be found in the | 
|  | 1286 | directory /proc/tty.You'll  find  entries  for drivers and line disciplines in | 
|  | 1287 | this directory, as shown in Table 1-11. | 
|  | 1288 |  | 
|  | 1289 |  | 
|  | 1290 | Table 1-11: Files in /proc/tty | 
|  | 1291 | .............................................................................. | 
|  | 1292 | File          Content | 
|  | 1293 | drivers       list of drivers and their usage | 
|  | 1294 | ldiscs        registered line disciplines | 
|  | 1295 | driver/serial usage statistic and status of single tty lines | 
|  | 1296 | .............................................................................. | 
|  | 1297 |  | 
|  | 1298 | To see  which  tty's  are  currently in use, you can simply look into the file | 
|  | 1299 | /proc/tty/drivers: | 
|  | 1300 |  | 
|  | 1301 | > cat /proc/tty/drivers | 
|  | 1302 | pty_slave            /dev/pts      136   0-255 pty:slave | 
|  | 1303 | pty_master           /dev/ptm      128   0-255 pty:master | 
|  | 1304 | pty_slave            /dev/ttyp       3   0-255 pty:slave | 
|  | 1305 | pty_master           /dev/pty        2   0-255 pty:master | 
|  | 1306 | serial               /dev/cua        5   64-67 serial:callout | 
|  | 1307 | serial               /dev/ttyS       4   64-67 serial | 
|  | 1308 | /dev/tty0            /dev/tty0       4       0 system:vtmaster | 
|  | 1309 | /dev/ptmx            /dev/ptmx       5       2 system | 
|  | 1310 | /dev/console         /dev/console    5       1 system:console | 
|  | 1311 | /dev/tty             /dev/tty        5       0 system:/dev/tty | 
|  | 1312 | unknown              /dev/tty        4    1-63 console | 
|  | 1313 |  | 
|  | 1314 |  | 
|  | 1315 | 1.8 Miscellaneous kernel statistics in /proc/stat | 
|  | 1316 | ------------------------------------------------- | 
|  | 1317 |  | 
|  | 1318 | Various pieces   of  information about  kernel activity  are  available in the | 
|  | 1319 | /proc/stat file.  All  of  the numbers reported  in  this file are  aggregates | 
|  | 1320 | since the system first booted.  For a quick look, simply cat the file: | 
|  | 1321 |  | 
|  | 1322 | > cat /proc/stat | 
|  | 1323 | cpu  2255 34 2290 22625563 6290 127 456 0 0 0 | 
|  | 1324 | cpu0 1132 34 1441 11311718 3675 127 438 0 0 0 | 
|  | 1325 | cpu1 1123 0 849 11313845 2614 0 18 0 0 0 | 
|  | 1326 | intr 114930548 113199788 3 0 5 263 0 4 [... lots more numbers ...] | 
|  | 1327 | ctxt 1990473 | 
|  | 1328 | btime 1062191376 | 
|  | 1329 | processes 2915 | 
|  | 1330 | procs_running 1 | 
|  | 1331 | procs_blocked 0 | 
|  | 1332 | softirq 183433 0 21755 12 39 1137 231 21459 2263 | 
|  | 1333 |  | 
|  | 1334 | The very first  "cpu" line aggregates the  numbers in all  of the other "cpuN" | 
|  | 1335 | lines.  These numbers identify the amount of time the CPU has spent performing | 
|  | 1336 | different kinds of work.  Time units are in USER_HZ (typically hundredths of a | 
|  | 1337 | second).  The meanings of the columns are as follows, from left to right: | 
|  | 1338 |  | 
|  | 1339 | - user: normal processes executing in user mode | 
|  | 1340 | - nice: niced processes executing in user mode | 
|  | 1341 | - system: processes executing in kernel mode | 
|  | 1342 | - idle: twiddling thumbs | 
|  | 1343 | - iowait: In a word, iowait stands for waiting for I/O to complete. But there | 
|  | 1344 | are several problems: | 
|  | 1345 | 1. Cpu will not wait for I/O to complete, iowait is the time that a task is | 
|  | 1346 | waiting for I/O to complete. When cpu goes into idle state for | 
|  | 1347 | outstanding task io, another task will be scheduled on this CPU. | 
|  | 1348 | 2. In a multi-core CPU, the task waiting for I/O to complete is not running | 
|  | 1349 | on any CPU, so the iowait of each CPU is difficult to calculate. | 
|  | 1350 | 3. The value of iowait field in /proc/stat will decrease in certain | 
|  | 1351 | conditions. | 
|  | 1352 | So, the iowait is not reliable by reading from /proc/stat. | 
|  | 1353 | - irq: servicing interrupts | 
|  | 1354 | - softirq: servicing softirqs | 
|  | 1355 | - steal: involuntary wait | 
|  | 1356 | - guest: running a normal guest | 
|  | 1357 | - guest_nice: running a niced guest | 
|  | 1358 |  | 
|  | 1359 | The "intr" line gives counts of interrupts  serviced since boot time, for each | 
|  | 1360 | of the  possible system interrupts.   The first  column  is the  total of  all | 
|  | 1361 | interrupts serviced  including  unnumbered  architecture specific  interrupts; | 
|  | 1362 | each  subsequent column is the  total for that particular numbered interrupt. | 
|  | 1363 | Unnumbered interrupts are not shown, only summed into the total. | 
|  | 1364 |  | 
|  | 1365 | The "ctxt" line gives the total number of context switches across all CPUs. | 
|  | 1366 |  | 
|  | 1367 | The "btime" line gives  the time at which the  system booted, in seconds since | 
|  | 1368 | the Unix epoch. | 
|  | 1369 |  | 
|  | 1370 | The "processes" line gives the number  of processes and threads created, which | 
|  | 1371 | includes (but  is not limited  to) those  created by  calls to the  fork() and | 
|  | 1372 | clone() system calls. | 
|  | 1373 |  | 
|  | 1374 | The "procs_running" line gives the total number of threads that are | 
|  | 1375 | running or ready to run (i.e., the total number of runnable threads). | 
|  | 1376 |  | 
|  | 1377 | The   "procs_blocked" line gives  the  number of  processes currently blocked, | 
|  | 1378 | waiting for I/O to complete. | 
|  | 1379 |  | 
|  | 1380 | The "softirq" line gives counts of softirqs serviced since boot time, for each | 
|  | 1381 | of the possible system softirqs. The first column is the total of all | 
|  | 1382 | softirqs serviced; each subsequent column is the total for that particular | 
|  | 1383 | softirq. | 
|  | 1384 |  | 
|  | 1385 |  | 
|  | 1386 | 1.9 Ext4 file system parameters | 
|  | 1387 | ------------------------------- | 
|  | 1388 |  | 
|  | 1389 | Information about mounted ext4 file systems can be found in | 
|  | 1390 | /proc/fs/ext4.  Each mounted filesystem will have a directory in | 
|  | 1391 | /proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or | 
|  | 1392 | /proc/fs/ext4/dm-0).   The files in each per-device directory are shown | 
|  | 1393 | in Table 1-12, below. | 
|  | 1394 |  | 
|  | 1395 | Table 1-12: Files in /proc/fs/ext4/<devname> | 
|  | 1396 | .............................................................................. | 
|  | 1397 | File            Content | 
|  | 1398 | mb_groups       details of multiblock allocator buddy cache of free blocks | 
|  | 1399 | .............................................................................. | 
|  | 1400 |  | 
|  | 1401 | 2.0 /proc/consoles | 
|  | 1402 | ------------------ | 
|  | 1403 | Shows registered system console lines. | 
|  | 1404 |  | 
|  | 1405 | To see which character device lines are currently used for the system console | 
|  | 1406 | /dev/console, you may simply look into the file /proc/consoles: | 
|  | 1407 |  | 
|  | 1408 | > cat /proc/consoles | 
|  | 1409 | tty0                 -WU (ECp)       4:7 | 
|  | 1410 | ttyS0                -W- (Ep)        4:64 | 
|  | 1411 |  | 
|  | 1412 | The columns are: | 
|  | 1413 |  | 
|  | 1414 | device               name of the device | 
|  | 1415 | operations           R = can do read operations | 
|  | 1416 | W = can do write operations | 
|  | 1417 | U = can do unblank | 
|  | 1418 | flags                E = it is enabled | 
|  | 1419 | C = it is preferred console | 
|  | 1420 | B = it is primary boot console | 
|  | 1421 | p = it is used for printk buffer | 
|  | 1422 | b = it is not a TTY but a Braille device | 
|  | 1423 | a = it is safe to use when cpu is offline | 
|  | 1424 | major:minor          major and minor number of the device separated by a colon | 
|  | 1425 |  | 
|  | 1426 | ------------------------------------------------------------------------------ | 
|  | 1427 | Summary | 
|  | 1428 | ------------------------------------------------------------------------------ | 
|  | 1429 | The /proc file system serves information about the running system. It not only | 
|  | 1430 | allows access to process data but also allows you to request the kernel status | 
|  | 1431 | by reading files in the hierarchy. | 
|  | 1432 |  | 
|  | 1433 | The directory  structure  of /proc reflects the types of information and makes | 
|  | 1434 | it easy, if not obvious, where to look for specific data. | 
|  | 1435 | ------------------------------------------------------------------------------ | 
|  | 1436 |  | 
|  | 1437 | ------------------------------------------------------------------------------ | 
|  | 1438 | CHAPTER 2: MODIFYING SYSTEM PARAMETERS | 
|  | 1439 | ------------------------------------------------------------------------------ | 
|  | 1440 |  | 
|  | 1441 | ------------------------------------------------------------------------------ | 
|  | 1442 | In This Chapter | 
|  | 1443 | ------------------------------------------------------------------------------ | 
|  | 1444 | * Modifying kernel parameters by writing into files found in /proc/sys | 
|  | 1445 | * Exploring the files which modify certain parameters | 
|  | 1446 | * Review of the /proc/sys file tree | 
|  | 1447 | ------------------------------------------------------------------------------ | 
|  | 1448 |  | 
|  | 1449 |  | 
|  | 1450 | A very  interesting part of /proc is the directory /proc/sys. This is not only | 
|  | 1451 | a source  of  information,  it also allows you to change parameters within the | 
|  | 1452 | kernel. Be  very  careful  when attempting this. You can optimize your system, | 
|  | 1453 | but you  can  also  cause  it  to  crash.  Never  alter kernel parameters on a | 
|  | 1454 | production system.  Set  up  a  development machine and test to make sure that | 
|  | 1455 | everything works  the  way  you want it to. You may have no alternative but to | 
|  | 1456 | reboot the machine once an error has been made. | 
|  | 1457 |  | 
|  | 1458 | To change  a  value,  simply  echo  the new value into the file. An example is | 
|  | 1459 | given below  in the section on the file system data. You need to be root to do | 
|  | 1460 | this. You  can  create  your  own  boot script to perform this every time your | 
|  | 1461 | system boots. | 
|  | 1462 |  | 
|  | 1463 | The files  in /proc/sys can be used to fine tune and monitor miscellaneous and | 
|  | 1464 | general things  in  the operation of the Linux kernel. Since some of the files | 
|  | 1465 | can inadvertently  disrupt  your  system,  it  is  advisable  to  read  both | 
|  | 1466 | documentation and  source  before actually making adjustments. In any case, be | 
|  | 1467 | very careful  when  writing  to  any  of these files. The entries in /proc may | 
|  | 1468 | change slightly between the 2.1.* and the 2.2 kernel, so if there is any doubt | 
|  | 1469 | review the kernel documentation in the directory /usr/src/linux/Documentation. | 
|  | 1470 | This chapter  is  heavily  based  on the documentation included in the pre 2.2 | 
|  | 1471 | kernels, and became part of it in version 2.2.1 of the Linux kernel. | 
|  | 1472 |  | 
|  | 1473 | Please see: Documentation/sysctl/ directory for descriptions of these | 
|  | 1474 | entries. | 
|  | 1475 |  | 
|  | 1476 | ------------------------------------------------------------------------------ | 
|  | 1477 | Summary | 
|  | 1478 | ------------------------------------------------------------------------------ | 
|  | 1479 | Certain aspects  of  kernel  behavior  can be modified at runtime, without the | 
|  | 1480 | need to  recompile  the kernel, or even to reboot the system. The files in the | 
|  | 1481 | /proc/sys tree  can  not only be read, but also modified. You can use the echo | 
|  | 1482 | command to write value into these files, thereby changing the default settings | 
|  | 1483 | of the kernel. | 
|  | 1484 | ------------------------------------------------------------------------------ | 
|  | 1485 |  | 
|  | 1486 | ------------------------------------------------------------------------------ | 
|  | 1487 | CHAPTER 3: PER-PROCESS PARAMETERS | 
|  | 1488 | ------------------------------------------------------------------------------ | 
|  | 1489 |  | 
|  | 1490 | 3.1 /proc/<pid>/oom_adj & /proc/<pid>/oom_score_adj- Adjust the oom-killer score | 
|  | 1491 | -------------------------------------------------------------------------------- | 
|  | 1492 |  | 
|  | 1493 | These file can be used to adjust the badness heuristic used to select which | 
|  | 1494 | process gets killed in out of memory conditions. | 
|  | 1495 |  | 
|  | 1496 | The badness heuristic assigns a value to each candidate task ranging from 0 | 
|  | 1497 | (never kill) to 1000 (always kill) to determine which process is targeted.  The | 
|  | 1498 | units are roughly a proportion along that range of allowed memory the process | 
|  | 1499 | may allocate from based on an estimation of its current memory and swap use. | 
|  | 1500 | For example, if a task is using all allowed memory, its badness score will be | 
|  | 1501 | 1000.  If it is using half of its allowed memory, its score will be 500. | 
|  | 1502 |  | 
|  | 1503 | There is an additional factor included in the badness score: the current memory | 
|  | 1504 | and swap usage is discounted by 3% for root processes. | 
|  | 1505 |  | 
|  | 1506 | The amount of "allowed" memory depends on the context in which the oom killer | 
|  | 1507 | was called.  If it is due to the memory assigned to the allocating task's cpuset | 
|  | 1508 | being exhausted, the allowed memory represents the set of mems assigned to that | 
|  | 1509 | cpuset.  If it is due to a mempolicy's node(s) being exhausted, the allowed | 
|  | 1510 | memory represents the set of mempolicy nodes.  If it is due to a memory | 
|  | 1511 | limit (or swap limit) being reached, the allowed memory is that configured | 
|  | 1512 | limit.  Finally, if it is due to the entire system being out of memory, the | 
|  | 1513 | allowed memory represents all allocatable resources. | 
|  | 1514 |  | 
|  | 1515 | The value of /proc/<pid>/oom_score_adj is added to the badness score before it | 
|  | 1516 | is used to determine which task to kill.  Acceptable values range from -1000 | 
|  | 1517 | (OOM_SCORE_ADJ_MIN) to +1000 (OOM_SCORE_ADJ_MAX).  This allows userspace to | 
|  | 1518 | polarize the preference for oom killing either by always preferring a certain | 
|  | 1519 | task or completely disabling it.  The lowest possible value, -1000, is | 
|  | 1520 | equivalent to disabling oom killing entirely for that task since it will always | 
|  | 1521 | report a badness score of 0. | 
|  | 1522 |  | 
|  | 1523 | Consequently, it is very simple for userspace to define the amount of memory to | 
|  | 1524 | consider for each task.  Setting a /proc/<pid>/oom_score_adj value of +500, for | 
|  | 1525 | example, is roughly equivalent to allowing the remainder of tasks sharing the | 
|  | 1526 | same system, cpuset, mempolicy, or memory controller resources to use at least | 
|  | 1527 | 50% more memory.  A value of -500, on the other hand, would be roughly | 
|  | 1528 | equivalent to discounting 50% of the task's allowed memory from being considered | 
|  | 1529 | as scoring against the task. | 
|  | 1530 |  | 
|  | 1531 | For backwards compatibility with previous kernels, /proc/<pid>/oom_adj may also | 
|  | 1532 | be used to tune the badness score.  Its acceptable values range from -16 | 
|  | 1533 | (OOM_ADJUST_MIN) to +15 (OOM_ADJUST_MAX) and a special value of -17 | 
|  | 1534 | (OOM_DISABLE) to disable oom killing entirely for that task.  Its value is | 
|  | 1535 | scaled linearly with /proc/<pid>/oom_score_adj. | 
|  | 1536 |  | 
|  | 1537 | The value of /proc/<pid>/oom_score_adj may be reduced no lower than the last | 
|  | 1538 | value set by a CAP_SYS_RESOURCE process. To reduce the value any lower | 
|  | 1539 | requires CAP_SYS_RESOURCE. | 
|  | 1540 |  | 
|  | 1541 | Caveat: when a parent task is selected, the oom killer will sacrifice any first | 
|  | 1542 | generation children with separate address spaces instead, if possible.  This | 
|  | 1543 | avoids servers and important system daemons from being killed and loses the | 
|  | 1544 | minimal amount of work. | 
|  | 1545 |  | 
|  | 1546 |  | 
|  | 1547 | 3.2 /proc/<pid>/oom_score - Display current oom-killer score | 
|  | 1548 | ------------------------------------------------------------- | 
|  | 1549 |  | 
|  | 1550 | This file can be used to check the current score used by the oom-killer is for | 
|  | 1551 | any given <pid>. Use it together with /proc/<pid>/oom_score_adj to tune which | 
|  | 1552 | process should be killed in an out-of-memory situation. | 
|  | 1553 |  | 
|  | 1554 |  | 
|  | 1555 | 3.3  /proc/<pid>/io - Display the IO accounting fields | 
|  | 1556 | ------------------------------------------------------- | 
|  | 1557 |  | 
|  | 1558 | This file contains IO statistics for each running process | 
|  | 1559 |  | 
|  | 1560 | Example | 
|  | 1561 | ------- | 
|  | 1562 |  | 
|  | 1563 | test:/tmp # dd if=/dev/zero of=/tmp/test.dat & | 
|  | 1564 | [1] 3828 | 
|  | 1565 |  | 
|  | 1566 | test:/tmp # cat /proc/3828/io | 
|  | 1567 | rchar: 323934931 | 
|  | 1568 | wchar: 323929600 | 
|  | 1569 | syscr: 632687 | 
|  | 1570 | syscw: 632675 | 
|  | 1571 | read_bytes: 0 | 
|  | 1572 | write_bytes: 323932160 | 
|  | 1573 | cancelled_write_bytes: 0 | 
|  | 1574 |  | 
|  | 1575 |  | 
|  | 1576 | Description | 
|  | 1577 | ----------- | 
|  | 1578 |  | 
|  | 1579 | rchar | 
|  | 1580 | ----- | 
|  | 1581 |  | 
|  | 1582 | I/O counter: chars read | 
|  | 1583 | The number of bytes which this task has caused to be read from storage. This | 
|  | 1584 | is simply the sum of bytes which this process passed to read() and pread(). | 
|  | 1585 | It includes things like tty IO and it is unaffected by whether or not actual | 
|  | 1586 | physical disk IO was required (the read might have been satisfied from | 
|  | 1587 | pagecache) | 
|  | 1588 |  | 
|  | 1589 |  | 
|  | 1590 | wchar | 
|  | 1591 | ----- | 
|  | 1592 |  | 
|  | 1593 | I/O counter: chars written | 
|  | 1594 | The number of bytes which this task has caused, or shall cause to be written | 
|  | 1595 | to disk. Similar caveats apply here as with rchar. | 
|  | 1596 |  | 
|  | 1597 |  | 
|  | 1598 | syscr | 
|  | 1599 | ----- | 
|  | 1600 |  | 
|  | 1601 | I/O counter: read syscalls | 
|  | 1602 | Attempt to count the number of read I/O operations, i.e. syscalls like read() | 
|  | 1603 | and pread(). | 
|  | 1604 |  | 
|  | 1605 |  | 
|  | 1606 | syscw | 
|  | 1607 | ----- | 
|  | 1608 |  | 
|  | 1609 | I/O counter: write syscalls | 
|  | 1610 | Attempt to count the number of write I/O operations, i.e. syscalls like | 
|  | 1611 | write() and pwrite(). | 
|  | 1612 |  | 
|  | 1613 |  | 
|  | 1614 | read_bytes | 
|  | 1615 | ---------- | 
|  | 1616 |  | 
|  | 1617 | I/O counter: bytes read | 
|  | 1618 | Attempt to count the number of bytes which this process really did cause to | 
|  | 1619 | be fetched from the storage layer. Done at the submit_bio() level, so it is | 
|  | 1620 | accurate for block-backed filesystems. <please add status regarding NFS and | 
|  | 1621 | CIFS at a later time> | 
|  | 1622 |  | 
|  | 1623 |  | 
|  | 1624 | write_bytes | 
|  | 1625 | ----------- | 
|  | 1626 |  | 
|  | 1627 | I/O counter: bytes written | 
|  | 1628 | Attempt to count the number of bytes which this process caused to be sent to | 
|  | 1629 | the storage layer. This is done at page-dirtying time. | 
|  | 1630 |  | 
|  | 1631 |  | 
|  | 1632 | cancelled_write_bytes | 
|  | 1633 | --------------------- | 
|  | 1634 |  | 
|  | 1635 | The big inaccuracy here is truncate. If a process writes 1MB to a file and | 
|  | 1636 | then deletes the file, it will in fact perform no writeout. But it will have | 
|  | 1637 | been accounted as having caused 1MB of write. | 
|  | 1638 | In other words: The number of bytes which this process caused to not happen, | 
|  | 1639 | by truncating pagecache. A task can cause "negative" IO too. If this task | 
|  | 1640 | truncates some dirty pagecache, some IO which another task has been accounted | 
|  | 1641 | for (in its write_bytes) will not be happening. We _could_ just subtract that | 
|  | 1642 | from the truncating task's write_bytes, but there is information loss in doing | 
|  | 1643 | that. | 
|  | 1644 |  | 
|  | 1645 |  | 
|  | 1646 | Note | 
|  | 1647 | ---- | 
|  | 1648 |  | 
|  | 1649 | At its current implementation state, this is a bit racy on 32-bit machines: if | 
|  | 1650 | process A reads process B's /proc/pid/io while process B is updating one of | 
|  | 1651 | those 64-bit counters, process A could see an intermediate result. | 
|  | 1652 |  | 
|  | 1653 |  | 
|  | 1654 | More information about this can be found within the taskstats documentation in | 
|  | 1655 | Documentation/accounting. | 
|  | 1656 |  | 
|  | 1657 | 3.4 /proc/<pid>/coredump_filter - Core dump filtering settings | 
|  | 1658 | --------------------------------------------------------------- | 
|  | 1659 | When a process is dumped, all anonymous memory is written to a core file as | 
|  | 1660 | long as the size of the core file isn't limited. But sometimes we don't want | 
|  | 1661 | to dump some memory segments, for example, huge shared memory or DAX. | 
|  | 1662 | Conversely, sometimes we want to save file-backed memory segments into a core | 
|  | 1663 | file, not only the individual files. | 
|  | 1664 |  | 
|  | 1665 | /proc/<pid>/coredump_filter allows you to customize which memory segments | 
|  | 1666 | will be dumped when the <pid> process is dumped. coredump_filter is a bitmask | 
|  | 1667 | of memory types. If a bit of the bitmask is set, memory segments of the | 
|  | 1668 | corresponding memory type are dumped, otherwise they are not dumped. | 
|  | 1669 |  | 
|  | 1670 | The following 9 memory types are supported: | 
|  | 1671 | - (bit 0) anonymous private memory | 
|  | 1672 | - (bit 1) anonymous shared memory | 
|  | 1673 | - (bit 2) file-backed private memory | 
|  | 1674 | - (bit 3) file-backed shared memory | 
|  | 1675 | - (bit 4) ELF header pages in file-backed private memory areas (it is | 
|  | 1676 | effective only if the bit 2 is cleared) | 
|  | 1677 | - (bit 5) hugetlb private memory | 
|  | 1678 | - (bit 6) hugetlb shared memory | 
|  | 1679 | - (bit 7) DAX private memory | 
|  | 1680 | - (bit 8) DAX shared memory | 
|  | 1681 |  | 
|  | 1682 | Note that MMIO pages such as frame buffer are never dumped and vDSO pages | 
|  | 1683 | are always dumped regardless of the bitmask status. | 
|  | 1684 |  | 
|  | 1685 | Note that bits 0-4 don't affect hugetlb or DAX memory. hugetlb memory is | 
|  | 1686 | only affected by bit 5-6, and DAX is only affected by bits 7-8. | 
|  | 1687 |  | 
|  | 1688 | The default value of coredump_filter is 0x33; this means all anonymous memory | 
|  | 1689 | segments, ELF header pages and hugetlb private memory are dumped. | 
|  | 1690 |  | 
|  | 1691 | If you don't want to dump all shared memory segments attached to pid 1234, | 
|  | 1692 | write 0x31 to the process's proc file. | 
|  | 1693 |  | 
|  | 1694 | $ echo 0x31 > /proc/1234/coredump_filter | 
|  | 1695 |  | 
|  | 1696 | When a new process is created, the process inherits the bitmask status from its | 
|  | 1697 | parent. It is useful to set up coredump_filter before the program runs. | 
|  | 1698 | For example: | 
|  | 1699 |  | 
|  | 1700 | $ echo 0x7 > /proc/self/coredump_filter | 
|  | 1701 | $ ./some_program | 
|  | 1702 |  | 
|  | 1703 | 3.5	/proc/<pid>/mountinfo - Information about mounts | 
|  | 1704 | -------------------------------------------------------- | 
|  | 1705 |  | 
|  | 1706 | This file contains lines of the form: | 
|  | 1707 |  | 
|  | 1708 | 36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue | 
|  | 1709 | (1)(2)(3)   (4)   (5)      (6)      (7)   (8) (9)   (10)         (11) | 
|  | 1710 |  | 
|  | 1711 | (1) mount ID:  unique identifier of the mount (may be reused after umount) | 
|  | 1712 | (2) parent ID:  ID of parent (or of self for the top of the mount tree) | 
|  | 1713 | (3) major:minor:  value of st_dev for files on filesystem | 
|  | 1714 | (4) root:  root of the mount within the filesystem | 
|  | 1715 | (5) mount point:  mount point relative to the process's root | 
|  | 1716 | (6) mount options:  per mount options | 
|  | 1717 | (7) optional fields:  zero or more fields of the form "tag[:value]" | 
|  | 1718 | (8) separator:  marks the end of the optional fields | 
|  | 1719 | (9) filesystem type:  name of filesystem of the form "type[.subtype]" | 
|  | 1720 | (10) mount source:  filesystem specific information or "none" | 
|  | 1721 | (11) super options:  per super block options | 
|  | 1722 |  | 
|  | 1723 | Parsers should ignore all unrecognised optional fields.  Currently the | 
|  | 1724 | possible optional fields are: | 
|  | 1725 |  | 
|  | 1726 | shared:X  mount is shared in peer group X | 
|  | 1727 | master:X  mount is slave to peer group X | 
|  | 1728 | propagate_from:X  mount is slave and receives propagation from peer group X (*) | 
|  | 1729 | unbindable  mount is unbindable | 
|  | 1730 |  | 
|  | 1731 | (*) X is the closest dominant peer group under the process's root.  If | 
|  | 1732 | X is the immediate master of the mount, or if there's no dominant peer | 
|  | 1733 | group under the same root, then only the "master:X" field is present | 
|  | 1734 | and not the "propagate_from:X" field. | 
|  | 1735 |  | 
|  | 1736 | For more information on mount propagation see: | 
|  | 1737 |  | 
|  | 1738 | Documentation/filesystems/sharedsubtree.txt | 
|  | 1739 |  | 
|  | 1740 |  | 
|  | 1741 | 3.6	/proc/<pid>/comm  & /proc/<pid>/task/<tid>/comm | 
|  | 1742 | -------------------------------------------------------- | 
|  | 1743 | These files provide a method to access a tasks comm value. It also allows for | 
|  | 1744 | a task to set its own or one of its thread siblings comm value. The comm value | 
|  | 1745 | is limited in size compared to the cmdline value, so writing anything longer | 
|  | 1746 | then the kernel's TASK_COMM_LEN (currently 16 chars) will result in a truncated | 
|  | 1747 | comm value. | 
|  | 1748 |  | 
|  | 1749 |  | 
|  | 1750 | 3.7	/proc/<pid>/task/<tid>/children - Information about task children | 
|  | 1751 | ------------------------------------------------------------------------- | 
|  | 1752 | This file provides a fast way to retrieve first level children pids | 
|  | 1753 | of a task pointed by <pid>/<tid> pair. The format is a space separated | 
|  | 1754 | stream of pids. | 
|  | 1755 |  | 
|  | 1756 | Note the "first level" here -- if a child has own children they will | 
|  | 1757 | not be listed here, one needs to read /proc/<children-pid>/task/<tid>/children | 
|  | 1758 | to obtain the descendants. | 
|  | 1759 |  | 
|  | 1760 | Since this interface is intended to be fast and cheap it doesn't | 
|  | 1761 | guarantee to provide precise results and some children might be | 
|  | 1762 | skipped, especially if they've exited right after we printed their | 
|  | 1763 | pids, so one need to either stop or freeze processes being inspected | 
|  | 1764 | if precise results are needed. | 
|  | 1765 |  | 
|  | 1766 |  | 
|  | 1767 | 3.8	/proc/<pid>/fdinfo/<fd> - Information about opened file | 
|  | 1768 | --------------------------------------------------------------- | 
|  | 1769 | This file provides information associated with an opened file. The regular | 
|  | 1770 | files have at least three fields -- 'pos', 'flags' and mnt_id. The 'pos' | 
|  | 1771 | represents the current offset of the opened file in decimal form [see lseek(2) | 
|  | 1772 | for details], 'flags' denotes the octal O_xxx mask the file has been | 
|  | 1773 | created with [see open(2) for details] and 'mnt_id' represents mount ID of | 
|  | 1774 | the file system containing the opened file [see 3.5 /proc/<pid>/mountinfo | 
|  | 1775 | for details]. | 
|  | 1776 |  | 
|  | 1777 | A typical output is | 
|  | 1778 |  | 
|  | 1779 | pos:	0 | 
|  | 1780 | flags:	0100002 | 
|  | 1781 | mnt_id:	19 | 
|  | 1782 |  | 
|  | 1783 | All locks associated with a file descriptor are shown in its fdinfo too. | 
|  | 1784 |  | 
|  | 1785 | lock:       1: FLOCK  ADVISORY  WRITE 359 00:13:11691 0 EOF | 
|  | 1786 |  | 
|  | 1787 | The files such as eventfd, fsnotify, signalfd, epoll among the regular pos/flags | 
|  | 1788 | pair provide additional information particular to the objects they represent. | 
|  | 1789 |  | 
|  | 1790 | Eventfd files | 
|  | 1791 | ~~~~~~~~~~~~~ | 
|  | 1792 | pos:	0 | 
|  | 1793 | flags:	04002 | 
|  | 1794 | mnt_id:	9 | 
|  | 1795 | eventfd-count:	5a | 
|  | 1796 |  | 
|  | 1797 | where 'eventfd-count' is hex value of a counter. | 
|  | 1798 |  | 
|  | 1799 | Signalfd files | 
|  | 1800 | ~~~~~~~~~~~~~~ | 
|  | 1801 | pos:	0 | 
|  | 1802 | flags:	04002 | 
|  | 1803 | mnt_id:	9 | 
|  | 1804 | sigmask:	0000000000000200 | 
|  | 1805 |  | 
|  | 1806 | where 'sigmask' is hex value of the signal mask associated | 
|  | 1807 | with a file. | 
|  | 1808 |  | 
|  | 1809 | Epoll files | 
|  | 1810 | ~~~~~~~~~~~ | 
|  | 1811 | pos:	0 | 
|  | 1812 | flags:	02 | 
|  | 1813 | mnt_id:	9 | 
|  | 1814 | tfd:        5 events:       1d data: ffffffffffffffff pos:0 ino:61af sdev:7 | 
|  | 1815 |  | 
|  | 1816 | where 'tfd' is a target file descriptor number in decimal form, | 
|  | 1817 | 'events' is events mask being watched and the 'data' is data | 
|  | 1818 | associated with a target [see epoll(7) for more details]. | 
|  | 1819 |  | 
|  | 1820 | The 'pos' is current offset of the target file in decimal form | 
|  | 1821 | [see lseek(2)], 'ino' and 'sdev' are inode and device numbers | 
|  | 1822 | where target file resides, all in hex format. | 
|  | 1823 |  | 
|  | 1824 | Fsnotify files | 
|  | 1825 | ~~~~~~~~~~~~~~ | 
|  | 1826 | For inotify files the format is the following | 
|  | 1827 |  | 
|  | 1828 | pos:	0 | 
|  | 1829 | flags:	02000000 | 
|  | 1830 | inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d | 
|  | 1831 |  | 
|  | 1832 | where 'wd' is a watch descriptor in decimal form, ie a target file | 
|  | 1833 | descriptor number, 'ino' and 'sdev' are inode and device where the | 
|  | 1834 | target file resides and the 'mask' is the mask of events, all in hex | 
|  | 1835 | form [see inotify(7) for more details]. | 
|  | 1836 |  | 
|  | 1837 | If the kernel was built with exportfs support, the path to the target | 
|  | 1838 | file is encoded as a file handle.  The file handle is provided by three | 
|  | 1839 | fields 'fhandle-bytes', 'fhandle-type' and 'f_handle', all in hex | 
|  | 1840 | format. | 
|  | 1841 |  | 
|  | 1842 | If the kernel is built without exportfs support the file handle won't be | 
|  | 1843 | printed out. | 
|  | 1844 |  | 
|  | 1845 | If there is no inotify mark attached yet the 'inotify' line will be omitted. | 
|  | 1846 |  | 
|  | 1847 | For fanotify files the format is | 
|  | 1848 |  | 
|  | 1849 | pos:	0 | 
|  | 1850 | flags:	02 | 
|  | 1851 | mnt_id:	9 | 
|  | 1852 | fanotify flags:10 event-flags:0 | 
|  | 1853 | fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003 | 
|  | 1854 | fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4 | 
|  | 1855 |  | 
|  | 1856 | where fanotify 'flags' and 'event-flags' are values used in fanotify_init | 
|  | 1857 | call, 'mnt_id' is the mount point identifier, 'mflags' is the value of | 
|  | 1858 | flags associated with mark which are tracked separately from events | 
|  | 1859 | mask. 'ino', 'sdev' are target inode and device, 'mask' is the events | 
|  | 1860 | mask and 'ignored_mask' is the mask of events which are to be ignored. | 
|  | 1861 | All in hex format. Incorporation of 'mflags', 'mask' and 'ignored_mask' | 
|  | 1862 | does provide information about flags and mask used in fanotify_mark | 
|  | 1863 | call [see fsnotify manpage for details]. | 
|  | 1864 |  | 
|  | 1865 | While the first three lines are mandatory and always printed, the rest is | 
|  | 1866 | optional and may be omitted if no marks created yet. | 
|  | 1867 |  | 
|  | 1868 | Timerfd files | 
|  | 1869 | ~~~~~~~~~~~~~ | 
|  | 1870 |  | 
|  | 1871 | pos:	0 | 
|  | 1872 | flags:	02 | 
|  | 1873 | mnt_id:	9 | 
|  | 1874 | clockid: 0 | 
|  | 1875 | ticks: 0 | 
|  | 1876 | settime flags: 01 | 
|  | 1877 | it_value: (0, 49406829) | 
|  | 1878 | it_interval: (1, 0) | 
|  | 1879 |  | 
|  | 1880 | where 'clockid' is the clock type and 'ticks' is the number of the timer expirations | 
|  | 1881 | that have occurred [see timerfd_create(2) for details]. 'settime flags' are | 
|  | 1882 | flags in octal form been used to setup the timer [see timerfd_settime(2) for | 
|  | 1883 | details]. 'it_value' is remaining time until the timer exiration. | 
|  | 1884 | 'it_interval' is the interval for the timer. Note the timer might be set up | 
|  | 1885 | with TIMER_ABSTIME option which will be shown in 'settime flags', but 'it_value' | 
|  | 1886 | still exhibits timer's remaining time. | 
|  | 1887 |  | 
|  | 1888 | 3.9	/proc/<pid>/map_files - Information about memory mapped files | 
|  | 1889 | --------------------------------------------------------------------- | 
|  | 1890 | This directory contains symbolic links which represent memory mapped files | 
|  | 1891 | the process is maintaining.  Example output: | 
|  | 1892 |  | 
|  | 1893 | | lr-------- 1 root root 64 Jan 27 11:24 333c600000-333c620000 -> /usr/lib64/ld-2.18.so | 
|  | 1894 | | lr-------- 1 root root 64 Jan 27 11:24 333c81f000-333c820000 -> /usr/lib64/ld-2.18.so | 
|  | 1895 | | lr-------- 1 root root 64 Jan 27 11:24 333c820000-333c821000 -> /usr/lib64/ld-2.18.so | 
|  | 1896 | | ... | 
|  | 1897 | | lr-------- 1 root root 64 Jan 27 11:24 35d0421000-35d0422000 -> /usr/lib64/libselinux.so.1 | 
|  | 1898 | | lr-------- 1 root root 64 Jan 27 11:24 400000-41a000 -> /usr/bin/ls | 
|  | 1899 |  | 
|  | 1900 | The name of a link represents the virtual memory bounds of a mapping, i.e. | 
|  | 1901 | vm_area_struct::vm_start-vm_area_struct::vm_end. | 
|  | 1902 |  | 
|  | 1903 | The main purpose of the map_files is to retrieve a set of memory mapped | 
|  | 1904 | files in a fast way instead of parsing /proc/<pid>/maps or | 
|  | 1905 | /proc/<pid>/smaps, both of which contain many more records.  At the same | 
|  | 1906 | time one can open(2) mappings from the listings of two processes and | 
|  | 1907 | comparing their inode numbers to figure out which anonymous memory areas | 
|  | 1908 | are actually shared. | 
|  | 1909 |  | 
|  | 1910 | 3.10	/proc/<pid>/timerslack_ns - Task timerslack value | 
|  | 1911 | --------------------------------------------------------- | 
|  | 1912 | This file provides the value of the task's timerslack value in nanoseconds. | 
|  | 1913 | This value specifies a amount of time that normal timers may be deferred | 
|  | 1914 | in order to coalesce timers and avoid unnecessary wakeups. | 
|  | 1915 |  | 
|  | 1916 | This allows a task's interactivity vs power consumption trade off to be | 
|  | 1917 | adjusted. | 
|  | 1918 |  | 
|  | 1919 | Writing 0 to the file will set the tasks timerslack to the default value. | 
|  | 1920 |  | 
|  | 1921 | Valid values are from 0 - ULLONG_MAX | 
|  | 1922 |  | 
|  | 1923 | An application setting the value must have PTRACE_MODE_ATTACH_FSCREDS level | 
|  | 1924 | permissions on the task specified to change its timerslack_ns value. | 
|  | 1925 |  | 
|  | 1926 | 3.11	/proc/<pid>/patch_state - Livepatch patch operation state | 
|  | 1927 | ----------------------------------------------------------------- | 
|  | 1928 | When CONFIG_LIVEPATCH is enabled, this file displays the value of the | 
|  | 1929 | patch state for the task. | 
|  | 1930 |  | 
|  | 1931 | A value of '-1' indicates that no patch is in transition. | 
|  | 1932 |  | 
|  | 1933 | A value of '0' indicates that a patch is in transition and the task is | 
|  | 1934 | unpatched.  If the patch is being enabled, then the task hasn't been | 
|  | 1935 | patched yet.  If the patch is being disabled, then the task has already | 
|  | 1936 | been unpatched. | 
|  | 1937 |  | 
|  | 1938 | A value of '1' indicates that a patch is in transition and the task is | 
|  | 1939 | patched.  If the patch is being enabled, then the task has already been | 
|  | 1940 | patched.  If the patch is being disabled, then the task hasn't been | 
|  | 1941 | unpatched yet. | 
|  | 1942 |  | 
|  | 1943 |  | 
|  | 1944 | ------------------------------------------------------------------------------ | 
|  | 1945 | Configuring procfs | 
|  | 1946 | ------------------------------------------------------------------------------ | 
|  | 1947 |  | 
|  | 1948 | 4.1	Mount options | 
|  | 1949 | --------------------- | 
|  | 1950 |  | 
|  | 1951 | The following mount options are supported: | 
|  | 1952 |  | 
|  | 1953 | hidepid=	Set /proc/<pid>/ access mode. | 
|  | 1954 | gid=		Set the group authorized to learn processes information. | 
|  | 1955 |  | 
|  | 1956 | hidepid=0 means classic mode - everybody may access all /proc/<pid>/ directories | 
|  | 1957 | (default). | 
|  | 1958 |  | 
|  | 1959 | hidepid=1 means users may not access any /proc/<pid>/ directories but their | 
|  | 1960 | own.  Sensitive files like cmdline, sched*, status are now protected against | 
|  | 1961 | other users.  This makes it impossible to learn whether any user runs | 
|  | 1962 | specific program (given the program doesn't reveal itself by its behaviour). | 
|  | 1963 | As an additional bonus, as /proc/<pid>/cmdline is unaccessible for other users, | 
|  | 1964 | poorly written programs passing sensitive information via program arguments are | 
|  | 1965 | now protected against local eavesdroppers. | 
|  | 1966 |  | 
|  | 1967 | hidepid=2 means hidepid=1 plus all /proc/<pid>/ will be fully invisible to other | 
|  | 1968 | users.  It doesn't mean that it hides a fact whether a process with a specific | 
|  | 1969 | pid value exists (it can be learned by other means, e.g. by "kill -0 $PID"), | 
|  | 1970 | but it hides process' uid and gid, which may be learned by stat()'ing | 
|  | 1971 | /proc/<pid>/ otherwise.  It greatly complicates an intruder's task of gathering | 
|  | 1972 | information about running processes, whether some daemon runs with elevated | 
|  | 1973 | privileges, whether other user runs some sensitive program, whether other users | 
|  | 1974 | run any program at all, etc. | 
|  | 1975 |  | 
|  | 1976 | gid= defines a group authorized to learn processes information otherwise | 
|  | 1977 | prohibited by hidepid=.  If you use some daemon like identd which needs to learn | 
|  | 1978 | information about processes information, just add identd to this group. |