| b.liu | e958203 | 2025-04-17 19:18:16 +0800 | [diff] [blame] | 1 | The QNX6 Filesystem | 
|  | 2 | =================== | 
|  | 3 |  | 
|  | 4 | The qnx6fs is used by newer QNX operating system versions. (e.g. Neutrino) | 
|  | 5 | It got introduced in QNX 6.4.0 and is used default since 6.4.1. | 
|  | 6 |  | 
|  | 7 | Option | 
|  | 8 | ====== | 
|  | 9 |  | 
|  | 10 | mmi_fs		Mount filesystem as used for example by Audi MMI 3G system | 
|  | 11 |  | 
|  | 12 | Specification | 
|  | 13 | ============= | 
|  | 14 |  | 
|  | 15 | qnx6fs shares many properties with traditional Unix filesystems. It has the | 
|  | 16 | concepts of blocks, inodes and directories. | 
|  | 17 | On QNX it is possible to create little endian and big endian qnx6 filesystems. | 
|  | 18 | This feature makes it possible to create and use a different endianness fs | 
|  | 19 | for the target (QNX is used on quite a range of embedded systems) platform | 
|  | 20 | running on a different endianness. | 
|  | 21 | The Linux driver handles endianness transparently. (LE and BE) | 
|  | 22 |  | 
|  | 23 | Blocks | 
|  | 24 | ------ | 
|  | 25 |  | 
|  | 26 | The space in the device or file is split up into blocks. These are a fixed | 
|  | 27 | size of 512, 1024, 2048 or 4096, which is decided when the filesystem is | 
|  | 28 | created. | 
|  | 29 | Blockpointers are 32bit, so the maximum space that can be addressed is | 
|  | 30 | 2^32 * 4096 bytes or 16TB | 
|  | 31 |  | 
|  | 32 | The superblocks | 
|  | 33 | --------------- | 
|  | 34 |  | 
|  | 35 | The superblock contains all global information about the filesystem. | 
|  | 36 | Each qnx6fs got two superblocks, each one having a 64bit serial number. | 
|  | 37 | That serial number is used to identify the "active" superblock. | 
|  | 38 | In write mode with reach new snapshot (after each synchronous write), the | 
|  | 39 | serial of the new master superblock is increased (old superblock serial + 1) | 
|  | 40 |  | 
|  | 41 | So basically the snapshot functionality is realized by an atomic final | 
|  | 42 | update of the serial number. Before updating that serial, all modifications | 
|  | 43 | are done by copying all modified blocks during that specific write request | 
|  | 44 | (or period) and building up a new (stable) filesystem structure under the | 
|  | 45 | inactive superblock. | 
|  | 46 |  | 
|  | 47 | Each superblock holds a set of root inodes for the different filesystem | 
|  | 48 | parts. (Inode, Bitmap and Longfilenames) | 
|  | 49 | Each of these root nodes holds information like total size of the stored | 
|  | 50 | data and the addressing levels in that specific tree. | 
|  | 51 | If the level value is 0, up to 16 direct blocks can be addressed by each | 
|  | 52 | node. | 
|  | 53 | Level 1 adds an additional indirect addressing level where each indirect | 
|  | 54 | addressing block holds up to blocksize / 4 bytes pointers to data blocks. | 
|  | 55 | Level 2 adds an additional indirect addressing block level (so, already up | 
|  | 56 | to 16 * 256 * 256 = 1048576 blocks that can be addressed by such a tree). | 
|  | 57 |  | 
|  | 58 | Unused block pointers are always set to ~0 - regardless of root node, | 
|  | 59 | indirect addressing blocks or inodes. | 
|  | 60 | Data leaves are always on the lowest level. So no data is stored on upper | 
|  | 61 | tree levels. | 
|  | 62 |  | 
|  | 63 | The first Superblock is located at 0x2000. (0x2000 is the bootblock size) | 
|  | 64 | The Audi MMI 3G first superblock directly starts at byte 0. | 
|  | 65 | Second superblock position can either be calculated from the superblock | 
|  | 66 | information (total number of filesystem blocks) or by taking the highest | 
|  | 67 | device address, zeroing the last 3 bytes and then subtracting 0x1000 from | 
|  | 68 | that address. | 
|  | 69 |  | 
|  | 70 | 0x1000 is the size reserved for each superblock - regardless of the | 
|  | 71 | blocksize of the filesystem. | 
|  | 72 |  | 
|  | 73 | Inodes | 
|  | 74 | ------ | 
|  | 75 |  | 
|  | 76 | Each object in the filesystem is represented by an inode. (index node) | 
|  | 77 | The inode structure contains pointers to the filesystem blocks which contain | 
|  | 78 | the data held in the object and all of the metadata about an object except | 
|  | 79 | its longname. (filenames longer than 27 characters) | 
|  | 80 | The metadata about an object includes the permissions, owner, group, flags, | 
|  | 81 | size, number of blocks used, access time, change time and modification time. | 
|  | 82 |  | 
|  | 83 | Object mode field is POSIX format. (which makes things easier) | 
|  | 84 |  | 
|  | 85 | There are also pointers to the first 16 blocks, if the object data can be | 
|  | 86 | addressed with 16 direct blocks. | 
|  | 87 | For more than 16 blocks an indirect addressing in form of another tree is | 
|  | 88 | used. (scheme is the same as the one used for the superblock root nodes) | 
|  | 89 |  | 
|  | 90 | The filesize is stored 64bit. Inode counting starts with 1. (while long | 
|  | 91 | filename inodes start with 0) | 
|  | 92 |  | 
|  | 93 | Directories | 
|  | 94 | ----------- | 
|  | 95 |  | 
|  | 96 | A directory is a filesystem object and has an inode just like a file. | 
|  | 97 | It is a specially formatted file containing records which associate each | 
|  | 98 | name with an inode number. | 
|  | 99 | '.' inode number points to the directory inode | 
|  | 100 | '..' inode number points to the parent directory inode | 
|  | 101 | Eeach filename record additionally got a filename length field. | 
|  | 102 |  | 
|  | 103 | One special case are long filenames or subdirectory names. | 
|  | 104 | These got set a filename length field of 0xff in the corresponding directory | 
|  | 105 | record plus the longfile inode number also stored in that record. | 
|  | 106 | With that longfilename inode number, the longfilename tree can be walked | 
|  | 107 | starting with the superblock longfilename root node pointers. | 
|  | 108 |  | 
|  | 109 | Special files | 
|  | 110 | ------------- | 
|  | 111 |  | 
|  | 112 | Symbolic links are also filesystem objects with inodes. They got a specific | 
|  | 113 | bit in the inode mode field identifying them as symbolic link. | 
|  | 114 | The directory entry file inode pointer points to the target file inode. | 
|  | 115 |  | 
|  | 116 | Hard links got an inode, a directory entry, but a specific mode bit set, | 
|  | 117 | no block pointers and the directory file record pointing to the target file | 
|  | 118 | inode. | 
|  | 119 |  | 
|  | 120 | Character and block special devices do not exist in QNX as those files | 
|  | 121 | are handled by the QNX kernel/drivers and created in /dev independent of the | 
|  | 122 | underlaying filesystem. | 
|  | 123 |  | 
|  | 124 | Long filenames | 
|  | 125 | -------------- | 
|  | 126 |  | 
|  | 127 | Long filenames are stored in a separate addressing tree. The staring point | 
|  | 128 | is the longfilename root node in the active superblock. | 
|  | 129 | Each data block (tree leaves) holds one long filename. That filename is | 
|  | 130 | limited to 510 bytes. The first two starting bytes are used as length field | 
|  | 131 | for the actual filename. | 
|  | 132 | If that structure shall fit for all allowed blocksizes, it is clear why there | 
|  | 133 | is a limit of 510 bytes for the actual filename stored. | 
|  | 134 |  | 
|  | 135 | Bitmap | 
|  | 136 | ------ | 
|  | 137 |  | 
|  | 138 | The qnx6fs filesystem allocation bitmap is stored in a tree under bitmap | 
|  | 139 | root node in the superblock and each bit in the bitmap represents one | 
|  | 140 | filesystem block. | 
|  | 141 | The first block is block 0, which starts 0x1000 after superblock start. | 
|  | 142 | So for a normal qnx6fs 0x3000 (bootblock + superblock) is the physical | 
|  | 143 | address at which block 0 is located. | 
|  | 144 |  | 
|  | 145 | Bits at the end of the last bitmap block are set to 1, if the device is | 
|  | 146 | smaller than addressing space in the bitmap. | 
|  | 147 |  | 
|  | 148 | Bitmap system area | 
|  | 149 | ------------------ | 
|  | 150 |  | 
|  | 151 | The bitmap itself is divided into three parts. | 
|  | 152 | First the system area, that is split into two halves. | 
|  | 153 | Then userspace. | 
|  | 154 |  | 
|  | 155 | The requirement for a static, fixed preallocated system area comes from how | 
|  | 156 | qnx6fs deals with writes. | 
|  | 157 | Each superblock got it's own half of the system area. So superblock #1 | 
|  | 158 | always uses blocks from the lower half while superblock #2 just writes to | 
|  | 159 | blocks represented by the upper half bitmap system area bits. | 
|  | 160 |  | 
|  | 161 | Bitmap blocks, Inode blocks and indirect addressing blocks for those two | 
|  | 162 | tree structures are treated as system blocks. | 
|  | 163 |  | 
|  | 164 | The rational behind that is that a write request can work on a new snapshot | 
|  | 165 | (system area of the inactive - resp. lower serial numbered superblock) while | 
|  | 166 | at the same time there is still a complete stable filesystem structer in the | 
|  | 167 | other half of the system area. | 
|  | 168 |  | 
|  | 169 | When finished with writing (a sync write is completed, the maximum sync leap | 
|  | 170 | time or a filesystem sync is requested), serial of the previously inactive | 
|  | 171 | superblock atomically is increased and the fs switches over to that - then | 
|  | 172 | stable declared - superblock. | 
|  | 173 |  | 
|  | 174 | For all data outside the system area, blocks are just copied while writing. |