Denis Vlasenko | ec0c920 | 2006-11-26 17:07:38 +0000 | [diff] [blame] | 1 | 'pax headers' is POSIX 2003 (iirc) addition designed to fix |
| 2 | tar format limitations - older tar format has fixed fields |
| 3 | for everything (filename, uid, filesize etc) which can overflow. |
| 4 | |
| 5 | pax Header Block |
| 6 | |
| 7 | The pax header block shall be identical to the ustar header block |
| 8 | described in ustar Interchange Format, except that two additional |
| 9 | typeflag values are defined: |
| 10 | |
| 11 | x |
| 12 | Represents extended header records for the following file in |
| 13 | the archive (which shall have its own ustar header block). |
| 14 | |
| 15 | g |
| 16 | Represents global extended header records for the following |
| 17 | files in the archive. Each value shall affect all subsequent files |
| 18 | that do not override that value in their own extended header |
| 19 | record and until another global extended header record is reached |
| 20 | that provides another value for the same field. The typeflag g |
| 21 | global headers should not be used with interchange media that |
| 22 | could suffer partial data loss in transporting the archive. |
| 23 | |
| 24 | For both of these types, the size field shall be the size of the |
| 25 | extended header records in octets. The other fields in the header |
| 26 | block are not meaningful to this version of the pax utility. |
| 27 | However, if this archive is read by a pax utility conforming to |
| 28 | the ISO POSIX-2:1993 standard, the header block fields are used to |
| 29 | create a regular file that contains the extended header records as |
| 30 | data. Therefore, header block field values should be selected to |
| 31 | provide reasonable file access to this regular file. |
| 32 | |
| 33 | A further difference from the ustar header block is that data |
| 34 | blocks for files of typeflag 1 (the digit one) (hard link) may be |
| 35 | included, which means that the size field may be greater than |
| 36 | zero. |
| 37 | |
| 38 | pax Extended Header |
| 39 | |
| 40 | An extended header shall consist of one or more records, each |
| 41 | constructed as follows: |
| 42 | |
| 43 | "%d %s=%s\n", <length>, <keyword>, <value> |
| 44 | |
| 45 | The <length> field shall be the decimal length of the extended |
| 46 | header record in octets, including length string itself and the |
| 47 | trailing <newline>. |
| 48 | |
| 49 | [skip] |
| 50 | |
| 51 | atime |
| 52 | The file access time for the following file(s), equivalent to |
| 53 | the value of the st_atime member of the stat structure for a file, |
| 54 | as described by the stat() function. The access time shall be |
| 55 | restored if the process has the appropriate privilege required to |
| 56 | do so. The format of the <value> shall be as described in pax |
| 57 | Extended Header File Times. |
| 58 | |
| 59 | charset |
| 60 | The name of the character set used to encode the data in the |
| 61 | following file(s). |
| 62 | |
| 63 | The encoding is included in an extended header for information |
| 64 | only; when pax is used as described in IEEE Std 1003.1-2001, it |
| 65 | shall not translate the file data into any other encoding. The |
| 66 | BINARY entry indicates unencoded binary data. |
| 67 | |
| 68 | When used in write or copy mode, it is implementation-defined |
| 69 | whether pax includes a charset extended header record for a file. |
| 70 | |
| 71 | comment |
| 72 | A series of characters used as a comment. All characters in |
| 73 | the <value> field shall be ignored by pax. |
| 74 | |
| 75 | gid |
| 76 | The group ID of the group that owns the file, expressed as a |
| 77 | decimal number using digits from the ISO/IEC 646:1991 standard. |
| 78 | This record shall override the gid field in the following header |
| 79 | block(s). When used in write or copy mode, pax shall include a gid |
| 80 | extended header record for each file whose group ID is greater |
| 81 | than 2097151 (octal 7777777). |
| 82 | |
| 83 | gname |
| 84 | The group of the file(s), formatted as a group name in the |
| 85 | group database. This record shall override the gid and gname |
| 86 | fields in the following header block(s), and any gid extended |
| 87 | header record. When used in read, copy, or list mode, pax shall |
| 88 | translate the name from the UTF-8 encoding in the header record to |
| 89 | the character set appropriate for the group database on the |
| 90 | receiving system. If any of the UTF-8 characters cannot be |
| 91 | translated, and if the -o invalid= UTF-8 option is not specified, |
| 92 | the results are implementation-defined. When used in write or copy |
| 93 | mode, pax shall include a gname extended header record for each |
| 94 | file whose group name cannot be represented entirely with the |
| 95 | letters and digits of the portable character set. |
| 96 | |
| 97 | linkpath |
| 98 | The pathname of a link being created to another file, of any |
| 99 | type, previously archived. This record shall override the linkname |
| 100 | field in the following ustar header block(s). The following ustar |
| 101 | header block shall determine the type of link created. If typeflag |
| 102 | of the following header block is 1, it shall be a hard link. If |
| 103 | typeflag is 2, it shall be a symbolic link and the linkpath value |
| 104 | shall be the contents of the symbolic link. The pax utility shall |
| 105 | translate the name of the link (contents of the symbolic link) |
| 106 | from the UTF-8 encoding to the character set appropriate for the |
| 107 | local file system. When used in write or copy mode, pax shall |
| 108 | include a linkpath extended header record for each link whose |
| 109 | pathname cannot be represented entirely with the members of the |
| 110 | portable character set other than NUL. |
| 111 | |
| 112 | mtime |
| 113 | The file modification time of the following file(s), |
| 114 | equivalent to the value of the st_mtime member of the stat |
| 115 | structure for a file, as described in the stat() function. This |
| 116 | record shall override the mtime field in the following header |
| 117 | block(s). The modification time shall be restored if the process |
| 118 | has the appropriate privilege required to do so. The format of the |
| 119 | <value> shall be as described in pax Extended Header File Times. |
| 120 | |
| 121 | path |
| 122 | The pathname of the following file(s). This record shall |
| 123 | override the name and prefix fields in the following header |
| 124 | block(s). The pax utility shall translate the pathname of the file |
| 125 | from the UTF-8 encoding to the character set appropriate for the |
| 126 | local file system. |
| 127 | |
| 128 | When used in write or copy mode, pax shall include a path |
| 129 | extended header record for each file whose pathname cannot be |
| 130 | represented entirely with the members of the portable character |
| 131 | set other than NUL. |
| 132 | |
| 133 | realtime.any |
| 134 | The keywords prefixed by "realtime." are reserved for future |
| 135 | standardization. |
| 136 | |
| 137 | security.any |
| 138 | The keywords prefixed by "security." are reserved for future |
| 139 | standardization. |
| 140 | |
| 141 | size |
| 142 | The size of the file in octets, expressed as a decimal number |
| 143 | using digits from the ISO/IEC 646:1991 standard. This record shall |
| 144 | override the size field in the following header block(s). When |
| 145 | used in write or copy mode, pax shall include a size extended |
| 146 | header record for each file with a size value greater than |
| 147 | 8589934591 (octal 77777777777). |
| 148 | |
| 149 | uid |
| 150 | The user ID of the file owner, expressed as a decimal number |
| 151 | using digits from the ISO/IEC 646:1991 standard. This record shall |
| 152 | override the uid field in the following header block(s). When used |
| 153 | in write or copy mode, pax shall include a uid extended header |
| 154 | record for each file whose owner ID is greater than 2097151 (octal |
| 155 | 7777777). |
| 156 | |
| 157 | uname |
| 158 | The owner of the following file(s), formatted as a user name |
| 159 | in the user database. This record shall override the uid and uname |
| 160 | fields in the following header block(s), and any uid extended |
| 161 | header record. When used in read, copy, or list mode, pax shall |
| 162 | translate the name from the UTF-8 encoding in the header record to |
| 163 | the character set appropriate for the user database on the |
| 164 | receiving system. If any of the UTF-8 characters cannot be |
| 165 | translated, and if the -o invalid= UTF-8 option is not specified, |
| 166 | the results are implementation-defined. When used in write or copy |
| 167 | mode, pax shall include a uname extended header record for each |
| 168 | file whose user name cannot be represented entirely with the |
| 169 | letters and digits of the portable character set. |
| 170 | |
| 171 | If the <value> field is zero length, it shall delete any header |
| 172 | block field, previously entered extended header value, or global |
| 173 | extended header value of the same name. |
| 174 | |
| 175 | If a keyword in an extended header record (or in a -o |
| 176 | option-argument) overrides or deletes a corresponding field in the |
| 177 | ustar header block, pax shall ignore the contents of that header |
| 178 | block field. |
| 179 | |
| 180 | Unlike the ustar header block fields, NULs shall not delimit |
| 181 | <value>s; all characters within the <value> field shall be |
| 182 | considered data for the field. None of the length limitations of |
| 183 | the ustar header block fields in ustar Header Block shall apply to |
| 184 | the extended header records. |
| 185 | |
| 186 | pax Extended Header File Times |
| 187 | |
| 188 | Time records shall be formatted as a decimal representation of the |
| 189 | time in seconds since the Epoch. If a period ( '.' ) decimal point |
| 190 | character is present, the digits to the right of the point shall |
| 191 | represent the units of a subsecond timing granularity. In read or |
| 192 | copy mode, the pax utility shall truncate the time of a file to |
| 193 | the greatest value that is not greater than the input header |
| 194 | file time. In write or copy mode, the pax utility shall output a |
| 195 | time exactly if it can be represented exactly as a decimal number, |
| 196 | and otherwise shall generate only enough digits so that the same |
| 197 | time shall be recovered if the file is extracted on a system whose |
| 198 | underlying implementation supports the same time granularity. |
| 199 | |
| 200 | Example from Linux kernel archive tarball: |
| 201 | |
| 202 | 00000000 70 61 78 5f 67 6c 6f 62 61 6c 5f 68 65 61 64 65 |pax_global_heade| |
| 203 | 00000010 72 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |r...............| |
| 204 | 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 205 | 00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 206 | 00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 207 | 00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 208 | 00000060 00 00 00 00 30 30 30 30 36 36 36 00 30 30 30 30 |....0000666.0000| |
| 209 | 00000070 30 30 30 00 30 30 30 30 30 30 30 00 30 30 30 30 |000.0000000.0000| |
| 210 | 00000080 30 30 30 30 30 36 34 00 30 30 30 30 30 30 30 30 |0000064.00000000| |
| 211 | 00000090 30 30 30 00 30 30 31 34 30 35 33 00 67 00 00 00 |000.0014053.g...| |
| 212 | 000000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 213 | 000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 214 | 000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 215 | 000000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 216 | 000000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 217 | 000000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 218 | 00000100 00 75 73 74 61 72 00 30 30 67 69 74 00 00 00 00 |.ustar.00git....| |
| 219 | 00000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 220 | 00000120 00 00 00 00 00 00 00 00 00 67 69 74 00 00 00 00 |.........git....| |
| 221 | 00000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 222 | 00000140 00 00 00 00 00 00 00 00 00 30 30 30 30 30 30 30 |.........0000000| |
| 223 | 00000150 00 30 30 30 30 30 30 30 00 00 00 00 00 00 00 00 |.0000000........| |
| 224 | 00000160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 225 | 00000170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 226 | 00000180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 227 | 00000190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 228 | 000001a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 229 | 000001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 230 | 000001c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 231 | 000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 232 | 000001e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 233 | 000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 234 | 00000200 35 32 20 63 6f 6d 6d 65 6e 74 3d 62 31 30 35 30 |52 comment=b1050| |
| 235 | 00000210 32 62 32 32 61 31 32 30 39 64 36 62 34 37 36 33 |2b22a1209d6b4763| |
| 236 | 00000220 39 64 38 38 62 38 31 32 62 32 31 66 62 35 39 34 |9d88b812b21fb594| |
| 237 | 00000230 39 65 34 0a 00 00 00 00 00 00 00 00 00 00 00 00 |9e4.............| |
| 238 | 00000240 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| |
| 239 | ... |