-
Notifications
You must be signed in to change notification settings - Fork 1
LowLevel
This deals with low level disk operations and associated types.
You can retrieve information about the disk geometry using the geometry
subcommand:
a2kit geometry -d mydisk.imd
This will produce a JSON string with two keys, package
and tracks
. The package
value is a string with the nominal disk diameter, such as "5.25"
. The tracks
value is a list of track objects, each containing cylinder
, head
, flux_code
, nibble_code
, and chs_map
. The cylinder
and head
generally correspond to a geometric ordering, such as outer to inner. The chs_map
is the soft-sector address and size of each sector on the track, or the "geometric address" if the former is not applicable.
For some image types, like IMD, the geometry
subcommand should always succeed. For others, like WOZ, the disk geometry cannot always be solved, e.g., there could be a proprietary soft-sectoring scheme. In such cases a2kit
will return an error.
Information about the file system on a disk image is retrieved using the stat
subcommand:
a2kit stat -d mydisk.imd
The produces a JSON string with standard keys fs_name
, label
, users
, block_size
, block_beg
, block_end
, and free_blocks
. There is also a raw
key whose value is either null
or a further object dependent on the file system. For CP/M this will be the disk parameter block (DPB). For FAT file systems this will be the BIOS parameter block (BPB). This can return an error in cases where a proprietary, or unsupported, file system is on the disk.
File images are a way of abstracting any file that could exist on the file systems that a2kit
handles, including sparse files. When you want to specify that an item is a file image you use the any
type. As an example, suppose we have a binary file named thechip
containing the 4 byte sequence 6,5,0, and 2. We can get
the any
representation using
a2kit get -f thechip -t any -d mydos33.dsk --indent 4
Assuming console output, this would display
{
"fimg_version": "2.1.0",
"file_system": "a2 dos",
"chunk_len": 256,
"eof": "",
"fs_type": "04",
"aux": "",
"access": "",
"accessed": "",
"created": "",
"modified": "",
"version": "",
"min_version": "",
"full_path": "thechip",
"chunks": {
"0": "00030400060500020000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
}
}
For DOS, most of the metadata is empty. In this case there is only one "chunk," but generally there could be many. The same file retrieved from ProDOS would look different:
{
"fimg_version": "2.1.0",
"file_system": "prodos",
"chunk_len": 512,
"eof": "040000",
"fs_type": "06",
"aux": "0003",
"access": "E3",
"accessed": "",
"created": "842D1C0A",
"modified": "842D1C0A",
"version": "24",
"min_version": "00",
"full_path": "thechip",
"chunks": {
"0": "0605000200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
}
}
A few things to note:
- The file image is pure JSON, it can be processed with any JSON parser.
- Most value fields are hex strings. The interpretation of metadata depends on the file system, but wherever possible, the bytes are in direct correspondence with what is stored on disk.
- For DOS 3.3, the starting address and length of the data are in the first two words of chunk 0. This is a characteristic of the file system, not the file image representation.
- For ProDOS, the starting address is in the
aux
value, and the length is in theeof
value. - The chunk keys do not have to be in any kind of sequence, or even be numbers. The file system implementation must know how to interpret its own keys. Generally sequential files will have keys in an unbroken sequence, while sparse files will have "missing" keys.
- The
full_path
must start at the root directory. Leading slashes (FAT), user numbers (CP/M), and volume names (ProDOS) are optional.
You can pass file images through the a2kit
pipeline the same as any other object, but when writing it is required that the file_system
key match the file system found on the disk image. You can use an unpack
node to work around this requirement:
# the following is an error because Apple is on the left and MS-DOS is on the right
a2kit get -f apple_fimg.json | a2kit put -d msdos.imd -t any -f myfile
# inserting the unpack node and committing to a type allows the copy to proceed
a2kit get -f apple_fimg.json | a2kit unpack -t txt | a2kit put -d msdos.imd -t txt -f myfile.txt
The "chunks" that appear in a file image are nearly identical to the file system's allocation blocks. The only difference is that a chunk number is purely abstract, whereas a block number corresponds to a definite location on disk. The way the chunk numbers get mapped to physical sectors will depend on the particular disk the file image is restored to.
This allows you to look at or copy the track data that is stored with WOZ images. To get the bytes exactly as they are stored in the WOZ track buffer use
a2kit get -t raw_track -f 17,0 -d mydos33.woz
Here the -f
argument is <cylinder>,<head>
. You can also align the nibbles:
a2kit get -t track -f 17,0 -d mydos33.woz
The output for track
is easier to interpret, and also will display the following mnemonics alongside:
Mnemonic | Pattern | Meaning | Comment |
---|---|---|---|
> ... |
repeated $FF | sync bytes | false matches are possible |
(A: |
$D5 AA 96 | address prolog | 3.5 inch, 5.25 inch 16 sector |
(A: |
$D5 AA B5 | address prolog | 5.25 inch 13 sector |
(D: |
$D5 AA AD | data prolog | |
::) |
$DE AA EB | either epilog | 5.25 inch |
:) |
$DE AA | either epilog | 3.5 inch |
0 -F
|
address field | decoded 4&4 address nibbles | 5.25 inch |
0 -F or ^
|
address field | decoded 6&2 address nibbles | 3.5 inch, ^ means >15 |
R |
$D5 or $AA | reserved bytes | in case they appear outside prologs/epilogs |
? |
invalid nibble | possible bad track | OK in sync gaps |
If you want to copy a track, you can only use raw_track
, and the source buffer must be the same size as the destination buffer. The encoding of the source is carried over without modification to the destination.
Track operations are only for image types that store the detailed track data. If the image type does not have such data an error will be returned.
This allows you to read or write directly to physical sectors, without any need to identify a file system. The -f
argument is <cylinder>,<head>,<sector>
. For example
a2kit get -t sec -f 17,0,1 -d mydos33.woz
would read cylinder 17, side 0, sector 1, and display it to stdout. For images with the full track data, like WOZ images, there is no problem defining exactly what a physical sector is, i.e., the track is searched for the given address. For image types that rely on an assumed ordering, such as DSK, it cannot always be guaranteed that the retrieved sector corresponds to the sector address on the original (e.g. if the DSK is ProDOS ordered this will not work). In such cases you can use block operations to identify the file system and retrieve it's native allocation unit.
You can also write a sector by pipelining some data into put
:
# dangerous operation
a2kit get -f some_local_file | a2kit put -t sec -f 17,0,1 -d mydos33.woz
It is important to understand this is a blind write, there are no checks against breaking whatever file system is on the disk.
You can use block operations to get
or put
the file system's native allocation unit. The block is always identified by a single unsigned integer. For this reason, if the file system is sector-oriented (e.g. DOS 3.x), one needs the following formula to relate track and sector numbers to the block number:
// only needed for DOS 3.x
block_number = track_number * sectors_per_track + logical_sector_number;
In order to get a block, simply specify the block
"file type" and use the block number as the "path":
a2kit get -t block -f 272 -d mydos33.dsk
This will display the VTOC if the disk is a 5.25 inch DOS 3.3 floppy.
You can also write a block by pipelining some data into put
:
# dangerous operation
a2kit get -f some_local_file | a2kit put -t block -f 272 -d mydos33.dsk
It is important to understand this is a blind write, there are no checks against breaking whatever file system is on the disk.
You can access any number of sectors or blocks all at once using non-contiguous range specifications. Apart from being a CLI convenience, this can make an enormous difference in script performance. A non-contiguous region is specified by putting the ,,
separator between contiguous regions. Contiguous regions are formed using the ..
separator. As an example, suppose we want the boot tracks and directory track of a DOS disk. We can use
# range specifier has to be quoted in some shells
a2kit get -d mydos33.dsk -t sec -f 0..4,0,0..16,,17,0,0..16
Remember, the single comma separates cyl,head,sec
, while the double-comma separates contiguous regions. The range beg..end
iterates over beg,beg+1,...,end-1
. For blocks the notation is similar. The following would grab blocks 0,3,4,5,6,7:
# range specifier has to be quoted in some shells
a2kit get -d prodos.dsk -t block -f 0,,3..8
If a track has two or more sectors with matching addresses, one must use a sequence to retrieve the data unambiguously. The important point is that a2kit get
and a2kit put
will always perform a multi-sector operation in angle-order. Suppose we have track 0 laid out in angular sequence with sectors 1,2,3,2,5
. To unambiguously get, say, the instance of "sector 2" that follows sector 3, it is necessary to get two sectors worth of data:
# range specifier has to be quoted in some shells
a2kit get -d mydisk.imd -t sec -f 0,0,3,,0,0,2 # unambiguous
This must be done in a single invocation of the process, i.e., the following will not work:
a2kit get -d mydisk.imd -t sec -f 0,0,3
a2kit get -d mydisk.imd -t sec -f 0,0,2 # ambiguous
When doing this sort of processing, it may be helpful to retrieve the disk geometry first (see above).
- Don't confuse physical sectors with logical sectors. Logical sectors are never referenced except through blocks. See above for how Apple DOS blocks are defined.
- Don't confuse geometric order with physical order. Cylinder/head inputs are geometrically ordered. Physical addresses are defined by each track's CHS map.
- Sector access is not allowed for logical volumes, such as PO images.
- CP/M reserved tracks are only accessible by sector. The user area is accessible by either sector or block.
- For FAT disks, the reserved, FAT, and root directory regions are only accessible by sector. The data region is accessible by either sector or block.
- Do not use sector access on an Apple DSK unless it is known to be DOS ordered.
- If you pipe the result of a multi-sector or multi-block
get
, the receiving node (probablyput
) must use the same range specification. N.b. a range mismatch that happens to match in size will fail silently.