FileTypes

High Level File Types

The following are the high level file types defined by a2kit. Specifying one type or the other can affect how the data is packed in the allocation blocks and/or how the data is encoded. Applicability to specific file systems is noted at the end of each section.

Raw Files

The raw type treats a file's allocation blocks as a byte stream, with no interpretation at all. If the file is sparse, a2kit will concatenate the allocated blocks (sparse files can be handled fully using the any type). You can ask for the byte stream to be truncated at the EOF using the --trunc flag. The exact effect of --trunc depends on how metadata is stored in the file system's directory.

When a raw file is saved to a file system that maintains its own type codes (e.g. ProDOS), the file will be saved as text, regardless of contents. This can be changed after the fact using the retype subcommand.

The raw type works with any file system.

Binary Files

The a2kit type code for a binary file is bin.

Some file systems define a load-address. When working with binary files, a2kit passes only the data between pipeline nodes, i.e., the load-address is discarded. As a result, for some file systems, the load-address has to be specified anew for each node in the pipeline. You can pass the full information about a file through the pipeline using the any type, see the low level page.

The bin type works with any file system.

Raw vs. Binary

While bin and raw both work with binary data, the packing strategy is different. Refer to the following table for the various distinctions.

Property	raw	bin
`get` truncates at EOF	with `--trunc` maybe	always
`get` strips header	never	always
`put` sets EOF	always	always
`put` adds header	never	always
`put` maps to FS type	text	binary

Further details depend on the file system.

BASIC Language Files

BASIC is an interpreted language, but the representation on an Apple II disk is usually tokenized, i.e., keywords and other symbols are represented by a single byte. This byte is chosen to have no overlap with whatever text encoding is chosen. When processing BASIC files you have to know whether you are dealing with a "source" file (almost always this will be found on the local file system, and not on a disk image), or a tokenized file. The a2kit type codes are as follows:

BASIC	format	type code
Applesoft	source	atxt
Applesoft	tokens	atok
Integer	source	itxt
Integer	tokens	itok

It is important to realize a2kit will not automatically tokenize the source or detokenize the tokens. You have to insert a pipeline node if you want to do this. See Languages for examples.

Applesoft and Integer BASIC are specific to DOS and ProDOS.

Assembly Language Files

Some assembly language source files, notably Merlin, are "tokenized" in the sense that they are not simple ASCII. The a2kit type codes are as follows:

Assembler	format	type code
Merlin	local source	mtxt
Merlin	disk image source	mtok
Merlin	assembled executable	bin

Just as with BASIC program files, a2kit will not automatically "tokenize" local source files or "detokenize" source files taken from a disk image. You have to insert a pipeline node if you want to do this. See Languages for examples.

Merlin assembly language is specific to DOS and ProDOS.

Sequential Text

Text files on disk images may be encoded differently from those on the local file system. For example, DOS 3.3 text files are negative ASCII and use carriage returns as line separators. When using get and put the encoding is automatically converted if the file type is txt. If you want to preserve the original encoding, use the raw type. As an example,

a2kit get -f mytext -t txt -d img.dsk

will display readable text. On the other hand,

a2kit get -f mytext -t raw -d img.dsk --trunc

will display a hex dump. When forming a pipeline use txt unless you have a specific reason to use raw.

Sequential text works with any file system.

Random Access Text

Random access text files have to be manipulated with the aid of a JSON representation which is assigned the abstract file type rec (records). For example, to get a random access file from a ProDOS image use

a2kit get -f myrecords -t rec -d img.dsk

For DOS 3.3 the record length has to be given:

a2kit get -f myrecords -t rec -d img.dsk -l 127

This will display a JSON string representing the file. It might look something like this:

{
    "a2kit_type": "rec",
    "record_length": 127,
    "records": {
        "5": ["field1","field2"],
        "2": ["field"]
    }
}

Notice the record numbers do not have to be in any sequence or order. This structure can be passed along the pipeline, for example, you can copy the records from a DOS 3.3 disk to a ProDOS disk:

a2kit get -f myrecords -t rec -d img.dsk -l 127 | a2kit put -f myrecords -t rec -d img.po

If you want to put a local file as random access text, it must be in the JSON representation. N.b. this is different from the other basic file types where the source can be a simple binary or text file.

Random access text works with DOS and ProDOS. a2kit does not yet support CP/M random access text files. You can, however, use the any type to manipulate an arbitrary sparse file, including on CP/M.

Other Types

You can always use the retype subcommand to change one of the above types into any other type, e.g., you can change a binary file into a ProDOS system file. Of course the contents of the file must be consistent with the requirements of the type. If the block-wise storage pattern needs to be controlled you can use the any representation, see low level.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly