Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tools for working with or creating MBR-formatted disk images #14

Open
4 of 5 tasks
reynir opened this issue Mar 7, 2023 · 29 comments · May be fixed by #28
Open
4 of 5 tasks

Tools for working with or creating MBR-formatted disk images #14

reynir opened this issue Mar 7, 2023 · 29 comments · May be fixed by #28
Assignees
Labels

Comments

@reynir
Copy link
Member

reynir commented Mar 7, 2023

It would be nice to develop one or more tools to:

  • Inspect a MBR header's content. For example printing out what partions it contains, their size and type etc.
  • Retrieve data contents of partitions for example to stdout or a file
    • Write data contents from a file or stdin to a partition
  • Resizing partitions in a disk image
  • Create a MBR formatted disk image with contents from files passed as command line arguments. Padding and empty sections may optionally be passed.
@reynir reynir added Outreachy good-first-issues medium A medium difficulty issue labels Mar 7, 2023
@PizieDust
Copy link
Contributor

PizieDust commented Mar 7, 2023

Hello @reynir
It will be my pleasure to try this issue. Please can I be assigned here?

@reynir
Copy link
Member Author

reynir commented Mar 9, 2023

@PizieDust what would you like to work on? I suggest working on one of the two first as they are not as difficult as the latter two. It may be helpful to look at the wikipedia article. We implement the "modern standard MBR". https://en.wikipedia.org/wiki/Master_boot_record

To create a new binary executable you can use dune init exec --libs mbr mbr_inspect bin/. This will create a directory bin/ with files mbr_inspect.ml and dune. You write the code in mbr_inspect.ml.

@PizieDust
Copy link
Contributor

Hi @reynir , I apologize for the delay on this issue. This is my first time working with functional programming.
For this task, will it be better to create a new .ml file in the lib directory or is it okay to modify the code in mbr.ml to achieve the task.

@PizieDust
Copy link
Contributor

@PizieDust what would you like to work on? I suggest working on one of the two first as they are not as difficult as the latter two. It may be helpful to look at the wikipedia article. We implement the "modern standard MBR". https://en.wikipedia.org/wiki/Master_boot_record

To create a new binary executable you can use dune init exec --libs mbr mbr_inspect bin/. This will create a directory bin/ with files mbr_inspect.ml and dune. You write the code in mbr_inspect.ml.

Okay thank you. This is very helpful.

@PizieDust
Copy link
Contributor

@reynir
So, for the first task, my understanding is that we need a function or a module that can receive as input a MBR header and read it's contents which are:

  bootstrap_code : string;
  original_physical_drive : int;
  seconds : int;
  minutes : int;
  hours : int;
  disk_signature : int32;
  partitions : Partition.t list;

or for the modern mbr standard which contains the fields:

  bootstrap_code1 : uint8_t; [@len 218]
  _zeroes_1 : uint8_t; [@len 2]
  original_physical_drive : uint8_t;
  seconds : uint8_t;
  minutes : uint8_t;
  hours : uint8_t;
  bootstrap_code2 : uint8_t; [@len 216]
  disk_signature : uint32_t;
  _zeroes_2 : uint8_t; [@len 2]
  partitions : uint8_t; [@len 64]
  signature1 : uint8_t; (* 0x55 *)
  signature2 : uint8_t; (* 0xaa *)

and for each of the partitions (max 4), we also iterate over them and read the fields which are:

    active : bool;
    first_absolute_sector_chs : Geometry.t;
    ty : int;
    last_absolute_sector_chs : Geometry.t;
    first_absolute_sector_lba : int32;
    sectors : int32;

In the file lib/mbr.ml, it is my understanding the function unmarshalfunction converts the MBR header from a cstruct to an ocaml record.

So my question is, in the implementation of mbr_inspect.ml, can we use this unmarshal function to first parse the mbr header into an ocaml record and then access the fields of the record and print them?

we also have an unmarshal function in the partition module.

I'm trying to figure out the ocaml syntax on how to achieve this. hope I am in the right direction?

@PizieDust
Copy link
Contributor

PizieDust commented Mar 9, 2023

For the second task, here is a basic implementation I have (pardon the wrong syntax)

let print_partition partition =
     Printf.printf "Active: partition.Partition.active;
     Printf.printf "Type: partition.Partition.ty;
...

@reynir
Copy link
Member Author

reynir commented Mar 10, 2023

Yes, you're on the right track. Use the unmarshal to parse the MBR header from a cstruct. A Cstruct.t is a kind of buffer. You can use Cstruct.of_string to convert a string to a Cstruct.t if you're reading strings.

The difference between the type t and type mbr is the latter is very close to the disk layout and includes attributes used for code generation using the "ppx" mechanism. The generated code helps parse the raw modern standard MBR format. You don't have to worry much about that for now. The former is an OCaml record as you note. It is a higher level representation. The zeroes are left out, and the bootstrap code is concatenated. In other words we hide away some unimportant details of the underlying representation.

@PizieDust
Copy link
Contributor

PizieDust commented Mar 10, 2023

@reynir I'm trying to create a partition with the code (which will be saved in a file) and then use this partition to test the code for this task, but I keep having errors with creating the partition. Here is the code I am using:

    let disk_length_bytes = Int32.(mul (mul 16l 1024l) 1024l) in
    let disk_length_sectors = Int32.(div disk_length_bytes 512l) in
    let start_sector = 2048l in
    let length_sectors = Int32.sub disk_length_sectors start_sector in
    let partition1 = Mbr.Partition.make ~active:true ~ty:6 start_sector length_sectors in
    let mbr = Mbr.make [ partition1 ] in
    match mbr with
    | Ok -> print_endline "MBR created"
    | Error msg -> Printf.printf "MBR failed %s\n" msg

running dune build gives the error:

11| let mbr = Mbr.make [ partition1 ] in (* partitions is underlined *)
Error: This expression has type (Mbr.Partition.t, string) result but an expression was expected of type Mbr.Partition.t

Please what could I be doing wrongly?

@reynir
Copy link
Member Author

reynir commented Mar 10, 2023

Edit: I forgot to explain the error. The error is due to Mbr.Partition.make returning a "result" type. You need to match in the same way as you do for the result of Mbr.make.

I created a test archive (in a .tar.gz due to github limitations): test.img.tar.gz

I used the following commands:

$ fallocate -l 2M test.img # Allocate a 2 MB file, which is 4 sectors
$ parted --align none test.img mklabel msdos # Write MBR
$ parted --align none test.img mkpart primary 1s 1s # Add a one-sector partition
$ parted --align none test.img mkpart primary 2s 3s # Add a two-sector partition (remaining space)

The argument --align none tells parted to not try to align things in a way required by old operating systems (and/or disks).

@PizieDust
Copy link
Contributor

Oh I see. Thank you. I just downloaded the archive. It'll come in really handy. thanks much

@PizieDust
Copy link
Contributor

@reynir Concerning task 3 (resizing partitions in a disk image).
I have some questions. so say we have a disk with 3 partitions:

  • partition A
  • partition B
  • partition C
    hypothetically how will out function work:
    say we have to resize partition B by increasing it's size.....this could cause an overlap with partition C. Do we shift partition C to handle the overlap?

my idea of the function was something like this:

let resize_partiton mbr partition_number new_size = 
.....

where our function takes in the MBR, the partition number to be resized and the new size.
then we iterate over the existing partitions in MBR and find the exact partition we are looking for. Then we change the sectors to match the new size. and return the new partition.

From here, I think we'll update the partitions in the MBR and write the changes to the file (overwrite??)
I don't know if I'm thinking about it correctly.

I created a new file resize_partition.ml in the /bin folder.

@reynir
Copy link
Member Author

reynir commented Mar 14, 2023

Indeed, the partitions may overlap and care should be taken in that case. The default should be to error if partitions would overlap. Shifting partitions would require moving data around on the disk (image). If this fails, for example sudden shutdown or the user aborting the command, you may end up with a bad state in the partition. I don't expect you to implement this.

You would need to overwrite the first 512 bytes of the file with a new header. Take care not to truncate or replace the file.

The interface of the library likely has to be extended a bit. The types Mbr.t and Mbr.Partition.t are marked private meaning that users of the library can inspect the type but not create new values directly. Instead, the "smart" constructors Mbr.make and Mbr.Partition.make has to be used. The "smart" thing about smart constructors is they are functions that can ensure invariants are kept. It may be handy to implement a function Mbr.with_partitions that takes a Mbr.t and a Mbr.Partition.t list and returns a new Mbr.t. I am as well doubting the current interface a bit :)

https://v2.ocaml.org/releases/5.0/htmlman/privatetypes.html

@PizieDust
Copy link
Contributor

Thank you, this is very helpful.
One final question, say it is implemented and the function returns a new Mbr.t, do we just print the new Mbr structure to console or is it to be written to like a file? I'm not quite sure. Or can I just work on this and output to console and then after reviewing you indicate what can be done next?

@reynir
Copy link
Member Author

reynir commented Mar 14, 2023

The MBR structure needs to be marshaled and the marshaled structure is what needs to be written.

@reynir
Copy link
Member Author

reynir commented Mar 17, 2023

I updated the issue with tasks done and I clarified the second task -- it's about reading the data from a partition

@PizieDust
Copy link
Contributor

okay great. For the second task, reading the data content of a partition: meaning what is actually stored in this partition ?

@reynir
Copy link
Member Author

reynir commented Mar 17, 2023

Yes. The idea is a partition may contain e.g. a tar archive. Because of the MBR header (and potentially other partitions before) the standard tar tools are not able to read the tar archive. Instead, it may be useful to "extract" the partition data in order to use other tools. A related task could be to "blit" or write data from a file into a partition in a disk image.

@PizieDust
Copy link
Contributor

Hello @reynir
For the subtask on task 2: writing to a partition.
I have a general idea in this but I am having trouble with which functions to use which can write to a partition. I understand this will require some low level I/O operations. I looked into some modules and found the Ocaml Unix module. I am unsure if this is a good fit given that this code is for use with Mirage OS and may not have support for Unix type functionality. Let me know your thoughts on this.

@reynir
Copy link
Member Author

reynir commented Mar 21, 2023

Hello @reynir For the subtask on task 2: writing to a partition. I have a general idea in this but I am having trouble with which functions to use which can write to a partition. I understand this will require some low level I/O operations. I looked into some modules and found the Ocaml Unix module. I am unsure if this is a good fit given that this code is for use with Mirage OS and may not have support for Unix type functionality. Let me know your thoughts on this.

You can use the Unix module for the command line tools. They are for running in a Unix-like environment. While the I/O operations would need to be changed for Mirage the source of the tool can serve as an example on how to use the library (while being useful!)

@PizieDust
Copy link
Contributor

Oh thank you. Let me get on it. I have some bare code written, I'll just add the Unix module and experiment with it.

@reynir
Copy link
Member Author

reynir commented Mar 21, 2023

I think you should be able to use seek_out for writing to the partition FWIW

@0xrotense
Copy link

I would like to work on one of the last two, can you assign me, please?

@PizieDust
Copy link
Contributor

I would like to work on one of the last two, can you assign me, please?

That's cool. But I already have some code in development. Maybe you could help review when I open PR's

@reynir
Copy link
Member Author

reynir commented Mar 30, 2023

@0x0god I would advice that you focus on reynir/mirage-block-partition#6. The last two tasks in this issue are not easy first tasks.

@0xrotense
Copy link

@0x0god I would advice that you focus on reynir/mirage-block-partition#6. The last two tasks in this issue are not easy first tasks.

Thanks for the advice, Please check my comment on my recent PR.

@PizieDust
Copy link
Contributor

Hello @reynir
When you are available I have some questions about the final task of this issue:

Create a MBR formatted disk image with contents from files passed as command line arguments. Padding and empty sections may optionally be passed.

So far we have been working with MBR headers. The task is about creating an MBR formatted disk. Is this to achieve the same thing as when we use fallocate and parted?

By empty sections do we mean un-allocated space? I don't quite understand what padding here means.

Also what's the expected implementation for this task like? A script which maybe executes some unix commands to create the disk, usage of the other scripts such as write_partition.exe to write file contents to the disk?

@reynir
Copy link
Member Author

reynir commented Apr 20, 2023

Yes, this is about achieving the same as we did with fallocate and parted, and a little more. The idea was to pass zero or more files with partition data, some options and a destination and it would write a disk image to the destination with a MBR header and partitions containing the data. I think what I meant with padding was partitions larger than the input data - so containing either zeroes or uninitialized data at the end. And empty sections would be space in between partitions.

Given that you wrote several tools that can be used to achieve some of the sub tasks it maybe makes more sense to just write a tool for writing a fresh MBR header. Then a small script could be written to achieve the above.

@PizieDust
Copy link
Contributor

Thank you. This is a great explanation.

@PizieDust
Copy link
Contributor

Hi @reynir
I've been tinkering around with this. Here is what my initial comments are.
When we are creating a disk with files which will be saved in partitions, this means we are limited to 4 files right? as the MBR structure can only support a maximum of 4 partitions?
Following from this argument, one can also assume that we can auto-calculate the size of the disk image by taking the sum of the sizes of each individual file?

@PizieDust PizieDust linked a pull request May 31, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants