Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equivalent to .long/.word directives in GAS? #207

Closed
ethindp opened this issue Oct 18, 2021 · 10 comments
Closed

Equivalent to .long/.word directives in GAS? #207

ethindp opened this issue Oct 18, 2021 · 10 comments

Comments

@ethindp
Copy link

ethindp commented Oct 18, 2021

What is the equivalent to directives like .long/.word/.byte/etc. in GAS? E.g.: this code:

    .align 16
_L8010_GDT_table:
    .long 0, 0
    .long 0x0000FFFF, 0x00CF9A00    ; flat code
    .long 0x0000FFFF, 0x008F9200    ; flat data
    .long 0x00000068, 0x00CF8900    ; tss
_L8030_GDT_value:
    .word _L8030_GDT_value - _L8010_GDT_table - 1
    .long 0x8010
    .long 0, 0

I know that there are methods like db and dq_i, but does that apply to a particular label, and how do you "align" them when creating them? I suppose I could just directly write the raw values to memory but that feels kinda like a hack.

@wtfsck
Copy link
Member

wtfsck commented Oct 18, 2021

The gas formatter uses these:

bits masm/nasm gas
8 db .byte
16 dw .word
32 dd .int
64 dq .quad

I don't know if .long is 32 or 64 bits.

.dq() and .dd() don't currently support labels. Adding support for it should be easy.

Alignment isn't supported at the moment.

@ethindp
Copy link
Author

ethindp commented Oct 18, 2021

@wtfsck Thanks! The instructions I'm specifically concerned with are these:

_L8040:
    xorw    %ax, %ax
    movw    %ax, %ds
    lgdtl   0x8030

According to SEC. 7.62 of the GAS manual, .long is the same as .int. How hard would it be to implement alignment? According to the directives documentationsection AS by default pads data with zeros or NOPs. It might be a good idea to implement something like that for x86 specifically, though we might also want to consider just going for the semantics of .balign or .p2align.

@wtfsck
Copy link
Member

wtfsck commented Oct 19, 2021

@wtfsck Thanks! The instructions I'm specifically concerned with are these:

_L8040:
    xorw    %ax, %ax
    movw    %ax, %ds
    lgdtl   0x8030

Those instructions don't have any dq(label) or align directives so should work today.

How hard would it be to implement alignment?

Shouldn't be a problem.

According to the directives documentationsection AS by default pads data with zeros or NOPs. It might be a good idea to implement something like that for x86 specifically, though we might also want to consider just going for the semantics of .balign or .p2align.

Both can be supported, p2align n seems to be b2align(1 << n)

@ethindp
Copy link
Author

ethindp commented Oct 21, 2021

@wtfsck How are label-level db/dw/dd/dq data arguments unsupported? Looking at the code I get the impression that declaring a label then using .dd()/.dw()/.dq()/... would work fine:

	#[allow(clippy::missing_inline_in_public_items)]
	pub fn db(&mut self, data: &[u8]) -> Result<(), IcedError> {
		self.decl_data_verify_no_prefixes()?;
		for bytes in data.chunks(CodeAssembler::MAX_DB_COUNT) {
			self.add_instr(Instruction::with_declare_byte(bytes)?)?;
		}

		Ok(())
	}

And the algorithm for decl_data_verify_no_prefixes simply checks if the data declaration has any prefixes, but declaring a label doesn't set any prefixes. Similarly, add_instr() doesn't clear the label until after the instruction(s) have been added, so theoretically this should work (excluding alignment), no?

let a = CodeAssembler::new(16)?;
let mut ap_trampoline = a.create_label()?;
a.cli()?.cld()?.jmp(0x8040)?; // Not quit the same as ljmp in GAS...
a.set_label(ap_trampoline)?;
let mut gdt_table = a.create_label()?;
a.dd(&[0, 0, 0x0000FFFF, 0x00CF9A00, 0x0000FFFF, 0x008F9200, 0x00000068, 0x00CF8900])?;
a.set_label(gdt_table)?;

The only thing that won't work is this part if I'm right:

_L8030_GDT_value:
    .word _L8030_GDT_value - _L8010_GDT_table - 1
    .long 0x8010
    .long 0, 0

And I don't know how the processor would take things if I used dd instead of dw for everything in that part, nor do I know how to simulate that calculation (I suspect it might be a memory calculation). Thoughts?

@wtfsck
Copy link
Member

wtfsck commented Oct 22, 2021

@wtfsck How are label-level db/dw/dd/dq data arguments unsupported? Looking at the code I get the impression that declaring a label then using .dd()/.dw()/.dq()/... would work fine:

A label before a db/dw/dd/dq is supported. What's not supported at the moment, but should be easy, is to do eg. dd(label) or dq(label).

And the algorithm for decl_data_verify_no_prefixes simply checks if the data declaration has any prefixes, but declaring a label doesn't set any prefixes. Similarly, add_instr() doesn't clear the label until after the instruction(s) have been added, so theoretically this should work (excluding alignment), no?

let a = CodeAssembler::new(16)?;
let mut ap_trampoline = a.create_label()?;
a.cli()?.cld()?.jmp(0x8040)?; // Not quit the same as ljmp in GAS...
a.set_label(ap_trampoline)?;
let mut gdt_table = a.create_label()?;
a.dd(&[0, 0, 0x0000FFFF, 0x00CF9A00, 0x0000FFFF, 0x008F9200, 0x00000068, 0x00CF8900])?;
a.set_label(gdt_table)?;

ljmp sel:offs in gas should be jmp_far(sel, offs).

Also if the data above is the GDT, I think you want to set gdt_label before the dd() function call.

The only thing that won't work is this part if I'm right:

_L8030_GDT_value:
    .word _L8030_GDT_value - _L8010_GDT_table - 1
    .long 0x8010
    .long 0, 0

And I don't know how the processor would take things if I used dd instead of dw for everything in that part, nor do I know how to simulate that calculation (I suspect it might be a memory calculation). Thoughts?

It depends on the values of the labels. I assume _L8030_GDT_value is the end of the GDT and _L8010_GDT_table is the beginning of it. If you already know the size of it, just replace that expression with the size - 1, eg. 0x20 - 1.

@ethindp
Copy link
Author

ethindp commented Oct 22, 2021

A label before a db/dw/dd/dq is supported. What's not supported at the moment, but should be easy, is to do eg. dd(label) or dq(label).

Can you elaborate a bit on this? I'm still confused on what you mean.

ljmp sel:offs in gas should be jmp_far(sel, offs).
Also if the data above is the GDT, I think you want to set gdt_label before the dd() function call.

Thanks! I didn't know about that function and was wondering how to do that!

It depends on the values of the labels. I assume _L8030_GDT_value is the end of the GDT and _L8010_GDT_table is the beginning of it. If you already know the size of it, just replace that expression with the size - 1, eg. 0x20 - 1.

The nice thing about this assembly is that the labels make it pretty obvious how large something is. Could I not just do 0x8030-0x8010-1? Or would there be padding and such?

The only thing to tackle now is the alignment problem (I thought of adding in alignment X NOPs, but that doesn't appear to be how things work). So far, my code looks something like this, at least in my head:

Original code:

; this code will be relocated to 0x8000, sets up environment for calling a C function
    .code16
ap_trampoline:
    cli
    cld
    ljmp    $0, $0x8040
    .align 16
_L8010_GDT_table:
    .long 0, 0
    .long 0x0000FFFF, 0x00CF9A00    ; flat code
    .long 0x0000FFFF, 0x008F9200    ; flat data
    .long 0x00000068, 0x00CF8900    ; tss
_L8030_GDT_value:
    .word _L8030_GDT_value - _L8010_GDT_table - 1
    .long 0x8010
    .long 0, 0
    .align 64
_L8040:
    xorw    %ax, %ax
    movw    %ax, %ds
    lgdtl   0x8030
    movl    %cr0, %eax
    orl     $1, %eax
    movl    %eax, %cr0
    ljmp    $8, $0x8060
    .align 32
    .code32
_L8060:
    movw    $16, %ax
    movw    %ax, %ds
    movw    %ax, %ss
    ; get our Local APIC ID
    mov     $1, %eax
    cpuid
    shrl    $24, %ebx
    movl    %ebx, %edi
    ; set up 32k stack, one for each core. It is important that all core must have its own stack
    shll    $15, %ebx
    movl    stack_top, %esp
    subl    %ebx, %esp
    pushl   %edi
    ; spinlock, wait for the BSP to finish
1:  pause
    cmpb    $0, bspdone
    jz      1b
    lock    incb aprunning
    ; jump into C code (should never return)
    ljmp    $8, $ap_startup

Rust code:

let asm = CodeAssembler::new(16)?;
let mut ap_trampoline = asm.create_label()?;
asm
    .set_label(ap_trampoline)?
    .cli()?
    .cld()?
    .jmp_far(0, 0x8040)?;
let mut gdt_table = asm.create_label()?;
asm
    .set_label(gdt_table)?
    .dd([
        0,
        0,
        0x0000FFFF,
        0x00CF9A00,
        0x0000FFFF,
        0x008F9200,
        0x00000068,
        0x00CF8900,
])?;
let mut gdt_value = asm.create_label()?;
asm
    .set_label(gdt_value)?
    .dd([0x8030-0x8010-1, 0x8010, 0, 0])?;
let mut l8040 = asm.create_label()?;
asm
    .set_label(l8040)?
    .xor(ax, ax)?
    .mov(ax, ds)?
    .lgdt(0x8030)?
    .mov(cr0, eax)?
    .or(1, eax)?
    .mov(eax, cr0)?
    .jmp_far(8, 0x8060)?;
let code16bytes = asm.assemble()?;
let asm = CodeAssembler::new(32)?;
asm
    .mov(16, ax)?
    .mov(ax, ds)?
    .mov(ax, ss)?
    .mov(1, eax)?
    .cpuid()?
    .shr(24, ebx)?
    .mov(ebx, edi)?
    .shl(15, ebx)?
    .mov(stack_top_addr, esp)?
    .sub(ebx, esp)?
    .push(edi)?;
let label = asm.anonymous_label()?;
asm
    .set_label(label)?
    .pause()?
    .cmp(0, bsp_done_addr)?
    .jz(label)?
    .lock()?
    .inc(ap_running_addr)?
    .jmp_far(8, ap_startup_rust_func)?;
let code32bytes = asm.assemble()?;

That's at least how I think I might do it (though my code might not be entirely correct).

@ethindp ethindp closed this as completed Oct 22, 2021
@ethindp ethindp reopened this Oct 22, 2021
@ethindp
Copy link
Author

ethindp commented Oct 22, 2021

I updated my comment because I accidentally submitted my comment before I was done lol

@wtfsck
Copy link
Member

wtfsck commented Oct 23, 2021

Can you elaborate a bit on this? I'm still confused on what you mean.

Sometimes you want to store the address of a label in some table, think a switch statement that checks if the value is in range then looks up the target code and jumps to it. In masm you'd use several dd offset label but it's currently not possible to do the same thing here, eg. dd(label). But should be easy to implement if needed.

That's at least how I think I might do it (though my code might not be entirely correct).

I quickly checked the translated code and seems like most instructions use the wrong order of operands since you looked at gas assembly which switches the order of operands (with only a few exceptions such as enter x,y).

let code16bytes = asm.assemble()?;

let code32bytes = asm.assemble()?;

Needs addr of the 32-bit code, the gas code above assumes 0x8060.

@ethindp
Copy link
Author

ethindp commented Oct 23, 2021

@wtfsck Thanks again! Perhaps operand ordering should be documented for newcomers.

@ethindp ethindp closed this as completed Oct 23, 2021
@wtfsck
Copy link
Member

wtfsck commented Oct 24, 2021

@wtfsck Thanks again! Perhaps operand ordering should be documented for newcomers.

All instructions have doc comments showing the order of the operands (and required CPUID/CPU).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants