-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Printing unicode block elements using syscall #28
Comments
I didn't intentionally make any changes to unicode support since I built off of MARS. I don't remember seeing anything explicitly supporting unicode so there is a good chance unicode just isn't supported.
Have you tried MARS to see if the behavior is present there too? If it isn't, its probably pretty easy for me to fix unicode support.
This is a fine place to ask this. |
I looked into this and the character is not fully put into memory when the program loads so it would be impossible to print the correct character later on. This is behavior inherited from MARS. I am unsure if it really makes sense to support unicode. I'll check to see what other simulators do. |
I haven't had a chance to look further into this since yesterday but I will continue looking today. The MIPS version of my library uses SPIM and it is fully functioning with that simulator so maybe that could be a place to start looking at how they handle the unicode? I guess I never really looked at what was being put into memory at program start which I should have. I was focused on what was being printed out (silly of me). When I get to work today I will see exactly what you mean by it is not being fully put into memory. I had also tried loading the hexadecimal translation of the unicode as bytes followed by a null-terminated 0. The MIPS/SPIM version also was able to print using this. Maybe that is a way around RARS not fully loading the character? When trying this in RARS printed out a french u character followed by two generic boxes (as i mentioned earlier). Thanks for all your help at looking into this. |
I took a look at the relevant code and strings loaded in directives assume every character is 1 byte and so only loads the bottom byte from each character into memory. The write syscall looks like it is correct if the default encoding for Strings is UTF-8. The printString syscall also makes the assumption that every character is 1 byte.
Yeah, manually moving the correct bytes into place would get around the problems with the directives. The way strings are stored is handled in: rars/rars/assembler/Assembler.java Lines 1017 to 1094 in 522ebbd
Strings are written by the write syscall in: Lines 261 to 306 in 522ebbd
Strings are written by the printString syscall in: rars/rars/riscv/syscalls/NullString.java Lines 57 to 73 in 522ebbd
|
Ripes and Venus both also don't support unicode. Ripes puts UTF-8 encoded bytes into memory, but has trouble outputting correctly. Venus detects and errors unicode in .string directives, and doesn't output correctly if manually loaded. The minimum I will accept to close this issue is an error for trying to load unicode in a directive. A pull request adding UTF-8 support would be greatly appreciated. |
I am porting a graphics library from MIPS to RISC-V and I need to print the unicode full block element using syscalls. Does RARS not support printing unicode characters? I load the character into memory using: .asciz "█", perform a printString ecall, and in debugging I get a generic looking outline of a box for any non-printable ascii characters. Do you have any ideas on how to fix this? I apologize if this is the wrong place to ask this.
The text was updated successfully, but these errors were encountered: