Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode support for string directives #1

Merged
merged 2 commits into from
May 30, 2019
Merged

Unicode support for string directives #1

merged 2 commits into from
May 30, 2019

Conversation

TaylorZowtuk
Copy link
Owner

Noticed that the original pull request which added unicode support for the printstring syscall appears to have broken the open syscall. When stepping through the example File I/O code, which is provided in the syscalls tab of the f1 help in rars, the open file syscall returns -1 in a0 indicating that the file was not opened successfully. Because this problem appears to have started with the modifications in the original pull request they remain in this update. @rmnattas and I narrowed the issue down to the line 447 in the following:

public static int openFile(String filename, int flags) {
// Internally, a "file descriptor" is an index into a table
// of the filename, flag, and the File???putStream associated with
// that file descriptor.
int retValue = -1;
FileInputStream inputStream;
FileOutputStream outputStream;
int fdToUse;
// Check internal plausibility of opening this file
fdToUse = FileIOData.nowOpening(filename, flags);
retValue = fdToUse; // return value is the fd
if (fdToUse < 0) {
return -1;
} // fileErrorString would have been set
if (flags == O_RDONLY) // Open for reading only
{
try {
// Set up input stream from disk file
inputStream = new FileInputStream(filename);
FileIOData.setStreamInUse(fdToUse, inputStream); // Save stream for later use
} catch (FileNotFoundException e) {
fileErrorString = "File " + filename + " not found, open for input.";
retValue = -1;
}
} else if ((flags & O_WRONLY) != 0) // Open for writing only
{
// Set up output stream to disk file
try {
outputStream = new FileOutputStream(filename, ((flags & O_APPEND) != 0));
FileIOData.setStreamInUse(fdToUse, outputStream); // Save stream for later use
} catch (FileNotFoundException e) {
fileErrorString = "File " + filename + " not found, open for output.";
retValue = -1;
}
}
return retValue; // return the "file descriptor"
}

From there we noticed that if we replace filename with "testout.txt" the file is created/opened successfully and will then write to the file successfully. We see that openfile is called in:

public void simulate(ProgramStatement statement) throws ExitingException {
int retValue = SystemIO.openFile(NullString.get(statement),
RegisterFile.getValue("a1"));
RegisterFile.updateRegister("a0", retValue); // set returned fd value in register
}

which uses the NullString.get() method that was modified in the original pull request. We don't think its an issue with the utf-8 encoding because the filename bytes are loaded correctly in memory and FileOutputStream() is able to create a file with a utf-8 encoded name in a java simulator. As of right now we are still looking into this issue but wanted to update with these new changes. Any advice into what this problem may be would be appreciated.

@TaylorZowtuk
Copy link
Owner Author

The following program demonstrates the unicode being loaded correctly in directives, the unicode escape being implemented, and printing correctly:

.data
test:
.asciz
#example that raises an error and doesnt assemble
#"\u258 "

#print (nothing)
#""

#print: Test Test ©© The quick brown fox jumps █	█ over the lazy dog ★ Java ☕
"Test \u0054\u0065\u0073\u0074 ©\u00A9 The quick brown fox jumps █\t\u2588 over the lazy dog \u2605 Java \u2615"

.text
main:
	li	t0, -1
	li	t0, 0xFFF
	andi t0, t0, -0x800
	li a7, 4
	la a0, test
	ecall
	
	li a7, 10
	ecall

@TheThirdOne
Copy link

I took a look at why open broke.

You are leaving a null byte at the end of every string.

byte[] utf8Bytes = new byte[size];
for (int i=0; i < (size - 1); i++){ //size - 1 so we dont include the null terminator in the utf8Bytes array

Just to inform you guys on how to find bugs like this, my process was:

  1. Find a test case where it appears (this was just writing/finding a program using open)
  2. Use a debugger at the entry to code you know runs (SyscallOpen.simulate)
  3. Step through until something unexpected happens
  4. That lead me to SystemIO:447 just as you found.
  5. I printed out the stack trace below
  6. I looked at FileOutputStream.java:206 which showed File.isInvalid() must be true
  7. File.isInvalid() is true mainly if this.path.indexOf(0) < 0
  8. So I checked out the filename you passed and it had a null on the end.
  9. I looked at your NullString change and found out why.
java.io.FileNotFoundException: Invalid file path
	at java.io.FileOutputStream.<init>(FileOutputStream.java:206)
	at java.io.FileOutputStream.<init>(FileOutputStream.java:133)
	at rars.util.SystemIO.openFile(SystemIO.java:447)
	at rars.riscv.syscalls.SyscallOpen.simulate(SyscallOpen.java:46)
	at rars.riscv.InstructionSet.findAndSimulateSyscall(InstructionSet.java:269)
	at rars.riscv.instructions.ECALL.simulate(ECALL.java:44)
	at rars.simulator.Simulator$SimThread.run(Simulator.java:481)
	at java.lang.Thread.run(Thread.java:748)

So, a debugger and a source reference for the Java API were useful tools. But you found the right line, so my tip would be to print stack traces if a line in question is throwing.

@TheThirdOne
Copy link

Also your code for handling \u escapes needs a little better error handling.

Assembling the following throws an exception on the command line and fails to log an error to the console. Your substring call just doesn't check to make sure there are enough characters.

.data
.string "\u"
Exception in thread "AWT-EventQueue-0" java.lang.StringIndexOutOfBoundsException: String index out of range: 7
	at java.lang.String.substring(String.java:1963)
	at rars.assembler.Assembler.storeStrings(Assembler.java:1069)
	at rars.assembler.Assembler.executeDirective(Assembler.java:656)
	at rars.assembler.Assembler.parseLine(Assembler.java:372)
	at rars.assembler.Assembler.assemble(Assembler.java:153)
	at rars.RISCVprogram.assemble(RISCVprogram.java:325)
	at rars.venus.run.RunAssembleAction.actionPerformed(RunAssembleAction.java:113)
	at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2022)
        ...

@TaylorZowtuk
Copy link
Owner Author

TaylorZowtuk commented May 31, 2019

@TheThirdOne
Ah what a silly mistake! We caught that it was including the null byte in strings and we thought we had fixed it with the size - 1 change in this merge because it fixed some printing behavior. I see the problem was that I was still initializing the size of the byte array with the arrayList size which led to an extra index that would not be set in the for loop of line 75 and thus it is initialized to zero by default. The fix I will implement is to initialize the size variable to be size - 1 in line 73 and remove the size - 1 condition from the for loop in line 75.

int size = utf8BytesList.size();
byte[] utf8Bytes = new byte[size];
for (int i=0; i < (size - 1); i++){ //size - 1 so we dont include the null terminator in the utf8Bytes array
utf8Bytes[i] = utf8BytesList.get(i);
}

Just to inform you guys on how to find bugs like this, my process was:

This was incredibly helpful thank you for this. Would you mind recommending me a debugger which you find works well with RARS?

Also your code for handling \u escapes needs a little better error handling.

Sorry about this. This was definitely a sloppy implementation. Like I mentioned previously I'm quite unfamiliar with java and thus the error handling but I will work on this.

Thank you for your detailed feedback and help it is very much appreciated.

@TheThirdOne
Copy link

This was incredibly helpful thank you for this. Would you mind recommending me a debugger which you find works well with RARS?

I'm glad it was helpful. I have been using Intellij as an IDE when working on RARS, so I have used its debugging features, but any other IDE (Eclipse, Netbeans, ...) should have similar features. If you don't want to use an IDE for it, jdb seems like a good solution but it is command line only. JSwat might work but is discontinued and apparently does not work for Java 8 which is required for running RARS.

@rmnattas
Copy link

rmnattas commented Jun 3, 2019

@TheThirdOne We got the IntelliJ IDE but we are having trouble in the Build/Run/Debug configurations. What should it be? I used a JAR application configuration giving it the build-jar.sh for build and the jar file to run but debugging and breakpoints doesn't work with that.

@TheThirdOne
Copy link

This is my configuration:

example-configuration

In order for this to work, you need the entire repo to be a "source" module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants