Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

step22 - fails for Gatemate flash programming or flash reading #2

Open
fm4dd opened this issue Aug 27, 2023 · 14 comments
Open

step22 - fails for Gatemate flash programming or flash reading #2

fm4dd opened this issue Aug 27, 2023 · 14 comments

Comments

@fm4dd
Copy link
Owner

fm4dd commented Aug 27, 2023

In step22, the application expects the app image data to be uploaded into the flash memory at a 1M offset. This step currently fails due to issues with SPI flash access. The issue is that no output is visible on the serial line. This can have several possible root causes:

(A) Selected wrong flash read mode for spi_flash.v. 4 modes are defined:

  1. SPI_FLASH_READ
  2. SPI_FLASH_FAST_READ
  3. SPI_FLASH_FAST_READ_DUAL_OUTPUT
  4. SPI_FLASH_FAST_READ_DUAL_IO

So far, I unsuccessfully tried 1. and 4.

(B) Using the wrong openFPGALoader flash offset value

openFPGALoader only understands an offset provided in bytes. I am using openFPGALoader -b gatemate_evb_spi -o 1048576 data/scene1.dat, assuming the 1M offset calculates with 1024.

(C) It is the first time that I am using the +uCIO flag for the Gatemate place and route tool 'p_r', trying to directly access the flash from the FPGA.

(D) A pin swap is another common cause (e.g. MOSI->MISO), hopefully it's that simple.

The make output example below shows the successful flash execution, but the program output does not show :-(

fm@nuc7fpga:~/fpga/projects/git/gatemate-riscv/step22$ make prog
Programming scene data at 1M offset:
/home/fm/cc-toolchain-linux/bin/openFPGALoader/openFPGALoader -b gatemate_evb_spi -o 1048576 data/scene1.dat
Jtag frequency : requested 6.00MHz   -> real 6.00MHz  
Detail: 
Jedec ID          : c2
memory type       : 28
memory capacity   : 17
EDID + CFD length : c2
EDID              : 1728
CFD               : 
00
Detail: 
Jedec ID          : c2
memory type       : 28
memory capacity   : 17
EDID + CFD length : c2
EDID              : 1728
CFD               : 
flash chip unknown: use basic protection detection
Erasing: [==================================================] 100.00%
Done
Writing: [==================================================] 100.00%
Done
Wait for CFG_DONE DONE
Programming E1 SPI Config:
/home/fm/cc-toolchain-linux/bin/openFPGALoader/openFPGALoader -b gatemate_evb_spi SOC_00.cfg
Jtag frequency : requested 6.00MHz   -> real 6.00MHz  
Detail: 
Jedec ID          : c2
memory type       : 28
memory capacity   : 17
EDID + CFD length : c2
EDID              : 1728
CFD               : 
00
Detail: 
Jedec ID          : c2
memory type       : 28
memory capacity   : 17
EDID + CFD length : c2
EDID              : 1728
CFD               : 
flash chip unknown: use basic protection detection
Erasing: [==================================================] 100.00%
Done
Writing: [==================================================] 100.00%
Done
Wait for CFG_DONE DONE
@fm4dd fm4dd changed the title step22 - fails for Gatemate flash programming or access issues step22 - fails for Gatemate flash programming or flash reading Aug 27, 2023
@g3grau
Copy link

g3grau commented Oct 3, 2023

Hi,
thanks a lot for reworking BrunoLevy's tutorial on a CCA1 EVB! I had some headache using the wrong libgcc in step 21, but finally arrived at this step :-)
Using openFPGALoader -b gatemate_evb_jtag --dump-flash --offset 0 --file-size 8388608 test.bit and by comparison using hexedit I would say that programming of the Flash with both bitstream and data file works fine.

Debugging it with gtkwave revealed that MISO is Z all the time. It took a while to see it, but in my case the CCF file declared both MOSI and MISO as output?! Changing MISO to Pin_in certainly helps, but still doesn't solve it.
In simulation MISO remained Z ... also after adding a Flash device model and a RESET pulse to the testbench. If I hardwire MISO to 1 or 0, the SPI reads the corresponding values back. SCK, MOSI, CS# look ok, the command sent (0x03) also looks valid to me. SCK starts from 0 for mode 0. What's missing to get the MISO signal?
The FPGA initialization probably uses QSPI mode, but we should be free to change the IO mode after that?

@fm4dd
Copy link
Owner Author

fm4dd commented Oct 4, 2023

Hi g3grau,

Thank you very much for sharing your experience and attempts to solve the flash reading. A second person trying is very helpful! Sadly I have not made any progress myself yet. I need to give it another try on one of the coming weekends. Solving step22 is key to progress further.

Next I want to try and connect a logic analyzer to the J3 pin header. Per schematic, J3 is directly connecting to the Flash IC SPI lines. Recording the SPI protocol and decode the state would be great.

There is a discussion post for the ARTY board on BrunoLevy/learn-fpga#108 that also struggles with the Flash in the same step.

@g3grau
Copy link

g3grau commented Oct 4, 2023

I just connected a scope and got surprised.
During programming everything looks as expected. When the design is running, SCK keeps toggling at 10MHz which somehow fits because CS# is permanently low (as opposed to the simulation) and MOSI keeps repeating the same pattern (which doesn't match any expected pattern from the design) with a period of 45us (should be my test loop in main()).
MISO is always low (simulation is always Z). Since CS# doesn't react, the memory can't do anything.

For comparison I just loaded the design from step21 .. there CS#, MOSI and MISO are 1 as expected.
So we do have (some) control over these pins, but the FPGA doesn't seem to to what the design/simulation tells :-S

More spooky ... I added some reg in the main design for non-blocking assignment of the SPI signals to them, and then assign that to the 3 free LED (I think I do not yet understand this bidirectional IO). It doesn't change anything in the simulation, but suddenly the scope shows the expected signals on CS# and MISO! The program still doesn't run 100%, my printf statements are nuts, but maybe I made something wrong with memory size and stack pointer (I tried to increase RAM 4x). Apart from the direction fixing in gatemate-e1.ccf for SPIFLASH_MISO, this should be my relevant change.
I have no clue why the LED stuff affects the function of the SPI...

`module SOC (
input CLK, // system clock
input RESET,// reset button
output [7:0] LEDS, // system LEDs
input RXD, // UART receive
output TXD, // UART transmit
output SPIFLASH_CLK, // SPI flash clock
output SPIFLASH_CS_N, // SPI flash chip select (active low)
// inout SPIFLASH_MOSI, // SPI flash IO pins
// inout SPIFLASH_MISO // SPI flash IO pins
output SPIFLASH_MOSI, // SPI flash IO pins
input SPIFLASH_MISO // SPI flash IO pins
);

MappedSPIFlash SPIFlash(
.clk(clk),
.word_address(mem_wordaddr[19:0]),
.rdata(SPIFlash_rdata),
.rstrb(isSPIFlash & mem_rstrb),
.rbusy(SPIFlash_rbusy),
.CLK(SPIFLASH_CLK),
.CS_N(SPIFLASH_CS_N),
.MOSI(SPIFLASH_MOSI),
.MISO(SPIFLASH_MISO)
);
// ...
reg dbg[2:0];
always @(posedge clk) begin
dbg[2] <= SPIFLASH_CS_N;
dbg[1] <= SPIFLASH_MOSI;
dbg[0] <= SPIFLASH_MISO;
end
assign LEDS[7] = dbg[2];
assign LEDS[6] = dbg[1];
assign LEDS[5] = dbg[0];

`

@g3grau
Copy link

g3grau commented Oct 4, 2023

After fixing an error in my memory sizing I could read the first 4 bytes .. but the memory address does not increase.
Multiple reads return the (endian-scrambled) sequence 07 7F 80 05. The hexdump at 0x100000 reads 7f07 0580.
Getting closer :) But I wonder why the simulated Flash doesn't work (and MISO is still Z all the time). Must be something fundamental with bidirectional signals?

@g3grau
Copy link

g3grau commented Oct 6, 2023

Well..?! I really don't know why it works now and how exactly it fails in the other cases (not only the Flash access, also the code execution seems broken in that case), but now I get the full demo at maybe 3Fps :-)
I put my changes on a gist.
In addition I use the -nomx8 flag for Yosys as this was used in the EVB defaults.

@fm4dd
Copy link
Owner Author

fm4dd commented Oct 9, 2023

👏 Congratulations! This is awesome. Thank you very much for the hard work, and documenting progress inside the comments! I am still processing changes from the files in your gist, so I can update the step22 files. My first quick attempt failed, so I better double-check. I may need to ask a question if I get stuck...

What is the pin assignment you had in your .ccf file? I fixed mine to:

Pin_out "SPIFLASH_CLK"   Loc = "IO_WA_B8";
Pin_out "SPIFLASH_CS_N"  Loc = "IO_WA_A8";
Pin_out "SPIFLASH_MOSI"  Loc = "IO_WA_B7";
Pin_in  "SPIFLASH_MISO"  Loc = "IO_WA_A7";

Meanwhile I was getting a handle on basic SPI protocol analysis for the E1 board's flash operation. I took down my notes for it in a gist at Gatemate E1 Board - SPI Flash Programming and Boot Operation. In a nutshell, openFPGA programs the flash at 6.25MHz by using repeat PP "page program" commands to write the bitstream data in 256 byte chunks. For the E1 FPGA boot from flash, a single FASTREAD operation is done at 33MHz after checking the flash access with a couple of RDID commands.

Getting over step22 is great!

@g3grau
Copy link

g3grau commented Oct 9, 2023

It would be awesome if I would understand the issue :-D It's a workaround but not a fix yet.
Cool debugging with the logic analyzer! Maybe that can help to understand what happens if you just comment out
the assignment to the 3 LED (I just have a scope at hand). In my case this still runs the demo, but apparently loads data from a different address, starting it somewhere in the middle.

The design can run at 30MHz (I switched the power mode to speed), but more seems to be not possible due to the SPI clock limit (fast should allow 80MHz after configuring it with an initially slower clock) and a timing issue in the design (PR reports 28MHz max). I didn't check if my RAM increase may be the bottleneck.

Yes, the ccf looks good. We have to remember that dual/quad would require inout...

@fm4dd
Copy link
Owner Author

fm4dd commented Oct 14, 2023

Awesome! Yes it works for me too! Your changes made all the difference. This is great. Big thanks for solving it! Your debug code that routes the SPI signals to the upper LED's was neat. I could see if the SPI starts, and if there is MISO traffic coming back from the flash.

The major fix was to set the right signal directions.

module SOC (
    input  CLK,               // system clock 
    input  RESET,             // reset button
    output [7:0] LEDS,        // system LEDs
    input  RXD,               // UART receive
    output TXD,               // UART transmit
    output SPIFLASH_CLK,      // SPI flash clock
    output SPIFLASH_CS_N,     // SPI flash chip select (active low)
    output SPIFLASH_MOSI,     // SPI flash MOSI output
    input  SPIFLASH_MISO      // SPI flash MISO input
);

I just updated the repo with 3968362, and now the final two steps 23/24 are waiting for another day.

@g3grau
Copy link

g3grau commented Oct 15, 2023

Great! Just to be sure: your demo starts with a yellow "Oxygene" string? I had variants which seem to start somewhere else.

I tried your variant (with reg in the interface), that also worked for me. It it is still kind of unstable. E.g. it only works if I comment out my debug block. Changing the clock back from 30 to 10MHz also breaks it again. As an analog designer I know this kind of side effects, but here I didn't expect it :-D The SOC_pr.log tells me that it should run up to 21 MHz. The opposite is true...no output at CPU_FREQ 10 but works fine at 30 :-S

I'll double check again with your repo update, maybe I just changed too much ;) I just noticed that you started step23 and ran into an issue "program doesn't run" .. which may also be this kind of timing issue I have with step22? Did you try changing the clock?

@Elektronikus
Copy link

Hello, I'm struggling also.
From another project there was a problem in the gatemate cc-toolchain-linux which requires defining IO pins as in or out. I think this problem will be solved very soon by a toolchain update.

@Elektronikus
Copy link

@g3grau: what was the solution to the wrong libgcc. I'm trying to use the riscv64-unknown-linux-gnu-gcc and I'm struggling with the compiler flags

@g3grau
Copy link

g3grau commented Sep 30, 2024

Hi,
I didn't remember anything about the libgcc, but the makefile kept a memory!
That's my src-sieve/Makefile with some comments on different attempts :) Hope that helps to find the right one

### Local Makefile for building the RISC-V native application.  ###
### Requires riscv-toolchain, linker script and firmware_words  ###
### hex conversion program. Creates the 'firmware.hex' output   ###
### for FPGA upload together with the bitstream.                ###
### ----------------------------------------------------------- ###

TOOLCHAINDIR = /usr/cadtools/riscv-gnu-toolchain
RVLINKSCRIPT=/home/ggrau/projects/FPGAsrc/GateMate/CologneChip/gatemate-riscv/ldscripts-shared/bram.ld
FW_WORDS_DIR=$(TOOLCHAINDIR)/firmware_words
# RV32I_LIBGCC=$(TOOLCHAINDIR)/lib/gcc/riscv64-unknown-elf/8.3.0/rv32i/ilp32/libgcc.a
# RV32I_LIBGCC=$(TOOLCHAINDIR)/build-gcc-linux-stage2/gcc/lib32/ilp32/libgcc.a        !! DON'T USE LINUX !!
# RV32I_LIBGCC=$(TOOLCHAINDIR)/build-gcc-newlib-stage2/gcc/libgcc.a         # undefined reference to __mulsi3 ?!
RV32I_LIBGCC=$(TOOLCHAINDIR)/build/riscv64_multilib/lib64/gcc/riscv64-unknown-elf/11.1.0/rv32i/ilp32/libgcc.a

ASFLAGS= -march=rv32i -mabi=ilp32 -mno-relax
LDFLAGS= -m elf32lriscv -nostdlib --no-relax -T $(RVLINKSCRIPT)
CFLAGS= -march=rv32i -mabi=ilp32 -fno-pic -fno-stack-protector -w -Wl,--no-relax

all: firmware.hex

# Step-3: Convert the RISCV elf binary into an readmem() formatted hex file
# -------------------------------------------------------------------------
RAMSIZE=6144

firmware.hex: sieve.bram.elf
        $(FW_WORDS_DIR)/firmware_words $< -ram $(RAMSIZE) -max_addr $(RAMSIZE) -out $@

# Step-2: Link the object files into a RISCV elf binary
# -----------------------------------------------------
sieve.bram.elf: %.o
        $(TOOLCHAINDIR)/bin/riscv64-unknown-elf-ld start.o wait.o putchar.o print.o sieve.o $(RV32I_LIBGCC) $(LDFLAGS) -o $@

# Step-1: build object files (.o) from assembler source files (.S) 
# ----------------------------------------------------------------
%o:
        $(TOOLCHAINDIR)/bin/riscv64-unknown-elf-as $(ASFLAGS) start.S -o start.o
        $(TOOLCHAINDIR)/bin/riscv64-unknown-elf-as $(ASFLAGS) wait.S -o wait.o
        $(TOOLCHAINDIR)/bin/riscv64-unknown-elf-as $(ASFLAGS) putchar.S -o putchar.o
        $(TOOLCHAINDIR)/bin/riscv64-unknown-elf-gcc $(CFLAGS) -c print.c -o print.o
        $(TOOLCHAINDIR)/bin/riscv64-unknown-elf-gcc $(CFLAGS) -c sieve.c -o sieve.o

clean:
        rm -f *.o *.hex *.elf

.SECONDARY:
.PHONY: all clean

@g3grau
Copy link

g3grau commented Sep 30, 2024

Hello, I'm struggling also. From another project there was a problem in the gatemate cc-toolchain-linux which requires defining IO pins as in or out. I think this problem will be solved very soon by a toolchain update.

That's an interesting note. I tried to get a data logger working with a lot of headache this spring, but at some point I gave up on the Hyperram complexity and lack of deeper knowledge how to use DDR interfaces, multiphase clocks or timing constraints in this environment. I got it working (with inout) using a Verilog interface from Kevin M. Hubbard (which is much slower than the theoretical possible throughput, but at least it works). I think a "full speed" reference implementation of this hyperram interface would provide a great insight how to use these tools on an advanced level.

@Elektronikus
Copy link

Hello g3grau,
the new toolchain is out and claims to fix the problem. Ich will start checking next week. I didn't have time to work on the Libgcc topic.

With best regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants