Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

please describe how to get started #1

Open
maartenbrock opened this issue Feb 1, 2019 · 17 comments
Open

please describe how to get started #1

maartenbrock opened this issue Feb 1, 2019 · 17 comments

Comments

@maartenbrock
Copy link

Hello,

This looks like a very interesting project.

But, since I have never used MYHDL could you please describe how to get started with for example the Vivado project? I managed to install MYHDL in Python, but what then? Am I supposed to run one (or all?) of the python scripts before I can open the project in Vivado? It seems so, but it does a lot more. And after that I still don't have design_1.bd which the project references.

Also, I don't understand Verilog well, but am used using VHDL. Would that be possible?

Then you state HDL-deflate can be configured. But how should that be done? At conversion time or at implementation time?

Kind regards,
Maarten

@tomtor
Copy link
Owner

tomtor commented Feb 2, 2019

Hi Maarten,

The first step is to execute:

python3 deflate.py

This will generate deflate.v which you should import in e.g. Vivado.

You can also just run make which will do the same and also execute the testbench with Icarus.
Note that you should have MyHDL installed by executing pip install --user myhdl (which you did).

If you use Windows I advise installing WSL and Ubuntu:
https://docs.microsoft.com/en-us/windows/wsl/install-win10

If you want to change parameters just edit deflate.py and rerun python3 deflate.py and Vivado will pick up the changed deflate.v.

You can mix VHDL and Verilog in Vivado projects, that's no issue.
MyHDL can also generate VHDL, just change the last line in deflate.py and add hdl='VHDL'.

I have not tested the generated vhd file but added a deflate.v and deflate.vhd to the github repo, so one can get started without using any Python if the default configuration is OK.

design_1.bd is probably the Vivado board description, which should be generated if you select a board in Vivado. You need a specific .xdc file for your setup anyway, if you do not have an Arty-100, so creating a fresh Vivado project is also a good first step in most cases.

@maartenbrock
Copy link
Author

Hello Tom,

Thanks.

I hadn't installed myhdl with --user, but that shouldn't be a problem, I guess. Instead I ran pip install as Administrator. This also shows I'm on Windows, but I have access to native Ubuntu machines and VirtualBox Ubuntu VM's as well. Indeed running make would be easier on Linux.

Thanks for describing where to do the configuration. Once I opened the file I had seen that as well, but it would be nice if this was described in the readme. I would not have guessed how to generate the VHDL though.

Your HDL-Deflate.xpr already has a board selected. But it normally does not auto-generate a design AFAIK. You must have clicked 'Create Block Design' in the project at some point. Are you saying there is nothing of interest in your design_1.bd? If so, then where is deflate.v inserted? I only see test_deflate_bench.v being used. Even test_deflate_bench.v doesn't seem to reference deflate.v. I'm wondering if I should just forget about the HDL-Deflate.xpr Vivado project and the test_deflate_bench.v test.

Now for the deflate module. I see it has some inputs and outputs, but it's not quite clear to me what they mean. The clk and reset are obvious. I assume i_data is where the bytes come in, but when is it valid and how fast can I feed it? And o_byte is where the compressed data comes out, but again when is it valid? There are two progress indicators, but what's the difference or their meaning even? What is the meaning of o_done? And I have no clue what to feed into i_waddr and i_raddr either.

I somehow had expected some streaming input and output, like e.g. an AXI4-Stream.

Would you care to explain some more? Or is there a standard I should have read that uses similar definitions?

Kind regards,
Maarten

@tomtor
Copy link
Owner

tomtor commented Feb 2, 2019

@vanmierlo Yes, I think it's best to just forget about the HDL-Deflate.xpr Vivado project in the repository.

In a fresh Vivado project you can just add the deflate.v and test_deflate_bench.v as source files and an .xdc file if you want to see the LEDs blinking. That should synthesize in Vivado.

The best documentation is really in test_deflate.py. At the end is a section

@block
def test_deflate_bench(i_clk, o_led, led0_g, led1_b, led2_r):

which is the source for test_deflate_bench.v.

Near the top of test_deflate.py is a Python test bench which tests the streaming interfaces.

The deflate module CAN stream and could be wrapped in a (VIvado) AXI-stream interface, but that is left as an exercise. It shouldn't be very hard, but I didn't want to invest time in a Xilinx or Vivado specific solution.

I'll explain the ports:

i_mode can be set WRITE (see the top of deflate.py or end of this comment for the numeric values) and it will load i_data at address i_waddr into the (de)compress input buffer. This buffer can be circular and you can use streaming.

i_mode can be set to STARTD or STARTC for a single cycle to start the (de)compression. After that you should set i_mode to either IDLE (just wait for completion signalled by o_done) or optionally to READ (EDIT: contents of output buffer at address i_raddr will always be placed in o_byte regardless of i_mode setting) or WRITE (write data to (streaming) input buffer).

When the stream is completely (de)compressed o_done will be set to 1.

o_iprogress indicates the current position of the deflate input reader. You need that if you stream the input in a circular input buffer like the tests do. You should only write at addresses < o_iprogress to prevent overwriting unread data.

o_oprogress points always 1 byte past the last available output byte in the buffer. So you should not read from addresses >= o_oprogress. At the end of the (de)compression o_oprogress contains the size of the final result.

i_mode values:

IDLE = 0, WRITE = 1, READ = 2, STARTC= 3, STARTD = 4

@maartenbrock
Copy link
Author

Thanks again,

So we cannot write and read at the same time, right? That's a pity.

Does o_iprogress start at 1? Or should I only write at addresses <= o_iprogress?
The same for o_oprogress, can I really read at address o_oprogress?
And will these progress indicators have only 9 bits, like i_raddr and i_waddr? Or am I supposed to use a modulo?

I wouldn't necessarily classify AXI-stream as a Xilinx specific solution, though I really don't know if anyone else natively supports it. It's defined by ARM and Xilinx has decided to adopt it. And I'm sure there are other streaming protocols out there like e.g. a Wishbone point-to-point interconnection. But I know of none as efficient as AXI-Stream.

I'll start by examining test_deflate.py and see what I can make of it.

@maartenbrock
Copy link
Author

I have another question: make requires python3.6 and I happen to have 3.5 installed. Are you (or myhdl) using anything 3.6 specific? It didn't seem so after I modified it. So why not just use PYHTON=python3?

@tomtor
Copy link
Owner

tomtor commented Feb 2, 2019

@vanmierlo The Python testbench uses Python3.6 specific deflate options when testing with DYNAMIC=False. When you use DYNAMIC=True (default) then even Python2 will be fine.

Regarding your other questions: o_iprogress starts at 0. Edit: So you can overwrite the first input byte (with i_waddr = 0) when o_iprogress >= 1 while decompressing. When compressing o_iprogress >= 1 + CWINDOW.

The last valid output byte always has address o_oprogress - 1 and when o_done is True o_oprogress is set to the size of the result in bytes. These progress indicators have size LMAX, which defaults to 24 (16 megabytes).

EDIT: You can READ/WRITE at the same time. o_byte will always receive the contents of the output buffer at index i_raddr, independent of the i_mode. So setting i_mode to READ for reading is optional. Setting i_mode to WRITE allows concurrent read/write.

So there really should be no issue embedding the current implementation in the standard Vivado AXI-Stream wrapper. If you write this wrapper (or a Wishbone wrapper) then I would be happy to merge it in this repository. I might also write one in the future myself, but I have currently no concrete usecase for such a wrapper.

Finally, in general Vivado AXI implementations are not without issues: https://zipcpu.com/blog/2019/01/12/demoaxilite.html

Suddenly o_iprogress is related to reading? I thought it was about writing. And if o_iprogress starts at 0 it is not possible to write only at addresses < o_iprogress.
Were you maybe trying to say: You should not write at address o_iprogress - 1 ?

And likewise: You should not read at address o_oprogress ?

@maartenbrock
Copy link
Author

Would you consider to rename reset to resetn as it appears to be active-low?

@tomtor
Copy link
Owner

tomtor commented Feb 2, 2019

Would you consider to rename reset to resetn as it appears to be active-low?

Yes, I should or I could change the implementation to active-high which is best practice in FPGA land? Not sure, what is your preference?

@maartenbrock
Copy link
Author

My preference is active-high, but active-low with renaming to resetn is okay as well. The current situation is confusing.

@maartenbrock
Copy link
Author

I'm getting lost. You wrote:

o_iprogress indicates the current position of the deflate input reader. You need that if you stream the input in a circular input buffer like the tests do. You should only write at addresses < o_iprogress to prevent overwriting unread data.

Then I asked:

Does o_iprogress start at 1? Or should I only write at addresses <= o_iprogress?

And now you answer:

Regarding your other questions: o_iprogress starts at 0. So you can read the first output byte (with i_raddr = 0) when o_iprogress >= 1.

Suddenly o_iprogress is related to reading? I thought it was about writing. And if o_iprogress starts at 0 it is not possible to write only at addresses < o_iprogress.
Were you maybe trying to say: You should not write at address o_iprogress - 1 ?

And likewise: You should not read at address o_oprogress ?

@maartenbrock
Copy link
Author

The Vivado AXI article is about AXI-lite, not AXI-Stream. AXI-Stream is a much simpler protocol. Further, I've never used that core, but always wrote my own AXI-lite slave core which never gave any issues.

If setting i_mode to READ is optional, does that mean that you don't need to READ to release occupied bytes in the read buffer? So, if you don't read in time the buffer will just overflow, right? Not necessarily a problem, just asking for confirmation. It does mean that back-pressure on the reading side should be handled on the writing side.

@maartenbrock
Copy link
Author

maartenbrock commented Feb 4, 2019

The generated VHDL deflate.vhd has a lot of bugs:

  • NEXT is a reserved VHDL keyword, and so is DISTANCE
  • static is a redefinition of STATIC in VHDL
  • lots of direct assignments to and reading from signals from within functions/procedures
  • type errors
  • there is no bool() function nor stdl

I don't know if it is the source or the myhdl converter, but this is totally useless, sorry.

@tomtor
Copy link
Owner

tomtor commented Feb 4, 2019

I'm getting lost. You wrote:

Sorry, I corrected the text in the original reply. It should be OK now.

Regarding the VHDL, that's a MyHDL issue, nothing I can do about that.
I only use/test the verilog output which is fine. I could change the identifiers to prevent NEXT/DISTANCE/STATIC (and use e.g. HD_NEXT/HD_DISTANCE/...), but that makes no sense, because there remain still other VHDL issues?

If it is ONLY a matter of different identifiers then please submit a PR with identifiers which are fine. If it is broken beyond repair, you have no choice but to use the verilog.

Edit: I dropped the .vhd from the repo and changed reset to be active high.

@tomtor
Copy link
Owner

tomtor commented Feb 4, 2019

If setting i_mode to READ is optional, does that mean that you don't need to READ to release occupied bytes in the read buffer? So, if you don't read in time the buffer will just overflow, right? Not necessarily a problem, just asking for confirmation. It does mean that back-pressure on the reading side should be handled on the writing side.

Edit: i_raddr signals the (de)compressor which output data has been read. It will pause and not overwrite.

@maartenbrock
Copy link
Author

It looks like the generated vhdl is beyond repair. Just for the fun of it, open deflate.vhd in Vivado and see all the red markers.

I have now declared a component in vhdl and included the verilog file in Vivado.

component deflate is
port (
	clk		: in std_logic;
	reset		: in std_logic;

	i_mode		: in std_logic_vector ( 2 downto 0 );
	i_waddr		: in std_logic_vector ( 8 downto 0 );
	i_data		: in std_logic_vector ( 7 downto 0 );
	o_iprogress	: out std_logic_vector ( 23 downto 0 );
	o_oprogress	: out std_logic_vector ( 23 downto 0 );
	i_raddr		: in std_logic_vector ( 8 downto 0 );
	o_byte		: out std_logic_vector ( 7 downto 0 );
	o_done		: out std_logic		
);
end component;

I had to make some changes to the generated deflate.v to get things working:

  • changed o_byte from reg to wire
  • assign o_byte directly from oram[(i_raddr & 511)] instead of through a register
  • initialize o_iprogress and o_oprogress to zero during reset

And then I still have to delay o_oprogress by two clocks to get things right.

It's all these out-of-sync progress indicators, addresses and data that makes this difficult to handle. This is all automatically fixed when using an AXI-Stream. And you could probably drop the output blockram as well.

@maartenbrock
Copy link
Author

maartenbrock commented Feb 8, 2019

I think bramread() can be removed as it seems it is not used.

And the enumeration of states IDLE, WRITE, etc. should be initialized with range(5) instead of 6 now.

Also, can you explain why IBSIZE needs to be 16 * CWINDOW when FAST is used? Or could it still be set to 2 * CWINDOW?

Btw. I think FAST is not about the achievable clock speed. But is it about low latency or about high throughput (measured in clock ticks)?

@tomtor
Copy link
Owner

tomtor commented Feb 10, 2019

I fixed the range two days ago:

da8691e

FAST is explained in the README.

The only reason for 16 * CWINDOW is to get a buffer (512 bytes) which is sufficiently big for a dynamic tree test in the testbench part that does not use streaming. You can set it to 2 * CWINDOW and it will only TEST non dynamic compression. It has no impact on normal usage.

Edit: CWINDOW is 32 when FAST = True and 256 when False so we need to calculate IBSIZE as is done.

Edit 2: What do you mean regarding bramread(): it sets orbyte!?
It is not called in deflate.py but it IS needed and functional. It generates the verilog always block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants