Rework codec #822

dalex78 · 2022-04-28T22:09:30Z

Linked with issue #818

The proposed modifications on the codec package are retro-compatible. The tb_codec.vhd has not been modified to underline this. The tb of the com package tb_com_codec.vhd has not been touched either.

The tests compile and pass for data_type and com package with Modelsim 2020.1. I will try with GHDL soon.

Here is a list of changes (non-exhaustive):

Add comments into the code which explains the purpose of this package and how each type is encoded.
Removed hard coded value in the code
Replace the type 'string' used to encode by an alias named 'code_t' to ease readability and maintainbility.

Inverse the function and alias names to ease readability, maintainbility and ease ctrl+F:

function encode(constant data : std_ulogic_vector) return string;
function decode(constant code : string) return std_ulogic_vector;
alias encode_std_ulogic_vector is encode[std_ulogic_vector return string];
alias decode_std_ulogic_vector is decode[string return std_ulogic_vector];

becomes

function encode_std_ulogic_vector(data : std_ulogic_vector) return string;
function decode_std_ulogic_vector(code : string) return std_ulogic_vector;
alias encode is encode_std_ulogic_vector[std_ulogic_vector return string];
alias decode is decode_std_ulogic_vector[string return std_ulogic_vector];

The code length for integer automatically adapts to the width of the integer (1993/2002/2008: 32bits, 2019: 64bits). (Not tested with VHDL-2019).
Added clearly which function should be used by a VUnit user vs a VUnit advanced user vs a VUnit developer.
Functions encode_array_header and get_range are replaced (these names are not consistent with the other functions):
- function encode_range(range_left : integer; range_right : integer; is_ascending : boolean) return code_vec_t;
- function decode_range(code : code_vec_t) return range_t;
- encode_array_header and get_range are deprecated but preserved for backward compatibility.
- encode_array_header and get_range take the encoded string of range_left, range_right and is_ascending while encode_range and decode_range directly take integers and a boolean.
- Note that encode_array_header can encode two ranges but it was never used for that purpose (anywhere in the VUnit repo). The encode_range can only encode a single range.
Replace the ieee.numeric_std.unsigned by ieee.numeric_std.unresolved_unsigned
Replace the ieee.numeric_std.signed by ieee.numeric_std.unresolved_signed
Simplify the local log2 function used to encode the real type by using the ieee.math_real package.
The same idea could be applied to string, but I did not see the use case so it was not implemented.
For each encoded type which is a scalar, an enumerated type or a record type, there is a function named function code_length_**type_name** return natural which indicate how many characters are needed to encode a value of the selected type.
For each encoded type which is an array type, there is a function named function code_length_**type_name**(length : natural) return natural; which indicate how many character are needed to encode a value of the selected type.
The type std_ulogic_array is similar to the std_ulogic_vector but with an integer range. However, it was not the case for bit_vector. There is now a type bit_array for that purpose.
The function to_byte_array and from_byte_array are replaced (these names are not consistent with the other functions and the introduction of bit_array required a change of bit_vector to bit_array):
- function encode_raw_bit_array(data : bit_array) return code_t;
- function decode_raw_bit_array(code : code_t) return bit_array;
- to_byte_array and from_byte_array are deprecated but preserved for backward compatibility.
Create a new package common_pkg which contains constants/fonction used by the codec package but not specific to it like:
```
-- Retrieve the integer width of the simulator
function get_simulator_integer_width return positive;
```
Added a page in the online documentation which shows the declaration package of the codec.vhd and codec-2008p.vhd files.
Updated the python code to generate the encode/decode functions for user-specified type.

…per. Retro-compatibility is preserved. There is now an encode and decode procedure (for advanced user) for each predefined types as well as a simple function (for the user). You can now know the size of the futur encoded string by using the code_length_ functions.

…ation on how each type is encoded.

…s which are scalar, enumerated or composite type (records only). Add the support deprecated functions (for backward compatibility)

…s which are composite type (arrays only). End of modification of codec_builder.vhd

… which are scalar, enumerated or composite types (records only).

… which are composite type (arrays only).

…125 tests) and com (132 tests) with Modelsim 2020.1. All tests pass. Note that no tb nor the codec_2008 has been modified

…im 2020.1. All tests pass. Note that no tb has been modified.

…ch show the coded and codec_2008 package declaration. A small introduction of the codec is now present in the user_guide of the data_type.

dalex78 · 2022-04-29T10:18:14Z

@std-max pointed that unresolved_unsigned does not exists in VHDL-93. I lauched test only with VHDL 2008. I will run some more tests in VHDL 93 and correct this (and eventually other mistakes of the same nature)

…DL when using VHDL-93 may work without errors or warnings.

…ying the old ones. All test with GHDL passes (in 93 and 2008) for all tests in vunit/vhdl (GHDL 3.0.0-dev (v2.0.0-101-g791ff0c1) [Dunoon edition] Compiled with GNAT Version: 9.3.0. GCC back-end code generator)

dalex78 · 2022-04-30T18:09:31Z

I have lauched all tests inside the vunit/vhdl sub-directories with Modelsim 2020.1 and GHDL 3.0.0-dev (v2.0.0-101-g791ff0c1). The tests have been perfomed with VHDL-93 (when possible) and VHDL-2008.
All tests pass in all configurations (GHDL vs Modelsim and 93 vs 2008).

Note:
I made slight modifications because GHDL raises an error when casting a null range of type bit_array(-1 downto 0) to bit_vector.
Error: 'bound check failure'
Example:

  procedure decode_bit_vector(constant code : in code_t; variable index : inout code_index_t; variable result : out bit_vector) is
    variable ret_val : bit_array(result'range);
  begin
    decode_bit_array(code, index, ret_val);
    result := bit_vector(ret_val); -- ERROR 'bound check failure' with GDHL but not with modelsim
  end procedure;

This code has been transformed into:

  procedure decode_bit_vector(constant code : in code_t; variable index : inout code_index_t; variable result : out bit_vector) is
    variable ret_val : bit_array(result'range);
  begin
    if ret_val'length /= 0 then -- GHDL work-around
      decode_bit_array(code, index, ret_val);
      result := bit_vector(ret_val);
    else
      result := "";
    end if;
  end procedure;

LRM: VHDL-2008: '5.2.1 General'

A constraint is compatible with a subtype if each bound of the range belongs to the subtype or if the range constraint defines a null range. Otherwise, the range constraint is not compatible with the subtype.

So I guess this is a GHDL mistake. I will link the corresponding issue from the GHDL repo when created.

dalex78 · 2022-05-07T19:22:10Z

@LarsAsplund ?
@umarcor ?

LarsAsplund · 2022-07-17T18:38:54Z

vunit/vhdl/data_types/src/common_pkg.vhd

+-------------------------------------------------------------------------------
+-- Package declaration
+-------------------------------------------------------------------------------
+package common_pkg is


I think this should be renamed to data_types_common_pkg since we can have common packages at various levels that will be compiled into the common vunit_lib

Sure, I'll change that

vunit/vhdl/data_types/src/common_pkg.vhd

LarsAsplund · 2022-07-17T18:43:20Z

vunit/vhdl/data_types/src/common_pkg.vhd

+  end function;
+
+
+  function get_simulator_integer_width return positive is -- TBC Is this function written well enough ?


Have you tested this with a simulator using 64-bit integers? I tested with Riviera-PRO 2022.04 in VHDL-2019 mode but it is still using 32-bit integers.

From a standard point of view integers are implementation defined. Not being 32 bits doesn't necessary mean it is 64 bits. However, this function is only used to allocate sufficient number of bits. Maybe rename the function and the related constant to show that?

Not tested with a 64-bits as I do not have one I believe. questasim/22.1.0 may be 64bits, I need to check that.

Would min_integer_width be ok ? (I am not convinced)

The description of such a function would be that it returns the number of bits to allocate for an integer so maybe get_num_of_bits_to_allocate_for_integer. I don't mind long names if that's needed to describe the purpose.

LarsAsplund · 2022-07-17T20:02:11Z

vunit/vhdl/data_types/src/common_pkg.vhd

+  end function;
+
+
+  function get_simulator_real_width return positive is -- TBC Is this function written well enough ?


Same comment for this and following functions

Would min_real_width be ok ? (I am not convinced)

get_num_of_bits_to_allocate_for_real

LarsAsplund · 2022-07-17T20:28:19Z

vunit/vhdl/data_types/src/codec.vhd

@@ -6,197 +6,231 @@
 --
 -- Copyright (c) 2014-2022, Lars Asplund lars.anders.asplund@gmail.com

+library std;


Superfluous library clause. Access to library std is implicit. General remark for all files.

I'll remove them.

LarsAsplund · 2022-07-17T20:28:34Z

vunit/vhdl/data_types/src/codec.vhd


+library work;


Superfluous library clause. Access to library work is implicit. General remark for all files.

I'll remove them.

LarsAsplund · 2022-07-17T20:48:14Z

vunit/vhdl/data_types/src/codec.vhd

-    constant code : string)
-    return character;
+
+  function encode_boolean(data : boolean) return code_t;


I guess this is fixed in a later commit but these type-specific functions have no bodies. The bodies are for the ocerloaded encode/decode functions

I have tried to segment the commits so that it is easier for you to review. Yes the body will come later.

Not sure what you mean by:

The bodies are for the ocerloaded encode/decode functions

What I mean is that the function declaration is for encode_boolean and then there is an alias for encode. That means that the function definition has to be for encode_boolean and not encode as it is now.

Yes, I changed that. When developers work on the codec library (or user wants to now more about it), all they can actually see is a list of encode/decode functions/procedures. It is much more convenient to have the function definition for encode_boolean has it is easier t search for than encode.

Yes, the function definition has to be for encode_boolean and not encode ~~as it is now~~ as it is for the commit you must be currently reviewing but not the last one. Sorry if that's not convenient to you, I'll try to have better commits next time.

LarsAsplund · 2022-07-17T21:21:10Z

vunit/vhdl/data_types/src/codec_builder.vhd

+  -- of that feature. We first encode/decode the real part, then the imaginary part.
+
+  -- Predefined enumerated types
+  procedure encode_boolean(constant data : in boolean; variable index : inout code_index_t; variable code : inout code_t);


Similar to previous code. Function bodies are for the overloaded functions

Not sure what you mean

See previous

LarsAsplund · 2022-07-17T21:48:36Z

vunit/vhdl/data_types/src/codec_builder.vhd

+  end function;
+
+  -----------------------------------------------------------------------------
+  -- String


LarsAsplund · 2022-07-17T21:58:37Z

vunit/vhdl/data_types/src/codec_builder.vhd

+  -----------------------------------------------------------------------------
+  -- A string can directly be stored into the encoded string
+  -- We do not encode the range
+  function code_length_raw_string(length : natural) return natural is


The code length functions taking vector length as input assumes that the code length for every element in the vector is the same. At the same type the code length function for the element type takes the element as input which opens up for non-constant codes. This becomes a bit inconsistent

I did not understood.

The code length functions taking vector length as input assumes that the code length for every element in the vector is the same.

The code length of any elements which is not an array or a record (a composite type) is constant. The code length of a composite type is variable depending on :

the length of the array and the base element of the array and whether or not we encode the range of the vector,

the elements constituting the record.

At the same type the code length function for the element type takes the element as input which opens up for non-constant codes.

No, a code length of a non-composite type returns a constant regardless of the element given:

-- Non-composite type signal some_bool : boolean; constant LENGTH1 : string := code_length_boolean; constant LENGTH2 : string := code_length_boolean(some_bool); constant LENGTH3 : string := code_length(some_bool); -- Note that LENGTH1 = LENGTH2 = LENGTH3

These functions take the boolean in parameter but this is optional. It was done this way to have the same way of writing between a composite and a non-composite type.

-- Composite type signal some_vec : std_ulogic(7 downto 0); constant LENGTH1 : string := code_length_std_ulogic(some_vec'length); constant LENGTH2 : string := code_length_std_ulogic(some_vec); -- Same writing as above constant LENGTH3 : string := code_length(some_vec); -- Same writing as above -- Note that LENGTH1 = LENGTH2 = LENGTH3

I missed the public constant and only saw:

function code_length_std_ulogic(data : std_ulogic := std_ulogic'left) return natural

This opens up for the possibility of having a code length that depends on the data. This is not the case today but I'm thinking that it might be the case in the future. I've seen examples where many encoded data structures may lead to stack overflow. A possible solution to this would be to create a more compact encoding scheme. For example, rather than characters as building blocks for encoding we might find it more efficient to use bits and have variable length encoding, for example 0 and 1 are more common than L and H so they would be given a shorter code word. With a variable length encoding the code length for a std_ulogic_vector can't be calculated from the length of the array, only from the data.

All this is a bit speculative of course but the fewer dependencies to the constant length assumption we have the easier it would be for such a future update.

Sure, I'll remove the function which only take the length into account.
I guess that I'll also remove the default values for functions like code_length_boolean or code_length_std_ulogic

LarsAsplund · 2022-07-17T22:04:12Z

vunit/vhdl/data_types/src/codec_builder.vhd

+  -- * a sign (encoded as boolean)
+  -- * a mantisse (encoded as 2 integers)
+  -- * an exponent (encoded as 1 integer)
+  constant static_code_length_real : positive := code_length_boolean + 3 * code_length_integer;


Shouldn't this be a function of the simulator_real_width constant? This is a function of simulator_integer_width. In practice this is probably no issue if 64-bit integers and 64-bit reals come in pair.

That's a good remark. I'll look into that.

LarsAsplund

After 8b8c2db I can run all tests in ModelSim but GHDL fails compilation of codec_builder without error message. Riviera-PRO and Active-HDL compile (after changing the non-positive code_t range and adding the in mode) but they fail simulation with an internal error which I can't decrypt. I will have to do some bisecting of the changes to find the root cause. I guess you can't get hold of Riviera-PRO and Active-HDL but I recommend testing with free GHDL

LarsAsplund · 2022-07-19T18:54:23Z

vunit/vhdl/data_types/src/common_pkg.vhd

+  end function;
+
+
+  function get_simulator_integer_width return positive is -- TBC Is this function written well enough ?


The description of such a function would be that it returns the number of bits to allocate for an integer so maybe get_num_of_bits_to_allocate_for_integer. I don't mind long names if that's needed to describe the purpose.

LarsAsplund · 2022-07-19T19:00:09Z

vunit/vhdl/data_types/src/common_pkg.vhd

+  end function;
+
+
+  function get_simulator_real_width return positive is -- TBC Is this function written well enough ?


get_num_of_bits_to_allocate_for_real

LarsAsplund · 2022-07-19T19:02:43Z

vunit/vhdl/data_types/src/codec.vhd

-    constant code : string)
-    return character;
+
+  function encode_boolean(data : boolean) return code_t;


What I mean is that the function declaration is for encode_boolean and then there is an alias for encode. That means that the function definition has to be for encode_boolean and not encode as it is now.

LarsAsplund · 2022-07-19T19:02:52Z

vunit/vhdl/data_types/src/codec_builder.vhd

+  -- of that feature. We first encode/decode the real part, then the imaginary part.
+
+  -- Predefined enumerated types
+  procedure encode_boolean(constant data : in boolean; variable index : inout code_index_t; variable code : inout code_t);


See previous

LarsAsplund · 2022-07-19T19:21:38Z

vunit/vhdl/data_types/src/codec_builder.vhd

+  -----------------------------------------------------------------------------
+  -- A string can directly be stored into the encoded string
+  -- We do not encode the range
+  function code_length_raw_string(length : natural) return natural is


I missed the public constant and only saw:

function code_length_std_ulogic(data : std_ulogic := std_ulogic'left) return natural

This opens up for the possibility of having a code length that depends on the data. This is not the case today but I'm thinking that it might be the case in the future. I've seen examples where many encoded data structures may lead to stack overflow. A possible solution to this would be to create a more compact encoding scheme. For example, rather than characters as building blocks for encoding we might find it more efficient to use bits and have variable length encoding, for example 0 and 1 are more common than L and H so they would be given a shorter code word. With a variable length encoding the code length for a std_ulogic_vector can't be calculated from the length of the array, only from the data.

All this is a bit speculative of course but the fewer dependencies to the constant length assumption we have the easier it would be for such a future update.

LarsAsplund · 2022-07-19T20:12:28Z

vunit/vhdl/data_types/src/codec_builder.vhd

+
+  -- Deprecated. Maintained for backward compatibility.
+  function to_byte_array(value : bit_vector) return code_t is
+    variable ret_val : code_t(code_length_bit_array(bit_array(value))-1 downto 0);


code_t has a positive range so downto 0 is not legal. Riviera-PRO complains.

LarsAsplund · 2022-07-19T20:16:57Z

vunit/vhdl/data_types/src/codec_builder.vhd

-    constant range_right2  : string := "";
-    constant is_ascending2 : string := "T")
-    return string is
+    constant range_left1   : in code_t;


The function declaration has the mode in explicitly defined on all constants. Riviera-PRO doesn't like that there is a difference between declaration and definition.

As far as I can see, there are no difference between the function declaration and definition of encode_array_header.

In one place it is:

constant range_left1 : code_t;

and in the other:

constant range_left1 : in code_t;

These are the same from a VHDL point of view since in is the default for a constant. However, Riviera-PRO will give a compile error if they are not literally the same.

LarsAsplund · 2022-07-19T20:19:25Z

vunit/vhdl/data_types/src/codec.vhd


+  function decode_raw_string(code : code_t; length : positive) return string is
+    variable ret_val : string(length-1 downto 0);


string must have a positive range

Nice catch, I'll also add tests for this one.

dalex78 · 2022-07-20T06:24:13Z

After 8b8c2db I can run all tests in ModelSim but GHDL fails compilation of codec_builder without error message. Riviera-PRO and Active-HDL compile (after changing the non-positive code_t range and adding the in mode) but they fail simulation with an internal error which I can't decrypt. I will have to do some bisecting of the changes to find the root cause. I guess you can't get hold of Riviera-PRO and Active-HDL but I recommend testing with free GHDL

The commits 4e7fd60 and 7515a1f update the sources so that it compiles with GHDL (GHDL 3.0.0-dev (v2.0.0-101-g791ff0c1)). There is a comment about that earlier: #822 (comment).

I tested with Modelsim the commit 8b8c2db but no other tool.

And I tested with Modelsim and GHDL starting commit e2de513 (see comment #822 (comment)).

LarsAsplund · 2022-07-22T14:01:12Z

vunit/vhdl/data_types/src/codec-2008p.vhd

+  --===========================================================================
+
+  function encode_boolean_vector(data : boolean_vector) return code_t;
+  function decode_boolean_vector(code : string) return boolean_vector;


The declarations of the decode functions take string but the definition takes code_t. Riviera-PRO doesn't allow this although they are the same type.

LarsAsplund · 2022-07-22T17:08:43Z

vunit/vhdl/data_types/src/common_pkg.vhd

+
+  function get_simulator_real_width return positive is -- TBC Is this function written well enough ?
+  begin
+    if real'high = 1.0E308 then


Max for single-precision floats is 3.4028235e38 and for float 1.7976931348623158e308. This will always be false and return 64. In general I find that the high/low values have a tendency to be unexpected. I recall working around such bugs in VUnit in the past. I would recommend that we do an inequality with some threshold that is chosen with some margin such that we allocate sufficient number of bits. Or we hardcode these values to always be 64.

dalex78 added 15 commits April 28, 2022 21:52

Add new package for data_type with useful and common constants/functions

e3590cd

Add the body of the functions code_length_type functions and document…

31b17af

…ation on how each type is encoded.

Add the body of the encode/decode procedures for each predefined type…

929d3e2

…s which are scalar, enumerated or composite type (records only). Add the support deprecated functions (for backward compatibility)

Add the body of the encode/decode procedures for each predefined type…

735e399

…s which are composite type (arrays only). End of modification of codec_builder.vhd

Add the body of the encode/decode functions for each predefined types…

3f90937

… which are scalar, enumerated or composite types (records only).

Add the body of the encode/decode functions for each predefined types…

b963424

… which are composite type (arrays only).

Add the body of the decrecated functions. Test launch for data_type (…

8b8c2db

…125 tests) and com (132 tests) with Modelsim 2020.1. All tests pass. Note that no tb nor the codec_2008 has been modified

Update the codec_2008 package declaration

11d1139

Test launch for data_type (125 tests) and com (132 tests) with Models…

82e65e3

…im 2020.1. All tests pass. Note that no tb has been modified.

Update the generation of user created types (enumerated types only)

6776690

Update the generation of user created types (records types only)

7399e73

Update the generation of user created types (array types only)

4b1dc95

Update the generation of user created types.

6046a42

Add documentation for https://vunit.github.i. There is a new page whi…

e2de513

…ch show the coded and codec_2008 package declaration. A small introduction of the codec is now present in the user_guide of the data_type.

dalex78 added 2 commits April 30, 2022 11:54

Corrected sources for compilation on Modelsim and elaboration with GH…

4e7fd60

…DL when using VHDL-93 may work without errors or warnings.

Correct errors when compiling with GHDL. Add some tests without modif…

7515a1f

…ying the old ones. All test with GHDL passes (in 93 and 2008) for all tests in vunit/vhdl (GHDL 3.0.0-dev (v2.0.0-101-g791ff0c1) [Dunoon edition] Compiled with GNAT Version: 9.3.0. GCC back-end code generator)

LarsAsplund reviewed Jul 17, 2022

View reviewed changes

LarsAsplund reviewed Jul 20, 2022

View reviewed changes

LarsAsplund reviewed Jul 22, 2022

View reviewed changes

		end function;


		function get_simulator_integer_width return positive is -- TBC Is this function written well enough ?

		end function;


		function get_simulator_real_width return positive is -- TBC Is this function written well enough ?


		function decode_raw_string(code : code_t; length : positive) return string is
		variable ret_val : string(length-1 downto 0);

Rework codec #822

Are you sure you want to change the base?

Rework codec #822

Conversation

dalex78 commented Apr 28, 2022

dalex78 commented Apr 29, 2022

dalex78 commented Apr 30, 2022

dalex78 commented May 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalex78 Jul 18, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalex78 Jul 18, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalex78 Jul 18, 2022 • edited

Choose a reason for hiding this comment

LarsAsplund left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalex78 commented Jul 20, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalex78 Jul 18, 2022 •

edited

dalex78 Jul 18, 2022 •

edited

dalex78 Jul 18, 2022 •

edited