Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework codec #822

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

Rework codec #822

wants to merge 17 commits into from

Conversation

dalex78
Copy link
Contributor

@dalex78 dalex78 commented Apr 28, 2022

Linked with issue #818

The proposed modifications on the codec package are retro-compatible. The tb_codec.vhd has not been modified to underline this. The tb of the com package tb_com_codec.vhd has not been touched either.

The tests compile and pass for data_type and com package with Modelsim 2020.1. I will try with GHDL soon.

Here is a list of changes (non-exhaustive):

  • Add comments into the code which explains the purpose of this package and how each type is encoded.
  • Removed hard coded value in the code
  • Replace the type 'string' used to encode by an alias named 'code_t' to ease readability and maintainbility.
  • Inverse the function and alias names to ease readability, maintainbility and ease ctrl+F:
    function encode(constant data : std_ulogic_vector) return string;
    function decode(constant code : string) return std_ulogic_vector;
    alias encode_std_ulogic_vector is encode[std_ulogic_vector return string];
    alias decode_std_ulogic_vector is decode[string return std_ulogic_vector];
    becomes
    function encode_std_ulogic_vector(data : std_ulogic_vector) return string;
    function decode_std_ulogic_vector(code : string) return std_ulogic_vector;
    alias encode is encode_std_ulogic_vector[std_ulogic_vector return string];
    alias decode is decode_std_ulogic_vector[string return std_ulogic_vector];
  • The code length for integer automatically adapts to the width of the integer (1993/2002/2008: 32bits, 2019: 64bits). (Not tested with VHDL-2019).
  • Added clearly which function should be used by a VUnit user vs a VUnit advanced user vs a VUnit developer.
  • Functions encode_array_header and get_range are replaced (these names are not consistent with the other functions):
    • function encode_range(range_left : integer; range_right : integer; is_ascending : boolean) return code_vec_t;
    • function decode_range(code : code_vec_t) return range_t;
    • encode_array_header and get_range are deprecated but preserved for backward compatibility.
    • encode_array_header and get_range take the encoded string of range_left, range_right and is_ascending while encode_range and decode_range directly take integers and a boolean.
    • Note that encode_array_header can encode two ranges but it was never used for that purpose (anywhere in the VUnit repo). The encode_range can only encode a single range.
  • Replace the ieee.numeric_std.unsigned by ieee.numeric_std.unresolved_unsigned
  • Replace the ieee.numeric_std.signed by ieee.numeric_std.unresolved_signed
  • Simplify the local log2 function used to encode the real type by using the ieee.math_real package.
    The same idea could be applied to string, but I did not see the use case so it was not implemented.
  • For each encoded type which is a scalar, an enumerated type or a record type, there is a function named function code_length_**type_name** return natural which indicate how many characters are needed to encode a value of the selected type.
  • For each encoded type which is an array type, there is a function named function code_length_**type_name**(length : natural) return natural; which indicate how many character are needed to encode a value of the selected type.
  • The type std_ulogic_array is similar to the std_ulogic_vector but with an integer range. However, it was not the case for bit_vector. There is now a type bit_array for that purpose.
  • The function to_byte_array and from_byte_array are replaced (these names are not consistent with the other functions and the introduction of bit_array required a change of bit_vector to bit_array):
    • function encode_raw_bit_array(data : bit_array) return code_t;
    • function decode_raw_bit_array(code : code_t) return bit_array;
    • to_byte_array and from_byte_array are deprecated but preserved for backward compatibility.
  • Create a new package common_pkg which contains constants/fonction used by the codec package but not specific to it like:
    -- Retrieve the integer width of the simulator
    function get_simulator_integer_width return positive;
  • Added a page in the online documentation which shows the declaration package of the codec.vhd and codec-2008p.vhd files.
  • Updated the python code to generate the encode/decode functions for user-specified type.

…per. Retro-compatibility is preserved. There is now an encode and decode procedure (for advanced user) for each predefined types as well as a simple function (for the user). You can now know the size of the futur encoded string by using the code_length_ functions.
…s which are scalar, enumerated or composite type (records only). Add the support deprecated functions (for backward compatibility)
…s which are composite type (arrays only). End of modification of codec_builder.vhd
… which are scalar, enumerated or composite types (records only).
…125 tests) and com (132 tests) with Modelsim 2020.1. All tests pass. Note that no tb nor the codec_2008 has been modified
…im 2020.1. All tests pass. Note that no tb has been modified.
…ch show the coded and codec_2008 package declaration. A small introduction of the codec is now present in the user_guide of the data_type.
@dalex78
Copy link
Contributor Author

dalex78 commented Apr 29, 2022

@std-max pointed that unresolved_unsigned does not exists in VHDL-93. I lauched test only with VHDL 2008. I will run some more tests in VHDL 93 and correct this (and eventually other mistakes of the same nature)

…DL when using VHDL-93 may work without errors or warnings.
…ying the old ones. All test with GHDL passes (in 93 and 2008) for all tests in vunit/vhdl (GHDL 3.0.0-dev (v2.0.0-101-g791ff0c1) [Dunoon edition] Compiled with GNAT Version: 9.3.0. GCC back-end code generator)
@dalex78
Copy link
Contributor Author

dalex78 commented Apr 30, 2022

I have lauched all tests inside the vunit/vhdl sub-directories with Modelsim 2020.1 and GHDL 3.0.0-dev (v2.0.0-101-g791ff0c1). The tests have been perfomed with VHDL-93 (when possible) and VHDL-2008.
All tests pass in all configurations (GHDL vs Modelsim and 93 vs 2008).

Note:
I made slight modifications because GHDL raises an error when casting a null range of type bit_array(-1 downto 0) to bit_vector.
Error: 'bound check failure'
Example:

  procedure decode_bit_vector(constant code : in code_t; variable index : inout code_index_t; variable result : out bit_vector) is
    variable ret_val : bit_array(result'range);
  begin
    decode_bit_array(code, index, ret_val);
    result := bit_vector(ret_val); -- ERROR 'bound check failure' with GDHL but not with modelsim
  end procedure;

This code has been transformed into:

  procedure decode_bit_vector(constant code : in code_t; variable index : inout code_index_t; variable result : out bit_vector) is
    variable ret_val : bit_array(result'range);
  begin
    if ret_val'length /= 0 then -- GHDL work-around
      decode_bit_array(code, index, ret_val);
      result := bit_vector(ret_val);
    else
      result := "";
    end if;
  end procedure;

LRM: VHDL-2008: '5.2.1 General'

A constraint is compatible with a subtype if each bound of the range belongs to the subtype or if the range constraint defines a null range. Otherwise, the range constraint is not compatible with the subtype.

So I guess this is a GHDL mistake. I will link the corresponding issue from the GHDL repo when created.

@dalex78
Copy link
Contributor Author

dalex78 commented May 7, 2022

@LarsAsplund ?
@umarcor ?

-------------------------------------------------------------------------------
-- Package declaration
-------------------------------------------------------------------------------
package common_pkg is
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be renamed to data_types_common_pkg since we can have common packages at various levels that will be compiled into the common vunit_lib

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll change that

vunit/vhdl/data_types/src/common_pkg.vhd Show resolved Hide resolved
end function;


function get_simulator_integer_width return positive is -- TBC Is this function written well enough ?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested this with a simulator using 64-bit integers? I tested with Riviera-PRO 2022.04 in VHDL-2019 mode but it is still using 32-bit integers.

From a standard point of view integers are implementation defined. Not being 32 bits doesn't necessary mean it is 64 bits. However, this function is only used to allocate sufficient number of bits. Maybe rename the function and the related constant to show that?

Copy link
Contributor Author

@dalex78 dalex78 Jul 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not tested with a 64-bits as I do not have one I believe. questasim/22.1.0 may be 64bits, I need to check that.

Would min_integer_width be ok ? (I am not convinced)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of such a function would be that it returns the number of bits to allocate for an integer so maybe get_num_of_bits_to_allocate_for_integer. I don't mind long names if that's needed to describe the purpose.

end function;


function get_simulator_real_width return positive is -- TBC Is this function written well enough ?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment for this and following functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would min_real_width be ok ? (I am not convinced)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_num_of_bits_to_allocate_for_real

@@ -6,197 +6,231 @@
--
-- Copyright (c) 2014-2022, Lars Asplund lars.anders.asplund@gmail.com

library std;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superfluous library clause. Access to library std is implicit. General remark for all files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove them.


library work;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superfluous library clause. Access to library work is implicit. General remark for all files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove them.

constant code : string)
return character;

function encode_boolean(data : boolean) return code_t;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is fixed in a later commit but these type-specific functions have no bodies. The bodies are for the ocerloaded encode/decode functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to segment the commits so that it is easier for you to review. Yes the body will come later.

Not sure what you mean by:

The bodies are for the ocerloaded encode/decode functions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that the function declaration is for encode_boolean and then there is an alias for encode. That means that the function definition has to be for encode_boolean and not encode as it is now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I changed that. When developers work on the codec library (or user wants to now more about it), all they can actually see is a list of encode/decode functions/procedures. It is much more convenient to have the function definition for encode_boolean has it is easier t search for than encode.

Yes, the function definition has to be for encode_boolean and not encode as it is now as it is for the commit you must be currently reviewing but not the last one. Sorry if that's not convenient to you, I'll try to have better commits next time.

-- of that feature. We first encode/decode the real part, then the imaginary part.

-- Predefined enumerated types
procedure encode_boolean(constant data : in boolean; variable index : inout code_index_t; variable code : inout code_t);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to previous code. Function bodies are for the overloaded functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous

end function;

-----------------------------------------------------------------------------
-- String
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bit_array

-----------------------------------------------------------------------------
-- A string can directly be stored into the encoded string
-- We do not encode the range
function code_length_raw_string(length : natural) return natural is
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code length functions taking vector length as input assumes that the code length for every element in the vector is the same. At the same type the code length function for the element type takes the element as input which opens up for non-constant codes. This becomes a bit inconsistent

Copy link
Contributor Author

@dalex78 dalex78 Jul 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not understood.

The code length functions taking vector length as input assumes that the code length for every element in the vector is the same.

The code length of any elements which is not an array or a record (a composite type) is constant. The code length of a composite type is variable depending on :

  • the length of the array and the base element of the array and whether or not we encode the range of the vector,
  • the elements constituting the record.

At the same type the code length function for the element type takes the element as input which opens up for non-constant codes.

No, a code length of a non-composite type returns a constant regardless of the element given:

-- Non-composite type
signal some_bool : boolean;
constant LENGTH1 : string := code_length_boolean;
constant LENGTH2 : string := code_length_boolean(some_bool);
constant LENGTH3 : string := code_length(some_bool);
-- Note that LENGTH1 = LENGTH2 = LENGTH3

These functions take the boolean in parameter but this is optional. It was done this way to have the same way of writing between a composite and a non-composite type.

-- Composite type
signal some_vec : std_ulogic(7 downto 0);
constant LENGTH1 : string := code_length_std_ulogic(some_vec'length);
constant LENGTH2 : string := code_length_std_ulogic(some_vec); -- Same writing as above
constant LENGTH3 : string := code_length(some_vec); -- Same writing as above
-- Note that LENGTH1 = LENGTH2 = LENGTH3

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the public constant and only saw:

function code_length_std_ulogic(data : std_ulogic := std_ulogic'left) return natural

This opens up for the possibility of having a code length that depends on the data. This is not the case today but I'm thinking that it might be the case in the future. I've seen examples where many encoded data structures may lead to stack overflow. A possible solution to this would be to create a more compact encoding scheme. For example, rather than characters as building blocks for encoding we might find it more efficient to use bits and have variable length encoding, for example 0 and 1 are more common than L and H so they would be given a shorter code word. With a variable length encoding the code length for a std_ulogic_vector can't be calculated from the length of the array, only from the data.

All this is a bit speculative of course but the fewer dependencies to the constant length assumption we have the easier it would be for such a future update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll remove the function which only take the length into account.
I guess that I'll also remove the default values for functions like code_length_boolean or code_length_std_ulogic

-- * a sign (encoded as boolean)
-- * a mantisse (encoded as 2 integers)
-- * an exponent (encoded as 1 integer)
constant static_code_length_real : positive := code_length_boolean + 3 * code_length_integer;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a function of the simulator_real_width constant? This is a function of simulator_integer_width. In practice this is probably no issue if 64-bit integers and 64-bit reals come in pair.

Copy link
Contributor Author

@dalex78 dalex78 Jul 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good remark. I'll look into that.

Copy link
Collaborator

@LarsAsplund LarsAsplund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After 8b8c2db I can run all tests in ModelSim but GHDL fails compilation of codec_builder without error message. Riviera-PRO and Active-HDL compile (after changing the non-positive code_t range and adding the in mode) but they fail simulation with an internal error which I can't decrypt. I will have to do some bisecting of the changes to find the root cause. I guess you can't get hold of Riviera-PRO and Active-HDL but I recommend testing with free GHDL

end function;


function get_simulator_integer_width return positive is -- TBC Is this function written well enough ?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of such a function would be that it returns the number of bits to allocate for an integer so maybe get_num_of_bits_to_allocate_for_integer. I don't mind long names if that's needed to describe the purpose.

end function;


function get_simulator_real_width return positive is -- TBC Is this function written well enough ?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_num_of_bits_to_allocate_for_real

constant code : string)
return character;

function encode_boolean(data : boolean) return code_t;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that the function declaration is for encode_boolean and then there is an alias for encode. That means that the function definition has to be for encode_boolean and not encode as it is now.

-- of that feature. We first encode/decode the real part, then the imaginary part.

-- Predefined enumerated types
procedure encode_boolean(constant data : in boolean; variable index : inout code_index_t; variable code : inout code_t);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous

-----------------------------------------------------------------------------
-- A string can directly be stored into the encoded string
-- We do not encode the range
function code_length_raw_string(length : natural) return natural is
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the public constant and only saw:

function code_length_std_ulogic(data : std_ulogic := std_ulogic'left) return natural

This opens up for the possibility of having a code length that depends on the data. This is not the case today but I'm thinking that it might be the case in the future. I've seen examples where many encoded data structures may lead to stack overflow. A possible solution to this would be to create a more compact encoding scheme. For example, rather than characters as building blocks for encoding we might find it more efficient to use bits and have variable length encoding, for example 0 and 1 are more common than L and H so they would be given a shorter code word. With a variable length encoding the code length for a std_ulogic_vector can't be calculated from the length of the array, only from the data.

All this is a bit speculative of course but the fewer dependencies to the constant length assumption we have the easier it would be for such a future update.


-- Deprecated. Maintained for backward compatibility.
function to_byte_array(value : bit_vector) return code_t is
variable ret_val : code_t(code_length_bit_array(bit_array(value))-1 downto 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code_t has a positive range so downto 0 is not legal. Riviera-PRO complains.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch

constant range_right2 : string := "";
constant is_ascending2 : string := "T")
return string is
constant range_left1 : in code_t;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function declaration has the mode in explicitly defined on all constants. Riviera-PRO doesn't like that there is a difference between declaration and definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, there are no difference between the function declaration and definition of encode_array_header.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In one place it is:

constant range_left1 : code_t;

and in the other:

constant range_left1 : in code_t;

These are the same from a VHDL point of view since in is the default for a constant. However, Riviera-PRO will give a compile error if they are not literally the same.


function decode_raw_string(code : code_t; length : positive) return string is
variable ret_val : string(length-1 downto 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string must have a positive range

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, I'll also add tests for this one.

@dalex78
Copy link
Contributor Author

dalex78 commented Jul 20, 2022

After 8b8c2db I can run all tests in ModelSim but GHDL fails compilation of codec_builder without error message. Riviera-PRO and Active-HDL compile (after changing the non-positive code_t range and adding the in mode) but they fail simulation with an internal error which I can't decrypt. I will have to do some bisecting of the changes to find the root cause. I guess you can't get hold of Riviera-PRO and Active-HDL but I recommend testing with free GHDL

The commits 4e7fd60 and 7515a1f update the sources so that it compiles with GHDL (GHDL 3.0.0-dev (v2.0.0-101-g791ff0c1)). There is a comment about that earlier: #822 (comment).

I tested with Modelsim the commit 8b8c2db but no other tool.

And I tested with Modelsim and GHDL starting commit e2de513 (see comment #822 (comment)).

--===========================================================================

function encode_boolean_vector(data : boolean_vector) return code_t;
function decode_boolean_vector(code : string) return boolean_vector;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The declarations of the decode functions take string but the definition takes code_t. Riviera-PRO doesn't allow this although they are the same type.


function get_simulator_real_width return positive is -- TBC Is this function written well enough ?
begin
if real'high = 1.0E308 then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Max for single-precision floats is 3.4028235e38 and for float 1.7976931348623158e308. This will always be false and return 64. In general I find that the high/low values have a tendency to be unexpected. I recall working around such bugs in VUnit in the past. I would recommend that we do an inequality with some threshold that is chosen with some margin such that we allocate sufficient number of bits. Or we hardcode these values to always be 64.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants