New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework codec #822
base: master
Are you sure you want to change the base?
Rework codec #822
Conversation
…per. Retro-compatibility is preserved. There is now an encode and decode procedure (for advanced user) for each predefined types as well as a simple function (for the user). You can now know the size of the futur encoded string by using the code_length_ functions.
…ation on how each type is encoded.
…s which are scalar, enumerated or composite type (records only). Add the support deprecated functions (for backward compatibility)
…s which are composite type (arrays only). End of modification of codec_builder.vhd
… which are scalar, enumerated or composite types (records only).
… which are composite type (arrays only).
…125 tests) and com (132 tests) with Modelsim 2020.1. All tests pass. Note that no tb nor the codec_2008 has been modified
…im 2020.1. All tests pass. Note that no tb has been modified.
…ch show the coded and codec_2008 package declaration. A small introduction of the codec is now present in the user_guide of the data_type.
@std-max pointed that |
…DL when using VHDL-93 may work without errors or warnings.
…ying the old ones. All test with GHDL passes (in 93 and 2008) for all tests in vunit/vhdl (GHDL 3.0.0-dev (v2.0.0-101-g791ff0c1) [Dunoon edition] Compiled with GNAT Version: 9.3.0. GCC back-end code generator)
I have lauched all tests inside the Note: procedure decode_bit_vector(constant code : in code_t; variable index : inout code_index_t; variable result : out bit_vector) is
variable ret_val : bit_array(result'range);
begin
decode_bit_array(code, index, ret_val);
result := bit_vector(ret_val); -- ERROR 'bound check failure' with GDHL but not with modelsim
end procedure; This code has been transformed into: procedure decode_bit_vector(constant code : in code_t; variable index : inout code_index_t; variable result : out bit_vector) is
variable ret_val : bit_array(result'range);
begin
if ret_val'length /= 0 then -- GHDL work-around
decode_bit_array(code, index, ret_val);
result := bit_vector(ret_val);
else
result := "";
end if;
end procedure; LRM: VHDL-2008: '5.2.1 General'
So I guess this is a GHDL mistake. I will link the corresponding issue from the GHDL repo when created. |
@LarsAsplund ? |
------------------------------------------------------------------------------- | ||
-- Package declaration | ||
------------------------------------------------------------------------------- | ||
package common_pkg is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be renamed to data_types_common_pkg
since we can have common packages at various levels that will be compiled into the common vunit_lib
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll change that
end function; | ||
|
||
|
||
function get_simulator_integer_width return positive is -- TBC Is this function written well enough ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested this with a simulator using 64-bit integers? I tested with Riviera-PRO 2022.04 in VHDL-2019 mode but it is still using 32-bit integers.
From a standard point of view integers are implementation defined. Not being 32 bits doesn't necessary mean it is 64 bits. However, this function is only used to allocate sufficient number of bits. Maybe rename the function and the related constant to show that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not tested with a 64-bits as I do not have one I believe. questasim/22.1.0
may be 64bits, I need to check that.
Would min_integer_width
be ok ? (I am not convinced)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description of such a function would be that it returns the number of bits to allocate for an integer so maybe get_num_of_bits_to_allocate_for_integer
. I don't mind long names if that's needed to describe the purpose.
end function; | ||
|
||
|
||
function get_simulator_real_width return positive is -- TBC Is this function written well enough ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment for this and following functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would min_real_width
be ok ? (I am not convinced)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_num_of_bits_to_allocate_for_real
@@ -6,197 +6,231 @@ | |||
-- | |||
-- Copyright (c) 2014-2022, Lars Asplund lars.anders.asplund@gmail.com | |||
|
|||
library std; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Superfluous library clause. Access to library std is implicit. General remark for all files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove them.
|
||
library work; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Superfluous library clause. Access to library work is implicit. General remark for all files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove them.
constant code : string) | ||
return character; | ||
|
||
function encode_boolean(data : boolean) return code_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is fixed in a later commit but these type-specific functions have no bodies. The bodies are for the ocerloaded encode/decode functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried to segment the commits so that it is easier for you to review. Yes the body will come later.
Not sure what you mean by:
The bodies are for the ocerloaded encode/decode functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that the function declaration is for encode_boolean
and then there is an alias for encode
. That means that the function definition has to be for encode_boolean
and not encode
as it is now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I changed that. When developers work on the codec library (or user wants to now more about it), all they can actually see is a list of encode/decode functions/procedures. It is much more convenient to have the function definition for encode_boolean
has it is easier t search for than encode
.
Yes, the function definition has to be for encode_boolean and not encode as it is now as it is for the commit you must be currently reviewing but not the last one. Sorry if that's not convenient to you, I'll try to have better commits next time.
-- of that feature. We first encode/decode the real part, then the imaginary part. | ||
|
||
-- Predefined enumerated types | ||
procedure encode_boolean(constant data : in boolean; variable index : inout code_index_t; variable code : inout code_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to previous code. Function bodies are for the overloaded functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous
end function; | ||
|
||
----------------------------------------------------------------------------- | ||
-- String |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bit_array
----------------------------------------------------------------------------- | ||
-- A string can directly be stored into the encoded string | ||
-- We do not encode the range | ||
function code_length_raw_string(length : natural) return natural is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code length functions taking vector length as input assumes that the code length for every element in the vector is the same. At the same type the code length function for the element type takes the element as input which opens up for non-constant codes. This becomes a bit inconsistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not understood.
The code length functions taking vector length as input assumes that the code length for every element in the vector is the same.
The code length of any elements which is not an array or a record (a composite type) is constant. The code length of a composite type is variable depending on :
- the length of the array and the base element of the array and whether or not we encode the range of the vector,
- the elements constituting the record.
At the same type the code length function for the element type takes the element as input which opens up for non-constant codes.
No, a code length of a non-composite type returns a constant regardless of the element given:
-- Non-composite type
signal some_bool : boolean;
constant LENGTH1 : string := code_length_boolean;
constant LENGTH2 : string := code_length_boolean(some_bool);
constant LENGTH3 : string := code_length(some_bool);
-- Note that LENGTH1 = LENGTH2 = LENGTH3
These functions take the boolean in parameter but this is optional. It was done this way to have the same way of writing between a composite and a non-composite type.
-- Composite type
signal some_vec : std_ulogic(7 downto 0);
constant LENGTH1 : string := code_length_std_ulogic(some_vec'length);
constant LENGTH2 : string := code_length_std_ulogic(some_vec); -- Same writing as above
constant LENGTH3 : string := code_length(some_vec); -- Same writing as above
-- Note that LENGTH1 = LENGTH2 = LENGTH3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed the public constant and only saw:
function code_length_std_ulogic(data : std_ulogic := std_ulogic'left) return natural
This opens up for the possibility of having a code length that depends on the data. This is not the case today but I'm thinking that it might be the case in the future. I've seen examples where many encoded data structures may lead to stack overflow. A possible solution to this would be to create a more compact encoding scheme. For example, rather than characters as building blocks for encoding we might find it more efficient to use bits and have variable length encoding, for example 0
and 1
are more common than L
and H
so they would be given a shorter code word. With a variable length encoding the code length for a std_ulogic_vector
can't be calculated from the length of the array, only from the data.
All this is a bit speculative of course but the fewer dependencies to the constant length assumption we have the easier it would be for such a future update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll remove the function which only take the length into account.
I guess that I'll also remove the default values for functions like code_length_boolean
or code_length_std_ulogic
-- * a sign (encoded as boolean) | ||
-- * a mantisse (encoded as 2 integers) | ||
-- * an exponent (encoded as 1 integer) | ||
constant static_code_length_real : positive := code_length_boolean + 3 * code_length_integer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a function of the simulator_real_width
constant? This is a function of simulator_integer_width
. In practice this is probably no issue if 64-bit integers and 64-bit reals come in pair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good remark. I'll look into that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After 8b8c2db I can run all tests in ModelSim but GHDL fails compilation of codec_builder without error message. Riviera-PRO and Active-HDL compile (after changing the non-positive code_t range and adding the in mode) but they fail simulation with an internal error which I can't decrypt. I will have to do some bisecting of the changes to find the root cause. I guess you can't get hold of Riviera-PRO and Active-HDL but I recommend testing with free GHDL
end function; | ||
|
||
|
||
function get_simulator_integer_width return positive is -- TBC Is this function written well enough ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description of such a function would be that it returns the number of bits to allocate for an integer so maybe get_num_of_bits_to_allocate_for_integer
. I don't mind long names if that's needed to describe the purpose.
end function; | ||
|
||
|
||
function get_simulator_real_width return positive is -- TBC Is this function written well enough ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_num_of_bits_to_allocate_for_real
constant code : string) | ||
return character; | ||
|
||
function encode_boolean(data : boolean) return code_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that the function declaration is for encode_boolean
and then there is an alias for encode
. That means that the function definition has to be for encode_boolean
and not encode
as it is now.
-- of that feature. We first encode/decode the real part, then the imaginary part. | ||
|
||
-- Predefined enumerated types | ||
procedure encode_boolean(constant data : in boolean; variable index : inout code_index_t; variable code : inout code_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous
----------------------------------------------------------------------------- | ||
-- A string can directly be stored into the encoded string | ||
-- We do not encode the range | ||
function code_length_raw_string(length : natural) return natural is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed the public constant and only saw:
function code_length_std_ulogic(data : std_ulogic := std_ulogic'left) return natural
This opens up for the possibility of having a code length that depends on the data. This is not the case today but I'm thinking that it might be the case in the future. I've seen examples where many encoded data structures may lead to stack overflow. A possible solution to this would be to create a more compact encoding scheme. For example, rather than characters as building blocks for encoding we might find it more efficient to use bits and have variable length encoding, for example 0
and 1
are more common than L
and H
so they would be given a shorter code word. With a variable length encoding the code length for a std_ulogic_vector
can't be calculated from the length of the array, only from the data.
All this is a bit speculative of course but the fewer dependencies to the constant length assumption we have the easier it would be for such a future update.
|
||
-- Deprecated. Maintained for backward compatibility. | ||
function to_byte_array(value : bit_vector) return code_t is | ||
variable ret_val : code_t(code_length_bit_array(bit_array(value))-1 downto 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code_t
has a positive range so downto 0 is not legal. Riviera-PRO complains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch
constant range_right2 : string := ""; | ||
constant is_ascending2 : string := "T") | ||
return string is | ||
constant range_left1 : in code_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function declaration has the mode in
explicitly defined on all constants. Riviera-PRO doesn't like that there is a difference between declaration and definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can see, there are no difference between the function declaration and definition of encode_array_header
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In one place it is:
constant range_left1 : code_t;
and in the other:
constant range_left1 : in code_t;
These are the same from a VHDL point of view since in
is the default for a constant. However, Riviera-PRO will give a compile error if they are not literally the same.
|
||
function decode_raw_string(code : code_t; length : positive) return string is | ||
variable ret_val : string(length-1 downto 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string
must have a positive range
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, I'll also add tests for this one.
The commits 4e7fd60 and 7515a1f update the sources so that it compiles with GHDL (GHDL 3.0.0-dev (v2.0.0-101-g791ff0c1)). There is a comment about that earlier: #822 (comment). I tested with Modelsim the commit 8b8c2db but no other tool. And I tested with Modelsim and GHDL starting commit e2de513 (see comment #822 (comment)). |
--=========================================================================== | ||
|
||
function encode_boolean_vector(data : boolean_vector) return code_t; | ||
function decode_boolean_vector(code : string) return boolean_vector; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The declarations of the decode functions take string
but the definition takes code_t
. Riviera-PRO doesn't allow this although they are the same type.
|
||
function get_simulator_real_width return positive is -- TBC Is this function written well enough ? | ||
begin | ||
if real'high = 1.0E308 then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Max for single-precision floats is 3.4028235e38 and for float 1.7976931348623158e308. This will always be false and return 64. In general I find that the high/low values have a tendency to be unexpected. I recall working around such bugs in VUnit in the past. I would recommend that we do an inequality with some threshold that is chosen with some margin such that we allocate sufficient number of bits. Or we hardcode these values to always be 64.
Linked with issue #818
The proposed modifications on the
codec
package are retro-compatible. Thetb_codec.vhd
has not been modified to underline this. The tb of thecom
packagetb_com_codec.vhd
has not been touched either.The tests compile and pass for
data_type
andcom
package with Modelsim 2020.1. I will try with GHDL soon.Here is a list of changes (non-exhaustive):
function
and aliasnames
to ease readability, maintainbility and easectrl+F
:encode_array_header
andget_range
are replaced (these names are not consistent with the other functions):function encode_range(range_left : integer; range_right : integer; is_ascending : boolean) return code_vec_t;
function decode_range(code : code_vec_t) return range_t;
encode_array_header
andget_range
are deprecated but preserved for backward compatibility.encode_array_header
andget_range
take the encoded string ofrange_left
,range_right
andis_ascending
whileencode_range
anddecode_range
directly take integers and a boolean.encode_array_header
can encode two ranges but it was never used for that purpose (anywhere in the VUnit repo). Theencode_range
can only encode a single range.ieee.numeric_std.unsigned
byieee.numeric_std.unresolved_unsigned
ieee.numeric_std.signed
byieee.numeric_std.unresolved_signed
log2
function used to encode thereal
type by using theieee.math_real
package.The same idea could be applied to
string
, but I did not see the use case so it was not implemented.scalar
, anenumerated type
or arecord type
, there is a function namedfunction code_length_**type_name** return natural
which indicate how many characters are needed to encode a value of the selected type.array type
, there is a function namedfunction code_length_**type_name**(length : natural) return natural;
which indicate how many character are needed to encode a value of the selected type.std_ulogic_array
is similar to thestd_ulogic_vector
but with an integer range. However, it was not the case for bit_vector. There is now a typebit_array
for that purpose.to_byte_array
andfrom_byte_array
are replaced (these names are not consistent with the other functions and the introduction ofbit_array
required a change ofbit_vector
tobit_array
):function encode_raw_bit_array(data : bit_array) return code_t;
function decode_raw_bit_array(code : code_t) return bit_array;
to_byte_array
andfrom_byte_array
are deprecated but preserved for backward compatibility.common_pkg
which contains constants/fonction used by thecodec
package but not specific to it like:codec.vhd
andcodec-2008p.vhd
files.