Knowing the amount of bits that will be written #11

cBournhonesque · 2023-09-17T19:22:36Z

Hi,

I'd like to use bitcode for games networking; and it would be useful to have a function to know how many bits/bytes a structure would take if it were encoded, but without doing the actual encoding (so that i know in which packet i can put the encoded data).

Something similar to https://docs.rs/bincode/latest/bincode/fn.serialized_size.html

finnbear · 2023-09-17T19:42:37Z

Unlike bincode, bitcode doesn't support serializing into a mutable packet structure or stream because performance would suffer from lack of alignment/wide-integer instructions. bitcode only serializes into Vec<u8> (via allocation) or &[u8] (via &mut bitcode::Buffer).

As a result, the minimal-allocation method is to reuse a bitcode::Buffer (or pool of them) and copy from the resulting &[u8] into your packet, at which point you know the number of bytes from <&[u8]>::len().

Feel free to give other/more specific reasons to implement this functionality, e.g. a code example, taking into account the above limitations.

cBournhonesque · 2023-09-19T01:10:31Z

I'm not sure I fully understood your comment; what I meant was a trait like this: https://github.com/naia-lib/naia/blob/main/shared/serde/src/serde.rs#L4

Where there could be an additional function that simply returns the amount of bytes that the struct/enum will be serialized into, but without doing the actual serialization. For example via these kinds of implementations: https://github.com/naia-lib/naia/blob/main/shared/serde/src/impls/string.rs#L28

finnbear · 2023-09-19T01:39:17Z

For example via these kinds of implementations: https://github.com/naia-lib/naia/blob/main/shared/serde/src/impls/string.rs#L28

Thanks for providing a code example! It looks like you are using the bit length to decide whether to serialize the message at all, which could legitimately benefit from the functionality.

(Edit: FWIW, I tried implementing the desired functionality on the predict_len branch).

…).

caibear · 2023-09-19T02:55:15Z

I avoided adding something similar to bincode::serialized_size since I've noticed lots of people misuse it to allocate buffers with capacity as an optimization. This usually results in half the performance and double the binary size for everything but the most trivial structures (see bincode-org/bincode#401).

it would be useful to have a function to know how many bits/bytes a structure would take if it were encoded, but without doing the actual encoding (so that i know in which packet i can put the encoded data).

I would advise serializing each structure to a Vec<u8> with bitcode::encode and then appending as many as possible to another Vec<u8>, each with a length prefix such as a u16 or u32. The length prefix is required so you can pass a &[u8] of the original structure length to bitcode::decode.

While copying the bytes isn't ideal, it should be much faster than something like serialized_size.

finnbear · 2023-09-19T03:15:20Z

@caibear brings up some good points against implementing this and a possible alternative for your code.

Here is one more possible alternative for you, in the form of code that you can drop in to your project:

    use std::cell::RefCell;
    use serde::Serialize;
    use bitcode::{Encode, Buffer, Error};

    // for serde::Serialize
    fn serialize_len<T: Serialize + ?Sized>(t: &T) -> Result<usize, Error> {
        thread_local! {
            static BUFFER: RefCell<Option<Buffer>> = RefCell::new(None);
        }

        BUFFER.with(|buffer| {
            let mut buffer = buffer.borrow_mut();
            if buffer.is_none() {
                *buffer = Some(Default::default());
            }
            buffer.as_mut().unwrap().serialize(t).map(|bytes| bytes.len())
        })
    }

    // for bitcode::Encode
    fn encode_len<T: Encode + ?Sized>(t: &T) -> Result<usize, Error> {
        thread_local! {
            static BUFFER: RefCell<Option<Buffer>> = RefCell::new(None);
        }

        BUFFER.with(|buffer| {
            let mut buffer = buffer.borrow_mut();
            if buffer.is_none() {
                *buffer = Some(Default::default());
            }
            buffer.as_mut().unwrap().encode(t).map(|bytes| bytes.len())
        })
    }

Use these as a last resort if you can't refactor your code as suggested by @caibear. By reusing the Buffer, they avoid repeated memory allocations. They don't require additional codegen and won't be significantly slower than my predict_len changes mentioned above.

cBournhonesque · 2023-09-20T14:04:01Z

Thank you!
In general i'll be encoding everything in a buffer of size UDP_PACKET_SIZE (around 1400 bytes), so i wouldn't be using this to optimize allocations.
Both options that you provided make sense to me.

finnbear added a commit that referenced this issue Sep 19, 2023

Add functions to predict serialized length (*provisional only*, see #11…

c86cd76

…).

finnbear closed this as not planned Won't fix, can't repro, duplicate, stale Sep 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Knowing the amount of bits that will be written #11

Knowing the amount of bits that will be written #11

cBournhonesque commented Sep 17, 2023

finnbear commented Sep 17, 2023 •

edited

cBournhonesque commented Sep 19, 2023

finnbear commented Sep 19, 2023 •

edited

caibear commented Sep 19, 2023

finnbear commented Sep 19, 2023 •

edited

cBournhonesque commented Sep 20, 2023

Knowing the amount of bits that will be written #11

Knowing the amount of bits that will be written #11

Comments

cBournhonesque commented Sep 17, 2023

finnbear commented Sep 17, 2023 • edited

cBournhonesque commented Sep 19, 2023

finnbear commented Sep 19, 2023 • edited

caibear commented Sep 19, 2023

finnbear commented Sep 19, 2023 • edited

cBournhonesque commented Sep 20, 2023

finnbear commented Sep 17, 2023 •

edited

finnbear commented Sep 19, 2023 •

edited

finnbear commented Sep 19, 2023 •

edited