Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of corrupted frames in libFLAC #464

Open
melchor629 opened this issue Sep 24, 2022 · 7 comments
Open

Handling of corrupted frames in libFLAC #464

melchor629 opened this issue Sep 24, 2022 · 7 comments

Comments

@melchor629
Copy link

Hello flac team!! A user reopened an issue on my repo (a node bindings to libFLAC) pointing out that there is like half a second of silence when decoding certain flac file (included in the issue). libFLAC version in use are 1.3.4 (user's) and 1.4.0/1.4.1 (mine).

First investigation

The first thing I do is to check the file and effectively it has some kind of corruption in it. ffmpeg can decode the file fine without complains, but libFLAC inserts some silence just in the corrupted frame. Doing an analysis with flac -a ... to that file I get the following errors:

flac 1.4.0
Copyright (C) 2000-2009  Josh Coalson, 2011-2022  Xiph.Org Foundation
flac comes with ABSOLUTELY NO WARRANTY.  This is free software, and you are
welcome to redistribute it under certain conditions.  Type `flac' for details.

01. Fantastic Departure!.flac: *** Got error code 2:FLAC__STREAM_DECODER_ERROR_STATUS_FRAME_CRC_MISMATCH
*** Got error code 2:FLAC__STREAM_DECODER_ERROR_STATUS_FRAME_CRC_MISMATCH
*** Got error code 0:FLAC__STREAM_DECODER_ERROR_STATUS_LOST_SYNC
*** Got error code 1:FLAC__STREAM_DECODER_ERROR_STATUS_BAD_HEADER
*** Got error code 0:FLAC__STREAM_DECODER_ERROR_STATUS_LOST_SYNC
*** Got error code 1:FLAC__STREAM_DECODER_ERROR_STATUS_BAD_HEADER
*** Got error code 0:FLAC__STREAM_DECODER_ERROR_STATUS_LOST_SYNC
ERROR, MD5 signature mismatch

Debugging code!

So I setup my repo to debug some C/C++ code and run a test file to decode the file using the stream decoder using a file. Put a breakpoint here https://github.com/xiph/flac/blob/master/src/libFLAC/stream_decoder.c#L2118 and wait to trigger the error. And it hits twice (one per channel). The two hits of the breakpoint gives me the following values:

channel i decoder->private_->output[channel][i]
0 4050 54151
1 4050 -36808

NOTE: did not check read_subframe_, it is dense and I suppose it is correct.

After this, it goes into here

if(!FLAC__bitreader_rewind_to_after_last_seen_framesync(decoder->private_->input)){
in which it rewinds back to the start of the frame (well, the "start"). The decoder->private_->last_seen_framesync is 8392003 but the start of the frame is at 8392001. From the output of the analysis command (the corrupted frame):

frame=634	offset=8392001	bits=199496	blocksize=4096	sample_rate=44100	channels=2	channel_assignment=MID_SIDE
	subframe=0	wasted_bits=0	type=CONSTANT	value=0
	subframe=1	wasted_bits=0	type=CONSTANT	value=0

Not sure if the last_seen_framesync should be two bytes after the start of the frame, but in case this is wrong this line might be the issue (reads the position after reading two bytes of the frame):

if(!FLAC__stream_decoder_get_decode_position(decoder, &decoder->private_->last_seen_framesync))

Okey then, we've got a corrupted frame, the decoder then tries to rewind but it does not goes exactly at the start of the frame. Then when trying to find a frame from the FLAC__stream_decoder_process_single, it gets lost. When finds the next frame, it inserts silence for the whole corrupted frame (as stated here).

Trying something...

If I try to "fix" the last_seen_framesync by pointing at the start of the frame, the FLAC__stream_decoder_process_single will get stuck in an infinite loop because:

  1. looks for a frame
  2. process the frame but it is corrupted and rollbacks to the start of the frame
  3. go to 1

And it seems there is no way to stop the loop and tell the frame is corrupted (apart from the FLAC__STREAM_DECODER_ERROR_STATUS_FRAME_CRC_MISMATCH error).

Recap

  • The last_seen_framesync might be pointing at +2 bytes offset from the start of the frame
  • The frame is corrupted and tries to rewind (see above)
  • The rewind does not point to the start of the frame but close (lost sync & bad header errors)
  • If last_seen_framesync points to the start of the frame, an infinite loop is caused

My opinion

I would like to clarify I am not completely sure about the internals of libFLAC, so I might get something wrong. In any case, I would like to open some discussion around the issues described above.

The first thing I see is because there is some corrupted data (samples outside the bits per sample range), the decoder emits FLAC__STREAM_DECODER_ERROR_STATUS_FRAME_CRC_MISMATCH. I think this error is a bit confusing and it does not exactly reflect the real of the error. Or I just get this completely wrong. The description of the error says The frame's data did not match the CRC in the footer. but in this case that's not what happens. CRCs match but the data is not valid for the declared bits per sample.

Next thing is the FLAC__STREAM_DECODER_ERROR_STATUS_FRAME_CRC_MISMATCH error (or whatever error is sent in that scenario). In my opinion, should be sent only once. It is not necessary, and maybe reduces some CPU time by stopping just when the error is detected - and not checking the rest of the channels.

And last, as a developer, I would like to get the corrupted frame anyway (either by this case or a real CRC mismatch) and do whatever I would like to do with that. Like maybe try to fix somehow the samples or whatever. I already know there is something wrong with that frame (because of the FLAC__STREAM_DECODER_ERROR_STATUS_FRAME_CRC_MISMATCH). Maybe as an optional flag that could be set to send the frame instead of inserting silence... With a FLAC__stream_decoder_set_XXX()...

If there is anything not clear or need more information, I would be glad to help you. As well as I am open to discuss everything I put in this issue :)

@ktmf01
Copy link
Collaborator

ktmf01 commented Sep 24, 2022

Not sure if the last_seen_framesync should be two bytes after the start of the frame, but in case this is wrong this line might be the issue (reads the position after reading two bytes of the frame):

Yes, this is correct. If you rewind to the last seen framesync you get an infinite loop as you noticed. Previously, in FLAC 1.3.4 and before, libFLAC would start looking for a new frame directly after detecting an error. However, this could result in it skipping a valid frame as well, if it overread the previous frame. That's why it rewinds to just after the start of the corrupt frame for a valid one.

The first thing I see is because there is some corrupted data (samples outside the bits per sample range), the decoder emits FLAC__STREAM_DECODER_ERROR_STATUS_FRAME_CRC_MISMATCH. I think this error is a bit confusing and it does not exactly reflect the real of the error. Or I just get this completely wrong. The description of the error says The frame's data did not match the CRC in the footer. but in this case that's not what happens. CRCs match but the data is not valid for the declared bits per sample.

Yes, perhaps the error isn't really clear. It could be replaced with FLAC__STREAM_DECODER_ERROR_STATUS_LOST_SYNC which is more generic. I am hesitant to add another error to that enum for this.

Next thing is the FLAC__STREAM_DECODER_ERROR_STATUS_FRAME_CRC_MISMATCH error (or whatever error is sent in that scenario). In my opinion, should be sent only once. It is not necessary, and maybe reduces some CPU time by stopping just when the error is detected - and not checking the rest of the channels.

I agree that could be improved.

And last, as a developer, I would like to get the corrupted frame anyway (either by this case or a real CRC mismatch) and do whatever I would like to do with that. Like maybe try to fix somehow the samples or whatever. I already know there is something wrong with that frame (because of the FLAC__STREAM_DECODER_ERROR_STATUS_FRAME_CRC_MISMATCH). Maybe as an optional flag that could be set to send the frame instead of inserting silence... With a FLAC__stream_decoder_set_XXX()...

I'm not sure this is a good idea. FLAC is a lossless codec, and error concealment does not fit with that. In case of an error I'd like it to be loud and clear.

Furthermore, here the stream contains samples that do not fit within the bit per sample of the stream. If further processing is done on these, that could result in undefined behaviour.

@ktmf01
Copy link
Collaborator

ktmf01 commented Sep 24, 2022

Oh, and by the way, thanks for reporting!

@melchor629
Copy link
Author

Yes, this is correct. If you rewind to the last seen framesync you get an infinite loop as you noticed. Previously, in FLAC 1.3.4 and before, libFLAC would start looking for a new frame directly after detecting an error. However, this could result in it skipping a valid frame as well, if it overread the previous frame. That's why it rewinds to just after the start of the corrupt frame for a valid one.

Completely understandable. Thanks for the clarification.

Yes, perhaps the error isn't really clear. It could be replaced with FLAC__STREAM_DECODER_ERROR_STATUS_LOST_SYNC which is more generic. I am hesitant to add another error to that enum for this.

Maybe, would be nice to know what exactly happens without having to debug for that specific case. I understand the hesitation about adding a new error to the enum, but currently it is a bit difficult to know what the error is (in this scenario). Anyway, this is a really strange scenario, so I'm happy to change the error by the one suggested FLAC__STREAM_DECODER_ERROR_STATUS_LOST_SYNC or keep the one that has now (maybe I prefer this option). That's fine. In any case, would be nice to add this scenario to the documentation.

Furthermore, here the stream contains samples that do not fit within the bit per sample of the stream. If further processing is done on these, that could result in undefined behaviour.

I know, that's why I suggested to add this behind a flag in the decoder. Client must explicitly ask for this. For what I can see in code, if the frame is correct, it should go straight to this block of code

*got_a_frame = true;

libFLAC here does nothing with the data (except in on case I don't quite understand) and passes it to the client code. Now is client's code responsibility to handle that frame properly and avoid the undefined behaviour. That's why this option should be explicitly enabled. I really would like to have that frame and tackle it on my side.

For the mentioned case:

if(delta > 0) {

It seems here that if the bits per sample defined originally does not match the one in the frame (and frame's is smaller), then it adds that difference into the samples (?). The comment says shift out the samples before target_sample but adds instead of shifts (?). Anyway, here it is doing some processing and could cause more undefined behaviour if that corrupted frame is sent to the client.

On the other hand, it does not seem it should affect the decoder state (I might be wrong here!). The frame has been already decoded, and the operations are on the raw samples, sum that with the error sent to client, it should know that this frame has something wrong.

I would really have that option, but if that is not acceptable, I will understand.

Oh, and by the way, thanks for reporting!

Thanks for the fast answer!!

@ktmf01
Copy link
Collaborator

ktmf01 commented Sep 24, 2022

Furthermore, here the stream contains samples that do not fit within the bit per sample of the stream. If further processing is done on these, that could result in undefined behaviour.

I know, that's why I suggested to add this behind a flag in the decoder. Client must explicitly ask for this.

If such a thing is added, an interface would need to be provided that gives the client as much information as possible: what happened and which sample is the problem (if that is known). Otherwise you won't know what happened anyway.

For the mentioned case:

if(delta > 0) {

It seems here that if the bits per sample defined originally does not match the one in the frame (and frame's is smaller), then it adds that difference into the samples (?). The comment says shift out the samples before target_sample but adds instead of shifts (?). Anyway, here it is doing some processing and could cause more undefined behaviour if that corrupted frame is sent to the client.

That code is used when seeking. It throws out delta samples if the sample that has been requested to seek to is halfway a frame. In that case, the first half of the frame must be thrown out. It is the difference between frame-accurate and sample-accurate seeking.

I would really have that option, but if that is not acceptable, I will understand.

I'll keep this issue open, but I won't make any promises.

@melchor629
Copy link
Author

If such a thing is added, an interface would need to be provided that gives the client as much information as possible: what happened and which sample is the problem (if that is known). Otherwise you won't know what happened anyway.

Could be a good idea, in terms of error handling for such scenario. If it is possible to add this new interface to the decoder, would be nice.

That code is used when seeking. It throws out delta samples if the sample that has been requested to seek to is halfway a frame. In that case, the first half of the frame must be thrown out. It is the difference between frame-accurate and sample-accurate seeking.

Understood, thanks for explaining to me!!

I'll keep this issue open, but I won't make any promises.

Understandable. I will keep an eye of this issue, and if any help is required, I will be glad to.

One thing I just though, should I create a separated issue for this?

Next thing is the FLAC__STREAM_DECODER_ERROR_STATUS_FRAME_CRC_MISMATCH error (or whatever error is sent in that scenario). In my opinion, should be sent only once. It is not necessary, and maybe reduces some CPU time by stopping just when the error is detected - and not checking the rest of the channels.

@ktmf01
Copy link
Collaborator

ktmf01 commented Sep 24, 2022

One thing I just though, should I create a separated issue for this?

Next thing is the FLAC__STREAM_DECODER_ERROR_STATUS_FRAME_CRC_MISMATCH error (or whatever error is sent in that scenario). In my opinion, should be sent only once. It is not necessary, and maybe reduces some CPU time by stopping just when the error is detected - and not checking the rest of the channels.

No, I don't think it is necessary to open another issue. FLAC decoding is already extremely fast, so it doesn't seem pressing in any way.

@ktmf01
Copy link
Collaborator

ktmf01 commented Apr 12, 2023

Just as a note to self: to improve troubleshooting, maybe it is a good idea to add extra FLAC__Frame subframe types. Maybe something like FLAC__SUBFRAME_TYPE_OOB (out-of-bounds) and FLAC__SUBFRAME_TYPE_FILLER (for filling gaps between two frames when the frame in between was unreadable/unparsable). Then we have a simple way to pass data, so the client can do with it whatever it wants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants