Here are some ideas #36

AlfredoSequeida · 2021-02-19T20:57:12Z

Hi everyone, I've been busy, which is why I haven't been able to check-in as often. So first of all I want to thank everyone for their contributions and a special thanks to @Theelgirl and @dobrosketchkun since I know you two have been putting in a lot of work into the program. Your work does not go unnoticed.

I have some ideas I want to share and get some feedback.

Increasing storage capacity.
Originally, the program was made with the simple thought that 1-bit (black and white) pixels would allow the program to be more efficient at data retrieval against compression algorithms. But of course, using a single pixel to represent a bit, leaves a lot to be desired as far as maximizing storage goes. I think we can still achieve the same logic by adding the option for another type of encoding (in addition to 1-bit color) by using colors to represent a set of 2 bits per pixel thus doubling the storage capacity while still keeping the colors simple enough to guard against compression. Here is the math/logic:

Using the RGB color spectrum, the simplest colors are red, green, and blue. This means that they are easy to distinguish from one another even if compression changes them a bit.

With that in mind, using binary numbers we can double storage by storing 2 bits in a single color, thus giving us a possibility of 2^2 or 4 different combinations. Then we can assign a color to each combination and use black or white for the remaining color:

00: Black
10: Red
01: Green
11: Blue

Then as far as decoding goes, the logic would be the same as for black and white. We would check which color the values is closest to and assume that color:

For example, if the pixel is (255,12,30), then the color must be Red (bin: 10), since the pixel contains more red than anything else.

I haven't done the research yet, but we might also be able to take advantage of an alpha channel using the RGBA color spectrum, but I would assume that A might not be as easy to guard against compression.

Adding some sort of error checking /correction algorithm
Even though, the current implementation of fvid seems to work the times I have tried it. It's not perfect and there have been some reports of it not working with certain files. Because of this, I think we should look into adding some error checking or correction. A simple implementation might be adding a parity check where we add an extra bit or set of bits for every byte (8 bits) if the number is odd or even. However, that doesn't fix the data, it only tells us that the data is wrong and that assumes that the data was decoded correctly, which are probably too many assumptions for it to be a good solution. So I am open to hearing if anyone has any suggestions on this.
GUI
Recently, the program has been getting more attention (because of a TikTok I made lol) and I have seen more requests for a GUI. I know @dobrosketchkun made a GUI for the program, but I have not seen that implemented yet. I have not made a GUI for a Python program before, so I was doing some research and was thinking of building one using PyQT, or Kivy, but since @dobrosketchkun already did some work using Tkinter, I rather their work not go to waste. In addition, I think it would be cool to make this optional, so maybe during the install process or with a different package.
Changing the License
MIT was the original option since that is the most open license I know and I like the idea of sharing the source code and allowing people to do whatever they want with the program, but I don't want the work contributed by others to be "taken advantage of" for lack of a better phrase. With the amount of time contributors have put into their work, I would like the program and any copies to remain open source. So I propose, we change the license to GPL V3, but of course I am open to suggestions.

Everything here is just a suggestion, I want to hear what others have to say.

The text was updated successfully, but these errors were encountered:

Theelx · 2021-02-19T21:32:26Z

I think that's a great idea to find a way to increase storage capacity! However, I think that we already store 4 bits of information in 2 bits:

00: Black-Black
01: Black-White
10: White-Black
11: White-White

That could work, I'm not the best in that area but I'm sure it's possible.
I'd personally like to build on the existing GUI from dobrosketchkun, as I'm not an experienced GUI maker with other tools.
I personally could care less if people stole my portions of the code, so I propose the Unlicense. I've used it for my other projects, and it's basically the most permissive license you can get. Let's wait for dobrosketchkun's advice first though. https://unlicense.org/

Edit: Revised what I said in parts 1 and 4.

dobrosketchkun · 2021-02-19T21:43:08Z

Nice idea, I thought about it, but digress to another idea and forget about it, lol
Well, you can always use Reed–Solomon, but it will blow a volume of the data
My GUI is a hacky, partially working lazy few lines of code, plz make something better
I double @Theelgirl I usually post my "code" under public domain

Theelx · 2021-02-19T21:50:07Z

Dobro, for 2, error correction, have you heard of LDPC error correction? I was just googling it and came across this: https://pypi.org/project/pyldpc/

Edit: Apparently it's much faster than reed-solomon, https://stackoverflow.com/questions/41883385/error-correction-with-python-and-reed-solomon-for-large-inputs

dobrosketchkun · 2021-02-19T21:53:20Z

Nope, I haven't; it seems nice.

AlfredoSequeida · 2021-02-19T22:08:40Z

Awesome!

@Theelgirl This sounds like a great opportunity to get some experience with GUI's so ill leave that to you. Take your time of course. I'll also take a look at LDPC, I haven't heard of that before.

@dobrosketchkun and I have heard about Reed–Solomon, but haven't implemented it before, I will also take a look at that.

As for the License, since both of you don't really mind the licensing Issue, I think we should just leave it as is. I'll leave this issue open for a bit longer in case anyone else that wants to contribute has something to say. Thank you both for your time!

AlfredoSequeida · 2021-02-21T12:09:31Z

Excuse my typos ahead of time, I'm on mobile.

I spent some time today implementing Reed-Solomon error correction and I got that working with some quick tests, however, it really slows down the process.

There are few more things I need to look at before pushing a commit. I have noticed that PIL has trouble when too many images are open as during the decoding process I received a "too many images are open" error . And I also want to see if I can find a way to avoid us having to encode a video with a new frame rate before extracting frames since that's making the process even longer.

I have also noticed that with larger files, sometimes zipping the files fail. I think this might be a memory issue since I experienced a case where zipping failed the first time, and after runing it again, it worked.

Since I don't have time to write the algo from scratch, I am using this library I found:

https://github.com/tomerfiliba/reedsolomon

And I am adding correction data to every byte (8 bits) and testing with an ecc value of 12. This makes it so that it's possible to recover all 8 bits if they were to become damaged (worst possible scenario), but of course that comes with the cost of it taking longer and having a larger output videos hence the current issues I want to investigate more. I also plan to make this an optional feature with the -r flag (or something else so it won't be mistaken for a 'recursive' feature) for the same reasons.

If anyone has anything to add, please feel free to do so.

Theelx · 2021-02-21T16:11:59Z

I'm still thinking that ldpc error correction will be better as it's much faster, but we'll see. At the moment, I'm setting up some documentation, will push when done.

Theelx · 2021-02-21T19:29:59Z

@AlfredoSequeida Can I get integrations perms for this github so I can set up the readthedocs thing? I pushed the docs here already, but apparently they need a working integration to publish them :(

Edit: Looks like it just needs the webhook so it can automatically update the docs with every commit: https://fvid.readthedocs.io/en/latest/

AlfredoSequeida · 2021-02-22T00:07:59Z

@Theelgirl I don't see an option on the repo to give you integrations permissions, so I was trying to see if I can set that up from my end, but looking at https://docs.readthedocs.io/en/stable/webhooks.html#github I don't see the Payload URL.

Theelx · 2021-02-22T00:09:24Z

@AlfredoSequeida I'll email it to you (your outlook email that's on your GitHub page).

Theelx · 2021-02-22T00:13:41Z

Also I just added what I believe is your readthedocs account to the project. Is your account name AlfredoSequeida on readthedocs.io?

AlfredoSequeida · 2021-02-22T00:14:43Z

Yes, that's me, I added the webhook.

Theelx · 2021-02-22T00:57:04Z

Ok cool. Can you push your current version of the reed-solomon error correction to a new branch so I can check it out and possibly speed it up?

AlfredoSequeida · 2021-02-22T00:58:33Z

Yes, I'll do that.

AlfredoSequeida · 2021-02-22T01:10:10Z

Here is the current Reed-Solomon implementation: c0f3ecd

Theelx · 2021-02-22T01:10:53Z

Cool thanks!

AlfredoSequeida · 2021-02-22T04:05:31Z

Quick GUI mockup idea. I had some extra time today and I wanted to see what I could come up with. Just a simple idea, I don't expect it to look like that. I am not even sure about the log-in idea. I figure that people can do the uploading on their own if they wish to, but I added that since that was the original premise.

AlfredoSequeida · 2021-02-22T04:47:26Z

And going off @dobrosketchkun 's idea, there could also be an option to use a URL to decode

dobrosketchkun · 2021-02-22T09:00:05Z

This is a very nice GUI, miles better than mine.

Theelx · 2021-02-22T18:22:55Z

@AlfredoSequeida @dobrosketchkun I did a good. This zfec error correction takes about 6-7 seconds to encode my 580KB test image versus over a minute with Reed-Solo, bloats it to 103MB mp4 instead of >200MB, and takes about 2s to decode it.

Branch:
https://github.com/AlfredoSequeida/fvid/tree/zfec-error-correction
Zfec repo:
https://github.com/tahoe-lafs/zfec

dobrosketchkun · 2021-02-22T18:33:06Z

Splendid!

AlfredoSequeida · 2021-02-22T19:13:31Z

@Theelgirl That's awesome! When I get some time, I'll try the file I was using to test Reed-Solomon.

Theelx · 2021-02-22T19:22:22Z

Sounds good! I haven't tested it on any error-ridden files yet, only normal files, so hopefully it works.

Theelx · 2021-02-23T02:12:22Z

@AlfredoSequeida Have you had time to test it yet? I'm going to bed really soon unless I need to fix something.

AlfredoSequeida · 2021-02-23T02:20:31Z

@Theelgirl I haven't yet, but I will try to test it by the end of today. Get your rest lol.

AlfredoSequeida · 2021-02-23T02:43:43Z

@Theelgirl So, I checked out the zfec-error-correction branch, but, I don't see any zfec logic. The only change I see is the requirements.txt file. Did you push your changes? Or did I misunderstand your message? I thought you implemented the logic for zfec.

Theelx · 2021-02-23T02:44:53Z

Oops. I probably forgot to commit, pushing now.

Edit: done

Theelx · 2021-02-23T03:40:57Z

@AlfredoSequeida Yoink I did a dumb and forgot to import random to deal with errors, git pull the most recent changes and re-test.

AlfredoSequeida · 2021-02-23T09:17:51Z

@Theelgirl Ok, So I tested your implementation and it's definitely faster, but I was not able to decode a video downloaded from youtube. With that said, I don't think it's your implementation because I also can't decode it without the zfec error correction

I keep getting this:

Unziping...
Traceback (most recent call last):
  File "/home/alfredo/.local/bin/fvid", line 33, in <module>
    sys.exit(load_entry_point('fvid==1.0.0', 'console_scripts', 'fvid')())
  File "/home/alfredo/.local/lib/python3.9/site-packages/fvid/fvid.py", line 505, in main
    save_bits_to_file(file_path, bits, key, args.zfec)
  File "/home/alfredo/.local/lib/python3.9/site-packages/fvid/fvid.py", line 294, in save_bits_to_file
    bitstring = fo.read()
  File "/usr/lib/python3.9/gzip.py", line 300, in read
    return self._buffer.read(size)
  File "/usr/lib/python3.9/gzip.py", line 495, in read
    uncompress = self._decompressor.decompress(buf, size)
zlib.error: Error -3 while decompressing data: invalid literal/length code

Here are the youtube videos for reference:
non zfec
zfect

And here is the file I used to test, it's a 14.6MB PDF file

I have seen this before with other files, so it looks to be a problem with unzipping the data. Something is happening there that needs to be looked more at. Maybe @dobrosketchkun can help here.

Since decoding works before the YouTube upload, it makes me think that YouTube's compression is messing with the zipped data. A simple solution that I didn't get to try, but could work, is to use zfect after zipping the file (Instead of before) and then seeing if the error correction can retrieve it before unzipping it during decoding.

@Theelgirl I do have a question, I noticed that you decided to apply zfec every 50 bytes (I think that's what's happening), so my question is how many of those bytes are actually recoverable with zfec?

dobrosketchkun · 2021-02-23T10:03:01Z

Sorry for not being useful recently; I'm kind of busy.

I can't write it in nice code right now, but hear me out; in the first version, we had a delimiter to determine where is the end of the file, in current, we have zip machinery to help us out. The original problem is that no one can tell if this string of black pixels at the end of the last frame is part of the file or not.

But what if we can revive byte-encoded json without such an issue? Thus we can place encoded zip into it, and I think it'll maybe help with #33.

test = {"filename": "test",
        "data" : "somedata"}

data_bytes = json.dumps(test).encode('utf-8')
# b'{"filename": "test", "data": "somedata"}'

bitarray = BitArray(data_bytes)
#BitArray('0x7b2266696c656e616d65223a202274657374222c202264617461223a2022736f6d6564617461227d')

bitarray_bin = bitarray.bin
# 01111011001000100110011001101001011011000110010101101110011000010110110101100101001000100011101000100000001000100111010001100101011100110111010000100010001011000010000000100010011001000110000101110100011000010010001000111010001000000010001001110011011011110110110101100101011001000110000101110100011000010010001001111101

bitarray_bin = bitarray_bin + '1'*8*5 # last frame pad emulator
# 011110110010001001100110011010010110110001100101011011100110000101101101011001010010001000111010001000000010001001110100011001010111001101110100001000100010110000100000001000100110010001100001011101000110000100100010001110100010000000100010011100110110111101101101011001010110010001100001011101000110000100100010011111011111111111111111111111111111111111111111

##############################################

bitstring = Bits(bin=bitarray_bin)
# Bits('0x7b2266696c656e616d65223a202274657374222c202264617461223a2022736f6d6564617461227dffffffffff')
bitarray = bitstring.bytes
_temp = str(bitarray)
# b'{"filename": "test", "data": "somedata"}\xff\xff\xff\xff\xff'


# some sketchy way to split, needs to be cleaner
_temp = _temp.lstrip("b\'")
_temp = _temp.split('"}')
_temp = '"}'.join(_temp[:-1] + [''])
_temp = json.loads(_temp)

# {'data': 'somedata', 'filename': 'test'}

Theelx · 2021-02-23T14:25:12Z

@AlfredoSequeida Actually, with a block size of 8, Zfec is applied every 8 bytes. Zfec simply expands the data by 25% using the formula bloat% = MVAL/KVAL (not sure how it recovers using the original symbols but eh). In this case, with KVAL of 4 and MVAL of 5, that means that for every 5 blocks of data, we need 4 to reconstruct the original. Since it bloated the block to 10 symbols, it's easily divisible by 5.

Akul2010 · 2022-09-15T01:38:40Z

Can I help? I'm good at building GUIs in tkinter, PysimpleGui, and PyQT5. (Ok at kivy and kivymd as well.)
Sorry for the typos

Theelx · 2022-09-16T16:06:01Z

Thanks for the offer, however this project has been unofficially superseded by https://github.com/MeViMo/youbit. That repo is better in essentially every way.

AlfredoSequeida · 2022-09-17T16:03:51Z

Can I help? I'm good at building GUIs in tkinter, PysimpleGui, and PyQT5. (Ok at kivy and kivymd as well.)
Sorry for the typos

Hey @Akul2010 ! Thank you for offering to help! Although that would be awesome, I honestly haven't had time to work on the project recently. I think @Theelx 's idea of contributing to a more active project would be best.

Theelx pinned this issue Feb 19, 2021

Theelx added the Discussion label Feb 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Here are some ideas #36

Here are some ideas #36

AlfredoSequeida commented Feb 19, 2021 •

edited

Theelx commented Feb 19, 2021 •

edited

dobrosketchkun commented Feb 19, 2021

Theelx commented Feb 19, 2021 •

edited

dobrosketchkun commented Feb 19, 2021

AlfredoSequeida commented Feb 19, 2021

AlfredoSequeida commented Feb 21, 2021 •

edited

Theelx commented Feb 21, 2021

Theelx commented Feb 21, 2021 •

edited

AlfredoSequeida commented Feb 22, 2021

Theelx commented Feb 22, 2021 •

edited

Theelx commented Feb 22, 2021

AlfredoSequeida commented Feb 22, 2021

Theelx commented Feb 22, 2021

AlfredoSequeida commented Feb 22, 2021

AlfredoSequeida commented Feb 22, 2021

Theelx commented Feb 22, 2021

AlfredoSequeida commented Feb 22, 2021 •

edited

AlfredoSequeida commented Feb 22, 2021 •

edited

dobrosketchkun commented Feb 22, 2021

Theelx commented Feb 22, 2021

dobrosketchkun commented Feb 22, 2021

AlfredoSequeida commented Feb 22, 2021 •

edited

Theelx commented Feb 22, 2021

Theelx commented Feb 23, 2021

AlfredoSequeida commented Feb 23, 2021

AlfredoSequeida commented Feb 23, 2021 •

edited

Theelx commented Feb 23, 2021 •

edited

Theelx commented Feb 23, 2021

AlfredoSequeida commented Feb 23, 2021 •

edited

dobrosketchkun commented Feb 23, 2021 •

edited

Theelx commented Feb 23, 2021 •

edited

Akul2010 commented Sep 15, 2022

Theelx commented Sep 16, 2022

AlfredoSequeida commented Sep 17, 2022 •

edited

Here are some ideas #36

Here are some ideas #36

Comments

AlfredoSequeida commented Feb 19, 2021 • edited

Theelx commented Feb 19, 2021 • edited

dobrosketchkun commented Feb 19, 2021

Theelx commented Feb 19, 2021 • edited

dobrosketchkun commented Feb 19, 2021

AlfredoSequeida commented Feb 19, 2021

AlfredoSequeida commented Feb 21, 2021 • edited

Theelx commented Feb 21, 2021

Theelx commented Feb 21, 2021 • edited

AlfredoSequeida commented Feb 22, 2021

Theelx commented Feb 22, 2021 • edited

Theelx commented Feb 22, 2021

AlfredoSequeida commented Feb 22, 2021

Theelx commented Feb 22, 2021

AlfredoSequeida commented Feb 22, 2021

AlfredoSequeida commented Feb 22, 2021

Theelx commented Feb 22, 2021

AlfredoSequeida commented Feb 22, 2021 • edited

AlfredoSequeida commented Feb 22, 2021 • edited

dobrosketchkun commented Feb 22, 2021

Theelx commented Feb 22, 2021

dobrosketchkun commented Feb 22, 2021

AlfredoSequeida commented Feb 22, 2021 • edited

Theelx commented Feb 22, 2021

Theelx commented Feb 23, 2021

AlfredoSequeida commented Feb 23, 2021

AlfredoSequeida commented Feb 23, 2021 • edited

Theelx commented Feb 23, 2021 • edited

Theelx commented Feb 23, 2021

AlfredoSequeida commented Feb 23, 2021 • edited

dobrosketchkun commented Feb 23, 2021 • edited

Theelx commented Feb 23, 2021 • edited

Akul2010 commented Sep 15, 2022

Theelx commented Sep 16, 2022

AlfredoSequeida commented Sep 17, 2022 • edited

AlfredoSequeida commented Feb 19, 2021 •

edited

Theelx commented Feb 19, 2021 •

edited

Theelx commented Feb 19, 2021 •

edited

AlfredoSequeida commented Feb 21, 2021 •

edited

Theelx commented Feb 21, 2021 •

edited

Theelx commented Feb 22, 2021 •

edited

AlfredoSequeida commented Feb 22, 2021 •

edited

AlfredoSequeida commented Feb 22, 2021 •

edited

AlfredoSequeida commented Feb 22, 2021 •

edited

AlfredoSequeida commented Feb 23, 2021 •

edited

Theelx commented Feb 23, 2021 •

edited

AlfredoSequeida commented Feb 23, 2021 •

edited

dobrosketchkun commented Feb 23, 2021 •

edited

Theelx commented Feb 23, 2021 •

edited

AlfredoSequeida commented Sep 17, 2022 •

edited