Skip to content

Latest commit

 

History

History
82 lines (78 loc) · 4.01 KB

TODO.md

File metadata and controls

82 lines (78 loc) · 4.01 KB

TODO

  • Fix endian inconsistency problem
  • implement trie data structure to speedup performance
  • Try different advanced data structure. Refer to https://www.zhihu.com/question/32163076.
  • profile to find performance bottleneck, and try to optimize. So that we can set bigger value for MAX_TEST_FILE_NUM in test_integration.py
  • add unit test for code dict and str dict
  • add unit test for module convert_bytes_bits_int
  • why is our encode output different with the example one?
  • hypothesis custom build strategy
  • implement write_lzw_header for existed lzwfile
  • use hypothesis' inference from type hinting feature
  • add test suit for map-interface data structure
  • use hypothesis' iterables strategy, instead of lists, to avoid accidentally relying on sequence methods
  • Rename LZW package to lowercase lzw
  • Clean up code. Remove dead code. Polish comments. Add elaborate docstring.
  • Add test upon different code_bitsize
  • Add all to modules
  • Add slots to classes
  • Add docstring for all significant functions
  • Add docstring to module level
  • Add icons to badges.
  • Customize badge style (color, size)
  • Add badge for Python version specification
  • Deal with the incompatibility between hypothesis and pytest fixture tmp_path.
  • Add badge for LOC
  • Use Union[str, bytes, int, os.PathLike] to annotate open()'s passed in argument
  • Use pytest-subtesthack plugin to deal with fixture integration problem.
  • Test on more max_file_len and max_file_num.
  • Type annotate tmp_path fixture argument
    • Use path.pathlib.Path or Union[str, bytes, os.PathLike]
  • Try doctest
  • Should we pin dependencies' versions in requirements.txt?
  • Use pylint_runner third-party tool
  • Don't misuse AnyStr for type annotation
  • Type annotate variant positional argument and variant keyword argument
  • tox vs pytest vs nose
  • Increase coverage rate to over 95%. 99% is better.
  • Deal with TODOs spread across the project
  • Ref links:
  • Add README badge for LOC (Line of Codes)
  • Add README badge for pylint rating. (10.0/10)

Done

  • compute test coverage rate
  • handle the corner case of zero file is input for compression
  • handle the corner case of empty file
  • add max_len restriction to code dict
  • reserve the last code of code dict as virtual EOF
  • the testing doesn't cover the case that code dict grows beyond capacity.
  • consider rewriting in lazy evaluation / generator style, so as to save runtime space cost. Another importance of writing in stream style is that so our script can handle input that is infinitely large
  • preserve newline format. Disable Python's builtin default universal newline support feature.
  • ask for clarification of spec: should we handle the case of zero compressed file?
  • Add GitHub README badge (CI passing unit testing cases, testing coverage rate, etc.)
  • Add CI. Travis, Codecov.
  • Write README
  • Use sampled_from(Bit) to replace integers(min_value=0, max_value=1), because integers() strategy shrinks towards 0 and leads to unbalanced example, which is undesirable behavior.
  • Expand wildcard imports from typing module to concrete names
  • Use bytes as shorthand type hint to replace ByteString
  • Add .pylintrc project-wise
  • Rename CODE_BIT to CODE_BITSIZE
  • Rename test/ directory to tests/
  • Add .coveragerc, config file for coverage.py
  • Add badge for code style (black)
  • Remove .gitmodules
  • Expand all wildcard imports
  • Add license