Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

destinations.ml: add a compressed_file_destination that uses zstandard #247

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Lilydemi
Copy link

Trace files can be large and hard to share. Using zstd to compress trace
files allows users to decrease their file sizes, as demonstrated here:

$ du -sh trace.fxt trace.fxt.zst
472K    trace.fxt
56K     trace.fxt.zst

This is a preliminary attempt at adding a compressed file destination.
We have some open questions that we want to raise in the PR.

One thing we had to do is we had to make an original_filename labeled
argument. This is because, sometimes, magic-trace uses
/proc/self/fd/$FD paths as a way to prevent a race condition when
serving traces.

@Lilydemi Lilydemi force-pushed the compressed-file-destination branch from 10d0a2f to cd311c3 Compare July 14, 2022 18:39
@Lilydemi Lilydemi force-pushed the compressed-file-destination branch from 15a6a67 to 8a1eb3a Compare July 18, 2022 14:29
let com_size = buffer_size * 2 in
let com_level = 5 in
let compressed_buf = Iobuf.create ~len:com_size in
let file = Core_unix.openfile ~mode:[ O_CREAT; O_TRUNC; O_RDWR ] filename in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let file = Core_unix.openfile ~mode:[ O_CREAT; O_TRUNC; O_RDWR ] filename in
let file = Core_unix.openfile ~mode:[ O_CREAT; O_TRUNC; O_CLOEXEC; O_RDWR ] filename in

In case this ends up being used with a trace processor shell-based -serve.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think this review got accidentally marked as resolved.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added the O_CLOEXEC in direct_file_destination as well. Does that resolve this review?

vendor/tracing/zero/destinations.ml Outdated Show resolved Hide resolved
vendor/tracing/zero/destinations.ml Outdated Show resolved Hide resolved
vendor/tracing/zero/destinations.ml Outdated Show resolved Hide resolved
vendor/tracing/zero/destinations.ml Outdated Show resolved Hide resolved
@hlian
Copy link
Contributor

hlian commented Jul 20, 2022

This addresses #154 (just to link the two)

@hlian
Copy link
Contributor

hlian commented Jul 21, 2022

@Xyene ready for rereview if you have free time!

@Lilydemi Lilydemi force-pushed the compressed-file-destination branch from 83b8a37 to 246b494 Compare July 25, 2022 19:38
@hlian hlian force-pushed the compressed-file-destination branch from 246b494 to 7ae4414 Compare July 27, 2022 15:09
Trace files can be large and hard to share. Using zstd to compress trace
files allows users to decrease their file sizes, as demonstrated here:

```
$ du -sh trace.fxt trace.fxt.zst
472K    trace.fxt
56K     trace.fxt.zst
```

This is a preliminary attempt at adding a compressed file destination.
We have some open questions that we want to raise in the PR.

One thing we had to do is we had to make an `original_filename` labeled
argument. This is because, sometimes, magic-trace uses
`/proc/self/fd/$FD` paths as a way to prevent a race condition when
serving traces.

Signed-off-by: Lidya Demilew <si.demilew@gmail.com>
Signed-off-by: Lidya Demilew <si.demilew@gmail.com>
@hlian hlian force-pushed the compressed-file-destination branch from 7ae4414 to aa6935b Compare July 27, 2022 15:12
Signed-off-by: Hao Lian <hi@haolian.org>
@hlian hlian force-pushed the compressed-file-destination branch from a706045 to 734703e Compare July 29, 2022 15:37
let file_destination ~filename () = direct_file_destination ~filename ()
let compressed_file_destination ?(buffer_size = 64 * 1024) ~filename () =
let buf = Iobuf.create ~len:buffer_size in
let compressed_size = buffer_size * 2 in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not Zstandard.compression_output_size_bound, instead of compressed_size*2? One example where compressed_size*2 is not quite right is if the file is empty for some reason (the zstd'd file will not also be empty).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, very minor side note: comments go above the line you're commenting on, not below.

let file = Core_unix.openfile ~mode:[ O_CREAT; O_TRUNC; O_CLOEXEC; O_RDWR ] filename in
let written = ref 0 in
let compression_context = Zstandard.Compression_context.create () in
let flush () =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this compress the full trace file at once? Would it be easy to change this to stream the file out, instead? I worry that this may cause users with relatively large trace files to OOM for no good reason. In particular, those users are the most likely to care about compressing their trace files in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants