Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a matcher for file contents equality/difference #138

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

AntoniMarcinek
Copy link
Contributor

The purpose of the PR

This pull request adds a matcher to be used in asserting equality/difference of files. The intent is to support something like this:

Describe 'Some program'
  It 'provides valid result in the default settings'
     When run the_program
     The file 'dump.txt' should diff equal 'dump.default.txt'
  End
End

The PR's purpose is to start the discussion of the the above use case on something which already works.

Note that other possibilities are with

The contents of file 'dump.txt' should equal "$(cat dump.default.txt)"

or

diffit()
{
  diff "${diffit:?}" "$1"
}
The file 'dump.txt' should satisfy diffit 'dump.default.txt'

but the first provides too verbose and unreadable message, while the second one does not add the diff to the failure message.

Potential problems with this solution

It adds dependence on cmp and diff I have no idea what is the significance of this, i.e. if there are platforms lacking these tools.

I am aware that the name of the matcher is not very pretty. Maybe diff-equal or contents-equal would be better? Is the - character allowed in the DSL? I was considering the below, but found out none of these is permitted/gives the expected outcome:

The contents of file 'dump.txt' should equal contents of file "dump.default.txt"
The file 'dump.txt' should equal 'dump.default.txt'
The file 'dump.txt' should equal file 'dump.default.txt'

@AntoniMarcinek
Copy link
Contributor Author

This is kind of related to #112 in the sense that #112 also seems to bring dependence on diff and if implemented could be used in this matcher to extend it also to comparison of stdout/stderr with reference files (I have an impression that the subject output is not a file but a variable).

@ko1nksm
Copy link
Member

ko1nksm commented Dec 1, 2020

Thanks for contribution. Eventually I want to merge this, but there are a few things to do.

First, this is a matcher to make sure the files are identical. It would be good to be able to see the differences in diff format, but It is better to output only if there is a difference or not in this PR.

About cmp

I try not to increase the number of dependent commands whenever possible, but I think cmp is one of the exceptions. Because the shell alone cannot handle null character.

# dash silently ignores the null character
dash -c '[ "_$(printf "\0")_" = "__" ]; echo $?'
0

# bash ignores the null character with warning
bash -c '[ "_$(printf "\0")_" = "__" ]; echo $?'
bash: warning: command substitution: ignored null byte in input
0

# only zsh can handle null character
$ zsh -c '[ "_$(printf "\0")_" = "__" ]; echo $?'
1

Therefore, it makes sense to use cmp. However, after some research, it seems that there are environments that do not contain cmp.

docker run -it fedora:32 /bin/sh
sh-5.0# cmp
sh: cmp: command not found

In reality I don't think this will be a problem, but I want to avoid it if possible. Fortunately, cmp can be easily simulated by using od or hexdump commands on which it already depends. Use cmp if there is a cmp command for speed of execution, otherwise fall back to code that uses the od or hexdump command. Therefore no new dependencies arise.

About DSL

The file 'dump.txt' should equal file 'dump.default.txt'

I think this is better. It's not as long as contents and is more obvious with file. However, to do so, you need to modify the existing equal matcher. It's a bit ad hoc, but I I think it can be distinguished by arguments. Specifically, define the shellspec_matcher_equal_file function and call it if there are two arguments and the first argument is file.

About diff

Do not use a diff from this PR, but only output whether the files are the same or not. As for the diff, I would like to think about it when dealing with #112, but I want to handle it on the reporter side.

There are several implementations of diffs, such as colordiff, icdiff, git diff, etc., and I want to allow the user to change them. If there is no diff, it will appear as it does now. Therefore this is also no new dependencies arise.

It will take some time the implementation using diff. but you can override YOUR shellspec_matcher_equal_file in the shellspec_spec_helper_configure function in shellspec_spec_helper.sh.

@AntoniMarcinek
Copy link
Contributor Author

About cmp

Do I understand correctly that with the fallback the comparison should be done like

[ "$(od "$1")" = "$(od "$2")" ]

so that the strings are compared, but in a way which assures that the null character is not ignored?

I am not sure if am able to properly (i.e. in the shell-independent way) implement the check for existence of cmp and a fallback to od and then hexdump if also od is missing (greping around for hexdump in the shellspec I realised that you already have a fallback from od to hexdump in some function). Should I use the od_command function for the fallback solution?

About DSL

This also looks good to me. Unfortunately I am not sure how to implement it, because I don't really understand the matcher design - how this is called, what different functions in shellspec_matcher_equal() do. Consequently I can probably modify shellspec_matcher__match(), but I don't know what should be changed in other parts.

Specifically what to do with shellspec_syntax_failure_message? This definitely has to be changed but how to differentiate file and normal mode of equal? Will this also get more than 2 arguments if we have equal file name? Furthermore, what is the role of the shellspec_syntax_param line?

Is there a way to assert that the subject is file? Because this The output should equal file 'dump.default.txt' looks like a natural extension to The file 'dump.txt' should ...., but I think it will fail bitterly with what we have in mind currently.

About diff

This looks clear from my side.

I do have some questions to #112 though, but will comment there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants