Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid byte sequence in US-ASCII (ArgumentError) when running sitediff diff #132

Open
brnquester opened this issue Mar 15, 2022 · 1 comment

Comments

@brnquester
Copy link

Summary

I have no problem running sitediff store or sitediff crawl; however, when running sitediff diff I keep getting the following error:

sitediff diff
/usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/cache.rb:44:in `split': invalid byte sequence in US-ASCII (ArgumentError)
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/cache.rb:44:in `get'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:46:in `block in queue_path'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:45:in `each'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:45:in `queue_path'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:35:in `block in run'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:35:in `each'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:35:in `run'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff.rb:184:in `run'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/api.rb:117:in `diff'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/cli.rb:127:in `diff'
	from /usr/local/bundle/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
	from /usr/local/bundle/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
	from /usr/local/bundle/gems/thor-0.20.3/lib/thor.rb:387:in `dispatch'
	from /usr/local/bundle/gems/thor-0.20.3/lib/thor/base.rb:466:in `start'
	from /usr/local/bundle/gems/sitediff-1.1.1/bin/sitediff:12:in `<top (required)>'
	from /usr/local/bundle/bin/sitediff:23:in `load'
	from /usr/local/bundle/bin/sitediff:23:in `<main>'
Reading config file: /website/sitediff/sitediff.yaml
Read 4582 paths from: /website/sitediff/paths.txt

Solution attempts

I was able to pass that error by patching it with:

sed -i 's/path.split(File::SEPARATOR)/path.encode('\''UTF-8'\'', :invalid => :replace).split(File::SEPARATOR)/g' /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/cache.rb

But then I started to get other error:

sitediff diff
/usr/local/bundle/gems/addressable-2.5.2/lib/addressable/uri.rb:107:in `scan': invalid byte sequence in US-ASCII (ArgumentError)
	from /usr/local/bundle/gems/addressable-2.5.2/lib/addressable/uri.rb:107:in `parse'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/uriwrapper.rb:52:in `initialize'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:54:in `new'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:54:in `block in queue_path'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:45:in `each'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:45:in `queue_path'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:35:in `block in run'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:35:in `each'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/fetch.rb:35:in `run'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff.rb:184:in `run'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/api.rb:117:in `diff'
	from /usr/local/bundle/gems/sitediff-1.1.1/lib/sitediff/cli.rb:127:in `diff'
	from /usr/local/bundle/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
	from /usr/local/bundle/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
	from /usr/local/bundle/gems/thor-0.20.3/lib/thor.rb:387:in `dispatch'
	from /usr/local/bundle/gems/thor-0.20.3/lib/thor/base.rb:466:in `start'
	from /usr/local/bundle/gems/sitediff-1.1.1/bin/sitediff:12:in `<top (required)>'
	from /usr/local/bundle/bin/sitediff:23:in `load'
	from /usr/local/bundle/bin/sitediff:23:in `<main>'
Reading config file: /website/sitediff/sitediff.yaml
Read 4581 paths from: /website/sitediff/paths.txt
Using sites from cache: before

I have also tried to declare the encoding in the container before running/installing it with no success:

export LANG="en_US.UTF-8"
export LANGUAGE="en_US.UTF-8"
export LC_CTYPE="en_US.UTF-8"
export LC_NUMERIC="en_US.UTF-8"
export LC_TIME="en_US.UTF-8"
export LC_COLLATE="en_US.UTF-8"
export LC_MONETARY="en_US.UTF-8"
export LC_MESSAGES="en_US.UTF-8"
export LC_PAPER="en_US.UTF-8"
export LC_NAME="en_US.UTF-8"
export LC_ADDRESS="en_US.UTF-8"
export LC_TELEPHONE="en_US.UTF-8"
export LC_MEASUREMENT="en_US.UTF-8"
export LC_IDENTIFICATION="en_US.UTF-8"

Any thoughts?

Tech stack

  • Ubuntu 21.04
  • Docker container Ruby v2.6.9
  • Sitediff v1.1.1
@kirk-brown-ew
Copy link
Collaborator

I'd recommend Ruby v2.7.5. We'll look into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants