Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error ECONNREFUSED from certain external links #118

Closed
amimas opened this issue Jul 17, 2019 · 3 comments · Fixed by #133
Closed

Error ECONNREFUSED from certain external links #118

amimas opened this issue Jul 17, 2019 · 3 comments · Fixed by #133

Comments

@amimas
Copy link

amimas commented Jul 17, 2019

I'm getting ECONNREFUSED from certain links. These errors don't appear consistently all the time, but they do appear more often. Here're some examples of those links:

   7:22   error  https://tools.ietf.org/html/rfc6749 is dead. (request to https://tools.ietf.org/html/rfc6749 failed, reason: connect ECONNREFUSED 64.170.98.42:443)                  no-dead-link
  16:140  error  https://tools.ietf.org/html/rfc6749#page-10 is dead. (request to https://tools.ietf.org/html/rfc6749#page-10 failed, reason: connect ECONNREFUSED 64.170.98.42:443)  no-dead-link
   7:451  error    https://www.slf4j.org/ is dead. (request to https://www.slf4j.org/ failed, reason: connect ECONNREFUSED 83.173.251.158:443)  no-dead-link
   35:116  error    http://www.slf4j.org/manual.html is dead. (request to http://www.slf4j.org/manual.html failed, reason: connect ECONNREFUSED 83.173.251.158:80)  no-dead-link
    7:115  error    https://logback.qos.ch/ is dead. (request to https://logback.qos.ch/ failed, reason: connect ECONNREFUSED 83.173.251.158:443)                                                    no-dead-link
  110:84   error    https://logback.qos.ch/manual/configuration.html is dead. (request to https://logback.qos.ch/manual/configuration.html failed, reason: connect ECONNREFUSED 83.173.251.158:443)  no-dead-link
  151:201  error    http://lucene.apache.org/solr/ is dead. (request to http://lucene.apache.org/solr/ failed, reason: connect ECONNREFUSED 95.216.24.32:80)  no-dead-link

I'm using version 4.4.3 of this linter rule. I have been trying to debug this issue but haven't found any pattern or cause of this yet.

@amimas
Copy link
Author

amimas commented Dec 2, 2019

@azu - I recently found a way to reproduce this issue consistently. Here're some details and a git repository that I think might help you.

First of all, it seems the linter has some sort of performance issue. Based on my observation, when the linter has to process a lot of files/contents, it starts reporting various links as "dead link" with the reason being connect ECONNREFUSED even though those links are not dead.

I've created a sample repository here: https://gitlab.com/elasticpath/poc/docs-ep-cloudops-aws

This repository is setup to build a documentation site using Docusaurus. I know that textlint.io is also based on Docusaurus. So I think you're familiar with the overall setup. Anyways, here're the steps to get started:

  1. Clone the git repo
  2. Browse to the website directory: cd website
  3. Install dependencies: yarn install
  4. Run textlint to validate dead links: yarn textlint-no-dead-link
    • This should pass successfully unless there are some links that are actually dead for sure
  5. Run textlint to validate dead links: yarn textlint-no-dead-link-all
    • This is where you'll notice a lot of errors reported by textlint about dead links

The difference between commands executed in step 4 & 5 is basically how textlint is being executed. In step 4, another script/command is being called that executes textlint on one set of docs folder at a time. The set of docs are simply different versions directories. In step 5, the command is given all the directories containing docs for textlint to validate. This is where it fails consistently

Most of the time the errors contains ECONNREFUSED. Sometimes, it fails with 503 Service Unavailable error too.

I have another documentation repository where I have a lot more docs than the above sample repository. That's why the workaround in step 4 above is not working for that repository; I can't get the linter to validate links properly. Sometimes it also reports error mentioned in issue #119

How do I go about debugging this? Would you be able to take a look? I'm using various other textlint rules and none of them seem to have this issue. Only this rule seems to have problem depending on the amount of files/contents.

@pa-eps
Copy link

pa-eps commented Mar 19, 2020

@azu - I have been able to find some pattern to this issue.

I'm noticing that the linter fails to validate external links, which are always coming from apache websites. For example:

  • logging.apache.org
  • camel.apache.org
  • activemq.apache.org

My documentation markdown files has various links to above mentioned apache sites and more but they are all various subdomains of apache.org.

When the issue starts, it will report random links (from those sites) as dead with this error message:

reason: connect ECONNREFUSED no-dead-link

This becomes even a bigger issue in my CI pipeline because of these random failures which marks the pipeline as broken (because dead link found). And due to pipeline being broken, it blocks changes being merged, even though the links are not actually dead when we browse to them manually.

I decided to run my documentation website through other link validators (online service, desktop tools, etc.). My observation so far is that some of the other tools also has trouble validating the apache.org websites. They get timeout.

I suspect that the apache.org sites are denying access or not responding when they receive connection from automated tools. That's why I think it might useful if we could have the option to be able to change the User-Agent value. This was also brought up in #128 .

I'm not sure that #128 will solve this intermittent issue, but it probably will help in addition to the use-case mentioned in that issue.

Thoughts?

@azu
Copy link
Member

azu commented Mar 31, 2020

User-Agent option is reasonle.
Welcome to Pull Request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants