Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSX: ssh docker setup is not working for maint and master #42

Open
yarikoptic opened this issue Jan 3, 2021 · 9 comments
Open

OSX: ssh docker setup is not working for maint and master #42

yarikoptic opened this issue Jan 3, 2021 · 9 comments
Assignees
Labels

Comments

@yarikoptic
Copy link
Member

see e.g. https://github.com/datalad/git-annex/actions/runs/458535490 runs

2021-01-03T02:45:26.5758460Z 57a3a5a52691: Pull complete
2021-01-03T02:45:26.5818610Z Digest: sha256:f5e151dc378ce081e3009e0780d96ba96bd003be07f7da8be626ecce5511e0f1
2021-01-03T02:45:26.5837070Z Status: Downloaded newer image for dataladtester/docker-ssh-target:latest
2021-01-03T02:45:26.5843510Z  ---> eff5a230c1b6
2021-01-03T02:45:26.5848370Z Step 2/4 : RUN groupadd -og 20 dl &&     useradd -ms /bin/bash -ou 501 -g dl dl &&     mkdir -p /home/dl/.ssh &&     chown -R dl:dl /home/dl/ &&     echo 'dl:dl' | chpasswd
2021-01-03T02:45:26.7088880Z  ---> Running in c227b4f3181b
2021-01-03T02:45:27.1036730Z Removing intermediate container c227b4f3181b
2021-01-03T02:45:27.1042410Z  ---> 8c8f201d0430
2021-01-03T02:45:27.1046170Z Step 3/4 : CMD ["/usr/sbin/sshd", "-D"]
2021-01-03T02:45:27.1292980Z  ---> Running in dd6112c08e62
2021-01-03T02:45:27.1851620Z Removing intermediate container dd6112c08e62
2021-01-03T02:45:27.1855330Z  ---> 47602e520b19
2021-01-03T02:45:27.1856340Z Step 4/4 : RUN mkdir -p "/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T"
2021-01-03T02:45:27.2110890Z  ---> Running in 5c1475e5bade
2021-01-03T02:45:27.4972770Z Removing intermediate container 5c1475e5bade
2021-01-03T02:45:27.4976210Z  ---> ea7b055ec3e6
2021-01-03T02:45:27.4983950Z Successfully built ea7b055ec3e6
2021-01-03T02:45:27.5029840Z Successfully tagged datalad-tests-ssh:latest
2021-01-03T02:45:27.5756460Z ac072058773fb66ab0dec91d80af3fba86b5293374d92a49169f3de39fdec683
2021-01-03T02:45:27.9241170Z cfe67f0b48989a2f2d007dbf66614dd6ca7c1590be5c48bfb172bb767cd90f3c
2021-01-03T02:45:28.2400480Z nc: connectx to localhost port 42241 (tcp) failed: Connection refused
2021-01-03T02:45:28.2402450Z nc: connectx to localhost port 42241 (tcp) failed: Connection refused
2021-01-03T02:45:29.4298790Z nc: connectx to localhost port 42241 (tcp) failed: Connection refused
2021-01-03T02:45:29.4300300Z nc: connectx to localhost port 42241 (tcp) failed: Connection refused
.... the same is filling up the logs .... 

did not look inside on how to resolve but must be possible one way or another (may be it is just a port conflict issue among multiple docker instances on the same box?)

@yarikoptic
Copy link
Member Author

any immediate ideas on what is going wrong here? given that master soon will be released as 0.14.0 and thus maint jump over to current master, may be this issue would disappear on its own though

@jwodder
Copy link
Member

jwodder commented Jan 26, 2021

@yarikoptic I do not know what's going wrong. Further ad hoc customization of the SSH setup scripts would be needed in order to get any debugging information.

@yarikoptic
Copy link
Member Author

Ah, let's then forget about it and wait for master release

@yarikoptic
Copy link
Member Author

actually I take it back since I mixed it all up -- it works only on release and not on maint or master, so we are doomed to pin it down :-/ FWIW it seems failing differently ATM

maint:
==> docker-machine
Bash completion has been installed to:
  /usr/local/etc/bash_completion.d

To have launchd start docker-machine now and restart at login:
  brew services start docker-machine
Or, if you don't want/need a background service you can just run:
  docker-machine start
Creating CA: /Users/runner/.docker/machine/certs/ca.pem
Creating client certificate: /Users/runner/.docker/machine/certs/cert.pem
Running pre-create checks...
(default) Image cache directory does not exist, creating it at /Users/runner/.docker/machine/cache...
(default) No default Boot2Docker ISO found locally, downloading the latest release...
Error with pre-create check: "failure getting a version tag from the Github API response (are you getting rate limited by Github?)"
Error: Process completed with exit code 3.
master: connection refused
END datalad-tests-ssh2 LOGS --------
nc: connectx to localhost port 42241 (tcp) failed: Connection refused
nc: connectx to localhost port 42241 (tcp) failed: Connection refused
nc: connectx to localhost port 42241 (tcp) failed: Connection refused

so I guess it boils down to how datalad is installed (from pypi vs straight from git)?

@jwodder
Copy link
Member

jwodder commented Jan 26, 2021

@yarikoptic Those errors are occurring before datalad is even installed. We've also known about the maint issue for a while, and it continues to fail despite doing what the docker-machine docs say.

@yarikoptic
Copy link
Member Author

d'oh -- looked into our template:

    {% if ostype == "ubuntu" or ostype == "macos" %}
      - name: Set up SSH target
        shell: bash
        # TODO: Drop the release condition once 0.13.2 is released.
        run: |
          if [ "${{ matrix.version }}" != "release" ]; then
            {% if ostype == "macos" %}

that explains the difference between released or not. On released (which we should have started to test against SSH) we do not even bother to set it up for running SSH tests. So, at least that mystery is not a mystery ;) I will submit a PR now to just disable setting it up for SSH on OSX, so we get green again, but we still need to figure out WTF we fail to establish that docker container on OSX.

yarikoptic added a commit that referenced this issue Feb 1, 2021
For not yet known reason we have a problem instantiating a proper docker
container on OSX for ssh testing.  Also I found some logic which we
should have removed after datalad 0.13.2 release while establishing
that docker container on osx and linux -- so I removed conditional

Re OSX: see #42
yarikoptic added a commit that referenced this issue Feb 1, 2021
For not yet known reason we have a problem instantiating a proper docker
container on OSX for ssh testing.  Also I found some logic which we
should have removed after datalad 0.13.2 release while establishing
that docker container on osx and linux -- so I removed conditional

Re OSX: see #42
@jwodder
Copy link
Member

jwodder commented Feb 11, 2021

@yarikoptic I believe I've finally fixed this. The problem was that SSH was configured to connect to localhost, but when using docker-machine, containers' ports aren't exposed on localhost, they're exposed on the IP address for the docker-machine VM.

PRs: datalad/datalad#5417, #55

@yarikoptic
Copy link
Member Author

AWESOME, Thank you @jwodder !

@yarikoptic
Copy link
Member Author

Eventually we should get back to this , and either finish #55 or #58 but currently testing against datalad is still red overall since recent annex changes caused breakages, see datalad/datalad#6492 -- so we are pretty much blocked by that. We should get back to adding ssh testing as soon as datalad turns green again here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants