Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checking for git repo produces confusing messages #594

Open
julie777 opened this issue Feb 10, 2022 · 9 comments
Open

checking for git repo produces confusing messages #594

julie777 opened this issue Feb 10, 2022 · 9 comments
Labels
enhancement New feature or request

Comments

@julie777
Copy link

Description of your problem

putup -vv myprojects

Please provide the full traceback using the --very-verbose flag.

     run      git.exe rev-parse --git-dir
    info      last command failed with "fatal: not a git repository (or any of the parent directories): .git"

Please provide any additional information below.

In this instance the output of the command is not important because the normal case is that the new project directory is not in an existing git repo. The error/return code of the git command is all that matters. This will be confusing to most users. It was to me.

I am running many scenarios trying to learn the codebase so that I can contribute. It took me a few minutes to look at the code and understand. I doubt all users want to do that.

I am working on a pull request.

Versions and main components

  • PyScaffold Version: 4.1.4.post1.dev16+gc4469f0.d20220210
  • Python Version: 3.10.2
  • Operating system: Windows 10 / git-bash
  • How did you install PyScaffold: pipx
@abravalheri
Copy link
Collaborator

Hi @julie777, thank you very much for investigating that.

That is a tricky one... I think we want to know exactly what is happening when the flag --very-verbose is on. That helps debugging.

For example a few months ago @FlorianWilhelm managed to fix a very scary bug that was happening because git was failing to run. Having the message being printed is useful for when the user is going to report the issue in GitHub.

Maybe we should change the text in the message instead, in a way it is less confusing? What do you think?

Question, does the message appear with the regular --verbose?

@FlorianWilhelm
Copy link
Member

Hi @julie777, thanks for looking into this. It's exactly as @abravalheri explained, we see --very-verbose as a flag for debugging and also list failing commands that are supposed to fail like git in this very context. I also see your point that this might be confusing to users. Do you have an idea how to avoid this but at the same time that we still get the information from the actual shell command?

I think in this context the actual culprit is the fact that we call shell commands, i.e. git. This is a very bad style and I always tell my junior programmer colleagues to never to this :-) In our case, there is no other way as Linus Torvalds decided to not have Git separated into a library and a small frontend application. For this reason there is no proper way to call Git from Python at least not that I know of. Maybe things changed and there is now?

@julie777
Copy link
Author

I understand now.
I'm still learning about the code base. Please be patient with me and continue to point me in the right direction.

One of the things that confused me was that in the code I saw with logger.indent() but I didn't really see the indent in the output. Having the first column be right justified hides the nested structure of the second column.

I really like the use of the logging module to provide verbose output. It is infinitely configurable. I also like the addition of indent nesting. I have always found that python logging is a good way to both produce actual log files and to provide output to the user in a very configurable way.

I think there are things that could help, and they relate to this issue (although perhaps the issue should be renamed):

  1. clarifying debugging output vs verbose output for the user could help
  2. indentation that matched the structure of what is actually happening would help to explain the output
  3. making sure that there is an entry in the log stating what is being done that has nested within it how something is being done.

I will attempt to explain what I mean now. Don't think of this as criticism as much as just how a new person might interpret the current output and some possible ways that it could be changed.

Debug Logging

Now that I understand the purpose of --very-verbose I completely understand showing every primitive including failure results that are expected. But, I didn't know or expect --very-verbose to mean --debug. Perhaps we could change --very-verbose to --debug or at least change the help doc for it to indicate its purpose.

Indentation Indicates Grouping

Here is a chunk of my log that has the original failure that confused me.

      invoke  pyscaffold.structure:define_structure
      invoke  pyscaffold.actions:verify_project_dir
       chdir    .
         run      git.exe rev-parse --git-dir
    last command failed with "fatal: not a git repository (or any of the parent directories): .git"
      invoke  pyscaffold.update:version_migration
      invoke  pyscaffold.structure:create_structure
      create    myporoject
      create    myproject\.gitignore
      create    myproject\.coveragerc
      create    myproject\.readthedocs.yml

If I ignore the first column then I can easily see that invoking the create_structure action is using primitives to implement the action.

pyscaffold.structure:create_structure
      create    myproject
      create    myproject\.gitignore

But, to a newcomer the first column right justification obscures the single space indent and nothing says anything about action and the verbose pyscafflod.structure:define_structure is also noisy.

If I change the output to

invoke action create_structure
      create    dummy2
      create    dummy2\.gitignore

it becomes obvious that create dummy2 is a sub-step of invoke action create structure. In addition I would like the indentation to be 4 spaces to be easily discernible and to match the standard indentation for Python. :-)

Logging What in Addition to How

Looking at this part of the log

 invoke  pyscaffold.actions:verify_project_dir
       chdir    .
         run      git.exe rev-parse --git-dir
    last command failed with "fatal: not a git repository (or any of the parent directories): .git"

Several things aren't as clear as they could be. I see verify_project_dir and I think that it will check if the project directory exists. That is just how I interpret verify. But it is in fact doing other things. I am not going to look at the code for this, but rather just toss something out to give you an idea of how adding the what could help.

invoking action validate_project_dir
    checking if directory already exists                # This is the "what"
        ERROR: directory exists: use "--force" to overwrite it         # would  exit here if this really happened
    changing working directrory to parent of project
        chdir .            # this might be better if it was expanded to a $HOME relative path or a full path
    checking to make sure project will not be in an existing git repo    # This is the "what"
        run "git.exe rev-parse --git-dir"
            exit code: 1
            stdout:
            stderr:  last command failed with "fatal: not a git repository (or any of the parent directories): .git"
        project dir is not in existing git repo
invoking action create_project_directory_structure in ~/somepath/projects
    create directory myproject
    create directory myproject/docs
    create directory myproject/src
    create directory myproject/tests
    etc.
invoking action create_project_files
    create file README.txt from template somepath/README.txt.template
    etc.

There are still some inconsistencies, such as chdir which is a "primitive" and create which is a description of the primitive "mkdir". That said, I think you can understand my thoughts.

Using python wraps to format captured stdout and stderr and presenting in a way that doesn't just wrap in an uncontrolled way making the output messy can help. (This conflicts with the standard of keeping log entries to a single line, but I believe that only applies to applications like web servers that run continuously and have log aggregated and processed by other tools. I think pyscaffold logs will only be used by people.)

Please remember that the example output is both fictional and not tied to actual current output, and that I tried to keep it brief.

I haven't yet spent the time to understand the log wrapper that is doing all the logging, why it is there, what all it does, etc. I will do that before my next comment.

@abravalheri
Copy link
Collaborator

That sounds like a very interesting change!

@abravalheri abravalheri added the enhancement New feature or request label Feb 14, 2022
@julie777
Copy link
Author

I think in this context the actual culprit is the fact that we call shell commands, i.e. git. This is a very bad style and I always tell my junior programmer colleagues to never to this :-) In our case, there is no other way as Linus Torvalds decided to not have Git separated into a library and a small frontend application. For this reason there is no proper way to call Git from Python at least not that I know of. Maybe things changed and there is now?

Years ago when I was using bazaar for SCM I used a python library to access it. It worked quite well.

I know that there are at least two git libraries for python. I will take a look at them and let you know.

I don't have any real objection to running sub-processes when necessary. As you suggest libraries can be nicer in many ways.

I did notice that shell.py default to running subprocesses with shell=True. I am curios if you remember the reason for this. I have always tried to avoid it for a couple of reasons.

  • according to the python documentation it is a security hole
  • I really prefer to use a list of arguments and not have to manually handle quoting.
    • It used to be required to turn arguments into a single string when using shell=True but according to the current doc that is no longer required.

@abravalheri
Copy link
Collaborator

abravalheri commented Feb 14, 2022

At some point I studied running the command just from a list of string without shell=True. The code for that should be somewhere in the git history or in the closed PRs... it should not be very difficult to make it though.

I think the main reason here is to maximise compatibility. Git might be tricky to run. The program in $EDITOR as well...

@julie777
Copy link
Author

After doing some looking for a python git package, I only found one, GitPython.
I also remembered why the bazaar python package was so nice - because bazaar was written in Python and had an api.

As for GitPython I will look at it more closely to see if it provides anything that helps us. Since git isn't written in Python it is a wrapper using subprocess just like we already are doing. It might provide an easier to use API so I will review it.

@julie777
Copy link
Author

@abravalheri

Question, does the message appear with the regular --verbose?

Yes it does. I have tried all the possible options, but for a simple putup <project> both -v and -vv seem to produce exactly the same output.

@abravalheri
Copy link
Collaborator

Yes it does. I have tried all the possible options, but for a simple putup <project> both -v and -vv seem to produce exactly the same output.

I think that last time we had a bug, we made all the log messages more verbose because it was hard to find out what was going on.

Maybe we can hide some of the lines that can be confusing and just display them on --very-verbose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants