Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Malformed committer names/e-Mail addressses cause crashes #76

Open
philip-iii opened this issue Apr 12, 2011 · 4 comments
Open

[Bug] Malformed committer names/e-Mail addressses cause crashes #76

philip-iii opened this issue Apr 12, 2011 · 4 comments

Comments

@philip-iii
Copy link
Contributor

It appears that the git parser delivers empty committer/author which eventually leads to a crash in DBContentHandler:

DBG: DBProxyContentHandler: commit: a2190644e03b4e1918331a3e4f0459971f66c1e0
Traceback (most recent call last):
  File "./cvsanaly2", line 37, in <module>
    retval = pycvsanaly2.main.main (sys.argv[1:])
  File "/media/DATA/Resources/Tools/cvsanaly/cvsanaly-fork-latest/pycvsanaly2/main.py", line 398, in main
    parser.end()
  File "/media/DATA/Resources/Tools/cvsanaly/cvsanaly-fork-latest/pycvsanaly2/Parser.py", line 63, in end
    self.handler.end()        
  File "/media/DATA/Resources/Tools/cvsanaly/cvsanaly-fork-latest/pycvsanaly2/DBProxyContentHandler.py", line 84, in end
    self.db_handler.commit(item)
  File "/media/DATA/Resources/Tools/cvsanaly/cvsanaly-fork-latest/pycvsanaly2/DBContentHandler.py", line 634, in commit
    log.committer = self.__get_person(commit.committer)
  File "/media/DATA/Resources/Tools/cvsanaly/cvsanaly-fork-latest/pycvsanaly2/DBContentHandler.py", line 256, in __get_person
    name = to_utf8(person.name)
AttributeError: 'NoneType' object has no attribute 'name'

This occurs when the author field from the git log is not properly structured, e.g.

commit 8ba37b53db628bdd4ec2e8d54ada008e66a46100
Author: Keith Schwarz <keith@keithschwarz.com>
Date:   Sun Sep 21 07:14:52 2008 +0200

is fine, while

commit a2190644e03b4e1918331a3e4f0459971f66c1e0
Author: Armen Zambrano Gasparnian (armenzg@mozilla.com>
Date:   Thu Jan 1 00:00:00 1970 +0000

causes a crash (due to the left paren instead of an angled bracket). Both examples are taken from Firefox mercurial repo converted to Git. The mercurial repo has the same malformed user ID.

I guess there are two options:

  • make the git parser more flexible in terms of regexes, which in this case may as well be an overkill, since such malformed inputs are (as far as my observations go) not very common, or
  • improve the error handling down the line (in particular in the DBContentHandler, but possibly elsewhere as well) so that malformed input does not result in a crash but simply a debug/warning message and a null value in the database, so that one could potentially fix such cases manually if they are indeed not very common.

As a temporary workaround, exporting the log file from git (-s switch), fixing the input issues, and then using it as input (-i switch) should hopefully help in this case, however in the long term a more viable solution would be preferable as it often results in wasting hours of processing only to fail at a very late stage.

@philip-iii
Copy link
Contributor Author

It appears that a missing space between the author name and their e-Mail address (in angled brackets) causes a crash as well.

This later issue seems to be a lot more common (5 vs. ~50 occurrences in the Firefox repo)

@cflewis
Copy link

cflewis commented Apr 12, 2011

I would probably want to modify the regex in this case. I don't have time to work on it, linzhp might, but I don't know.

We're always happy to accept pull requests! (hint hint! :) )

@philip-iii
Copy link
Contributor Author

I will give it a try later, I need to construct a minimal test case so I won't have to wait for too long to see if it works.

@cflewis
Copy link

cflewis commented Apr 12, 2011

The easiest way to do this is with a doctest, that way the use-case is also documented in code alongside the problem.

I personally run doctests with nose and the command nosetests --with-doctest. Once you get them all passing, send a pull request and I'll merge it in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants