Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RiveScript Monolithic Version Control #1

Open
kirsle opened this issue Feb 9, 2017 · 2 comments
Open

RiveScript Monolithic Version Control #1

kirsle opened this issue Feb 9, 2017 · 2 comments

Comments

@kirsle
Copy link
Member

kirsle commented Feb 9, 2017

RiveScript may be able to benefit from having a monolithic version control repository rather than having one repo per implementation (Perl, Python, JavaScript, Go and Java). It could also encompass and make obsolete the other closely related repos: rivescript-wd and rsts.

High Level Rationale

This would involve a temporary disruption for third party contributors as the repos get rearranged and all the implementations are migrated into the common aichaos/rivescript repo. But the benefits gained from having a common repository for all implementations may include:

  • Adding a new feature to RiveScript as a whole can be made easier. For example, if I'm adding a new command to the language, I can have all 5 implementations get the update simultaneously with one GitHub pull request.

    The current state of affairs is that I would have to sit down all at once, clone all 5 repositories and deal with the overhead of making the changes and pull requests separately for each version. This is such an obstacle that I haven't added a new global feature to RiveScript in quite a while, because I don't want the published module versions to diverge from each other for any length of time.

  • Unit tests can be consolidated and kept in sync across all versions. Currently, each implementation has their own copies of unit test fixtures in their own repo so that Travis CI can run tests on push, and there is also the aichaos/rsts project which provides YAML test fixtures for the RiveScript language itself.

    Currently a lot of the YAML fixtures in rsts are duplicated from the tests in the individual implementations. And currently a lot of the implementations fail the RSTS test suite due to random little quirks in their implementations (for example, the exact wording they use for error messages).

    A mono-repo could bring the YAML fixtures from RSTS into the repo and all the unit tests can become RSTS tests, so that they are all kept in sync at all times and it will be easier to iron out their little quirks to all behave identically. And it's friendly for Travis CI which can still run automated unit tests.

  • I can make more use of GitHub features like Projects and Wikis. Currently most of the repos don't have wikis enabled until I have a repo-specific need for them (i.e., for the JavaScript version to document the Async issues that come up frequently in that project).

    If all the RiveScript implementations are merged together into the aichaos/rivescript project, the existing RiveScript Community Wiki can be merged with the random odd pages from the projects' own wikis.

Impact

Impact on Contributors

Having a single repo containing 5 different programming language implementations may deter contributions from developers who are only familiar with a subset of those languages: if they fix or change a behavior, the unit tests might pass for the language they modified but fail for the ones they didn't.

This may be somewhat alleviated by using the Git Flow workflow and have the master branch be the "production ready" one where all unit tests are passing and all builds are stable.

All pull requests would instead be directed at the develop branch which is allowed to fail tests from time to time. So if a contributor fixes the JavaScript implementation and breaks the other four, those can be patched by others (or me) later before the next production-ready merge to master can be done.

Impact on Implementations

The Perl, Python, Java and JavaScript implementations shouldn't be affected too much, as their GitHub URLs aren't very important in those languages. Their links on e.g. npm and PyPI would just be updated to point to the mono-repo.

The Go version would need to get a new module name, since Go ties the GitHub source address to the module's name by default. Options may include:

import (
    // The old (current) import path
    "github.com/aichaos/rivescript-go"

    // The new GitHub import path
    "github.com/aichaos/rivescript/go"

    // Could also make use of vanity import URLs...
    // but the gopkg.in one looks ugly IMHO.
    "gopkg.in/aichaos/rivescript/go.v1"

    // A self hosted vanity URL but it would require me to maintain the
    // URL and make sure it doesn't go down, or it will break `go get`
    "rivescript.com/go"

    // I could use GitHub Pages to remove the burden of self hosting
    // a vanity URL though; the Go version can have "/rivescript" because
    // no other language uses these kinds of import paths.
    "aichaos.github.io/rivescript"
)

On a bright note: I've been struggling with the code layout in rivescript-go because I don't like a large number of Go source files cluttering up the root of the repo alongside the meta files like README.md, Changes.md and .travis.yml; as such I kept the majority of the source code (with the tons of unit tests) under a src/ subdirectory and created a wrapper API in the root of the repo instead.

If the Go source for RiveScript were kept at aichaos/rivescript/go, then the /go directory can replace the src one and all the Go source files can be kept in a sane place together. The root of the repo, having nothing to do with the Go specific implementation, can contain only meta files and no program code at all.

This would simplify the Go API as much as I've been wanting.

Impact on Travis CI

The root of the repo could probably add a Makefile for Travis CI to use which would handle the necessary logic to run through the unit tests on each implementation.

@kaidesu
Copy link

kaidesu commented Feb 15, 2017

I think this unnecessarily raises the level of complexity too much to be helpful. The biggest wall it'll create is in collaboration with other developers.

And to address the file structure of the Go implementation - I'm very much a fan of keeping all my code within a src directory, where I will further categorize classes and files into subdirectories for organizational purposes. Tests I keep at the root within their own tests directory as well. This keeps the root of the project minimal and clean, and organized in a way that is easy to navigate.

If you take a look at my PHP implementation, you'll find a structure like the following:

+ logs/
+ resources/
+ src/
+ tests/
.gitattributes
.gitignore
LICENSE
README.md
composer.json
phpunit.xml
rivescript

@kirsle
Copy link
Member Author

kirsle commented Feb 16, 2017

With Go, the project structure becomes a more important problem because of the tight coupling between repository URLs (and therefore structure) and the packages in them. To put the bulk of the source code into a src/ subdirectory, I had to make a "wrapper" package at the root of the repository anyway so that I could provide a sane API to developers while also keeping the root of the repo as minimal as possible.

With the current layout, github.com/aichaos/rivescript-go is one package and github.com/aichaos/rivescript-go/src is a wholly separate one, which has a very similar API but that I encourage developers not to use... but the whole thing feels clumsy to me. The reason the src package has a similar API is that it has to export all its things up to the root package, because in Go any symbol with a capitalized first letter is exported to be used by other packages and lowercase names are internal use only (within the same package that defined it). For the high level wrapper class to get at all the functions and types of the src class, the src has to export all those symbols publicly.

Unit tests in Go become a whole other can of worms. It's possible to put the tests into a separate tests folder (and I played with that idea back and forth while working on the tests), but you end up losing a lot of the nice features of the Go tooling, such as code coverage reports. In order to get code coverage (and therefore nice highlighting of untested blocks of code in my text editor), the unit tests need to exist in the same package as the code they're testing (and be able to test internal functions and types). If I put the code into a separate tests directory, it can only test the public side of the API by treating it as a third party module. This works as far as running the tests go, but you lose the code coverage reports. It's not too bad of a loss, I suppose, because I don't bother caring about code coverage for the other implementations, but it's a really nice feature that's so easy to deal with in Go that it'd be a shame not to have it just for the sake of organizing my repository better.


Some ideas I have that could help me with the Go codebase without going to a mono-repo:

  • Find a way to use the rsts test suite in the Go version and get rid of all the current unit tests in favor of that.
    • Maybe by using git submodules to clone the rsts repo into a folder in the rivescript-go repo, and run the tests out of it; and make sure that all works with Travis CI for automated testing.
    • I'd lose code coverage reporting (most likely), but I could get rid of all of the *_test.go files and only have one (or less). Less source files altogether means I might be able to just bite the bullet and get rid of the src/ package and put everything into the root of the repo and solve that problem too.
  • Just move the Go implementation into the rivescript repo anyway, as though I'm going for a mono-repo, and I could get the benefits mentioned in the original post with organizing the repo. The src/ sub-package would become the go/ folder in rivescript without making the import URL look too silly or stuttery (e.g. github.com/aichaos/rivescript-go/go looks worse than github.com/aichaos/rivescript/go)
    • This would give off the impression that I'm showing favoritism toward the Go version, however, since it would get to use the rivescript repo name directly rather than the others having names like rivescript-python
  • Or just throw in the towel and move everything from the src subpackage into the main package and just deal with the fact that I'll have 20 *.go files sitting in the same place as README.md, Changes.md, .travis.yml, .gitignore and so on. This feels very cluttered to me but other Go developers seem to care a lot less about it than I do, so, :shrug:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants