Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wu-* commands for filesystem, etc... #14

Open
dieterichlawson opened this issue Feb 17, 2013 · 2 comments
Open

Add wu-* commands for filesystem, etc... #14

dieterichlawson opened this issue Feb 17, 2013 · 2 comments

Comments

@dieterichlawson
Copy link
Contributor

  1. Add Gorillib::Runner.run command with childprocess gem

  2. Add job commands:

  • wu-ps
  • wu-kill
  1. Add filesystem commands:
  • wu-dump
  • wu-lign
  • wu-ls
  • wu-du
  • wu-cp
  • wu-mv
  • wu-rm
  • wu-mkdir
  • wu-put
  • wu-get
@dhruvbansal
Copy link
Contributor

Many of these proposed wu- binaries seem like unnecessary aliases for already simple commands. wu-mv foo bar isn't so much better than hadoop fs -mv foo bar that it justifies a brand new executable on my $PATH, IMO. Or is the idea that we are "making it easier" or "providing an abstraction above Hadoop" ... ?

Stuff like wu-du and wu-lign (of course) are more useful but wu-du clearly belongs in wukong-hadoop.

Can we think about this list a little more critically is what I'm saying, I suppose.

@mrflip
Copy link
Member

mrflip commented Feb 20, 2013

Yes --

  • these will currently offer pathname helpers that are extremely useful as your projects grow beyond one-or-two scripts.
  • they present a more familiar interface -- wu-rm -r, not hadoop fs -rmr -- and a unified experience, with configliere/gnu style options and help (--skip_trash, not -skipTrash)
  • they will let us swap in a thrift back end that is much more performant -- no more waiting 1000ms for the JVM to start.
  • they will later become file system and datastore agnostic: real fs vs hdfs, and local vs Infochimps platform; and maybe even into the data store as appropriate.

The Hadoop file system commands are a true evil, and we've found these aliases really help when doing large project development. You already had most of them in your $PATH already as hdp-*.

No strong opinion on wukong vs wukong-hadoop, though they will eventually become non-hadoop-only.

Flip

Sent from my iPad

On Feb 20, 2013, at 12:28 AM, Dhruv Bansal notifications@github.com wrote:

Many of these proposed wu- binaries seem like unnecessary aliases for already simple commands. wu-mv foo bar isn't so much better than hadoop fs -mv foo bar that it justifies a brand new executable on my $PATH, IMO. Or is the idea that we are "making it easier" or "providing an abstraction above Hadoop" ... ?

Stuff like wu-du and wu-lign (of course) are more useful but wu-du clearly belongs in wukong-hadoop.

Can we think about this list a little more critically is what I'm saying, I suppose.


Reply to this email directly or view it on GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants