Skip to content

[RFC] CLI Converter

Romain Dorgueil edited this page Oct 8, 2017 · 3 revisions
Subject: Command line converter from one format to another
Author: Romain Dorgueil
Created: Jul 16, 2017
Updated: Oct 8, 2017
Target: 0.5
Status: Minimal feature released (0.5), improvements in progress (0.6)

Abstract

This is a proposal to be able to run simple file to file transformations directly from the console.

There is no point to write a transformation file to do the simplest work, it's very often needed and all the tools to do it are already available in bonobo. It would enable to add a lot of use cases for bonobo, like use it within bash scripts, etc. and it also provide a very nice way to make it useful for people who just want conversion, even if they do not know python and/or do not want to write python.

Use cases

  • Convert one format to another (0.5).
  • Convert encoding (with or without format change) (0.6).
  • Convert one format to the same format with different options (0.6).
  • Download a file and write in another format (0.6).
  • Read a file and post in another format to http (0.6?).
  • Geo-related file formats (0.6?).

Special highlight from different persons working with geographical data: there are no good open source tools to convert between the various GIS file formats (kml, shapefiles, geojson, ...), bonobo can have a niche use case here.

Proposal

  1. Convert one format to another, from local filesystem, from http, or from another readable storage source (s3...). The format can be either specified explicitely, or guessed from the file mime type, using mimetypes module from stdlib.
bonobo convert test.csv test.json -r csv -w json
bonobo convert test.txt test.dat --reader=csv --writer=json
bonobo convert https://httpbin.org/xml test.json -r xml
  1. Override default options from the readers/writers?
bonobo convert https://httpbin.org/xml test.json -r xml --option encoding=utf-8 --reader-option foo=bar

How should the various arguments be called? "option", "reader-option" and "writer-option" looks good for the long form, maybe argparse supports multiple letters shortform, and then -o/-wo/-ro can be ok. Ohterwise, I suggest -O, -W, -R uppercase makes it look more consistent.

  1. Add transformation?

Once the transformation factory feature is done, we can think of a DSL to configure simple transformations from command line (something that looks like jq command line tool?).

Work in progress

References