Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

joining multiple fields #179

Open
janetvanderpuye opened this issue May 30, 2018 · 2 comments
Open

joining multiple fields #179

janetvanderpuye opened this issue May 30, 2018 · 2 comments

Comments

@janetvanderpuye
Copy link

janetvanderpuye commented May 30, 2018

Hi, I'm new to rq so forgive me if this is a noob question. It is more of a question than a bug/issue. I looked around a bit and I could not find a solution, that's why I'm posting here. I have a json structure similar to this

 {
	"id":"123456",
	"header":{"more":false},
	"result":[
		{
			"e_id":"XXX",
			"e_type":"ENTITY", 
			"identifiers": [
				{
					"type":"NAME",
					"id":"XXX_0",
					"name":"Segosan Itabachi",
					"modified_ts":"2017-04-06 20:27:02.0",
					"main":true
				},{
					"type":"TAG",
					"id":"XXX_1",
					"name":"Segosan",
					"modified_ts":"2017-04-06 20:27:02.0",
					"main":false
				},
                                 {
					"type":"NAME",
					"id":"XXX_2",
					"name":"Segosan Itabachi-san",
					"modified_ts":"2017-04-06 20:27:02.0",
					"main":false
				}
 		]}
 ]
}

What I want to do is to filter flatten the whole structure into a sort of csv. The selection criteria is to go through all the objects in the identifiers array, check if the identifier type == "NAME" and then output the e_id joined to the id for that identifier object on a row. So for the above the output would look like

"XXX", "XXX_0", "Segosan Itabachi"
"XXX", "XXX_2", "Segosan Itabachi-san"

So far, I'm stuck at rq 'at "result"|spread ' < rds_tickers.json . If I use the map function with the e_id field, I can't access the identifiers. If I use the flatMap function with the identifiers array, then i know longer have access to the e_id field. Any tips or pointers in the right direction would be greatly appreciated.

@mikaelstaldal
Copy link
Contributor

mikaelstaldal commented Nov 20, 2018

You can do this:

rq 'at "result" | spread | map (x) => { _.map(_.filter(x.identifiers, function(e) {return e.type === "NAME"}), function(e) {return [x.e_id, e.id, e.name]}) } | spread'

And if you use the latest version just released, you can add -V to get CSV output.

But it is arguably quite clumsy. Not sure if there is an easier way.

@mikaelstaldal
Copy link
Contributor

Slightly better version:

rq 'at "result" | spread | map (x) => { _.map(_.filter(x.identifiers, e=>e.type === "NAME"), e=>[x.e_id, e.id, e.name]) } | spread'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants