Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to transform a table into a csv file using pup? #159

Open
stephane-archer opened this issue Aug 18, 2021 · 3 comments
Open

How to transform a table into a csv file using pup? #159

stephane-archer opened this issue Aug 18, 2021 · 3 comments

Comments

@stephane-archer
Copy link

stephane-archer commented Aug 18, 2021

I have the following input

<!-- a lot of html -->
<table>
<tr>
 <td>
  <p>
   data1
  </p>
 </td>
 <td>
  <p>
   data2
  </p>
 </td>
</tr>
<tr>
 <td>
  <p>
   data3
  </p>
 </td>
 <td>
  <p>
   data4
  </p>
 </td>
</tr>
<tr>
</table>
<!-- a lot of html -->

and I want the following output

data1 <separator> data2
data3 <separator> data4

Can I do this with pup?
pup.exe "table [magic display function]"

@Anshuman-bansal
Copy link

Screenshot 2021-09-26 at 10 25 34 PM

it is. not installing on mac i have go and hombrew both

@mazznoer
Copy link

mazznoer commented Oct 8, 2021

@stephane-archer Using pup and jq.

file.html:

<table>
  <thead>
    <tr>
      <th>Last Digit</th>
      <th>Color</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>black</td>
    </tr>
    <tr>
      <td>1</td>
      <td>red</td>
    </tr>
    <tr>
      <td>2</td>
      <td>green</td>
    </tr>
    <tr>
      <td>3</td>
      <td>yellow</td>
    </tr>
    <tr>
      <td>4</td>
      <td>blue</td>
    </tr>
    <tr>
      <td>5</td>
      <td>magenta</td>
    </tr>
    <tr>
      <td>6</td>
      <td>cyan</td>
    </tr>
    <tr>
      <td>7</td>
      <td>white</td>
    </tr>
  </tbody>
</table>

With commands:

pup -f file.html 'table tbody tr json{}' | jq '.[] | .children | "\(.[0].text),\(.[1].text)"' -r

We get:

0,black
1,red
2,green
3,yellow
4,blue
5,magenta
6,cyan
7,white

@MarceloAmigo
Copy link

or more easy:

pup -f file.html 'table tbody tr td:nth-last-of-type(n+2) text{}'  

for first case:

paste  <(pup -f file.html 'table tr td:first-child p text{}') <(pup -f file.html 'table tr td:last-child p text{}') 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants