Skip to content

A declarative struct-tag-based HTML unmarshaling or scraping package for Go built on top of the htmlquery library

License

Notifications You must be signed in to change notification settings

azlotnikov/goxtag

Repository files navigation

goxtag

GitHub go.mod Go version Build Status Coverage Status

This package is an analog of github.com/andrewstuart/goq for xpath selectors.

Install

go get -u github.com/azlotnikov/goxtag

Example

package main

import (
    "github.com/azlotnikov/goxtag"
    "log"
    "net/http"
)

// Structured representation for github file name table
type example struct {
    Title string `xpath:"//h1"`
    Files []string `xpath:".//table[contains(concat(' ',normalize-space(@class),' '),' files ')]//tbody//tr[contains(concat(' ',normalize-space(@class),' '),' js-navigation-item ')]//td[contains(concat(' ',normalize-space(@class),' '),' content ')]"`
}

func main() {
    res, err := http.Get("https://github.com/azlotnikov/goxtag")
    if err != nil {
        log.Fatal(err)
    }
    defer res.Body.Close()

    var ex example
	
    err = goxtag.NewDecoder(res.Body).Decode(&ex)
    if err != nil {
        log.Fatal(err)
    }

    log.Println(ex.Title, ex.Files)
}

Details

  • You can find info about CannotUnmarshalError in unmarshal-error.go
  • Use xpath_required:"false" if you don't need node not found in document error for not found nodes
  • Use xpath:"-" to ignore field
  • Use Unmarshal(b []byte, v interface{}) error for custom unmarshal

About

A declarative struct-tag-based HTML unmarshaling or scraping package for Go built on top of the htmlquery library

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages