Error causing frequent crashes #178

bmueller · 2020-09-27T20:28:21Z

I keep getting the following timeout errors:

Post https://api.push.apple.com/3/device/[xxx]: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

This is quickly followed by the following crash:

panic: runtime error: invalid memory address or nil pointer dereference

They're happening every 10-15 minutes, but only started happening in the last week or so. I tried throttling down the number of requests that were going through the server from around 1k per minute to just 50 or so per minute, but I'm still getting the crash.

I am running go1.12.17 and am using the latest version of apns2 from the master branch.

I'm a relative newbie when it comes to golang, so I'm going to post my code below. Perhaps I'm doing something wrong - but this push notification server has been working without issue for over a year and only recently started having a problem.

Any help you can provide (especially @sideshow ) would be greatly appreciated!

type PushBatch struct {
	PushNotifications []PushNotification `json:"pushNotification"`
}

type PushNotification struct {
	ObjectID    string `json:"objectId"`
	DeviceToken string `json:"deviceToken"`
	Topic       string `json:"topic"`
	CollapseID  string `json:"collapseId"`
	Payload     string `json:"payload"`
	Type        string `json:"pushType"`
}

var prodClient *apns2.Client

func collectPush(rw http.ResponseWriter, req *http.Request) {

	decoder := json.NewDecoder(req.Body)
	var p PushBatch
	err := decoder.Decode(&p)
	if err != nil {
		panic(err)
	}

	count := len(p.PushNotifications)
	notifications := make(chan *apns2.Notification, 50)
	responses := make(chan *apns2.Response, count)

	for i := 0; i < 20; i++ {
		go worker(notifications, responses)
	}

	for _, p := range p.PushNotifications {
		notifications <- n
	}

	for _, p := range p.PushNotifications {
		res := <-responses
	}

	close(notifications)
	close(responses)

	rw.WriteHeader(200)

}

func worker(notifications <-chan *apns2.Notification, responses chan<- *apns2.Response) {
	for n := range notifications {

		// initiate push
		res, err := prodClient.Push(n)

		if err != nil {
			fmt.Printf("Push Error: %v", err)
		}

		responses <- res

	}
}

func main() {

	// auth key
	authKey, err := token.AuthKeyFromFile("XXXX")
	if err != nil {
		log.Fatal("token error:", err)
	}

	// token
	token := &token.Token{
		AuthKey: authKey,
		// KeyID from developer account (Certificates, Identifiers & Profiles -> Keys)
		KeyID: "XXXX",
		// TeamID from developer account (View Account -> Membership)
		TeamID: "XXXXX",
	}

	// client setup
	devClient = apns2.NewTokenClient(token)
	prodClient = apns2.NewTokenClient(token)
	prodClient.Production()

	// apns2.HTTPClientTimeout = 200 * time.Second
	// apns2.TCPKeepAlive = 200 * time.Second

	// get port
	port := os.Getenv("PORT")
	if port == "" {
		port = "8080"
	}

	// start server
	http.HandleFunc("/", collectPush)
	log.Fatal(http.ListenAndServe(":"+port, nil))

}

One thing I'm wondering - when I call into my server with a new batch of push notifications, does my current code spawn a bunch of new workers on top of any existing workers, thus creating a potentially huge group of workers? What I want to do is just add any new batches of push notifications to an existing queue - if this isn't doing that, would appreciate some sample code showing how to do this.

The text was updated successfully, but these errors were encountered:

bmueller · 2020-09-29T03:37:47Z

To follow up on this, the crashes were my fault - I should only have been calling responses <- res in the worker method if the error from the push was nil. But I'm still seeing Client.Timeout quite frequently, as often as every 5-10 minutes.

Note that the connection to apns isn't idle at all during this time, as I'm sending 50+ (and sometimes 1000+) pushes per minute.

sideshow · 2020-09-29T04:35:37Z

@bmueller The way you've got it set up you're creating a notifications channel inside every HTTP request and then spawning 20 go workers to consume off this channel. This is bound to get you into trouble as you have no back pressure. ie, If you were to get 20,000 requests on your endpoint you would have 400,000 go routines (20,000 * 20) trying to use the same connection to send push notifications.

Granted 20,000 requests at once may be an exaggeration, but the point is that the first thing that would occur under this scenario with push notifications backed up is a timeout.

To avoid this you should create backpressure. One way of doing this under your current code is to create your notifications go channel and your workers inside your main function as opposed to inside the http handler. That way your http handler just needs to deserialise your http payload and push the notifications into the global channel.

The global channel should be buffered (you currently have it at 50 but could probably be 1000). This way you wont wont be setting up and tearing down workers for every request and if you get an influx of http requests they will wait until the channel emptys out a bit. You can then experiment with the amount of workers etc consuming off the channel

bmueller · 2020-09-29T05:23:30Z

@sideshow I figured I was doing something silly like that. Your explanation definitely makes sense for explaining what I'm doing wrong here.

I tried moving the notifications channel but am still getting the Client.Timeout errors fairly frequently. Perhaps I'm still doing something wrong? For example, I was basing the size of the responses channel on the count in the decoded push array but now that it's been moved to main I just set it to 1000. Not sure if that's the right way to do it.

I was also seeing response times getting way, way worse - before this change, response times were usually around 30ms; after the change, response times were over 2,000ms.

var notifications chan *apns2.Notification
var responses chan *apns2.Response

func collectPush(rw http.ResponseWriter, req *http.Request) {

	decoder := json.NewDecoder(req.Body)
	var p PushBatch
	err := decoder.Decode(&p)
	if err != nil {
		panic(err)
	}

	count := len(p.PushNotifications)

	for i := 0; i < count; i++ {
		notifications <- n
	}

	for i := 0; i < count; i++ {
		res := <-responses
	}

	rw.WriteHeader(200)

}

func main() {

	startClients()

	notifications = make(chan *apns2.Notification, 1000)
	responses = make(chan *apns2.Response, 1000)

	for i := 0; i < 50; i++ {
		go worker(notifications, responses)
	}

	// start server
	http.HandleFunc("/", collectPush)
	log.Fatal(http.ListenAndServe(":8080", nil))

}

bmueller · 2020-10-19T21:21:31Z

@sideshow I'm still getting frequent "Client.Timeout exceeded while awaiting headers" errors - at least one every 5 minutes - despite implementing the changes in the code above. Any thoughts on what could be going wrong here?

sideshow changed the title ~~Client.Timeout error causing frequent crashes~~ Error causing frequent crashes Sep 29, 2020

sideshow added the unconfirmed label Sep 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error causing frequent crashes #178

Error causing frequent crashes #178

bmueller commented Sep 27, 2020 •

edited

bmueller commented Sep 29, 2020 •

edited

sideshow commented Sep 29, 2020

bmueller commented Sep 29, 2020 •

edited

bmueller commented Oct 19, 2020

Error causing frequent crashes #178

Error causing frequent crashes #178

Comments

bmueller commented Sep 27, 2020 • edited

bmueller commented Sep 29, 2020 • edited

sideshow commented Sep 29, 2020

bmueller commented Sep 29, 2020 • edited

bmueller commented Oct 19, 2020

bmueller commented Sep 27, 2020 •

edited

bmueller commented Sep 29, 2020 •

edited

bmueller commented Sep 29, 2020 •

edited