Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug/Feat: Performance degradation with multiple requests #86

Open
voultest opened this issue Aug 27, 2023 · 8 comments
Open

Bug/Feat: Performance degradation with multiple requests #86

voultest opened this issue Aug 27, 2023 · 8 comments

Comments

@voultest
Copy link

Issue: For 50 simultaneous requests at a time we are getting 5-8 times slower performance compared to node.js. Results are being returned much more slower.

Possible cause: Is it because of semaphores lock? Also, tried calling dbConnect within qHandler.

Test Code:

package main

import (
    "encoding/json"
    "log"
    "fmt"
    "time"
    "runtime"
    "net/http"
)
import "database/sql"
import trino "github.com/trinodb/trino-go-client/trino"

var dbConn *sql.DB

func dbConnect() *sql.DB {
    foobarClient := &http.Client{
        Transport: &http.Transport{
            Proxy: http.ProxyFromEnvironment,
            DialContext: (&net.Dialer{
                Timeout:   30 * time.Second,
                KeepAlive: 30 * time.Second,
                DualStack: true,
            }).DialContext,
            MaxIdleConns:          500,
            IdleConnTimeout:       90 * time.Second,
            TLSHandshakeTimeout:   1 * time.Second,
            ExpectContinueTimeout: 1 * time.Second,
            TLSClientConfig:       &tls.Config{
                InsecureSkipVerify: true,
            },
        },
    }

    trino.RegisterCustomClient("foobar", foobarClient)
    dsn := "https://" + "<username>" + ":" + "<password>" + "@<hostname>"
    db, err := sql.Open("trino", dsn)
    if err != nil {
      fmt.Println(err)
      panic(err)
    } 
    return db
}

func qHandler(w http.ResponseWriter, r *http.Request) {
    rows, err := dbConn.Query(p.Query)
    if err != nil {
        fmt.Println(err)
        return
    }
    defer rows.Close()

    w.Header().Set("Content-Type", "application/json")
    w.WriteHeader(http.StatusCreated)
    json.NewEncoder(w).Encode(&Response{"200", "hello"})
}

func main() {
    dbConn = dbConnect()
    defer dbConn.Close()
    http.HandleFunc("/api", qHandler)
    fmt.Printf("Starting server at port 8080\n")
    if err := http.ListenAndServe(":8080", nil); err != nil {
        log.Fatal(err)
    }
}

Please find below some profiles.

CPU profile:

(pprof) top5
Showing nodes accounting for 3160ms, 30.30% of 10430ms total
Dropped 530 nodes (cum <= 52.15ms)
Showing top 5 nodes out of 335
      flat  flat%   sum%        cum   cum%
    1490ms 14.29% 14.29%     1490ms 14.29%  runtime/internal/syscall.Syscall6
     780ms  7.48% 21.76%      780ms  7.48%  runtime.futex
     340ms  3.26% 25.02%      370ms  3.55%  crypto/internal/bigmod.(*Nat).shiftIn
     290ms  2.78% 27.80%      310ms  2.97%  runtime.step
     260ms  2.49% 30.30%      260ms  2.49%  crypto/internal/edwards25519/field.feMul
(pprof) top5 -cum
Showing nodes accounting for 0.03s, 0.29% of 10.43s total
Dropped 530 nodes (cum <= 0.05s)
Showing top 5 nodes out of 335
      flat  flat%   sum%        cum   cum%
         0     0%     0%      2.60s 24.93%  crypto/tls.(*Conn).HandshakeContext (inline)
     0.01s 0.096% 0.096%      2.60s 24.93%  crypto/tls.(*Conn).handshakeContext
         0     0% 0.096%      2.59s 24.83%  net/http.(*persistConn).addTLS.func2
         0     0% 0.096%      2.52s 24.16%  crypto/tls.(*Conn).clientHandshake
     0.02s  0.19%  0.29%      2.44s 23.39%  runtime.systemstack

Block profile:

(pprof) top5
Showing nodes accounting for 42.02hrs, 99.95% of 42.04hrs total
Dropped 90 nodes (cum <= 0.21hrs)
Showing top 5 nodes out of 28
      flat  flat%   sum%        cum   cum%
  42.02hrs 99.95% 99.95%   42.02hrs 99.95%  runtime.selectgo
         0     0% 99.95%    0.41hrs  0.97%  database/sql.(*DB).Query
         0     0% 99.95%    0.41hrs  0.97%  database/sql.(*DB).QueryContext
         0     0% 99.95%    0.41hrs  0.97%  database/sql.(*DB).QueryContext.func1
         0     0% 99.95%    0.41hrs  0.97%  database/sql.(*DB).query
(pprof) top5 -cum
Showing nodes accounting for 151274.19s, 99.95% of 151354.15s total
Dropped 90 nodes (cum <= 756.77s)
Showing top 5 nodes out of 28
      flat  flat%   sum%        cum   cum%
151274.19s 99.95% 99.95% 151274.19s 99.95%  runtime.selectgo
         0     0% 99.95%  74079.35s 48.94%  net/http.(*persistConn).writeLoop
         0     0% 99.95%  72588.09s 47.96%  net/http.(*persistConn).readLoop
         0     0% 99.95%   1607.39s  1.06%  net/http.(*conn).serve
         0     0% 99.95%   1607.39s  1.06%  net/http.(*ServeMux).ServeHTTP

Mutex profile:

(pprof) top5
Showing nodes accounting for 7.43ms, 99.79% of 7.44ms total
Dropped 12 nodes (cum <= 0.04ms)
Showing top 5 nodes out of 32
      flat  flat%   sum%        cum   cum%
    7.32ms 98.40% 98.40%     7.32ms 98.40%  sync.(*Mutex).Unlock
    0.06ms  0.84% 99.24%     0.06ms  0.84%  context.(*cancelCtx).cancel
    0.04ms  0.56% 99.79%     0.04ms  0.56%  internal/singleflight.(*Group).doCall
         0     0% 99.79%     0.06ms  0.84%  context.(*timerCtx).cancel
         0     0% 99.79%     0.06ms  0.84%  context.WithDeadlineCause.func2
(pprof) top5 -cum
Showing nodes accounting for 7.32ms, 98.40% of 7.44ms total
Dropped 12 nodes (cum <= 0.04ms)
Showing top 5 nodes out of 32
      flat  flat%   sum%        cum   cum%
    7.32ms 98.40% 98.40%     7.32ms 98.40%  sync.(*Mutex).Unlock
         0     0% 98.40%     5.41ms 72.73%  net/http.(*persistConn).readLoop
         0     0% 98.40%     5.36ms 71.96%  net/http.(*Transport).cancelRequest
         0     0% 98.40%     1.25ms 16.74%  net/http.(*Transport).dialConnFor
         0     0% 98.40%     1.25ms 16.74%  net/http.(*Transport).putOrCloseIdleConn

Heap Profile:

(pprof) top5
Showing nodes accounting for 5.01MB, 19.23% of 26.06MB total
Dropped 18 nodes (cum <= 0.13MB)
Showing top 5 nodes out of 119
      flat  flat%   sum%        cum   cum%
    2.51MB  9.63%  9.63%     2.51MB  9.63%  bufio.NewWriterSize (inline)
      -2MB  7.68%  1.95%     3.51MB 13.48%  net/http.(*Transport).dialConn
    1.50MB  5.76%  7.71%     1.50MB  5.76%  crypto/tls.Client (inline)
    1.50MB  5.76% 13.47%     1.50MB  5.76%  crypto/aes.(*aesCipherGCM).NewGCM
    1.50MB  5.76% 19.23%     1.50MB  5.76%  net/http.(*persistConn).readLoop
(pprof) top5 -cum
Showing nodes accounting for 0.51MB, 1.95% of 26.06MB total
Dropped 18 nodes (cum <= 0.13MB)
Showing top 5 nodes out of 119
      flat  flat%   sum%        cum   cum%
      -2MB  7.68%  7.68%     3.51MB 13.48%  net/http.(*Transport).dialConn
         0     0%  7.68%     3.51MB 13.48%  net/http.(*Transport).dialConnFor
    2.51MB  9.63%  1.95%     2.51MB  9.63%  bufio.NewWriterSize (inline)
         0     0%  1.95%    -2.03MB  7.80%  encoding/json.(*Decoder).Decode
         0     0%  1.95%    -2.03MB  7.80%  github.com/trinodb/trino-go-client/trino.(*driverStmt).exec.func2

@nineinchnick
Copy link
Member

Thanks for the report, but can you provide a bit more detail, so we can try to reproduce this? We need something to compare these profiler results with.

What queries are you running? How much data they return? What's your Trino server configuration? When you say slower, what's the expected speed?

@nineinchnick
Copy link
Member

Which versions of trino-go-client and Trino server are you using?

@voultest
Copy link
Author

Thanks for the report, but can you provide a bit more detail, so we can try to reproduce this? We need something to compare these profiler results with.

What queries are you running? How much data they return? What's your Trino server configuration? When you say slower, what's the expected speed?

We used apache jmeter. We ran select * from

limit by 5500. If we run this alone we get the result within 500ms.
What we did was to run about 100 select statements with random values to query a table from jmeter. If we set the ramp factor to 50 and number of threads to 100, the same select statement took about 30 seconds. Same thing done against nodejs returned result in about 5 seconds.

@voultest
Copy link
Author

Which versions of trino-go-client and Trino server are you using?

0.312.0. Trino server is 420

@nineinchnick
Copy link
Member

select * from limit by 5500

That's not a correct query.

Which node.js driver are you using?

@voultest
Copy link
Author

select * from limit by 5500

That's not a correct query.

Which node.js driver are you using?

Sorry, some of the queries are:
select * from details_table limit 5500; // table contains millions of rows
select * from procured_table where data_id='${d_id}' limit 4; // table contains millions of rows

package: node-jdbc with some modification to use closure when looping over each row

@voultest
Copy link
Author

Most if not all of the individual queries are very fast sub-100ms. Kindly provide a direction, if possible, to debug where the latency could be.

@nineinchnick
Copy link
Member

nineinchnick commented Aug 30, 2023

I can recommend instrumenting your app with OpenTelemetry and collect traces, both from your app and the Trino server. That should show where the congestion happens - which span in a trace is unexpectedly longer. Unfortunately, this driver is not yet instrumented, but I'll try to work on that in the nearest future.
To enable tracing in Trino, set these configuration properties:

tracing.enabled=true
tracing.exporter.endpoint=http://jaeger-or-other-otel-collector:4317

We have one simple benchmark here: https://github.com/trinodb/trino-go-client/blob/master/trino/trino_test.go#L1697
If you could replicate the issue using a benchmark like this, it would also help a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants