Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Postgres dataWriter.go by Implementing Batch Insert and Update Operations #1252

Open
tolgaOzen opened this issue May 9, 2024 · 4 comments · May be fixed by #1266
Open

Optimize Postgres dataWriter.go by Implementing Batch Insert and Update Operations #1252

tolgaOzen opened this issue May 9, 2024 · 4 comments · May be fixed by #1266
Assignees
Labels
area/database Issues related with database configurations 💎 Bounty feature request New feature or request iteration Improvement on an existing feature

Comments

@tolgaOzen
Copy link
Member

Problem Description

Currently, in the internal/storage/postgres/dataWriter.go file, the INSERT and UPDATE operations are performed individually for each tuple in the RelationTuplesTable. This method can be inefficient, particularly when dealing with large volumes of data, as it increases network latency and places a higher load on the database server due to multiple query executions.

Proposed Solution

Implement batch processing for INSERT and UPDATE operations using pgx batch capabilities. This will reduce the number of round-trips to the database and improve the overall performance of the data writing process.

Implementation Details

  1. Batch Inserts:

    • Utilize pgx.Batch to queue multiple INSERT statements and execute them in a single batch. This is particularly useful when inserting multiple tuples into RelationTuplesTable.
    • Example:
      batch := &pgx.Batch{}
      titer := tb.Write.CreateTupleIterator()
      for titer.HasNext() {
          t := titer.GetNext()
          batch.Queue("INSERT INTO relation_tuples (fields...) VALUES ($values...)",
                      t.GetEntity().GetType(), t.GetEntity().GetId(), ...)
      }
      br := w.database.SendBatch(ctx, batch)
      defer br.Close()
      _, err = br.Exec()
      if err != nil {
          return err
      }
  2. Batch Updates:

    • Similarly, accumulate multiple UPDATE conditions into a single pgx.Batch and execute them together to minimize database interactions.
    • Example:
      batch = &pgx.Batch{}
      for _, condition := range deleteClauses {
          query, args, _ := squirrel.Update("relation_tuples").Set(...).Where(...).ToSql()
          batch.Queue(query, args...)
      }
      br = w.database.SendBatch(ctx, batch)
      defer br.Close()
      _, err = br.Exec()
      if err != nil {
          return err
      }

Benefits

  • Reduced Network Latency: Minimizes the number of database round-trips required for executing multiple queries.
  • Enhanced Performance: Improves the throughput of database operations, especially under high load.
  • Resource Optimization: Decreases the CPU and memory footprint on both the application and database server.

Considerations

  • Ensure proper transaction management and error handling are in place to handle possible failures in batch processing.
  • Conduct thorough testing to validate the performance improvements and ensure data integrity.

This change is expected to significantly enhance the efficiency of data handling in the system, particularly when processing large datasets or under high transaction scenarios.

@tolgaOzen tolgaOzen added feature request New feature or request iteration Improvement on an existing feature area/database Issues related with database configurations labels May 9, 2024
@tolgaOzen tolgaOzen added this to the Beta Graduation 🎉 milestone May 10, 2024
@tolgaOzen
Copy link
Member Author

/bounty $200

Copy link

algora-pbc bot commented May 24, 2024

💎 $200 bounty • Permify.co

Steps to solve:

  1. Start working: Comment /attempt #1252 with your implementation plan
  2. Submit work: Create a pull request including /claim #1252 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Don't wait to be assigned. A reward will be given for the best PR.

Additional opportunities:

Thank you for contributing to Permify/permify!

Add a bountyShare on socials

Attempt Started (GMT+0) Solution
🟢 @Gmin2 #1266

@Gmin2
Copy link
Contributor

Gmin2 commented May 24, 2024

Hey @tolgaOzen i am working ont it, can i get assigned , Thanks !

Copy link

algora-pbc bot commented May 25, 2024

💡 @Gmin2 submitted a pull request that claims the bounty. You can visit your bounty board to reward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/database Issues related with database configurations 💎 Bounty feature request New feature or request iteration Improvement on an existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants