Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chronicle Queue Heap memory issue #1534

Open
abhinavece opened this issue Feb 16, 2024 · 3 comments
Open

Chronicle Queue Heap memory issue #1534

abhinavece opened this issue Feb 16, 2024 · 3 comments
Assignees

Comments

@abhinavece
Copy link

abhinavece commented Feb 16, 2024

As per the heap dump, it seems there are huge number of StoreTailer instances getting created and it's occupying 25% of memory. can you help me understand what I am doing wrong?

    3,14,071 instances of "net.openhft.chronicle.queue.impl.single.StoreTailer", \n loaded by "org.springframework.boot.loader.LaunchedURLClassLoader @ 0x5339526d0" occupy 37,01,94,288 (24.97%) bytes. These instances are referenced from one instance of "java.lang.Object[]", loaded by "<system class loader>"
    
    3,12,605 instances of "net.openhft.chronicle.queue.impl.single.SCQIndexing", loaded by "org.springframework.boot.loader.LaunchedURLClassLoader @ 0x5339526d0" occupy 26,75,84,536 (18.05%) bytes. These instances are referenced from one instance of "java.lang.Object[]", loaded by "<system class loader>"
    
    
    9,37,551 instances of "net.openhft.chronicle.bytes.internal.ChunkedMappedBytes", loaded by "org.springframework.boot.loader.LaunchedURLClassLoader @ 0x5339526d0" occupy 24,75,13,464 (16.70%) bytes. These instances are referenced from one instance of "java.lang.Object[]", loaded by "<system class loader>"

Here is the implementation:

// This runs every seconds
protected void runOneIteration() {
    // service will terminate if exception is not caught.
    try {
      sampler.updateTime();
      sampler.sampled(() -> log.info("Checking for messages to publish"));
      Batch batchToSend = new Batch(MAX_BATCH_BYTES, MAX_BATCH_COUNT);
      while (!batchToSend.isFull()) {
        long endIndex = queue.createTailer().toEnd().index();
        try (DocumentContext dc = readTailer.readingDocument()) {
          if (!dc.isPresent()) {
            sampler.sampled(() -> log.info("Reached end of queue"));
            long readIndex = readTailer.index();
            if (readIndex < endIndex) {
              readTailer.moveToIndex(endIndex);
              fileDeletionManager.setSentIndex(endIndex);
              log.warn(
                  "Observed readTailer not at end with no document context. Moved from {} to {}", readIndex, endIndex);
            }
            break;
          }
          try {
            verify(dc.wire() != null, "Null wire with document context present");
            byte[] bytes = requireNonNull(dc.wire()).read().bytes();
            if (bytes != null) {
              PublishMessage message = PublishMessage.parseFrom(bytes);
              batchToSend.add(message);
            } else {
              // could happen in case of an error during append with document context open.
              log.warn("Read NULL message. Skipping");
            }
          } catch (Exception e) {
            log.error("Exception while parsing message", e);
          }
        }
      }
      if (batchToSend.isFull()) {
        log.info("Batch is full");
      }
      if (!batchToSend.isEmpty()) {
        PublishRequest publishRequest = PublishRequest.newBuilder().addAllMessages(batchToSend.getMessages()).build();
        try {
          publishMessagesOverRest(publishRequest);
          fileDeletionManager.setSentIndex(readTailer.index());
          scheduler.recordSuccess();
          log.info("Published {} messages successfully over rest", batchToSend.size());
        } catch (IOException e) {
            log.error("Something wrong with publishing over rest", e);
            fileDeletionManager.setSentIndex(readTailer.index());
            scheduler.recordSuccess();
          } catch (Exception err) {
            log.warn("Exception during message publish", err);
            QueueUtils.moveToIndex(readTailer, fileDeletionManager.getSentIndex());
            scheduler.recordFailure();
          }
        }
      } else {
        fileDeletionManager.setSentIndex(readTailer.index());
        sampler.sampled(() -> log.info("Skipping message publish as batch is empty"));
      }
    } catch (Exception e) {
      log.error("Encountered exception", e);
    } finally {
      try {
        sampler.sampled(this::printStats);
        sampler.sampled(fileDeletionManager::deleteOlderFiles);
      } catch (Exception e) {
        log.error("Encountered exception in finally", e);
      }
    }
  }

@tgd
Copy link
Contributor

tgd commented Feb 16, 2024

@abhinavece - queue.createTailer() will create a fresh tailer instance each time and this is happening on every iteration of your code.

/**
* Creates and returns a new ExcerptTailer for this ChronicleQueue.
* <b>
* A Tailer is <em>NOT thread-safe</em>. A Tailer can be created by one Thread and might be used by at most one other Thread.</em>.
* Sharing a Tailer across threads is unsafe and will inevitably lead to errors and unspecified behaviour.
* </b>
* <p>
* The tailor is created at the start, so unless you are using named tailors,
* this method is the same as calling `net.openhft.chronicle.queue.ChronicleQueue#createTailer(java.lang.String).toStart()`
*
* @return a new ExcerptTailer to read sequentially.
* @see #createTailer(String)
*/
@NotNull
ExcerptTailer createTailer();

@tgd tgd self-assigned this Feb 16, 2024
@abhinavece
Copy link
Author

@tgd - Can this also be related to the other 2 types of Objects creation in my heap: net.openhft.chronicle.queue.impl.single.SCQIndexing and net.openhft.chronicle.bytes.internal.ChunkedMappedBytes ?

Also, with the same code, I see the older version of ChroncileQueue (5.19.2) is behaving correct and there seems to be no such memory issue. Has there been any recent change in the design?

@tgd
Copy link
Contributor

tgd commented Feb 16, 2024

Hi @abhinavece

Can this also be related to the other 2 types of Objects creation in my heap

Most likely yes.

Also, with the same code, I see the older version of ChroncileQueue (5.19.2) is behaving correct and there seems to be no such memory issue. Has there been any recent change in the design?

Thanks for flagging this - please can you share with us a runnable unit test that reproduces this.

If you need expedited commercial support we offer that - please get in touch here https://chronicle.software/contact-us/. We do review these tickets but with a lower priority than commercial support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants