Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPOI leaks memory big time #262

Open
mercenaryntx opened this issue Jun 21, 2023 · 4 comments
Open

NPOI leaks memory big time #262

mercenaryntx opened this issue Jun 21, 2023 · 4 comments

Comments

@mercenaryntx
Copy link

Given a pretty large Excel file (160MB, with millions and millions of rows).
You can't even open it because it's going to spam the heap with hundreds of millions of objects.

I made a little test, and tried to pass instantiation but I always ran out of memory:
image

@mganss
Copy link
Owner

mganss commented Jun 30, 2023

Can you open it using NPOI alone somehow?

@tonyqus
Copy link

tonyqus commented Sep 28, 2023

160M Excel file is too huge. If you unzip the Excel (rename the .xlsx to .zip first), you will see how huge the size of openxml files are. Why don't you use CSV instead for this kind of data export.

@mercenaryntx
Copy link
Author

160M Excel file is too huge. If you unzip the Excel (rename the .xlsx to .zip first), you will see how huge the size of openxml files are. Why don't you use CSV instead for this kind of data export.

It's not my call, obviously. The Excel file is given as an external resource, and the number one rule of working with files is not to load them into memory. That's why we have streams in .NET in the first place. I switched to ExcelDataReader and all my problems disappeared. No offense, but building this lib on NPOI was clearly a mistake.

@tonyqus
Copy link

tonyqus commented Sep 29, 2023

@mercenaryntx Good to know that ExcelDataReader solves your issue. The NPOI community is working hard on performance improvement.

But there is a tradeoff between supporting more Excel features and performance. Definitely, NPOI can read the file better if we don't take a few properties from Excel into consideration. Usally, this kind of huge file reading is optimized by reading only partial data of openxml (without cell style info and other advanced properties).

Anyway, it's good to hear that you solve your issue. Good luck with ExcelDataReader.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants