Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Result was on the console for huge records #29

Open
tharun06 opened this issue Oct 12, 2017 · 8 comments
Open

Result was on the console for huge records #29

tharun06 opened this issue Oct 12, 2017 · 8 comments

Comments

@tharun06
Copy link

For small tables it worked fine and when I tried on a table around 50,000 rows the output was on the console and csv file was empty.

@edasque
Copy link
Owner

edasque commented Oct 12, 2017

I no longer have access to a dynamodb access. When I did, it was millions of rows long so I am surprised. If do get access to a large db again, I'll look into this.

@shubho-acc
Copy link

I am facing same issue. Unable to fetch 65000 records from same table.

@edasque
Copy link
Owner

edasque commented Feb 13, 2018

Are you though? Do you see the output on the console?

@shubho-acc
Copy link

No i dont see any error or log, i had to wait for like couple of mints but when i saw output file it was blank.
I tested on 8GB server.

@ghost
Copy link

ghost commented Mar 8, 2018

Why not migrate to using a dataFrame and exporting to .csv from there. (most libraries have 'low memory' flag to avoid loading everything into memory)

PS: I'm a python guy so here is my little contribution

@tcchau
Copy link
Contributor

tcchau commented Mar 8, 2018

@shubho-acc @tharun06 This is related to changes I made to be able to handle tables where the "schema" is not fixed. Essentially we have to read all the records to be able to figure out what the headers should be. We can create a new mode where if you know the schema is fixed, then the output can be streamed as we receive them, rather than being stored in memory, which is what is causing the problem. Reply to this thread if you're still interested in seeing a fix for this.

@ghost
Copy link

ghost commented Mar 8, 2018

@tcchau Why not add a simple flag for fixed schema that would just read the schema on the first line and generate based on that

@tcchau
Copy link
Contributor

tcchau commented Mar 8, 2018

@MarcoPorracin Yes, that's exactly what I mean. The new mode of operation would be triggered by the flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants