-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple rows and columns introduce false index items #573
Comments
Another example: This one fails. Removing the header fixes the issue. Interestingly, the behavior changes depending on whether or not you are using a debugger.
data_sources:
# Initial setup
initial_tech_capacity_params:
source: data_sources/initial_capacity_techs_kw.csv
rows: [nodes, techs, parameters] |
OK, so this is a limitation of what we can ask of pandas. A workaround: data_sources:
# Initial setup
initial_tech_capacity_params:
source: data_sources/initial_capacity_techs_kw.csv
rows: [nodes, techs, parameters]
columns: [values]
drop: values |
I can only reproduce this issue with your second example. The first one loads just fine. |
Odd, that's the one I saw as most problematic. I'll give an update if I can reproduce it... For the second: I would like to propose that this type of "dropping" should be the standard, to ensure the files given to the model are "stand alone". Otherwise, you'd need to always consult two files. This way, you have good data practices "baked in". What do you think? |
It's impossible to make it standard as we have to know whether the top row
is data or not. As soon as you say it isn't data (columns: [...]) then it
becomes an index / column and you wouldn't want that data deleted
automatically if you actually _wanted_ it to be part of you model. So the
only way to handle it consistently is to force the user to be explicit or
to tell them that there must _always_ be a header and that if they miss it
then their top row of data may be silently(?) lost...
…On Wed, 21 Feb 2024, 19:35 Ivan Ruiz Manuel, ***@***.***> wrote:
I can only reproduce this issue with your second example. The first one
loads just fine.
Odd, that's the one I saw as most problematic. I'll give an update if I
can reproduce it...
For the second: I would like to propose that this type of "dropping"
should be the standard, to ensure the files given to the model are "stand
alone". Otherwise, you'd need to always consult two files. This way, you
have good data practices "baked in".
What do you think?
—
Reply to this email directly, view it on GitHub
<#573 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEDB63XQKJEQUYE5ON7NDI3YUZEBNAVCNFSM6AAAAABDTMI46KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJXG43DKNBTGA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hmmm, that is true... |
Plan: enforce a header to always exist in a CSV, even if it is just one row. We will set |
What happened?
Loading a file with multiple rows and multiple columns adds fake indexes sometimes.
See the file:
Loaded via:
In this case, a fake index called
techs
will be added.Which operating systems have you used?
Version
v0.7
Relevant log output
No response
The text was updated successfully, but these errors were encountered: