Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grouped density charts have artifacts (again) #9078

Closed
jonmmease opened this issue Sep 4, 2023 · 6 comments · Fixed by #9106
Closed

Grouped density charts have artifacts (again) #9078

jonmmease opened this issue Sep 4, 2023 · 6 comments · Fixed by #9106
Labels
Altair Issue that is blocking Altair Bug 🐛

Comments

@jonmmease
Copy link
Contributor

This was originally reported in vega/vegafusion#381, but it appears to be a Vega-Lite issue. Given this grouped density Vega-Lite spec:

{
  "config": {"view": {"continuousWidth": 300, "continuousHeight": 300}},
  "data": {
    "url": "https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/movies.json"
  },
  "mark": {"type": "area"},
  "encoding": {
    "color": {"field": "Major_Genre", "type": "nominal"},
    "x": {"field": "IMDB_Rating", "type": "quantitative"},
    "y": {"field": "density", "type": "quantitative"}
  },
  "transform": [
    {"filter": {"field": "Major_Genre", "oneOf": ["Action", "Drama"]}},
    {
      "density": "IMDB_Rating",
      "groupby": ["Major_Genre"],
      "as": ["IMDB_Rating", "density"]
    }
  ],
  "$schema": "https://vega.github.io/schema/vega-lite/v5.14.1.json"
}

The resulting density curves have a lot of artifacts.

visualization (1)

A related case was reported in #8049, and the suggested fix was to add an explicit steps argument to the Vega kde transform that Vega-Lite produces. This change was made in #8088, but this doesn't seem to take care of the issue universally.

Rather than (or in addition to) steps I think what we need is to specify "resolve": "shared" in the kde transform so that the generated x-values are in sync across groups. This chart is fixed when manually adding "resolve": "shared" to the generated Vega spec:

Open the Chart in the Vega Editor

visualization (2)

@joelostblom
Copy link
Contributor

joelostblom commented Sep 4, 2023

It seems like this regression happened some time after VL 5.8 because it works fine in Altair 5.0.1, but not in 5.1.1 where VL 5.14 is used.

@joelostblom
Copy link
Contributor

Wohoo, thank you @jonmmease !

@joelostblom
Copy link
Contributor

@jonmmease @domoritz Unfortunately, it seems like this bug is back in the current version of Vega-Lite, so I'm reopening this issue:

image
Open the Chart in the Vega Editor

One concern that complicates fixing this, is that when the axis resolve is "shared" as per @jonmmease 's last fix, the extent of grouped density transforms incorrectly use the min/max of the entire dataset instead of for each group, resulting in long lines where there are no observations at all, instead of stopping the density at the last data point in the group.

This chart is created in altair 5.1.2 which uses VL 5.15.1 and shows the undesired behavior:

image

If you instead Open the Chart in the Vega Editor editor that runs the most recent version of VegaLite, you will see the desired behavior where each density is cut at the min/max values of each group:

image

It would be ideal if there is a fix for the grouped density issues that still allows for grouped densities to be cut at the min and max value of each group.

Altair code
import altair as alt
from vega_datasets import data

source = data.iris.url

alt.Chart(source, height=100).transform_density(
    'petalWidth',
    groupby=['species']
).mark_area(stroke='black').encode(
    alt.X('value:Q'),
    alt.Y('density:Q').stack(False),
    alt.Facet('species:N', columns=1, title=None).header(labelFontWeight='bold', labelFontSize=12)
)

@joelostblom joelostblom reopened this Oct 19, 2023
@jonmmease
Copy link
Contributor Author

@joelostblom are you saying that you're still seeing the jagged/sawtooth behavior in Vega-Lite 5.15.1+? Looks like the Vega editor is still on 5.15.0, so it's not surpising that the example you pasted above still has the issue.

Regarding the long tails, I'm not sure the best approach. Maybe we could filter our the zeros in some cases

@joelostblom
Copy link
Contributor

Oh sorry I completely missed that the Vega editor is not on the latest Vega-Lite version. I can see now that is says 5.15.0 in the grey text the bottom right corner so that makes sense then for why the densities are jagged in the editor.

Then only the second part of my comment regarding the long tails when grouping is still relevant. Filtering zeroes sounds like a a potential solution, but it would have to only filter zeros before and after the last non-zero value. We wouldn't want to filter zeroes between peaks in multi-modal distributions. Do you think this is something that should be fixed on the Vega level for the KDE transform? It seems like that might be appropriate and I can open an issue there instead if that's the case.

@kanitw
Copy link
Member

kanitw commented Oct 28, 2023

From the latest editor, this seems fixed. Feel free to reopen if you can still reproduce.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Altair Issue that is blocking Altair Bug 🐛
Projects
None yet
3 participants