Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Preview date_axis #2402

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

WIP: Preview date_axis #2402

wants to merge 3 commits into from

Conversation

Bliss3d
Copy link
Contributor

@Bliss3d Bliss3d commented Oct 8, 2022

Hey, I'm looking for some feedback.
Does the code need polish - yes, a lot
Pandas shouldn't be used - yes
Are there some bugs - yes

I'm more interested if the functionality is considered to be good enough to be included in vispy if the issues are ironed out.

python_2022-10-08_14-44-12-63.mp4

The axis keeps scaling down until 1 mircosecond, other vispy componemts break down at that zoom first
vispy_microseconds

@djhoese
Copy link
Member

djhoese commented Oct 10, 2022

Thanks for putting together. This entire general idea is something that VisPy could really benefit from. It isn't clear to what exact functionality you are worried about being "good enough". Converting some X values to formatted dates in the tick labels is absolutely, from a high level point of view, something we want. Having the tick labels auto-update positions and values as you zoom in is less necessary but also very cool.

Some things that come to mind as I read your code:

  1. Is there any way we can model this after the formatting objects that matplotlib has?
  2. There is a lot of logic being added here and a lot of it, as you say, needs to be cleaned up. The logic is pretty hard to understand and the fact that it is a lot of if statements and lists of formats or dividing of the total seconds between two times makes me concerned that there should be a better approach. There is a very good chance that I'm missing something that drives the complexity of this through the roof, but off the top of my head I feel like this logic should be split into a few key functions/steps and put into a separate class or two:
    a. Compute date range for current view
    b. Determine time span/step for this date range and specified format. "We can have 5 ticks formatted with format F that fit in this number of pixels".
    c. Compute list of dates (datetime object or datetime64) that fill this range with the step size calculated in b above.
    d. Format dates
  3. You note that a numpy array of date-like objects should be made a list for better performance. Is there no way to do this type of logic with numpy datetime64 objects?
  4. There really isn't much point in having floating point precision beyond 4-6 decimal places.
  5. Why does the user need to provide a separate array of dates? Can't the Axis object use the domain it is being set to to determine what values should be shown? I mean, does the user need to pass a whole list or would some start/end limits work?

Bottom line is that at this point it is hard to tell if your approach is a good one or a bad one because I'm having trouble following the code. The general feature though 👍 and thank you for putting in the work to try to get this figured out.

@Bliss3d
Copy link
Contributor Author

Bliss3d commented Oct 12, 2022

Thank you for your feedback.

At the beginning I just wanted to map some string to the integers of the axis. for example group A, group B, control group or just a list of dates. Now I just need a start date and the delta between date@int(1) and date@int(2) and it interpolates between integers and goes on forever.
But I still think "string mapping" would be a good idea, some people probebly want to name there histogramm, bar or what ever.
Due to that there is some overlap which needs to be cleaned up, im not sure if string mapping in this example even works

  1. I dont quite understand, I found this: https://matplotlib.org/stable/devel/style_guide.html but I dont know if its what you mean

  2. I guess you mean the main logic @ line 1004 onwards:
    I not sure how you would check if the timedelta is in the range of ms, seconds, minutes, hour, days, month, years also I dont think there a sooo much if statments - is there a lot of redundant code yes and you function names are very sensible and will be implemnted

  3. I'll chack that, I think datetime64 also works, I just used pandas because it was more familiar with generating date ranges with it

  4. You would think that, but its cruical because any kind of rounding messes everything up - You need this level of precision and there are still rare instantes where I get 1.1.2023 23:59:59:999999, but thats only on some date deltas and every tenth date but rounding makes it much worse, I even thought about using numpy.longdouble (fp80) but I think its possible with fp64 and fp80 is its own kind of headache

  5. Yes I did the inputes quite a while ago. I took 4 attempts from scratch to get a good working logic which wasnt insanely convoluted - it took quite some time

  6. This relies heavyly on relativdelta from python-dateutil, it that a problem?

@djhoese
Copy link
Member

djhoese commented Oct 12, 2022

I like the idea of this being flexible enough for people to name their histogram bins and stuff. It'd be great if this is flexible for that to happen easily in the future.

  1. I was more thinking about the tick formatting logic that matplotlib has: https://matplotlib.org/3.1.1/gallery/ticks_and_spines/tick-formatters.html. Or this date specific example: https://matplotlib.org/stable/gallery/ticks/date_demo_convert.html

  2. Ok. I was mostly thinking a datetime format string would handle the splitting of things but yeah for intervals that makes sense.

  3. Ok.

  4. I was mostly thinking about 32-bit float decimal precision which has a maximum of 7 decimal places. 64-bit is more, but most GPUs are still on 32-bit precision for vertices. So you will probably be able to format the dates with 64-bit precision, but the vertices they represent and display next to won't be.

  5. Ok.

  6. I've never used it but it doesn't look like it has any dependencies so that's nice. I'd be OK adding it or making it optional later. When the rest of the functionality is figured out we can discuss the best way forward.

@Bliss3d
Copy link
Contributor Author

Bliss3d commented Oct 13, 2022

I like the idea of this being flexible enough for people to name their histogram bins and stuff. It'd be great if this is flexible for that to happen easily in the future.

Yes that would be the plan.

Regarding those Locators https://matplotlib.org/stable/gallery/ticks/date_demo_convert.html I feel like they are bit manual shouldnt the scaling be automatic?

@djhoese
Copy link
Member

djhoese commented Oct 13, 2022

It is difficult to compare because I think most of matplotlib's examples are going to assume a static plot where vispy will typically assume an animate/dynamic plot. That said, users will probably know their data and choose what they want most. If there data is hourly there is no need to automatically choose per-minute ticks. Of course, on the other side of that argument is that if you zoom in enough you won't see any ticks, right?

What does matplotlib do if you don't specify a locator at all?

@Bliss3d
Copy link
Contributor Author

Bliss3d commented Oct 13, 2022

Since i use vispy I dont use mpl at all, all I remember is that if I gave it a data with datetime it just plotted it without providing a locatior, but I dont know what happened if you zoom in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants