Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decide on best-of-breed future approach to .ly video generation #67

Open
3 tasks
aspiers opened this issue Sep 4, 2017 · 8 comments
Open
3 tasks

Decide on best-of-breed future approach to .ly video generation #67

aspiers opened this issue Sep 4, 2017 · 8 comments

Comments

@aspiers
Copy link
Owner

aspiers commented Sep 4, 2017

In addition to ly2video, there are at least two alternative approaches to generating video from LilyPond files:

All three have their pros and cons. Ideally the community should get together and figure out a way forward which provides a single best-of-breed tool, since the community is too small to be able to afford to needless fragment and duplicate effort. If that means ditching ly2video altogether, so be it. Suggested steps:

  • Document high- and mid-level summaries of the approaches made by all three. For example, "ly2video essentially generates an extremely wide and low single-system png of the score, and then crops that at different places to generate video frames which are then synchronised with playback" would be the high-level summary, but then the mid-level summary would go into more detail about how it figures out the exact placement and playback timing of each grob.
  • Compile a matrix of the advantages of each.
  • Discuss as a community and figure out the best way forward.
@uliska
Copy link

uliska commented Sep 4, 2017

This is a very useful suggestion. I have never been involved in LilyPond video generation but I would love to see an easy-to-use and powerful (and more known) solution available. So I would suggest (and to some extent volunteer) to also add an interface for a potential result to Frescobaldi.

I suggest you (@aspiers) start with creating a stub of one or two pages on your repo's Wiki, with places for the first two items in the original issue description.

@aspiers
Copy link
Owner Author

aspiers commented Sep 4, 2017

I see Knut replied to my request for him to publish his source code as a git repo. Unfortunately for some reason the reply didn't reach my inbox, so I can't easily reply to it.

@knupero
Copy link

knupero commented Sep 4, 2017

So here I am. mail.adamspiers.org uses spamcop.net and that site decides that it is wise to block all mail from the mailout servers of the biggest german ISP, t-online.de.

High-level summary:

I don't believe in the "generate one system automatically" approach because it's impossible to find a good solution if the number of voices / instruments / staffs change within a video.

Therefore my system relies on the user to prepare a pdf with lilypond. That pdf is split into individual pages using pdftk. Every page is translated to a raster image using ghostscript, every raster image is translated to a x264 video using ffmpeg, all temporary x264 videos are combined together and audio gerated from lilypond midi files is added.

It is possible (and enabled by default) to instruct a patched lilypond to generate one pdf page for every moment with a note on / note off event and to highlight noteheads, stems, flags, rests and multimeasure rests for the time they are active.

Inside the ly source prepared by the user one book needs to be marked as source for video generation, one or more books need to be marked as source for audio. One video will be generated for every audio source. That way it is easy and fast to generate e.g. one video for every voices of a choir that emphasizes the individual voices and one video without such an emphasis. Have a look at the Hugo Wolf videos to be found at https://www.youtube.com/channel/UCZigkAdCNu_y9upXrbIyQUQ/videos?shelf_id=0&view=0&sort=dd

The published version is tuned for speed - generating the five Hugo Wolf videos mentioned above takes about a minute on my system (i4790K). That's possible because most steps use all available cores of a modern cpu.

Known bugs of the published version: pages to be duplicated (that means all but the title page) must not include eps files.

Limitations: No cursor, no scrolling. I don't like the speed changes of a moving cursor line. But it would be relatively easy to add a moving cursor line to my system. A standing cursor with a moving score would be impossible as it is bound to the one-system approach. A cursor bar at the bottom marking the current bar is also possible.

Within the close limits of my available time I am working on a much revised version. A lot of code already has moved from the bash script to scheme, the eps limitation has been eliminated.

Some thoughts

The long-term objective should be to make video generation a part of lilypond. Because of this we should not use languages not already used by lilypond. That means neither bash nor perl nor python is a real option.

If we need to change lilypond, we should limit ourselves to modify only its scm, ly and ps files. As long as we follow that principle, every user can use an existing installation as a base without the need to recompile anything.

We only should use external tools that are available for linux, windows and apple systems.

Are you sure that all this should be discussed here? I think lilypond-devel would be a good place.

Knut

@uliska
Copy link

uliska commented Sep 4, 2017

The long-term objective should be to make video generation a part of lilypond.

Of course it would be extremely cool if it could be an actual backend:

\score {
  \myScoreExpression
  \layout {}
  \midi {}
  \video {
     % ... configuration options like container format, frame rate, encoding ...
   }
}
}

If we need to change lilypond, we should limit ourselves to modify only its scm, ly and ps files. ...

Of course that would make things simpler for the user. But I could also imagine "really" doing it inside LIlyPond, i.e. not limiting that way but optionally modify the C++ part itself. At least if it could get into LilyPond proper. The biggest problem with that approach is related to

We only should use external tools that are available for linux, windows and apple systems.

This is true. The thing is: if we provide a Scheme-only solution (which could for example conveniently be wrapped in an openLIlyLib package) it's OK to specify dependencies like ffmpeg, imageMagick etc. which the user has to provide himself. But I'm not sure it would be acceptable to provide a "built-in" solution like a backend with such external dependencies. And bundling all dependencies may also not be a viable solution.

Are you sure that all this should be discussed here? I think lilypond-devel would be a good place.

I think assembling the lists of high-level summaries and a comparison matrix would be good to have here. For discussing actual implementation or roadmaps a more widely used channel like lilypond-devel might indeed be a better platform.

@knupero
Copy link

knupero commented Sep 5, 2017

A real video does consist of several parts: title credits, the main score video, end credits. Maybe you also want to have some metronome ticks and some kind of intonation before the main score video starts. Because of that a \video{...} block would be usefull only for pretty trivial music.

To define a source for a title page video I propose the following syntax

\book {
    \bookOutputSuffix "OurUniqueNameB"
    \markupVideo
    \markup { ... }
    }
}

To define a source for a score video I propose the following syntax

\book{
    \bookOutputSuffix "OurUniqueNameA"
    \scoreVideo
    \score {
        \someMusic
        \layout {}
    }
}

To define an audio source I propose the following syntax:

\book{
    \bookOutputSuffix "OurUniqueNameC"
    \audioTrack
    \score {
        \someMusic
        \midi {}
    }
}

Every implicit or explicit \book inside a ly source without \markupVideo, \scoreVideo or \audioTrack shall be processed as usual.

\markupVideo, \scoreVideo, \audioTrack inside of an implicit of explicit \book must be used at the top of that block, prior to all other commands that produce output, but after \bookOutputSuffix. If one of those keywords is used, lilypond shall construct a list of commands necessary to produce a video or audio file from the output of processing the book.

It must be possible to have multiple books with \markupVideo, \scoreVideo and \audioTrack definitions inside a ly file.

If one of those new commands is used, the output of the book block may be different from normal operation, but there must be some output that can be checked with standard tools. if e.g. \scoreVideo is used, and without its use there would be a pdf with one page, the result may be a pdf with a huge number of pages with differently colored noteheads or added cursor lines.

If the user believes that the output of the marked book blocks is ok he defines how the audio and video track of his final video shall be constructed, e.g.

#(define audioTrackList '("OurUniqueNameX ...))
#(define videoTrackList '("OurUniqueNameY ...))

(probably we want a nicer syntax) and adds

\makeVideo

\makeVideo shall combine and execute all the command lists recorded above and probably some additional commands to produce the final video.

Obviously we also need some additional commands, e.g. to define video resolution, to enable or to disable a cursor line, to enable or disable coloring of grobs etc. This could be done in scheme, e.g.

#(videoResolution 1280 720 )
#(videoPreset "veryslow")
#(audioBitRate "128k")

but probably we want a nicer syntax.

Knut

@uliska
Copy link

uliska commented Sep 5, 2017

To define a source for a title page video I propose the following syntax

In such a book \pageBreaks could be used to define "slides". There would have to be a way to specify the duration of the slide and probably a transition time (I don't think we need to support transitions beyond simply crossfade).

Of course the main question is: is it possible to implement such far-reaching functionality on the Scheme-addon level or would we have to go to modifying LilyPond itself?

@knupero
Copy link

knupero commented Sep 5, 2017

In such a book \pageBreaks could be used to define "slides". There would have to be a way to specify the duration of the slide and probably a transition time (I don't think we need to support transitions beyond simply crossfade).

A simple scheme list with the desired times would be enough for a start.

Of course the main question is: is it possible to implement such far-reaching functionality on the Scheme-addon level or would we have to go to modifying LilyPond itself?

Changes to the compiled lilypond files are definitely not necessary. But it's impossible without changes to some of lilypond's scm and ly (and possibly ps) files. A simple file to be included cannot work.

We need to intercept some events, define an after-line-breaking function and a page-post-process function:

\layout {
  \context { \Score
    \consists #(make-engraver (listeners (time-signature-event . format-time)))
    \consists #(make-engraver (listeners (tempo-change-event   . format-tempo)))
    \override NoteHead         #'after-line-breaking = #mkvideo-dump
    \override Rest             #'after-line-breaking = #mkvideo-dump
    \override MultiMeasureRest #'after-line-breaking = #mkvideo-dump
  }
}

\paper {
  #(define (page-post-process layout pages) (after-pb-processing layout pages))
}

With the information gathered above an extended ps dump-page function can be used to duplicate and change pages as necessary. My current code changes the color of noteheads etc at this place based on information gathered in the events / callbacks sketched above, and it would also be possible to add a colored cursor line here with a bit of postscript code.

@aspiers
Copy link
Owner Author

aspiers commented Feb 11, 2018

Hi all, thanks a lot for all the amazing suggestions here, and really sorry for the incredibly slow response! I've had a huge email backlog for months and only beginning to chew through it now.

@knupero commented on 4 Sep 2017, 13:41 BST:

So here I am. mail.adamspiers.org uses spamcop.net and that site decides that it is wise to block all mail from the mailout servers of the biggest german ISP, t-online.de.

Ugh sorry, that's super annoying :-/ I thought spamcop.net was a reputable and reliable block list but maybe I was mistaken :-( Any chance you could give me some IPs of the outbound t-online servers in question so I can check whether this is still an issue?

Anyway, regarding the real topic here of a future best-of-breed video generator ...

I don't believe in the "generate one system automatically" approach because it's impossible to find a good solution if the number of voices / instruments / staffs change within a video.

Please could you elaborate on this? I don't understand why it would be impossible to handle a changing number of voices / instruments / staffs; in fact I can imagine what it would look like an a basic concept for the implementation now (e.g. gradually zooming in and out depending on the height of the system, or simply going with a zoom level which can fit the highest system in the music and leaves vertical space for the other parts of the music).

But maybe I'm missing something. One advantage of rendering to a single system is that LilyPond's natural tendency to create line breaks doesn't get in the way of rendering something which can continuously scroll horizontally. OTOH I appreciate that not everyone would want the final video to only have a single system.

Therefore my system relies on the user to prepare a pdf with lilypond. That pdf is split into individual pages using pdftk. Every page is translated to a raster image using ghostscript, every raster image is translated to a x264 video using ffmpeg, all temporary x264 videos are combined together and audio gerated from lilypond midi files is added.

This is very similar how ly2video does it currently, with the most significant differences being that LilyPond renders directly to PNG and uses those as the basis for generating each frame of the video, and of course that it renders to a single system.

Out of curiosity, why did you choose to render to PDF rather than PNG, given that the end result is a video?

It is possible (and enabled by default) to instruct a patched lilypond to generate one pdf page for every moment with a note on / note off event and to highlight noteheads, stems, flags, rests and multimeasure rests for the time they are active.

Inside the ly source prepared by the user one book needs to be marked as source for video generation, one or more books need to be marked as source for audio. One video will be generated for every audio source. That way it is easy and fast to generate e.g. one video for every voices of a choir that emphasizes the individual voices and one video without such an emphasis. Have a look at the Hugo Wolf videos to be found at https://www.youtube.com/channel/UCZigkAdCNu_y9upXrbIyQUQ/videos?shelf_id=0&view=0&sort=dd

They look great! I really like the approach of highlighting in red, perhaps even better than all three modes offered by ly2video. However an ideal tool would offer all four rendering modes, and maybe others too.

Limitations: No cursor, no scrolling. I don't like the speed changes of a moving cursor line.

Yeah I agree it's aesthetically slightly unappealing. But different users will have different preferences.

But it would be relatively easy to add a moving cursor line to my system. A standing cursor with a moving score would be impossible as it is bound to the one-system approach.

Yep, good point.

A cursor bar at the bottom marking the current bar is also possible.

Within the close limits of my available time I am working on a much revised version. A lot of code already has moved from the bash script to scheme, the eps limitation has been eliminated.

Sounds cool, looking forward to seeing that!

Some thoughts

The long-term objective should be to make video generation a part of lilypond. Because of this we should not use languages not already used by lilypond. That means neither bash nor perl nor python is a real option.

If we need to change lilypond, we should limit ourselves to modify only its scm, ly and ps files. As long as we follow that principle, every user can use an existing installation as a base without the need to recompile anything.

We only should use external tools that are available for linux, windows and apple systems.

I agree with the sentiment behind this but I have some reservations about how realistically achievable it would be in practice in a cross-platform manner. I see that Urs already made the good point in his reply about the impact of a heavy number of dependencies on the core LilyPond distribution. However I'd be happy to be proved wrong though :-)

Are you sure that all this should be discussed here? I think lilypond-devel would be a good place.

Makes sense; in fact I see that there have already been a few discussions on lilypond-user in my absence relating to GSoC.

Due to lack of time I won't comment on the rest of the comments made above, which seem to mostly relate to brainstorming what a new built-in video generation feature would look like. And yes, that should probably be continued on lilypond-devel anyway.

I should also offer the disappointing caveat that unfortunately I'm unlikely to be able to contribute much in this area other than basic maintenance of ly2video (and even there I'm not doing a good job). This might change if I have a real-world need to do more video generation again myself, which is not out of the question but currently I can't foresee it happening much this year at least.

Thanks again for this excellent discussion and all your great contributions to the world of LilyPond!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants