Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Notification handling #43

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

lpsmith
Copy link
Contributor

@lpsmith lpsmith commented Dec 26, 2015

This pull request is dependent upon #42, of course.

@begriffs
Copy link

Looking forward to this being polished up and merged. It's important for PostgREST/postgrest#475

@nikita-volkov
Copy link
Owner

I'm currently swamped. It'll require me at least another week to come back to this.

@begriffs
Copy link

No problem, don't mean to stress you out. Wanted to post the update here so @lpsmith also knows I appreciate the PR and that it will be helpful once merged.

@begriffs
Copy link

Do you have some time to review this PR?

@nikita-volkov
Copy link
Owner

First I need to fix the bug with multithreading, which you know of, then I plan to switch to this issue. Can't make promises on the terms though. A month seems as likely as a week :)

@nikita-volkov
Copy link
Owner

Okay, I've reviewed this. I must say I'm rather conflicted. Everything that gets done in this PR actually only needs "postgresql-libpq", and I've actually refactored it to be so. This then raises the question of whether it makes sense at all to put this in the "hasql" library.

I think the best way to do this is for me to expose the low-level "postgresql-libpq" connection, and for the functionality of this PR to be released as a general extension-library over "postgresql-libpq", which itself won't have any dependency on Hasql. Thus it'll become possible to use this functionality with Hasql as well as any other libpq-based Postgresql driver.

What do you guys say?

@lpsmith
Copy link
Contributor Author

lpsmith commented Mar 7, 2016

There is a dependency between getNotification and a higher-level binding: in hasql's and postgresql-simple's case, it is the MVar. Your update has drastically changed the concurrent semantics for the worse. You can no longer wait for notifications in one thread while also continuing to use that same connection in another thread. There really isn't a generic way of implementing getNotification properly without dealing with issues that postgresql-libpq purposefully doesn't deal with.

@nikita-volkov
Copy link
Owner

Your update has drastically changed the concurrent semantics for the worse.

Right. I see now that the connection needs to be released in-between the loop cycles. However,

There really isn't a generic way of implementing getNotification properly

There is. The "postgresql-libpq"-based library could implement this using a more general interface like this:

getNotification ::
  (forall a. (PQ.Connection -> IO a) -> IO a) ->
  IO (Either IOError Notification)

Where the first parameter is essentially the withConnection function that you had in your implementation. In substance this would be similar to, but much more general than

getNotification ::
  MVar PQ.Connection ->
  IO (Either IOError Notification)

Thus that library could then be used both inside "postgresql-simple" and to extend the functionality of such libraries as "hasql". This is separation of concerns and composition on library-scale in action. Of course, I imply that it would be best for you to do it, since it's your code after all.

@nikita-volkov
Copy link
Owner

Oh, and I forgot to say that I'm sorry for taking so long to react to this PR, Leon.

@lpsmith
Copy link
Contributor Author

lpsmith commented Mar 7, 2016

Oh, no apology needed regarding the time taken, I am certainly in no rush. (Though I don't speak for begriffs)

As for the "generic" getNotification, that's not an interface that (A) anybody except for library implementors would want to use, and would also imply that (B) HaSQL would have to expose some if its internals via an Internal module.

Now, while I fully support (B), in which case you could have a separate hasql-notifications package if you really want, I doubt that your proposed getNotification is really all that much more general, as it's making an awful lot of assumptions as to what its withConnection parameter does and does not do; for example, it wouldn't really work for withConnection to take a connection out of a pool and put it back every time.

This construct is also isn't great from a code generation perspective, as I doubt the object code for the "generalized" getNotification is really of much use in and of itself, so the end-user executable doesn't really want it, and that you'd always want to inline that code (and often also inline its first parameter) in the user-exported getNotification in order to cut down on indirect jumps and whatnot. Basically, it should be treated more like a macro than a higher-order function. And getting that sort of transformation to work robustly with GHC is not worth the effort in this case.

Also, I don't like this idea from the point of dependency hygiene on hasql's part or API hygiene on postgresql-libpq's part.

@nikita-volkov
Copy link
Owner

(B) HaSQL would have to expose some if its internals via an Internal module.

That's exactly what I propose to do on the Hasql side. E.g., I'd export a function like the following from the Hasql.Connection module:

withLibPQConnection :: Hasql.Connection -> (PQ.Connection -> IO a) -> IO a

Then that would not only provide a way to solve this particular issue, but also a general way to provide other "libpq"-based extensions, without any pull-requests or any other kind of my involvement, which I find an incredibly important point from the maintenance perspective.

that's not an interface that (A) anybody except for library implementors would want to use

Probably, but then we're talking about this in the context of a feature of which the same could be said just as well. I mean I've never used notifications in my life and I suspect the same could be said of most of the users. Actually, @begriffs is the only user requesting such hardcore features so far, and his case could be as well put in the "library implementors" group.

However, look at this:

getNotification (withLibPQConnection hasqlConnection)

Apart from error-handling, that would be all it'd take to bind the proposed Hasql API with getNotification of the following signature:

getNotification ::
  (forall a. (PQ.Connection -> IO a) -> IO a) ->
  IO (Either IOError Notification)

I doubt that your proposed getNotification is really all that much more general, as it's making an awful lot of assumptions as to what its withConnection parameter does and does not do

It is more general, it does however indeed come with assumptions, so I think you meant to say that it's not safe. It's not safe indeed, but then it's almost the same level of abstraction as "libpq" itself that we're talking about. It does however solve a problem, which is common to separate families of libraries, and it doesn't impose any assumptions on how those libraries should wrap "libpq".

This construct is also isn't great from a code generation perspective, as I doubt the object code for the "generalized" getNotification is really of much use in and of itself, so the end-user executable doesn't really want it, and that you'd always want to inline that code (and often also inline its first parameter) in the user-exported getNotification in order to cut down on indirect jumps and whatnot. Basically, it should be treated more like a macro than a higher-order function. And getting that sort of transformation to work robustly with GHC is not worth the effort in this case.

I don't get this concern. It's just a basic continuation. Even if it somehow will in fact impose any noticeable overhead it'll be trivially solvable with either the INLINE pragma or a combination of INLINABLE pragma and the inline function at the use site.

Also, I don't like this idea from the point of dependency hygiene on hasql's part or API hygiene on postgresql-libpq's part.

I don't propose to change anything in "postgresql-libpq". I propose to release an extension library for it.

Concerning the "hasql" side, following is what concerns me most of all.

You see, lately I came to conclusion that the benefits of separation of concerns are not only applicable to modules, but the whole packages as well. And here's a specific practical motivation of mine: I'd prefer not to take on this functionality for maintenance, since I don't use it and don't plan to, hence there's not much sense for me to be responsible for it. I would much rather open the necessary doors in the API for it to be implementable as an extension by someone else, with him becoming both the copyright owner and the maintainer of that particular extension.

At the same time I'll keep my self able and the API flexible enough to continue the research in the area of the usability of the features, which this library already has. Those features are intentionally few: the library is basically just a mapping API, which can also execute statements. That's it. I'd love to be able to have the rest done in separate libraries and I provide more motivation on this in the readme.

@begriffs
Copy link

So should we go ahead with withLibPQConnection and create a hasql-notifications package? @lpsmith would that function give you what you need to implement your notification functions?

@nikita-volkov
Copy link
Owner

I plan to expose that function either way.

@lpsmith
Copy link
Contributor Author

lpsmith commented Apr 30, 2016

Yes, that function should be sufficient for my purposes.

@lpsmith
Copy link
Contributor Author

lpsmith commented Apr 30, 2016

However, while I'm 100% ok with such a solution, the "generalized" getNotification that Nikita proposed still isn't very general: it's assuming that it gives you exclusive access to a libpq connection object. So this should be considered part of the semantics of withLibPQConnection. So #42 is critical for implementing getNotification as well, which was merged long ago.

@mrijkeboer
Copy link

Any progress on this? I'm interested to use polling with HaSQL.

cocreature added a commit to cocreature/hasql that referenced this pull request Jul 23, 2016
@cocreature
Copy link
Contributor

@lpsmith @begriffs Alright, we now have withLibPQConnection so there should be nothing in the way of a hasql-notification package. I’d be happy to help out or even convert your code into a package myself (giving appropriate credit ofc) but I don’t want to just “steal” it without your permission.

@lpsmith
Copy link
Contributor Author

lpsmith commented Jul 23, 2016

The code is BSD3 licensed in postgresql-simple, so you already have permission. If you want explicit permission otherwise, by all means, please. :)

@cocreature
Copy link
Contributor

FYI I have now created a hasql-notifications package. It’s not yet on hackage because there seems to be a race condition with threadWaitRead that does not occur using postgresql-simple but I hope I’ll be able to figure this out during the next days. (There is a testcase illustrating the race condition in case anybody else wants to look at it).

@lpsmith
Copy link
Contributor Author

lpsmith commented Jul 31, 2016

I would be extremely interested in looking at the test case. Cheers!

@lpsmith
Copy link
Contributor Author

lpsmith commented Jul 31, 2016

Well, I'm not surprised that there is a race condition; you removed the code to prevent several race conditions!

@lpsmith
Copy link
Contributor Author

lpsmith commented Jul 31, 2016

Ok, so the join $ withLock fooLock $ do {...} and the join $ atomically $ do {...} idioms are quite useful when dealing with concurrent programming; the idiom allows you to decide what to do next while a lock is held (or in an atomic transaction!) and then the action that is returned is executed outside the lock/transaction. (because, for example, the code returned may block, or the code executes some IO actions that can't be performed safely inside STM.)

It's a highly useful idiom that I "discovered" myself, but am by no means the first to discover it, and this idiom was even used in hackage packages before I figured it out. For example, SPJ has at least once discussed this idiom on haskell-cafe in 2006 and this idiom is also used in resource-pool.

So by modifying code that perhaps looked like it was written in a needlessly obtuse way, you are actually modifying the concurrent semantics of the code. I discussed a very similar type of race condition on my blog, and also added comments explaining the cases to postgresql-simple

The only substantial difference between my pull request is that the code returns IOErrors instead of throws them. I actually rewrote the notification handling code for hasql, then backported the results back to postgresql-simple.

So, what I really would do, is stick to my PR and add the comments I added, or stick with the postgresql-simple code and change it so that it returns IOErrors. This will fix your errors.

I'm still extremely interested in a test case that demonstrates one of the race conditions!

@cocreature
Copy link
Contributor

@lpsmith Thanks a lot for looking at it so quickly. I’m not a big fan of the join $ withLock idiom since it makes it hard to see what is done inside and outside of the lock, but it is definitely better in this case.

However that is not what is causing the race condition, in fact I rewrote the join stuff in the hope of fixing a race condition that I might not have seen before. I am now back to the old code and enabled the code causing the race condition. The problem is in the third test. If I send the notification using postgresql-simple’s exec function (a comment illustrating why it is used instead of LibPQ.exec might be nice :)) everything works fine. However if I use a simple combination of PQ.send and PQ.getResult or hasql to send that notification, I sometimes run into a deadlock where no changes to the file descriptor are noticed.

@begriffs
Copy link

Hey @nikita-volkov any interest in this PR?

Notifications would be the missing link that allows postgrest to reload its cache of the database structure automatically on changes (since we can create a DDL trigger that sends a NOTIFY to the server). The inability to auto refresh the cache has been a source of confusion for a long time.

If I understand correctly, the standalone hasql-notifications project ran into race condition problems that they could avoid if they were able to add something to this library to hook into the internals. @lpsmith told me he'd be willing to help move the feature forward in hasql if you're OK with it.

@lpsmith
Copy link
Contributor Author

lpsmith commented Jul 27, 2017

Well, to be clear, @cocreature pointed out a race condition that I didn't fully appreciate, namely issues where getNotification blocks, then a notification arrives without GHC's IO manager becoming aware of it. This frequently happens when a query causes a notification and the notification arrives in the same network packet with the result.

An acceptable fix is to add a threadWaitRead to a standard query. This ensures that the IO manager stays in the loop, and thus wake up getNotification on every query.

postgresql-simple was vulnerable to this issue on Linux before I moved to async libpq calls for other reasons and thus inadvertently fixed this issue as well.

So it turns out that, as I have been arguing all along, there isn't really a generic way to implement getNotification for all consumers of postgresql-libpq. It turns out that getNotification requires minor changes to the internal implementation details of hasql.

Also, it would probably be best to move withLibpqConnection into a module explicitly marked as Internal. A libpq connection object is a complicated, stateful, very un-Haskelly thing, and making use of it without consideration of the internal implementation details of everything else using it is asking for trouble.

@nikita-volkov
Copy link
Owner

The good news is that the coming native reimplementation based on the protocol will sport the Notification feature.

If anyone decides to bring this PR up to date and test it, I'll be willing to merge. However, beware that it's gonna go away in the next major release, because it'll be a complete reimplementation of the library with no Libpq dependency.

Also please notice that I am passionately against the Internal convention for plentiful reasons, so please don't move any modules there.

@begriffs
Copy link

begriffs commented Aug 6, 2017

Very much looking forward to the native implementation! 🎊 I know it'll be "done when it's done," but any rough estimate of when you might release it?

@nikita-volkov
Copy link
Owner

I plan to present the new version at Haskell Exchange this year. So, some time before October 12.

@nikita-volkov
Copy link
Owner

In case you're interested in the updates on the subject, my talk from HaskellX is already published.

In short, the work is ongoing. The testing "beta" releases will from now on be in the 0.x version space and the stable 0.19 has become 1. The latest beta release is 0.20, it natively implements the Postgres protocol, and it already supports Notification. However there still are issues (GC, in particular), which I'm working on, so it's not a major release yet.

@begriffs
Copy link

begriffs commented Mar 8, 2018

@nikita-volkov I see that 0.20.1 is now available. Does it fix the GC issues you were talking about?

@nikita-volkov
Copy link
Owner

@begriffs It's still in the works. Once it's production-ready it'll be released as v2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants