Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo database server with REST API #2182

Draft
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

AlexanderMihail
Copy link

@AlexanderMihail AlexanderMihail commented Apr 13, 2024

This PR adds db_demo.cc to the build structures of Seastar.

The demo is for a MongoDB-style of a database where tables are pairs of (key,data) where data can be anything text, possibly json. The root of the HTTP server explains its use and features. Most notably:

  1. Table create/drop/use
  2. Per table insert/update/get
  3. Statistics and self tests.

Persistence is a pair of files db_<table_name>keys.txt and db<table_name>_values.txt where the keys file holds off_t markers into the values file. For performance reasons, keys are of fixed size so that they can be directly located with seek. New keys are appended to the keys and values files, updates are reusing the value space if the new value fits in the space of the old value and appended otherwise, and deleted entries are marked as such in the index file. This mechanism creates internal fragmentation within the keys and values files. Deleted keys are progressively reclaimed during searches, while reclaiming the full index and the space of deleted or resized values is left to a manual operation called "compact".

Temporary demo server at: http://demo.proximacentauri.org:10000/.

Please use the demo site responsibly, and remember to click on the ?use hyperlink before attempting any table-oriented operations. Table files are loaded only when the current table is in use.

Alexander Mihail added 11 commits April 9, 2024 13:27
…e, query key-values. Data is currently held in a map.
… text files for Index and Data. Keys go to Index, whereas Values go to the Data file.
…, /?op=purge, /?op=compact. Repaired the overall logic of the server and disk persistence. More REST commands to: /quit, /config?trace_level=1111, /config?max_cache=1. Enhanced the demos/rest.cmd with a more complex workflow to demonstrate fragmentation and defragmentation of keys and values files.
… responses. Test batch rest.cmd exits after 100 iterations.
… table as well as any other user-table. The root of the web server is a generated HTML page with navigation for all the system features. There is now a map of tables. Selecting the current table is with the /tablename/?use rest command. Tables can also be dropped. Table::compact was reworked to shift-up deleted keys in the index. An index_revolver coroutine was introduced to return one index per invocation. This logic also has index cleanup capabilities. The Table::list() command uses this coroutine to produce the list of all keys. Table::get/set are now boolean. The set prevents overinserting and updating of unknown keys. The get returns false if the key is unknown. This allows now for inserting or updating "null" values. Better statistics. Better formatted output. All REST logic is in one DefaultHandle where the table list lock is maneaged. Everything the DB server does is listed in the / page which is built by DefaultHandle::build_help. Added the new Table::self_test method and the corresponding /?selftest=0 REST command. This system of self-testing obsoletes the rest.bat file while also aiming at raw performance.

This version has problems with deleting keys that are in cache. TODO.
This version closes issues: #1, #2, #3, #4, #5, #6, #8. It advances on issues #13 and #11.
@nyh
Copy link
Contributor

nyh commented Apr 14, 2024

To be honest, I don't think that the Seastar repository should contain large "useful" projects, such as servers of different types, etc. A project which is likely to continue getting developed later, contains multiple files, etc., can - and I think should - be stored in a separate github project.

Perhaps it makes more sense to hold your project, and similar ones, in separate repositories, and then add somewhere (e.g., doc/examples.md) a document listing the various known projects that use Seastar?

Alexander Mihail added 2 commits April 15, 2024 01:22
…IO code should be reduced to this. Much better trace messages. Self tests. Corrections so that the self tests pass.
…s issue #16 regarding the encapsulation of IO operations. This version has a self-test as per issue #11.
…s issue #16 regarding the encapsulation of IO operations. This version has a self-test as per issue #11.
@AlexanderMihail
Copy link
Author

To be honest, I don't think that the Seastar repository should contain large "useful" projects, such as servers of different types, etc. A project which is likely to continue getting developed later, contains multiple files, etc., can - and I think should - be stored in a separate github project.

Perhaps it makes more sense to hold your project, and similar ones, in separate repositories, and then add somewhere (e.g., doc/examples.md) a document listing the various known projects that use Seastar?

Hi, Nadav. It's just the one file of 1200 lines at demos/db_demo.cc. It doesn't make much sense as yet another project for just one file and one class. Regards.

… to the main page before that. This is to avoid quit-restarting in a loop. Perparing to async-ify. A global and configurable mode_async. Added a rest provision to read the console log, if available. Reporting the sub-log of a Failed test when running in a loop. Git-ignoring Visual Studio temporary files and user-files of cmake. Working on issue #17.
…e back to the main page before that. This is to avoid quit-restarting in a loop. Perparing to async-ify. A global and configurable mode_async.

Added the static Table::co_self_test coroutine to wrap around run Table::self_test. The HTTP DefaultHandler calls either the self_test or the co_self_test depending on mode_async. In async mode, the self-test, which is a huge time penalty, is coroutined in the hope that it would split-up into individual, smaller, batches for the SHARD executor. This is to increase parallelism of the HTTP server when multiple users are concurrently active. Table::self_test is recursive in loop mode, and potentially endless. Merged the Prometheus sats server into the Database Demo server so only one port is exposed from the demo container. Added a rest provision to read the console log, if available. Reporting the sub-log of a Failed test when running in a loop. Git-ignoring Visual Studio temporary files and user-files of cmake. Working on issue #17.
@nyh
Copy link
Contributor

nyh commented Apr 17, 2024

Hi, Nadav. It's just the one file of 1200 lines at demos/db_demo.cc. It doesn't make much sense as yet another project for just one file and one class. Regards.

I saw 17 commits and many different files. I realize now that you have numerous commits adding things and then removing them. If your PR is not ready for review at all, please use the "Convert to draft" button (this is the appropriate way to mark a PR not ready for review - not the string "[WIP]" in the title which I missed), and only move the PR out of draft mode when it's ready to be reviewed.

Anyway, the question of whether it makes sense to have this demo inside Seastar doesn't only depend on whether it's a single file or not. Why do you want to include it in Seastar? Is it meant to demonstrate something which other Seastar tests/examples/demos doesn't? Also note if you're planning to use this demo or to continue develop it after the first commit, it will probably complicate your life (and ours) if you include it inside Seastar.

@AlexanderMihail AlexanderMihail marked this pull request as draft April 17, 2024 18:00
@AlexanderMihail
Copy link
Author

AlexanderMihail commented Apr 17, 2024

You're right, of course, Nadav. I've complied with your comments on WIP and Draft.

The reason of my PR with WIP was that without those, and without the ability to tag one specific person in the Assignee field, I couldn't have submitted my 5-cent for any review.

About what it is, what does it actually demonstrate, and why should it be included... A few points here:

  1. It is essentially.... a contribution. It's usefulness in the ecosystem is debatable if not doubtful, indeed.
  2. It does not really demonstrate anything new, it is a demonstration of aggregating Seastar features into something directly usable without intervention. It should demonstrate a sum of HTTPD, REST, DB, FileIO, and SHARD-async, all put to work.
  3. Perhaps it is less suited for your demos space, and shold be in the apps, like your existing HTTPD app which I used as a base.
  4. About demos, in general... I believe that there is a tradeoff somewhere. When something becomes small enough, usefull enough, designed well enough, and with just the right amount of complexity, it becomes a candidate for a demo of the product. Also, when it encapsulates logic that everybody needs/wants sooner or later.

About "projects using seastar"... This thing of mine is hardly "a project"... Not big enough to be a project, not small enough to be a demo, it seems.

It is actually not WIP anymore, simply because without a review I cannot do anything meaningful to it, so I have to stop updating it.

The most serious problem with it, though, is probably that it is not Seastar-compliant. I only heard of Seastar last week. I am sure that what I've done there is not at all Seastar-like. This is why I need review, help and advise. So, would you be so kind as to stay with me for a code review?

I thank you anyway for taking the interest in my PR.

Regards,
Alexander

@AlexanderMihail AlexanderMihail changed the title [WIP] Demo database server with REST API Demo database server with REST API Apr 17, 2024
@nyh
Copy link
Contributor

nyh commented Apr 18, 2024

The reason of my PR with WIP was that without those, and without the ability to tag one specific person in the Assignee field, I couldn't have submitted my 5-cent for any review.

Why would you want to submit something which isn't ready, for review?

The most serious problem with it, though, is probably that it is not Seastar-compliant. I only heard of Seastar last week. I am sure that what I've done there is not at all Seastar-like.

If you only heard of Seastar last week, maybe this isn't yet the best time to contribute code to the project - and you should begin by experimenting on your own and asking questions, and only submit a patch once you believe it's ready?

This is why I need review, help and advise. So, would you be so kind as to stay with me for a code review?

If you have specific questions, please ask them on the Seastar mailing list (seastar-dev in groups.google.com). If you think you find bugs in Seastar or holes in its API, you can open issues about those. But I can't do a code review to a PR that has 17 patches, 1200 lines of code, and you yourself admit isn't yet ready. Moreover, as I noted earlier, if this isn't a patch for Seastar but rather a full application on top of Seastar, I'm not sure I even want to review the entire application and I am not sure it belongs in the Seastar repository in the first place.

Alexander Mihail added 2 commits April 18, 2024 17:55
…he index revolver, in favor of co_await table.co_index_locked() which is used by future Table::list and future Table::rownext.
…configurable sync/async table mode, as now there's just the async mode. Simplified get to return bool, internally use _get that returns iterator, which is also used by set. And it seems that I'm nolonger getting SIG35, so perhaps this closes #17.
@AlexanderMihail
Copy link
Author

Alright. Had another go at it to sort-out two outstanding issues with this DB server. So annoying having to do things with an entirely new framework without any help. But such was my whole life. I've coroutined my DB operations and, surprisingly, the self-tests are much faster now. I also feel like it's more Seastar-like now. Well, maybe.

17 patches? Why does that matter? It's just one file for review.

Anyway, I thank you for the comments, even though they feel more like "no" across the board. Talking to an opponent is preferable to no human contact at all. Regards.

@nyh
Copy link
Contributor

nyh commented Apr 21, 2024

So annoying having to do things with an entirely new framework without any help.

I specifically suggested to you a way to get help - namely, a mailing list. There are other places as well (e.g., stackoverflow is one of my favorites). What I told you would not work is you sending 1000 lines of code for review as a way to ask questions or get help. While developers will be happy to review your 1000-line code if you say it's ready and if they think it might be useful - nobody will review your 1000 lines of code just to "help" you learn Seastar.

17 patches? Why does that matter? It's just one file for review.

I immediately stopped reviewing when I noticed the first patch adds several files, that a later patch removes. I can't review code seriously when I don't know if a later patch will remove it or not. If you want to submit pull requests (not just in this case - anywhere) you need to clean up your pull request

Anyway, I thank you for the comments, even though they feel more like "no" across the board. Talking to an opponent is preferable to no human contact at all. Regards.

You should understand that even if your code doesn't go in immediately, it doesn't mean you must delete it immediately :-) It's not a "yes" vs "no" thing. You can continue to develop your code in your own repository, and when you feel the code is ready (and you don't rewrite it every week to use a new style), and when you can make a convincing argument why it will be more useful in the main Seastar repository and not in a separate one, you can submit a PR. Until then you can ask questions on the mailing list, ask for advice on short code snippets (not 1000 lines of code), and even point newcomers who have questions to your code as an example.

@AlexanderMihail
Copy link
Author

You’ve picked-up a [WIP] and prosecuted me ever since, rendering this PR unreadable in the meanwhile. A “no can do”, or “you don’t have the necessary credentials” would have sufficed for me. As for the convincing argument that I didn’t make, you may have gotten this backwards: this PR is actually me making a case for Seastar. I know you are an important man in this ecosystem and I am really trying to thank you for taking the time here with me, but, as I’ve explained outright, this PR was for some specific people that I couldn’t tag in the Assignee field. I’ll have to keep it for a few more days until the whole matter fizzles-out, as there were email references to it, then I’ll close this and bother nobody no more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants