Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create more labels #505

Closed
cenkalti opened this issue May 17, 2023 · 12 comments
Closed

Create more labels #505

cenkalti opened this issue May 17, 2023 · 12 comments

Comments

@cenkalti
Copy link
Member

@ahrtr Can you create following labels to help tracking issues? (I have permission to tag issues but not create new labels)

  • corruption
  • defrag
  • documentation

Can you also rename "type/performance" as "performance"?

@ahrtr
Copy link
Member

ahrtr commented May 17, 2023

delegate to @jmhbnz to manage this.

We can follow the same pattern as etcd.

@jmhbnz
Copy link
Member

jmhbnz commented May 17, 2023

Thanks for the suggestion @ahrtr, agree there is some crossover here with etcd-io/etcd#15773.

I haven't looked across other subproject labels just yet as part of my initial updates to etcd triage issues but the key thing I would push for bbolt to align on is the use of area/x convention for these labels. Refer etcd list here: https://github.com/etcd-io/etcd/blob/main/Documentation/contributor-guide/triage_issues.md#step-4---define-the-areas-impacted. Or kubernetes list here: https://github.com/kubernetes/kubernetes/labels?q=area

In the interest of not slowing down @cenkalti I would propose the following new labels are created here in bbolt:

  • area/documentation - matches existing etcd area/documentation label.
  • area/defrag - new label specifically for bbolt
  • area/corruption - new label specifically for bbolt

We can come back at a later date to review the exact labels, I'm happy provided they follow the area/x standard.

In relation to type/performance my suggestion would be to retain the type/x standard to align with etcd: https://github.com/etcd-io/etcd/blob/main/Documentation/contributor-guide/triage_issues.md#step-3---apply-the-appropriate-type-label

@ahrtr
Copy link
Member

ahrtr commented May 17, 2023

Thanks @jmhbnz .

So let add:

  • type/bug (renamed from bug )

  • type/flake

  • type/feature (renamed from enhancement)

  • type/question (renamed from question)

  • type/support

  • area/performance (renamed from type/performance)

  • area/community

  • area/bbolt-core

  • area/corruption

  • area/bbolt-cli (renamed from bbolt-cli)

  • area/documentation

  • area/testing

  • area/security

defrag is an etcd concept. bbolt only knows compact.

@ahrtr
Copy link
Member

ahrtr commented May 17, 2023

Just updated and added some labels. Generally speaking, there are three kinds labels:

  • type/xxx: what's the type of the issue? e.g. a bug a feature request?
  • area/xxx: what's the area? e.g. documentation, bbolt-cli
  • stage/xxx: what's the current stage? triaged or in progress of investigation?

@ahrtr
Copy link
Member

ahrtr commented May 17, 2023

Done.

@ahrtr ahrtr closed this as completed May 17, 2023
@cenkalti
Copy link
Member Author

  • I don't like adding prefix to labels. Aesthetically, they don't look good. Kubernetes has 201 labels, etcd has 44 labels. bbolt is much more smaller project and don't need that level of moderation. That's just my personal opinion though.
  • Defragmentation (or compaction) is an issue that causes the database to grow larger than the actual data size. It's one of my primary area that I want to improve. Can we have a separate label for it? Or, does it go under "area/bbolt-core"?
  • I see that "dependencies" is also renamed to "area/dependencies". This label is created automatically by dependabot. If we want to have it with "area/" prefix, we must also add it to .dependabot.yml file. (docs)

@cenkalti cenkalti reopened this May 18, 2023
@jmhbnz
Copy link
Member

jmhbnz commented May 18, 2023

I don't like adding prefix to labels. Aesthetically, they don't look good. Kubernetes has 201 labels, etcd has 44 labels. bbolt is much more smaller project and don't need that level of moderation. That's just my personal opinion though.

As you say, personal view. I prefer the grouped approach aesthetically, however more importantly I would suggest that the grouping of labels into area/x, type/x priority/x etc helps contributors to understand how the labels work together and helps contributors perform issue triage in a consistent way across all etcd projects. I realise that it is early days and we still have a lot of work to do on attracting new contributors and standardising our triage but we have made a start on this and having clear guidelines that members can follow along with a consistent experience for them across our projects will help ease cognitive burden.

Defragmentation (or compaction) is an issue that causes the database to grow larger than the actual data size. It's one of my primary area that I want to improve. Can we have a separate label for it? Or, does it go under "area/bbolt-core"?

Definitely one for @ahrtr, I have no objection either way.

I see that "dependencies" is also renamed to "area/dependencies". This label is created automatically by dependabot. If we want to have it with "area/" prefix, we must also add it to .dependabot.yml file. (docs)

That's great spotting, thanks!

@ahrtr suggest we update etcd and subproject files to add area/dependencies as the label instead of standard dependencies, ref:https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#labels

Let me know what you think, I'm happy to get the pr's raised later today.

@ahrtr
Copy link
Member

ahrtr commented May 18, 2023

  • Defragmentation (or compaction) is an issue that causes the database to grow larger than the actual data size. It's one of my primary area that I want to improve. Can we have a separate label for it? Or, does it go under "area/bbolt-core"?

I would suggest to go under area/bbolt-core. bbolt isn't a big project, I don't expect too much labels on it. But data corruption is a serious problem, and I vaguely remember the first time I saw it is about 2 years ago. We are still struggling to reproduce it. So adding a area/corruption label for it.

  • I see that "dependencies" is also renamed to "area/dependencies". This label is created automatically by dependabot. If we want to have it with "area/" prefix, we must also add it to .dependabot.yml file. (docs)

Good catch. It isn't a big problem. I have no strong opinion here, and either label works for me. I reverted the label to dependencies for now. If we need to change it, then we'd better to change for all projects & sub-projects, including etcd, bbolt and raft. @jmhbnz Please feel free to raise an issue or draft PR to discuss it separately.

@cenkalti
Copy link
Member Author

Corruption is a serious problem but all cases may not be valid and do not need to get addressed by bbolt project. If we cannot reproduce the reported issue, we should close it.

There are many sources of corruption that we cannot control:

  • Faulty hardware (disk, memory, CPU)
  • Unsafe filesystems, misconfiguration
  • Other faulty/malicious processes on the same host
  • Cosmic rays

If it can be done, I see #492 as great improvement for detecting corruptions. Most databases put checksum to each page.

@cenkalti
Copy link
Member Author

Closing this as the task about labels is done.

@ahrtr
Copy link
Member

ahrtr commented May 19, 2023

all cases may not be valid

How and where is your conclusion coming from?

Both #402 and etcd-io/etcd#15498 indicate that there may be potential memory stomp bugs. Can't reproduce is a big concern, but it doesn't mean we can close the issues directly!

@cenkalti
Copy link
Member Author

My statement was not a conclusion. Of course there may be valid cases. However, we cannot be sure. Although we control the source code we don't have control over other factors (hardware, configuration, environment, universe).

bbolt is a mature and widely used project. If there was any serious bug, there would be more reports with common symptoms.

Any reported corruption may be a result of a single bitflip. We can never be sure.

That's why I want to prioritize #492 . First, we need to make sure physical integrity of the file is not broken. Otherwise there is a high possibility of wasting time trying to fix non-existent bugs in the code.

I have motivation and time to work on this. Just want to confirm we're on the same path before moving on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants