Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(notebooks): 馃摑 json and csv sink cookbooks are added #975

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

onuralpszr
Copy link
Collaborator

Description

New JSON and CSV sink cookbooks are added. For Json it needs a version bump, I used the direct develop branch while preparing it.

JSON collab URL: https://colab.research.google.com/drive/19aqX0QwKXxRk4sYtdEmghO2rzEAKMciM?usp=sharing
CSV collab URL: https://colab.research.google.com/drive/1-qetEcyzrBCOCq-TfVVT8AgUG3rJUhT7?usp=sharing

Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
@onuralpszr onuralpszr added the doc:sv:notebooks Supervision Notebook Related Updates label Mar 5, 2024
@onuralpszr onuralpszr self-assigned this Mar 5, 2024
@onuralpszr
Copy link
Collaborator Author

@SkalskiP do you want to merge/review this?

@SkalskiP
Copy link
Collaborator

Hi @onuralpszr 馃憢馃徎, thanks for the reminder.

  • It would be awesome to add some life to those cookbooks. Display images and/or videos. People see that you are downloading the VideoAssets.PEOPLE_WALKING video, but it would be cool to display it. Take a look here for reference.

  • You used supervision==0.19.0rc5 version. Let's bump it to supervision==0.21.0rc5. We rolled out some tracker bugs in supervision==0.20.0.

  • That link https://inference.roboflow.com/reference_pages/model_aliases/ is no longer live.

  • I'd move this 'detections.csv' to constant.

  • You no longer need that. You can use aliases directly.

    REGISTERED_ALIASES = {
        "yolov8n-640": "coco/3",
        "yolov8n-1280": "coco/9",
        "yolov8m-640": "coco/8",
        "yolov8x-1280": "coco/10",
    }
    
  • Call byte_track.reset() after this byte_track = sv.ByteTrack(). Tracker ID's in your CSV start at 180.

  • Use minimum_consecutive_frames=3 in byte_track = sv.ByteTrack(). It will make results more stable.

@onuralpszr
Copy link
Collaborator Author

@SkalskiP thank you for tips let me update all.

@SkalskiP
Copy link
Collaborator

Pleasure 馃挏

@onuralpszr
Copy link
Collaborator Author

@SkalskiP I updated both collabs in "the links" could you check, after your confirm I will commit all

@onuralpszr
Copy link
Collaborator Author

@SkalskiP friendly reminder ping :)

@LinasKo
Copy link
Collaborator

LinasKo commented May 22, 2024

Hi @onuralpszr , we spoke with Piotr and I'll be reviewing this one

@LinasKo
Copy link
Collaborator

LinasKo commented May 22, 2024

Hi @onuralpszr,

Here's my review comments:

  1. I see you've added byte_track.reset(), but it should instead be called on the next line after sv.ByteTrack(minimum_consecutive_frames=3) construction. We believe ByteTrack relies on global state, so if someone reruns the notebook, simply calling the constructor may not reset it.

  2. I'm not sure our ipynb->docs converter can handle the progress bar element returned in the video download cell. As a precaution, I'd like to remove the output of that cell.

  3. There's an empty line in json notebook:

   # <- empty line here - let's remove it
SOURCE_VIDEO_PATH = download_assets(VideoAssets.PEOPLE_WALKING)
CONFIDENCE_THRESHOLD = 0.5
IOU_THRESHOLD = 0.7
FILE_NAME = "detections.json"
INFERENCE_MODEL = "yolov8n-640"
  1. Generally, it's a great idea to have bonus content, but we don't think UMAP example fits here. Instead, we're thinking of the following:

Sinks allow users to serialize data into formats suitable for sharing with others or storage in the filesystem. We should show the deserialization case too!
1) We've got some detections that we serialize to JSON / CSV.
2) It's great to see that we can read these in pandas.
3) Next, we could show how to form Detections from these. For example, we could pick frame 1, take the columns and pass them to the Detections constructor.
4) An ideal notebook would also show one annotated frame before starting the sink, and one annotated frame with Detections from reading the CSV/JSON, making it obvious that it's the same.

  1. There's an empty block at the end in CSV notebook - let's remove that

  2. Something you're probably already aware of - the html file links to older version of supervision - we should update data-version to 0.21.0.

  3. Since the previous review, the aliases page is back. If we could mention in briefly in one of the markdown blocks, it would be helpful.
    https://inference.roboflow.com/quickstart/aliases/

  4. We should briefly explain Inference Pipeline and link to it. There's two pages. I suggest mentioning "InferencePipeline (reference)" in the explanation

  5. Similarly, let's link to ByteTrack https://supervision.roboflow.com/latest/trackers/ as well.

  6. Did I miss any links? Whenever we use a major roboflow class, we should link to its docs in https://supervision.roboflow.com/ and https://inference.roboflow.com/

Seems like a lot of changes - apologies for the hassle! However, we do wish to equip the devs with the exact tools they need to get their problems solved 馃槈

Let me know if you have any questions!

@SkalskiP
Copy link
Collaborator

@LinasKo, thanks for putting the list together! 馃檹馃徎

@LinasKo
Copy link
Collaborator

LinasKo commented May 28, 2024

Hey @onuralpszr,

How's it going? Got any updates for this one?

I know you're pretty busy so no rush 馃檪

@onuralpszr
Copy link
Collaborator Author

Hey @onuralpszr,

How's it going? Got any updates for this one?

I know you're pretty busy so no rush 馃檪

Check JSON for development in URL, for update html in commit I will do as last, Don't worry I know that problem ;) I will also remove "loading bar" I am very well aware that problem in the beginning ;)

I added docs and changes most of it. I wrote complete convert from json to detections

check only json please, I can do docs changes and others on CSV afterwards (%99 percent same)

Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
@onuralpszr
Copy link
Collaborator Author

@LinasKo I updated both colabs did some changes please check when you had time, thank you.

@LinasKo
Copy link
Collaborator

LinasKo commented May 29, 2024

Will do, thank you @onuralpszr !

@LinasKo
Copy link
Collaborator

LinasKo commented May 29, 2024

Hi @onuralpszr,

Thank you very much for the adjustments! It looks much better now, and I think as a user I'd be able to find what I need. I still have a few requests:

  1. The block explaining the inference model and parameters could be made more clear. For example:

"""
Here's the parameters you will need to set the model, configure InferencePipeline and store the results in a file:

  • SOURCE_VIDEO_PATH - the path to the input video
  • CONFIDENCE_THRESHOLD - do not include detections below this confidence level
  • IOU_THRESHOLD - do not include detections that overlap with other by more than this ratio
  • FILE_NAME - write the json output to this file
  • INFERENCE_MODEL - model id, for the model you wish to use. This can be a model alias, a fine-tuned model or a model from the Universe.

"""

  1. The details on ByteTrack are scarce.

"""
ByteTrack tracks the detections, checking which are new, and computing trajectories of those seen before.
"""

  1. with sv.CSVSink context manager is easy to use and promoted in https://supervision.roboflow.com/0.20.0/detection/tools/save_detections/#supervision.detection.tools.csv_sink.CSVSink - that's something we'd like to promote here too. The order could be:

    1. Define callback
    2. create inference pipeline
    3. with sv.CSVSink with pipeline commands inside
  2. "It will also be created in your current directory with the name detections.csv as well." - repetition. (a mistake I frequently do 馃槃)

  3. There should never be None inside Detections variables. A less concise, but better way would be to check if all values read from json / csv are None or all are not empty-string, and then set the variables. We could raise ValueError here. Even if it is a small single-use function, we can nudge the users towards a safer approach.

  4. Here I'm less certain: I like that you made the csv_to_detections work without extra dependencies like pandas. However, pandas is very popular when parsing CSV, and it can handle json with read_json. Is there a more elegant solution with pandas? Or do you think we shouldn't use it here?

  5. "A frame from video (Before Annotate)" -> "A frame from the video (before annotation)"
    "Annotate Image with Detections" -> "Annotate Frame with Detections"

I believe that's everything. Thanks again of the awesome work that you do!

@onuralpszr
Copy link
Collaborator Author

@LinasKo could you explain 6 again ? I think you meant "json" one for using nothing extra like "pandas" ?

@LinasKo
Copy link
Collaborator

LinasKo commented May 29, 2024

Certainly.

I see now, CSV example does use pandas. In that case, I suggest the following edit:

xyxy = np.array([[row["x_min"], row["y_min"], row["x_max"], row["y_max"]] for row in data]) can be rewritten as:
data[['x_min', 'y_min', 'x_max', 'y_max']].to_numpy(). Other rows as well, albeit we need to check and set, for example, class_id=None if all values are '' (empty string).

For the JSON example, I am torn - I'm not sure if we should try using pandas.from_json, or keep it as it is, and only modify to set variables such as class_id to None if all entries are empty.

@onuralpszr
Copy link
Collaborator Author

Certainly.

I see now, CSV example does use pandas. In that case, I suggest the following edit:

xyxy = np.array([[row["x_min"], row["y_min"], row["x_max"], row["y_max"]] for row in data]) can be rewritten as: data[['x_min', 'y_min', 'x_max', 'y_max']].to_numpy(). Other rows as well, albeit we need to check and set, for example, class_id=None if all values are '' (empty string).

For the JSON example, I am torn - I'm not sure if we should try using pandas.from_json, or keep it as it is, and only modify to set variables such as class_id to None if all entries are empty.

Json doesn't use any pandas at all it just numpy and numpy is already come with sv :) I am confused Where is pandas ? it is "np" not "pd" :)

@onuralpszr
Copy link
Collaborator Author

onuralpszr commented May 29, 2024

Certainly.

I see now, CSV example does use pandas. In that case, I suggest the following edit:

xyxy = np.array([[row["x_min"], row["y_min"], row["x_max"], row["y_max"]] for row in data]) can be rewritten as: data[['x_min', 'y_min', 'x_max', 'y_max']].to_numpy(). Other rows as well, albeit we need to check and set, for example, class_id=None if all values are '' (empty string).

Let me clean up bit more okay, but I am kinda close to decide leave this PR as it is and open a new one and add actual "parse json/csv to sv.Detections" and add those into notebook :)

@LinasKo
Copy link
Collaborator

LinasKo commented May 29, 2024

Yeah, I know it feels like we ought to have it, right? 馃檪

Let's not be too hasty though. Let's see whether it's needed first.
If people start asking about it, if we see a few discussion questions - then we can think about extending supervision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc:sv:notebooks Supervision Notebook Related Updates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants