Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validate_asset_path should not "leak" exception in case of user's regex misspecification #1913

Open
yarikoptic opened this issue Mar 30, 2024 · 8 comments

Comments

@yarikoptic
Copy link
Member

Look at SEntry report from today (says id ee7345f8563743bfb9f288c1fbea145e) where @TheChymera (congrats! Achievement unlocked ;) ) managed to make SEntry to react and log traceback

  File "dandiapi/api/views/asset.py", line 316, in create                                                                                                                           
    serializer.is_valid(raise_exception=True)                                                                                                                                       
  File "dandiapi/api/views/asset.py", line 220, in validate                                                                                                                         
    validate_asset_path(data['metadata']['path'])                                                                                                                                   
  File "dandiapi/api/models/asset.py", line 46, in validate_asset_path                                                                                                              
    if not re.match(ASSET_PATH_REGEX, path):                                                                                                                                        
  File "__init__.py", line 166, in match                                                                                                                                            
    return _compile(pattern, flags).match(string) 

so feels like that invocation should be guarded and some correct HTTP error code be returned instead of server "blowing up" with some 5xx error (I guess that is what happened, @TheChymera might be able to confirm)

@TheChymera
Copy link

This is the tail of the invocation log on my end. The error code is 400, not 5xx

sub-M481/ses-20240115165152/ephys/channels.tsv     15.9 kB         0                   done
sub-M481/ses-20240115165152/ephys/contacts.tsv     8.1 kB          0                   done
sub-M481/ses-20240115165152/ephys/probes.tsv       240 Bytes       0              100% done
...152/ephys/sub-M481_ses-20240115165152_ephys.nwb 18.6 GB         1                   ERROR                 failed validation
sub-M481/sessions.json                             2 Bytes         0                   done
sub-M481/sessions.tsv                              69 Bytes        0              100% done
sub-M483/ses-20240122121030/ephys/channels.tsv     15.8 kB         0              100% done
sub-M483/ses-20240122121030/ephys/contacts.tsv     8.1 kB          0              100% done
sub-M483/ses-20240122121030/ephys/probes.tsv       240 Bytes       0                   done
...030/ephys/sub-M483_ses-20240122121030_ephys.nwb 18.6 GB         1                   ERROR                 failed validation
sub-M483/sessions.json                             2 Bytes         0                   done
sub-M483/sessions.tsv                              69 Bytes        0              100% done
Summary:                                           56.9 GB   3 with errors   23.1 kB/s 16 skipped            1 should be edited online
                                                                                       28273 done            15 file exists
                                                                                       676 ERROR             652 Error 400 while sending ...
                                                                                                             21 503 Server Error: Service...
                                                                                                             3 failed validation
2024-03-30 06:46:20,755 [ WARNING] One or more assets failed validation.  Consult the logfile for details.
2024-03-30 06:46:21,200 [    INFO] Logs saved in /home/chymera/.local/state/dandi-cli/log/20240329222222Z-3828524.log
Traceback (most recent call last):
  File "/home/chymera/src/dandi-cli/venvs/mydev/bin/dandi", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/chymera/src/dandi-cli/venvs/mydev/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/chymera/src/dandi-cli/venvs/mydev/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/chymera/src/dandi-cli/venvs/mydev/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/chymera/src/dandi-cli/venvs/mydev/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/chymera/src/dandi-cli/venvs/mydev/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/chymera/src/dandi-cli/venvs/mydev/lib/python3.11/site-packages/click/decorators.py", line 45, in new_func
    return f(get_current_context().obj, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/chymera/src/dandi-cli/dandi/cli/base.py", line 126, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/home/chymera/src/dandi-cli/dandi/cli/cmd_upload.py", line 103, in upload
    upload(
  File "/home/chymera/src/dandi-cli/dandi/upload.py", line 459, in upload
    raise upload_err
  File "/home/chymera/src/dandi-cli/dandi/upload.py", line 357, in process_path
    for r in dfile.iter_upload(
  File "/home/chymera/src/dandi-cli/dandi/files/bases.py", line 354, in iter_upload
    resp = client.post(
           ^^^^^^^^^^^^
  File "/home/chymera/src/dandi-cli/dandi/dandiapi.py", line 306, in post
    return self.request("POST", path, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/chymera/src/dandi-cli/dandi/dandiapi.py", line 276, in request
    raise requests.HTTPError(msg, response=result)
requests.exceptions.HTTPError: Error 400 while sending POST request to https://api.dandiarchive.org/api/uploads/initialize/: {"contentSize":["Ensure this value is greater than or equal to 1."]}

@yarikoptic
Copy link
Member Author

thank you! it is those 21 503 Server Error: Service . Please check details in that /home/chymera/.local/state/dandi-cli/log/20240329222222Z-3828524.log on what files got those 503 .
Then what 400 for upload initialization and why, and open issue or PR against dandi-cli as well with those details/fix.

@TheChymera
Copy link

TheChymera commented Apr 1, 2024

This is everything that got 503 → https://ppb.chymera.eu/aa98fa

@yarikoptic
Copy link
Member Author

yarikoptic commented Apr 1, 2024

@jwodder -- from above log could you figure out what specific asset/path triggrered those

1067640:2024-03-30T06:16:25-0400 [DEBUG   ] urllib3.connectionpool 3828524:140691300669120 https://api.dandiarchive.org:443 "POST /api/dandisets/000876/versions/draft/assets/ HTTP/1.1" 503 506
1067641:2024-03-30T06:16:25-0400 [WARNING ] dandi 3828524:140691300669120 Will retry: Error 503 while sending POST request to https://api.dandiarchive.org/api/dandisets/000876/versions/draft/assets/: <!DOCTYPE html>

if not -- could we (in dandi-cli) then log more there for POST , e.g. what path etc?

@jwodder
Copy link
Member

jwodder commented Apr 1, 2024

@yarikoptic

from above log could you figure out what specific asset/path triggrered those

Perhaps if I had the full log rather than a filtered-down version (CC @TheChymera).

could we (in dandi-cli) then log more there for POST , e.g. what path etc?

The logging of failed requests is endpoint-agnostic, so the only way to log the path would be to log the entire request body for all failed POSTs, which would quite often be too much information. (It's also conceivable that, when failing to create an asset in an embargoed Dandiset, this approach could end up logging NWB fields that should arguably be kept secret.)

@TheChymera
Copy link

@jwodder the whole log is over 200MB, uploaded it here → https://ppb.chymera.eu/471899.log

@jwodder
Copy link
Member

jwodder commented Apr 2, 2024

@yarikoptic @TheChymera The assets that produced the 503 were those whose paths contained a @ character, such as code/venvs/reposit/share/jupyter/labextensions/@jupyter-widgets/jupyterlab-manager/install.json.

@jwodder
Copy link
Member

jwodder commented Apr 2, 2024

I can reproduce the 503 error when just trying to create an asset with the same path as one of the failing ones, but if I try something simpler like foo@bar.txt or foo/@bar.txt, I get a 400 error with a "Path improperly formatted" error message instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants