Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while writing annData #1143

Closed
3 tasks done
mari-ga opened this issue Sep 20, 2023 · 4 comments
Closed
3 tasks done

Error while writing annData #1143

mari-ga opened this issue Sep 20, 2023 · 4 comments
Labels

Comments

@mari-ga
Copy link

mari-ga commented Sep 20, 2023

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of anndata.
  • (optional) I have confirmed this bug exists on the master branch of anndata.

Report

Hi Everyone,
First of all, thank you very much for this wonderful tool.
I'm reopening an issue, which seems to have affected other users. Such as https://github.com/scverse/anndata/issues/726.
I created an Anndata object based on RNA and HTO data for hashing demultiplexing. No problem there, however when I try to to add the results from demultiplexing I run in the already seen:

TypeError: Can't implicitly convert non-string objects to strings

Above error raised while writing key 'bff' of <class 'h5py._hl.group.Group'> to /

I opened a csv file as a dataframe, which contains the results from demultiplexing with BFF. The results look like this:
Barcode | BFF
TTTGTCATCTTACCTA-1 | negative
TTTGTCATCTTAGAGC-1 | negative
TTTGTCATCTTAGCCC-1 | HTO_BBX-1
TTTGTCATCTTATCTG-1 | negative
TTTGTCATCTTCAACT-1 | negative
TTTGTCATCTTCATGT-1 | negative
TTTGTCATCTTCCTTC-1 | negativ

Which is pretty standard for demultiplexing results.

The code throwing this error is:
Code:

import scanpy as sc
import pandas as pd
from mudata import MuData
import numpy as np

rna_data = sc.read_10x_mtx("/path/raw/rna_raw_matrix")
data = pd.read_csv("bff_assignment.csv")
rna_data.obs = rna_data.obs.merge(data, left_index=True, right_index=True, how='left')
rna_data.obs.rename(columns={rna_data.obs.columns[0]: 'donor'}, inplace=True)
rna_data.obs.donor = rna_data.obs.donor.fillna("negative")
rna_data.obs.donor = rna_data.obs.donor.astype(str)

# code creating the error:
rna_data.write("/bff/adata.h5ad")

The error looks like this:

Traceback (most recent call last):
File "/Users/me/miniconda3/envs/summary/lib/python3.9/site-packages/h5py/_hl/dataset.py", line 166, in make_new_dset
    dset_id.write(h5s.ALL, h5s.ALL, data)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5d.pyx", line 283, in h5py.h5d.DatasetID.write
  File "h5py/_proxy.pyx", line 145, in h5py._proxy.dset_rw
  File "h5py/_conv.pyx", line 444, in h5py._conv.str2vlen
  File "h5py/_conv.pyx", line 95, in h5py._conv.generic_converter
  File "h5py/_conv.pyx", line 249, in h5py._conv.conv_str2vlen
TypeError: Can't implicitly convert non-string objects to strings


  File "/Users/ne/miniconda3/envs/summary/lib/python3.9/site-packages/anndata/_io/utils.py", line 229, in re_raise_error
    raise type(e)(
TypeError: Can't implicitly convert non-string objects to strings

Above error raised while writing key 'bff' of <class 'h5py._hl.group.Group'> to /

I would appreciate any assistance you could provide.
Thank you very much in advance!

Versions

anndata 0.9.2
mudata 0.2.3
numpy 1.24.4
pandas 2.0.3
scanpy 1.9.3
session_info 1.0.0

@c-westhoven
Copy link

c-westhoven commented Oct 4, 2023

#1068 and #1141 also have this issue

@ivirshup
Copy link
Member

ivirshup commented Oct 4, 2023

It looks like the problematic column (bff) isn't being converted to string, while your code is doing some work to cover the column "donor".

I would suggest trying something like: adata.obs["bff"] = adata.obs["bff"].astype("category") before writing.

@c-westhoven
Copy link

@ivirshup Is there a better way of doing this?
I'm testing functions that are new to me, so I don't always know whether columns are added to .obs/.var. And especially because some of the processing time is quite lengthy, it's quite annoying to have to re-run due to this TypeError.

Additionally just saving everything as a category isn't always useful, since some "object" type columns are actually int/float columns. So I need to go through each column and check whether they are better saved as categories or floats/int before saving.

@mari-ga
Copy link
Author

mari-ga commented Oct 10, 2023

@ivirshup Thank you very much it worked!

@mari-ga mari-ga closed this as completed Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants