Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General practices to convert existing SQLAlchemy tables into SQLModel #521

Open
8 tasks done
priamai opened this issue Jan 2, 2023 · 6 comments
Open
8 tasks done
Labels
question Further information is requested

Comments

@priamai
Copy link

priamai commented Jan 2, 2023

First Check

  • I added a very descriptive title to this issue.
  • I used the GitHub search to find a similar issue and didn't find it.
  • I searched the SQLModel documentation, with the integrated search.
  • I already searched in Google "How to X in SQLModel" and didn't find any information.
  • I already read and followed all the tutorial in the docs and didn't find an answer.
  • I already checked if it is not related to SQLModel but to Pydantic.
  • I already checked if it is not related to SQLModel but to SQLAlchemy.

Commit to Help

  • I commit to help with one of those options 👆

Example Code

class MetricBase(SQLModel):
    id: Optional[int] = Field(default=None, primary_key=True)
    fact_name: str
    dimensions: Optional[List] = Field(default=[])
    measures: Optional[List] = Field(default=[])
    params: Optional[Dict]

Description

It would be nice to show a few examples about how to model arrays and json SQL columns.
In general what principles should I follow to convert from a SQLAlchemy Table definition?

Operating System

Linux

Operating System Details

Ubuntu 21.0

SQLModel Version

0.0.8

Python Version

3.8.10

Additional Context

For example I am trying to convert this existing table:


from sqlalchemy import Column, Integer, String, DateTime, BigInteger, SmallInteger,LargeBinary, ForeignKey, Table, Float,Boolean
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.dialects.postgresql import JSONB,JSON, ARRAY

class Metrics(Base):
    __tablename__ = 'metrics'
    __table_args__ = {'extend_existing':True}
    
    # billing item id
    id = Column(BigInteger, autoincrement=True, primary_key=True)

    # the fact name
    fact_name = Column(String,nullable=False)
    dimensions = Column(ARRAY(String))
    measures = Column(ARRAY(String))
    sql_query = Column(String)
    rest_query = Column(String)
    params_query = Column(JSON)
    chart_types = Column(JSON)
    chart_title = Column(String)
@priamai priamai added the question Further information is requested label Jan 2, 2023
@meirdev
Copy link

meirdev commented Jan 2, 2023

In general, you can use any column type from sqlalchemy by specifying it in sa_column:

class MetricBase(SQLModel):
    id: Optional[int] = Field(default=None, primary_key=True)
    fact_name: str
    dimensions: Optional[List] = Field(default_factory=list, sa_column=Column(ARRAY(String)))
    measures: Optional[List] = Field(default_factory=list, sa_column=Column(ARRAY(String)))
    params: Optional[Dict] = Field(default_factory=dict, sa_column=Column(JSON))

@priamai
Copy link
Author

priamai commented Jan 2, 2023

Hello thanks,
that makes sense, what will happen if the user gets confused.
Stupid example:

class MetricBase(SQLModel):
    id: Optional[int] = Field(default=None, primary_key=True)
    fact_name: str
    dimensions: Optional[List[int]] = Field(default_factory=list, sa_column=Column(ARRAY(String)))
    measures: Optional[List[int]] = Field(default_factory=list, sa_column=Column(ARRAY(String)))
    params: Optional[Dict] = Field(default_factory=dict, sa_column=Column(JSON))

I am actually testing it now...

@meirdev
Copy link

meirdev commented Jan 2, 2023

This will cause confusion between the DB and your API (if you're using fastapi for example, you won't be able to send non-int values) but it works manually.

works:

metric = Metric()
metric.fact_name = "test"
metric.dimensions = ["a", "b", "c"]  # works!
metric.measures = [4, 5, 6]
metric.params = {"foo": "bar"}

session.add(metric)
session.commit()

not works:

app = FastAPI()

@app.post("/")
async def test_post(metric: Metric):
    return metric
POST /
{
  "id": 0,
  "fact_name": "test",
  "dimensions": [
    "a",
    "b",
    "c"
  ],
  "measures": [
    4,
    5,
    6
  ],
  "params": {
    "foo": "bar"
  }
}
{
  "detail": [
    {
      "loc": [
        "body",
        "dimensions",
        0
      ],
      "msg": "value is not a valid integer",
      "type": "type_error.integer"
    },
    {
      "loc": [
        "body",
        "dimensions",
        1
      ],
      "msg": "value is not a valid integer",
      "type": "type_error.integer"
    },
    {
      "loc": [
        "body",
        "dimensions",
        2
      ],
      "msg": "value is not a valid integer",
      "type": "type_error.integer"
    }
  ]
}

@priamai
Copy link
Author

priamai commented Jan 2, 2023

Hello, thanks this makes total sense now.
What happens when you don't specify the Field column, will the backend auto infer the nearest possible SQL column type?
Also is it best practice to specify both default_factory (pydantic) and default for SA?
Thanks again for the spoon feeding!

@priamai
Copy link
Author

priamai commented Jan 2, 2023

Responding to my own question I get this error:

ValueError: cannot specify both default and default_factory

@meirdev
Copy link

meirdev commented Jan 3, 2023

You can find all conversions between python types and sqlalchemy types here:

sqlmodel/sqlmodel/main.py

Lines 374 to 414 in 7b3148c

def get_sqlalchemy_type(field: ModelField) -> Any:
if issubclass(field.type_, str):
if field.field_info.max_length:
return AutoString(length=field.field_info.max_length)
return AutoString
if issubclass(field.type_, float):
return Float
if issubclass(field.type_, bool):
return Boolean
if issubclass(field.type_, int):
return Integer
if issubclass(field.type_, datetime):
return DateTime
if issubclass(field.type_, date):
return Date
if issubclass(field.type_, timedelta):
return Interval
if issubclass(field.type_, time):
return Time
if issubclass(field.type_, Enum):
return sa_Enum(field.type_)
if issubclass(field.type_, bytes):
return LargeBinary
if issubclass(field.type_, Decimal):
return Numeric(
precision=getattr(field.type_, "max_digits", None),
scale=getattr(field.type_, "decimal_places", None),
)
if issubclass(field.type_, ipaddress.IPv4Address):
return AutoString
if issubclass(field.type_, ipaddress.IPv4Network):
return AutoString
if issubclass(field.type_, ipaddress.IPv6Address):
return AutoString
if issubclass(field.type_, ipaddress.IPv6Network):
return AutoString
if issubclass(field.type_, Path):
return AutoString
if issubclass(field.type_, uuid.UUID):
return GUID
raise ValueError(f"The field {field.name} has no matching SQLAlchemy type")

It's preferred to use default_factory when you're dealing with mutable types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants