Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibilitar perguntas de dados em Linguagem Natural #324

Open
diogommartins opened this issue Mar 7, 2023 · 0 comments
Open

Possibilitar perguntas de dados em Linguagem Natural #324

diogommartins opened this issue Mar 7, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@diogommartins
Copy link
Contributor

Interfaces envolvidas

  • amora.questions.Question.from_natural_language_prompt(prompt: str) -> Question: Classmethod para criação de uma Data Question a partir de uma pergunta em linguagem natural
  • amora.questions.Question.to_model(model_config: Optional[ModelConfig]) -> AmoraModel: Gera um AmoraModel a partir de uma data question
  • amora.ai.prompt_context() -> str: Retorna contexto de prompt considerando o schema dos modelos do AMORA_PROJECT_PATH
  • amora.ai.SQLPromptAnswer(completion: str, request_params: dict, question: Question):
  • amora.ai.sql_translate(human_question: str) -> SQLPromptAnswer:

Exemplos

{ "prompt": "What are the available devices?", "completion": "DISTINCT device FROM `health`" }
{ "prompt": "What is the maximum heart rate observed today?", "completion": "MAX value FROM `heart_rate` WHERE DATE(creationDate) = CURRENT_DATE()" }
{ "prompt": "When was the maximum heart rate observed today?", "completion": "creationDate FROM `heart_rate` WHERE DATE(creationDate) = CURRENT_DATE() ORDER BY value DESC" }
{ "prompt": "How many steps where given today?", "completion": "SUM(value) FROM `diogo.steps` WHERE DATE(creationDate) = CURRENT_DATE()" }
{ "prompt": "How many steps where walked yesterday?", "completion": "SUM(value) FROM `apolo.steps` WHERE DATE(creationDate) = DATE_DIFF(CURRENT_DATE(), 1 DAY)" } 
{ "prompt": "What's the average hours of sleep per night?", "completion": "AVG() FROM `sleep` WHERE DATE(creationDate) >= DATE_DIFF(CURRENT_DATE(), 7 DAY) GROUP BY DATE..." }

Prompt em inglês:

### BigQuery SQL tables, with their properties:
#
# amora-data-build-tool.amora.steps(id,sourceName,unit,value,device,creationDate,startDate,endDate)
# amora-data-build-tool.amora.heart_rate_agg(avg,sum,count,year,month)
# amora-data-build-tool.amora.health(id,type,sourceName,sourceVersion,unit,value,device,creationDate,startDate,endDate)
# amora-data-build-tool.amora.array_repeated_fields(str_arr,int_arr,id)
# amora-data-build-tool.amora.step_count_by_source(value_avg,value_sum,value_count,source_name,event_timestamp)
# amora-data-build-tool.amora.edge(from_node,to_node)
# amora-data-build-tool.amora.heart_rate_over_100(unit,value,creationDate,id)
# amora-data-build-tool.amora.heart_rate(id,sourceName,unit,value,device,creationDate,startDate,endDate)
# amora-data-build-tool.amora.steps_agg(avg,sum,count,year,month)
#
### A query to answer 'What are the available health devices?'
SELECT

Completion

{
  "id": "cmpl-6mPmvIIpWAmcB2te9qAQHMhEp13V3",
  "object": "text_completion",
  "created": 1676996893,
  "model": "code-davinci-002",
  "choices": [
    {
      "text": " DISTINCT sourceName FROM `amora-data-build-tool.amora.health`\n",
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 335,
    "completion_tokens": 23,
    "total_tokens": 358
  }
}

Prompt em português

### BigQuery SQL tables, with their properties:
#
# amora-data-build-tool.amora.steps(id,sourceName,unit,value,device,creationDate,startDate,endDate)
# amora-data-build-tool.amora.heart_rate_agg(avg,sum,count,year,month)
# amora-data-build-tool.amora.health(id,type,sourceName,sourceVersion,unit,value,device,creationDate,startDate,endDate)
# amora-data-build-tool.amora.array_repeated_fields(str_arr,int_arr,id)
# amora-data-build-tool.amora.step_count_by_source(value_avg,value_sum,value_count,source_name,event_timestamp)
# amora-data-build-tool.amora.edge(from_node,to_node)
# amora-data-build-tool.amora.heart_rate_over_100(unit,value,creationDate,id)
# amora-data-build-tool.amora.heart_rate(id,sourceName,unit,value,device,creationDate,startDate,endDate)
# amora-data-build-tool.amora.steps_agg(avg,sum,count,year,month)
#
### A query to answer 'Qual o total de passos dados até hoje?'
SELECT

Completion:

{
  "id": "cmpl-6mPhmoWY7sww6sjmarpghTU2Jwzii",
  "object": "text_completion",
  "created": 1676996574,
  "model": "code-davinci-002",
  "choices": [
    {
      "text": " SUM(value) AS total_steps\nFROM `amora-data-build-tool.amora.steps`\nWHERE sourceName = 'HKQuantityTypeIdentifierStepCount'\n\n",
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 341,
    "completion_tokens": 42,
    "total_tokens": 383
  }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant