Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAFT Dataset #409

Open
hieudx149 opened this issue May 4, 2024 · 0 comments
Open

RAFT Dataset #409

hieudx149 opened this issue May 4, 2024 · 0 comments

Comments

@hieudx149
Copy link

Hello,

Your Paper RAFT is truly fantastic. It addresses the issues we've encountered when deploying generative AI applications for enterprise data. I have two questions regarding the data used in RAFT:

  1. How many data samples did you use to train LLama2?
  2. Could you provide me with some examples of real data you used to train LLama2? Specifically:
    • Example of data usage: P % of data: Q + D∗ + D2 + . . . + Dk → A∗
    • Example of data usage: (1 − P) % of data: Q + D1 + D2 + . . . + Dk → A∗ (when there are no 'oracle' documents, what would A* look like? I can't imagine if it's an answer like "no information to answer")

Answering these questions would greatly assist me in training our LLM. Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant