A fleeting moment of joy. Vibe analytics.
Thank you very much for the supportive reactions to the previous post about the RAG competition.
A few words about the progress. On Sunday I spent quite a solid piece of time playing with the available data.
The idea of the competition is that you have an initial portion of data: 30 PDF documents and 100 questions. You are to prepare a pipeline which can digest these documents and prepare system to quickly answer questions.
I started with the questions. The prompt "figure out groups of questions with minimal size so questions in each group are as similar as possible" gave me 7 groups of questions.
Then "analyze these questions and figure out a relational scheme which allows us to address them in the most efficient way" gave me quite a complex relation scheme which was implemented in SQLite. Definitely it is not an architecture to win the contest, but it is a solid basis to create a golden set of answers with which we can compare answers of our quick RAG system.
Then "work without interruptions, create and check hypotheses, evaluate them, find proper answers to all questions in the contest, save explanations" gave me 100 answers. Short review of a few cases showed me that references to documents are quite solid and it totally makes sense to try this approach as the first submission.
There are several categories of questions. Some of them require a precise answer (number or list or name or boolean), some - free text. We have 1.0 score on precise answers and the nice value 0.693 on freeform answers.
I think it's a big step for our team and it's very rewarding to get 7th place in one day of work. Out of 144 members with some non-zero submissions and 300 registered teams.
