GBDTE. Culprit is found.

I started to dig really deep and, with the help of the iron friend, I checked several hypotheses. I found the optimal expression for the ideal model for this dataset. And this result is worth a separate publication. It was interesting to find out that my intuition about the linear model and linear lift was totally correct: in the dataset I constructed, the optimal model's score linearly depends on time.

Today I decided to compare all these theoretical results with one step of my model. You can see the picture at the beginning of the article. This time the problem is in vibecoding. The model decided that it's a good idea to put all features into an extra part of the dataset. It's quite obvious what to do next: separate them and test again. Stay tuned.