in-context examples with performance comparable to the optimal least squares estimator
Voting ended over 3 years agoSucceeded
We also show that we can train Transformers to in-context learn more complex function classes -- namely sparse linear functions, two-layer neural networks, and decision trees -- with pe