Zafer Doğan - Random feature model on reducing the attention cost in transformers
Türkiye, Istanbul
Study location | Türkiye, Istanbul |
---|---|
Type | Summer Research Program - Graduate, full-time |
Language requirements | English |
---|
Other requirements | At least 2 reference(s) must be provided. |
---|
Overview
The proposed Random Feature Attention (RFA) introduces a linear time and space attention mechanism to address the efficiency challenges associated with conventional softmax attention in transformers. By leveraging random feature methods to approximate the softmax function, RFA offers a more scalable alternative for processing long sequences. Here, our goal is to characterize the training and the generalization performance of this model under some universality constraints.