压缩就算是你脑力无边,也不行
因为你要predict,partial or variantional concept has to be detected
Hell Fast·
Deep networks = Compression + Association
Multi-Layer = Compression
Self-Attention = Association
Layers = semantic compression (turn raw data into abstract representations)
Attention = contextual association (relate parts to make sense of the whole)