News

In getting rid of matrix multiplication and running their algorithm on custom hardware, the researchers found that they could power a billion-parameter-scale language model on just 13 watts, about ...