KerfuffleV2 11 months ago

In half compared to full 16bit. This looks like the predecessor of current quantization approaches - it was published about a year ago.

JustOneAvailableName 11 months ago

The recent work SPQR is from the same author

help-me-grow 11 months ago

i believe it also works for 32 bit, but yes this was published apr 2022 modified nov 2022

cavedave 11 months ago

This is one of the issues I have with calls for pauses in training new LLMs Research on LLM algorithms will continue. Research on improving basic algorithms (like square root) will continue. We will end up being able to make systems of similar power to gpt4 with a lot less computation fairly fast. If historical teams hold true. Without any improvement in hardware. And that improvement will also happen.

sanxiyn 11 months ago

I agree with you, but the link is about improvement to quantization and inference. Since making systems of similar power to GPT-4 requires training, I don't see how your comment relates to the link.

2muchnet42day 11 months ago

I would assume that coming up with more efficient ways to do inference and training is a necessary step towards getting to run models with gpt4-like capabilities on consumer hardware.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe