catalpaaa
|
4ab679480e
allow quantized model to be loaded from model dir (#760)
|
2 anos atrás |
oobabooga
|
3a47a602a3
Detect ggml*.bin files automatically
|
2 anos atrás |
oobabooga
|
4c27562157
Minor changes
|
2 anos atrás |
Thomas Antony
|
79fa2b6d7e
Add support for alpaca
|
2 anos atrás |
Thomas Antony
|
7745faa7bb
Add llamacpp to models.py
|
2 anos atrás |
oobabooga
|
1cb9246160
Adapt to the new model names
|
2 anos atrás |
oobabooga
|
53da672315
Fix FlexGen
|
2 anos atrás |
oobabooga
|
ee95e55df6
Fix RWKV tokenizer
|
2 anos atrás |
oobabooga
|
fde92048af
Merge branch 'main' into catalpaaa-lora-and-model-dir
|
2 anos atrás |
oobabooga
|
49c10c5570
Add support for the latest GPTQ models with group-size (#530)
|
2 anos atrás |
catalpaaa
|
b37c54edcf
lora-dir, model-dir and login auth
|
2 anos atrás |
oobabooga
|
a6bf54739c
Revert models.py (accident)
|
2 anos atrás |
oobabooga
|
a80aa65986
Update models.py
|
2 anos atrás |
oobabooga
|
ddb62470e9
--no-cache and --gpu-memory in MiB for fine VRAM control
|
2 anos atrás |
oobabooga
|
e26763a510
Minor changes
|
2 anos atrás |
Wojtek Kowaluk
|
7994b580d5
clean up duplicated code
|
2 anos atrás |
Wojtek Kowaluk
|
30939e2aee
add mps support on apple silicon
|
2 anos atrás |
oobabooga
|
ee164d1821
Don't split the layers in 8-bit mode by default
|
2 anos atrás |
oobabooga
|
e085cb4333
Small changes
|
2 anos atrás |
awoo
|
83cb20aad8
Add support for --gpu-memory witn --load-in-8bit
|
2 anos atrás |
oobabooga
|
1c378965e1
Remove unused imports
|
2 anos atrás |
oobabooga
|
66256ac1dd
Make the "no GPU has been detected" message more descriptive
|
2 anos atrás |
oobabooga
|
265ba384b7
Rename a file, add deprecation warning for --load-in-4bit
|
2 anos atrás |
Ayanami Rei
|
8778b756e6
use updated load_quantized
|
2 anos atrás |
Ayanami Rei
|
e1c952c41c
make argument non case-sensitive
|
2 anos atrás |
Ayanami Rei
|
3c9afd5ca3
rename method
|
2 anos atrás |
Ayanami Rei
|
edbc61139f
use new quant loader
|
2 anos atrás |
oobabooga
|
65dda28c9d
Rename --llama-bits to --gptq-bits
|
2 anos atrás |
oobabooga
|
fed3617f07
Move LLaMA 4-bit into a separate file
|
2 anos atrás |
draff
|
001e638b47
Make it actually work
|
2 anos atrás |