Commit Graph

1031 Commits

Author SHA1 Message Date
Alex "mcmonkey" Goodwin
566898a79a initial lora training tab 2023-03-25 12:08:26 -07:00
oobabooga
8c8e8b4450 Fix the early stopping callback #559 2023-03-25 12:35:52 -03:00
oobabooga
a1f12d607f Merge pull request #538 from Ph0rk0z/display-input-context
Add display of context when input was generated
2023-03-25 11:56:18 -03:00
oobabooga
70f9565f37 Update README.md 2023-03-25 02:35:30 -03:00
oobabooga
25be9698c7 Fix LoRA on mps 2023-03-25 01:18:32 -03:00
oobabooga
3da633a497 Merge pull request #529 from EyeDeck/main
Allow loading of .safetensors through GPTQ-for-LLaMa
2023-03-24 23:51:01 -03:00
oobabooga
9fa47c0eed Revert GPTQ_loader.py (accident) 2023-03-24 19:57:12 -03:00
oobabooga
a6bf54739c Revert models.py (accident) 2023-03-24 19:56:45 -03:00
oobabooga
0a16224451 Update GPTQ_loader.py 2023-03-24 19:54:36 -03:00
oobabooga
a80aa65986 Update models.py 2023-03-24 19:53:20 -03:00
oobabooga
507db0929d Do not use empty user messages in chat mode
This allows the bot to send messages by clicking on Generate with empty inputs.
2023-03-24 17:22:22 -03:00
oobabooga
6e1b16c2aa Update html_generator.py 2023-03-24 17:18:27 -03:00
oobabooga
ffb0187e83 Update chat.py 2023-03-24 17:17:29 -03:00
oobabooga
c14e598f14 Merge pull request #433 from mayaeary/fix/api-reload
Fix api extension duplicating
2023-03-24 16:56:10 -03:00
oobabooga
bfe960731f Merge branch 'main' into fix/api-reload 2023-03-24 16:54:41 -03:00
oobabooga
4a724ed22f Reorder imports 2023-03-24 16:53:56 -03:00
oobabooga
8fad84abc2 Update extensions.py 2023-03-24 16:51:27 -03:00
oobabooga
d8e950d6bd Don't load the model twice when using --lora 2023-03-24 16:30:32 -03:00
oobabooga
fd99995b01 Make the Stop button more consistent in chat mode 2023-03-24 15:59:27 -03:00
Forkoz
b740c5b284 Add display of context when input was generated
Not sure if I did this right but it does move with the conversation and seems to match value.
2023-03-24 08:56:07 -05:00
oobabooga
4f5c2ce785 Fix chat_generation_attempts 2023-03-24 02:03:30 -03:00
oobabooga
04417b658b Update README.md 2023-03-24 01:40:43 -03:00
oobabooga
bb4cb22453 Download .pt files using download-model.py (for 4-bit models) 2023-03-24 00:49:04 -03:00
oobabooga
143b5b5edf Mention one-click-bandaid in the README 2023-03-23 23:28:50 -03:00
EyeDeck
dcfd866402 Allow loading of .safetensors through GPTQ-for-LLaMa 2023-03-23 21:31:34 -04:00
oobabooga
8747c74339 Another missing import 2023-03-23 22:19:01 -03:00
oobabooga
7078d168c3 Missing import 2023-03-23 22:16:08 -03:00
oobabooga
d1327f99f9 Fix broken callbacks.py 2023-03-23 22:12:24 -03:00
oobabooga
9bdb3c784d Minor fix 2023-03-23 22:02:40 -03:00
oobabooga
b0abb327d8 Update LoRA.py 2023-03-23 22:02:09 -03:00
oobabooga
bf22d16ebc Clear cache while switching LoRAs 2023-03-23 21:56:26 -03:00
oobabooga
4578e88ffd Stop the bot from talking for you in chat mode 2023-03-23 21:38:20 -03:00
oobabooga
9bf6ecf9e2 Fix LoRA device map (attempt) 2023-03-23 16:49:41 -03:00
oobabooga
c5ebcc5f7e Change the default names (#518)
* Update shared.py

* Update settings-template.json
2023-03-23 13:36:00 -03:00
oobabooga
29bd41d453 Fix LoRA in CPU mode 2023-03-23 01:05:13 -03:00
oobabooga
eac27f4f55 Make LoRAs work in 16-bit mode 2023-03-23 00:55:33 -03:00
oobabooga
bfa81e105e Fix FlexGen streaming 2023-03-23 00:22:14 -03:00
oobabooga
7b6f85d327 Fix markdown headers in light mode 2023-03-23 00:13:34 -03:00
oobabooga
de6a09dc7f Properly separate the original prompt from the reply 2023-03-23 00:12:40 -03:00
oobabooga
d5fc1bead7 Merge pull request #489 from Brawlence/ext-fixes
Extensions performance & memory optimisations
2023-03-22 16:10:59 -03:00
oobabooga
bfb1be2820 Minor fix 2023-03-22 16:09:48 -03:00
oobabooga
0abff499e2 Use image.thumbnail 2023-03-22 16:03:05 -03:00
oobabooga
104212529f Minor changes 2023-03-22 15:55:03 -03:00
wywywywy
61346b88ea Add "seed" menu in the Parameters tab 2023-03-22 15:40:20 -03:00
Φφ
5389fce8e1 Extensions performance & memory optimisations
Reworked remove_surrounded_chars() to use regular expression ( https://regexr.com/7alb5 ) instead of repeated string concatenations for elevenlab_tts, silero_tts, sd_api_pictures. This should be both faster and more robust in handling asterisks.

Reduced the memory footprint of send_pictures and sd_api_pictures by scaling the images in the chat to 300 pixels max-side wise. (The user already has the original in case of the sent picture and there's an option to save the SD generation).
This should fix history growing annoyingly large with multiple pictures present
2023-03-22 11:51:00 +03:00
oobabooga
45b7e53565 Only catch proper Exceptions in the text generation function 2023-03-20 20:36:02 -03:00
oobabooga
6872ffd976 Update README.md 2023-03-20 16:53:14 -03:00
oobabooga
db4219a340 Update comments 2023-03-20 16:40:08 -03:00
oobabooga
7618f3fe8c Add -gptq-preload for 4-bit offloading (#460)
This works in a 4GB card now:

```
python server.py --model llama-7b-hf --gptq-bits 4 --gptq-pre-layer 20
```
2023-03-20 16:30:56 -03:00
Vladimir Belitskiy
e96687b1d6 Do not send empty user input as part of the prompt.
However, if extensions modify the empty prompt to be non-empty,
it'l still work as before.
2023-03-20 14:27:39 -04:00