diff --git a/README.md b/README.md
index 8094c3d..c4994d5 100644
--- a/README.md
+++ b/README.md
@@ -74,11 +74,11 @@ Should run without errors.
 If using option 2 (existing build), ensure it was compiled with shared libraries:
 
 ```bash
-cmake -B build -DBUILD_SHARED_LIBS=ON -DGGML_CUDA=ON  # or -DGGML_HIP=ON / -DGGML_VULKAN=1
+cmake -B build -DBUILD_SHARED_LIBS=ON # Add your custom build options
 cmake --build build --config Release -j$(nproc)
 ```
 
-The build must contain `libllama.so` (typically at `build/libllama.so`).
+The build must contain `libllama.so` (typically at `build/bin/libllama.so`).
 
 ## Scripts
 
@@ -128,7 +128,7 @@ Fine-tunes a model using Unsloth with LoRA adapters. Saves LoRA adapter to `./mo
 | `LEARNING_RATE` | Training learning rate | `2e-4` |
 | `MAX_LENGTH` | Maximum sequence length | `4096` |
 | `TRAIN_EPOCHS` | Number of training epochs | `1` |
-| `model_name` (line 74) | Base model to fine-tune | `"unsloth/Llama-3.2-3B-Instruct"` |
+| `model_name` (line 74) | Base model to fine-tune | `"Qwen/Qwen3.5-2B""` |
 
 ```bash
 bash scripts/finetune.sh
@@ -190,10 +190,11 @@ Common issues:
 
 ### Out of memory during training
 
-- Reduce `BATCH_SIZE` in `finetune.py`
-- Increase `GRADIENT_ACCUMULATION_STEPS` to compensate
+- Reduce `BATCH_SIZE` in `finetune.py` (lower = less VRAM usage)
+- Increase `GRADIENT_ACCUMULATION_STEPS` to compensate (higher = longer finetuning time)
+- `EFFECTIVE_BATCH_SIZE` = `BATCH_SIZE * GRADIENT_ACCUMULATION_STEPS` = 16+
 - Reduce `MAX_LENGTH` to fit shorter sequences
-- Set `load_in_4bit=True` in `finetune.py` (line 77)
+- Set `load_in_4bit=True` in `finetune.py` (line 77) for QLoRA
 
 ### llama-cpp-python install fails