mirror of
https://github.com/SakanaAI/doc-to-lora.git
synced 2026-04-25 08:06:22 +02:00
Doc-to-LoRA release
This commit is contained in:
commit
1abe8ae16d
92 changed files with 22131 additions and 0 deletions
25
scripts/main_exp/README.md
Normal file
25
scripts/main_exp/README.md
Normal file
|
|
@ -0,0 +1,25 @@
|
|||
# D2L pipeline
|
||||
### Data
|
||||
You can either download the generated data (recommended, ~100 GB for each model) or generate them by youself.
|
||||
Please see [`0-download_data.sh`](0-download_data.sh) for how to do model-specific data download.
|
||||
```bash
|
||||
# download training data for all three models (328GB)
|
||||
uv run bash scripts/main_exp/0-download_data.sh
|
||||
```
|
||||
|
||||
Generating data from scratch can take very long if not parallelized across multiple gpus.
|
||||
```bash
|
||||
# generate training data (takes very long if not parallelized across multiple gpus)
|
||||
# optional: use the command below for generating data from scratch
|
||||
# uv run bash scripts/main_exp/gen_data.sh
|
||||
```
|
||||
|
||||
### Training
|
||||
Simply run the training script once the data is ready.
|
||||
```bash
|
||||
# train
|
||||
uv run bash scripts/main_exp/1-train.sh
|
||||
```
|
||||
|
||||
### Evaluation
|
||||
All evaluation scripts for reproducing the main results in the paper are included in [eval](eval/) directory.
|
||||
Loading…
Add table
Add a link
Reference in a new issue