Files
Deep_Learning_M502019B/LabFinal/train_tokenizer.sh
2025-09-21 02:59:47 +00:00

5 lines
199 B
Bash
Executable File

python src/tokenizer/train.py \
--dataset_path "data/raw_dataset/VirusComment.py" \
--template_tokenizer "BAAI/bge-reranker-v2-m3" \
--vocab_size 30000 \
--output_dir "data/tokenizer"