If you want to train on a personal server instead of using Kaggle notebooks, you need to download all the Kaggle datasets to the server. If you’re only using the train/test files provided by the competition, you don’t strictly need the Kaggle API.
But if you want to run the various code snippets posted in the discussion, you’ll need to download a lot of datasets. It’s tedious and time-consuming. I wrote a shell script using the Kaggle API to batch-download everything, and it made things much easier.
kaggle datasets download -d kishalmandal/extra-datakaggle competitions download -c chaii-hindi-and-tamil-question-answeringkaggle datasets download -d kishalmandal/cleaned-data-for-chaiikaggle datasets download -d kishalmandal/inputkaggle datasets download -d msafi04/squad-translated-to-tamil-for-chaii
files=("extra-data" "cleaned-data-for-chaii" "input" "squad-translated-to-tamil-for-chaii" "chaii-hindi-and-tamil-question-answering")for i in "${files[@]}"; do unzip $i".zip" -d "$i;done