A Few Cautions When Fine-Tuning Decoder-Only Language Models (e.g. GPT-2)

William Briggs