val223•18d ago

I thought fine-tuning a small model would be a weekend project. It took three weeks.

I needed a model to sort support tickets into specific categories for a client project. I figured I could grab a small open-source model and fine-tune it with my dataset of about 5,000 examples. How hard could it be? The first week was just getting the data cleaned and formatted correctly, which was way messier than I thought. Then I hit the wall with compute costs on a cloud platform; my initial budget of $50 was gone in two days with almost no progress. I had to switch to a different training method entirely, which meant learning a new framework from scratch. The actual training and testing loop, with all the failed runs, ate up the rest of the time. Has anyone else been totally wrong about how long a 'simple' AI tuning job would take? What was your time-sink?

2 comments

2 Comments

kevin_flores18d ago

Oh man, that sounds painfully familiar. The data cleaning stage ALWAYS takes ten times longer than you budget for. My big time sink was dealing with weird memory errors during training that made no sense for days.

rubyw8318d ago

Tell me about it, @kevin_flores. It's like a law of the universe that the boring prep work takes forever. I see this everywhere, like when you spend an hour looking for a screwdriver for a five minute fix. Why does the setup always eat up the time?