Hey I actually had to do this exact thing once, I found that I had to get fairly manual with it with different expressions based on the frame number.
I'm on at my phone, so I'm not 100% sure about this and will correct it later, but for example, for a seconds and 30ths of a second counter (assuming you're replicating an NTSC display) at 24 fps:
[floor((frame/24)%60)]:[(frame%24)/24*30]
You most likely don't need to worry about dynamic hours and minutes (unless it's a really long shot!) and can simply enter them as numbers. Offset the frames to start from a specific time
Again, that expression probably isn't correct, but hopefully it gives you some ideas!