Your attempt is not as good as it could've been because of the main reason that the part where you zoom in doesn't match with the secondary clip. Say for example you zoomed on into this emoticon
but your next scene that became zoomed was this
instead. It wouldn't work.
Also on a technical aspect, using pan control in sony vegas is the conventional method of doing it if that is what you are using only. By playing around the zoom blurs, motion blurs and using masking to blur particular parts only, you can create a much smoother transition.
Hope this helps.