regarding to camera zooms and positioning for different areas:
VS has a builtin plugin system where you could create some new Actions which could handle that.
As VS is build i doubt that the top-to-bottom order to create cutscenes will change, but all in all you can handle similar stuff like Doublefine did in their cutscene editor.
Lipsync will be done automatically for your current speaking character as far as a lipsync file is available (.tsv).
Doing gestures while talking may need tweeking of these files as AFRLme already mentioned.
Parallax stuff alreaey works and zooming with the shader toolkit can do a lot of stuff,too. Also for these functions you could probably create a plugin to make new actions which control the lua part and the user just selects either characters /objects/coordinates and a zoom level in that action...