So is that what the paper is doing?Either you tell the robot "function high5() set pose = 0.1 0.4 0.9 1;" or you input that string into a neural network something like this:
View attachment 1020322
and out comes "pose 0.11 0.41 0.88 1.01" which is basically the same, but the neural figured thinks it better and can adapt to the environment.
Then you extend this to include language in the input aka "go fetch me a beer and then take the dog out for a walk".