Then there's also the problem that if you put a neural network simulation in charge of the car, you can train it with all sorts of material but you can't explicitly make it do anything - it either does it or it won't, or it may sometimes do it and other times not. You never know why it does what it does, and you can't explicitly prove that it won't do something wrong, and you can't ask it to explain what it's doing because it's not a general intelligence and doesn't understand your question.
So again it's like training a chimp to drive a car. It may do it very well 99% of the time, and then freak out and crap on the seat and kill everybody - only, the AI has far fewer neurons to play with than a chimp, or a mouse, or even a bumblebee, because of the price and power constraints of your computer system.
For the computer to do as well as people do, it must work 99.99999999% of the time without a major malfunction or incident, because that would be the equivalent of 1 fatal mistake over 100 million miles. Meanwhile, the state of the art image recognition software pre-trained on particular objects has a mean accuracy of about 81% in detecting those objects from video, as of 2016:
That would be like you driving along the road, and being unable to see every fifth car or cyclist etc. you come across even as they stand right in front of you, even as you are looking right at them. It's just terrible. It's nowhere good enough.