Drive automatically car, need answers various means besides, working environment is changing all the time.
So, those who train good L4 class is automatic drive the system is not simple. Need depends on promotive function (Reward Function) and cost function (Cost Function) .
Come so, researcher needs to spend a large number of energy, give these function in aggrandizement study tone ginseng. The environment is more complex, the job that tone enters is done harder.
Nevertheless, baidu drives automatically sectional mankind, want to liberate both hands, join melody important task resign to him AI.
Then, they developed automatic attune to join a method, let AI can use shorter training time, obtain answer complex drive the ability of setting.
Delimit key: Get used to a variety of environments quickly.
Ginseng of attune leaving a line is safer
Drive automatically car, need can handle the AI system of all sorts of setting.
System of this action program, it is to be based on Baidu Appollo to drive automatically of frame research and development.
The system is data drive, the data that uses includes an expert to drive data and surroundings data.
The graph can see on, systematic cent is mix from the line online two parts:
1. online module, be in charge of generating a best athletic contrail, those who use is promotive function.
2. joins module from string tone, just become promotive function and cost function with a future life, and it is the function that can adjust as the environment.
So, the 2nd part is a key. It is good to should see a group of parameters, imitate test and road are measured cannot little (following graphs) .
Circulate to reduce feedback (the time that Feedback Cycles) spends, baidu goes against aggrandizement study with the condition that is based on a rank (frame of Rank Based Conditional IRL) , will move teach award / cost function, replace endless hand to move tone ginseng.
The model is how refine becomes
So, see the look with specific model:
Online still and from the line two parts, can see this new aggrandizement study moves ginseng frame nevertheless (the position that RC-IRL) is in.
Working flow
Primitive feature builder (Raw Feature Generator) , get an input from the environment, evaluate sampling contrail or the expert drives orbit. Single out a few contrail from which, use jointly with module leaving a line to online module.
From inside orbit, after coming out primitive feature extraction, online evaluate implement medium award / cost function, can give out a mark.
Finally, arrange the mark come out, perhaps plan with trends (Dynamic Programming) , will choose a contrail that outputs finally.
Train a process
Training data is the expert from 1000+ hours drives single out in data those who come, eliminate the partial pick that changes without speed again without fraise, the rest of 718 million frame, ensure the difficulty of training.
Training a process leave a line, apply to large-scale test, also apply to case of the horn side processing (Corner Cases) .
Additional, data also is to be collected automatically, tag automatically, saved physical strength for the mankind again.
Value function, will train with SIAMESE network. This one part, it is to be used catch drive of behavior, will catch according to a lot of features.
Training is good, go attending a test. The content that imitate checks includes: Jockey, turn, change line, overtake and more complex setting.
It is the road is measured after simulator. Up to this year on July 25, the system already was experienced more than 25, 000 miles road is measured. The group says, AI is behaved so far good.
Win again?
Two days ago, baidu announces to actor or actress with China the car reachs collaboration, explore automatic road-sense together commercialize.
Before a week, waymo subsidiary " benefit rub " settle Shanghai.
On the competition ground that drives automatically, who also won't rein in footstep.
One day, if Waymo unmanned vehicle came to China, do not know Baidu can " win again " .
Good gracious, to Baidu, can say nothing, watch show