MIT development is cheap shine keep equipment, can of function of processing graph data compares a s

MIT development is cheap shine keep equipment, can of function of processing graph data compares a server

Xin Zhiyuan is compiled

[Xin Zhiyuan introduction] science of computer of Masschusetts Institute of Technology and artificial intelligence lab (CSAIL) researcher design gives a kind of device, use shine cheaply put, use a personal computer to be able to handle many figure only, achieve the performance that agrees with the traditional server of thousands of dollar. Investigator thinks, this will change us to handle the way of big data thoroughly.

MIT development is cheap shine keep equipment, can of function of processing graph data compares a server

In the view of data science, graph (Graph) it is to point to the node that shoots a large number of complex data concerns with Yu Ying (Nodes) with jumper (Connecting Lines) structure. Analytic Graph is very useful in a lot of application, for example network of socialization of webpage rank, analysis in order to get political opinion, or the nerve of scale cerebra yuan structure.

However, the large Graphs bulk that constitutes by billions of node and line can reach TB level. Normally for, the processing of graph data needs to cross the server with many big power consumption, in costly DRAM (DRAM) in undertake.

Recently, science of computer of Masschusetts Institute of Technology and artificial intelligence lab (CSAIL) researcher design gives a kind of device, use shine cheaply put (use in the smartphone the sort of) , use a personal computer to be able to handle many figure only.

MIT development is cheap shine keep equipment, can of function of processing graph data compares a server

This equipment includes to shine put chip array (8 chip of the black in the graph) calculate with " accelerator " (chip array is left) . Researcher puts forward a kind of new algorithm, all visits of graphical data request sort puts the order of OK and relaxed visit to shine, incorporate at the same time a few requests pay expenses with reducing sort.

Shine put chip array + computational accelerator, server class performance is achieved in personal computer

When processing graph data, shine put get slower than DRAM normally much. But researcher development goes a kind by Shan Cunxin piece array and computation " accelerator " compositive equipment, can make Shan Cunda arrives the function that is similar to DRAM.

Drive of this equipment is a kind of new algorithm, it is OK the order that requests all visits of graph data sort to put OK and fast, relaxed visit to shine. It still incorporates a few requests, in order to reduce the expense of sort -- assorted computation time, memory, bandwidth and other consideration resource.

Researcher uses this equipment and a few traditional high-powered systems to handle a few large plans together, include giant Web Data Commons Hyperlink Graph, this Graph has 3.5 billion node and 128 billion jumper. To handle this Graph, traditional system needs to cost the server of thousands of dollar, and the DRAM of 128GB. Researcher two new facility (of the DRAM of 1GB of add up to and 1TB shine put) receive table computer, acquired same property. In addition, through incorporating a few equipment, can handle bigger picture -- amount to 4 billion node and 128 billion jumper -- and other system cannot handle these plans on the server of 128G.

MIT development is cheap shine keep equipment, can of function of processing graph data compares a server

Researcher two equipment (of the DRAM of 1GB of add up to and 1TB shine put) receive computer of a table, acquired as same as the traditional server of thousands of dollar property.

Sang-Woo Jun of the first author says the graduate student of CSAIL, thesis: "The most important is, we can use the device with smaller power comsumption, fewer, smaller temperature to hold same property. " the seminar of international computer architecture that should consider to be published this year (ISCA) go up.

This equipment can be used at reducing the cost related to graphical analysis and specific power consumption, can improve performance in a lot of application even. For example, researcher is developing a program at present, can identify the gene that causes cancer. The company of large science and technology such as Gu Ge also can use these device, will run an analysis through using fewer machine, use up in order to reduce the sources of energy.

"The graph is handled (Graph Processing) it is a very general idea, " Arvind of professor of department of project of science of the co-worker of this research, computer says, "Do webpage rank and gene detect place is there? To us, they are same computational question, the meaning that the Graph that just differs conveys is different. The meaning that the Graph that just differs conveys is different..

Paper coauthor still has the Shuotao Xu of two graduate students of CSAIL and Andy Wright, and electronic project and the Sizhuo Zhang that computer science fastens.

Sort-reduce algorithm

In chart analysis, search and the system updates the value of node according to the join of node and other node and other magnanimity index. For example, in webpage rank, every node represents a webpage. If node A has higher cost to receive node B repeatedly, so the value of node B also can increase.

Traditional system is storage of all graph data in DRAM, this makes the rate when they are processing data very rapid, but also cause cost costly and bad news report. Some systems uninstall partial data storage shine put on, this kind of way is cheaper, but rate is slower, efficiency is lower, because this still needs many DRAM.

The new facility of CSAIL research and development moves in be called " Sort-reduce " on algorithm, this algorithm solved use shine put a when regard main memory as the source main problem: Wasteful.

Chart analysis system needs to pass many, those of few and far between picture texture visit is apart from very far node here. The system requests to visit the 4 data to 8 byte directly normally, in order to update the value of node. DRAM offerred very quick direct visit. However, shine put can visit 4KB to arrive only the data of 8KB piece, but still update a few byte only. When jumping over a graph, repeat a visit every request can waste bandwidth.

Sort-reduce algorithm turns and use all direct visit requests, have sort to them according to the order of identifier, the destination that identifier indicates a request -- for example node A all updating cent becomes a group, allocate node B entirely. Such, show the request that puts the Chunks that can visit size of thousands of Kilobyte at the same time, improve efficiency greatly thereby.

To save computational force and bandwidth further, this algorithm incorporates data at the same time as far as possible the smallest in group in. Wanted algorithm to record the identifier that match only, it goes to these data to load in bag of a data -- incorporate A1 and A2 for example A3. This kind of practice repeats many times, found smaller and smaller data parcel with the identifier that match, can undertake till generation the least data of sort is wrapped. This reduced a visit to repeat the amount of the request greatly.

Researcher uses Sort-reduce algorithm on two large Graphs, the full data that updates need in Shan Cunzhong decreased about 90% .

Custom-built accelerator

However, to lead plane, the computational amount of Sort-reduce algorithm is very large. Accordingly, researcher adds a custom-built accelerator in equipment. Accelerator is mixed in lead plane shine put the dot intermediate acts as between chip, carry out all algorithmic computation. This reduced n to use up greatly for accelerator, so that can use the PC of a low power comsumption or notebook computer to serve as lead plane, use government already the data of sort executes other and less important mission.

Arvind says: "Accelerator is to use help lead plane to undertake calculative originally, but current result shows, lead plane becomes so not important. Lead plane becomes so not important..

"This job of MIT revealed a kind of new method that implements an analysis on very big picture: Use Shan Cuncun store graph, use FPGA (custom-built integrated circuit) the data processing that with clever way executive place needs and analysis, "University Austin divides Dekesasi school computer science teachs Keshav Pingali to say, "From long-term in light of, this may make the system can process mass data effectively, this will change us to handle the way of big data thoroughly. This will change us to handle the way of big data thoroughly..

The researcher of MIT says, because the cost of lead plane can be very low, their long-term goal is to found a general platform and software library, what so that the user is a graph,analyse him application development besides is algorithmic. Jun says: "You can insert this platform notebook computer, download this software, write simple program next, with respect to the property that can acquire server degree on your notebook computer. With respect to the property that can acquire server degree on your notebook computer..

Textual: Http://news.mit.edu/2018/device-allows-personal-computer-process-huge-graphs-0531

未经允许不得转载:News » MIT development is cheap shine keep equipment, can of function of processing graph data compares a s