HPC Wire

Subscribe to HPC Wire feed
Since 1987 - Covering the Fastest Computers in the World and the People Who Run Them
Updated: 20 hours 52 min ago

Advanced Clustering Integrating NVIDIA Tesla P100 Accelerators Into Line of HPC Clusters

Fri, 02/17/2017 - 07:17

Feb. 17 — Advanced Clustering Technologies is helping customers solve challenges by integrating NVIDIA Tesla P100 accelerators into its line of high performance computing clusters. Advanced Clustering Technologies builds custom, turn-key HPC clusters that are used for a wide range of workloads including analytics, deep learning, life sciences, engineering simulation and modeling, climate and weather study, energy exploration, and improving manufacturing processes.

“NVIDIA-enabled GPU clusters are proving very effective for our customers in academia, research and industry,” said Jim Paugh, Director of Sales at Advanced Clustering. “The Tesla P100 is a giant step forward in accelerating scientific research, which leads to breakthroughs in a wide variety of disciplines.”

Tesla P100 GPU accelerators are based on NVIDIA’s latest Pascal GPU architecture, which provides the throughput of more than 32 commodity CPU-based nodes. The Tesla P100 specifications are:

  • 5.3 teraflops double-precision performance
  • 10.6 teraflops single-precision performance
  • 21.2 teraflops half-precision performance
  • 732GB/sec memory bandwidth with CoWoS HBM2 stacked memory
  • ECC protection for increased reliability

“Customers taking advantage of Advanced Clustering’s high performance computing clusters with integrated NVIDIA Tesla P100 GPUs benefit from the most technologically advanced accelerated computing solution in the market – greatly speeding workload performance across analytics, simulation and modeling, deep learning and more,” said Randy Lewis, Senior Director of Worldwide Field Operations at NVIDIA.

About Advanced Clustering 

Advanced Clustering Technologies, a privately held corporation based in Kansas City, Missouri, is dedicated to developing high-performance computing (HPC) solutions. The company provides highly customized turn-key cluster systems — utilizing out-of-the-box technology — to companies and organizations with specialized computing needs.

The technical and sales teams have more than 50 years of combined industry experience and comprehensive knowledge in the areas of cluster topologies and cluster configurations. In business since 2001, Advanced Clustering Technologies’ commitment to exceeding client expectations has earned the company the reputation as one of the nation’s premier providers of high performance computing systems. For more details, please visit http://www.advancedclustering.com/technologies/gpu-computing/.

Source: Advanced Clustering

The post Advanced Clustering Integrating NVIDIA Tesla P100 Accelerators Into Line of HPC Clusters appeared first on HPCwire.

Tokyo Tech’s TSUBAME3.0 Will Be First HPE-SGI Super

Thu, 02/16/2017 - 22:37

In a press event Friday afternoon local time in Japan, Tokyo Institute of Technology (Tokyo Tech) announced its plans for the TSUBAME 3.0 supercomputer, which will be Japan’s “fastest AI supercomputer,” when it comes online this summer (2017). Projections are that it will deliver 12.2 double-precision petaflops and 64.3 half-precision (peak specs).

Nvidia was the first vendor to publicly share the news in the US. We know that Nvidia will be supplying Pascal P100 GPUs, but the big surprise here is the system vendor. The Nvidia blog did not specifically mention HPE or SGI but it did include this photo with a caption referencing it as TSUBAME3.0:

TSUBAME3.0 – click to expand (Source: Nvidia)

That is most certainly an HPE-rebrand of the SGI ICE XA supercomputer, which would make this the first SGI system win since the supercomputer maker was brought into the HPE fold. For fun, here’s a photo of the University of Tokyo’s “supercomputer system B,” an SGI ICE XA/UV hybrid system:

Source: University of Tokyo-Institute for Solid State Physics

TSUBAME3.0 is on track to deliver more than two times the performance of its predecessor, TSUBAME2.5, which ranks 40th on the latest Top500 list (Nov. 2016) with a LINPACK score of 2.8 petaflops (peak: 5.6 petaflops). When TSUBAME was upgraded from 2.0 to 2.5 in the fall of 2013, the HP Proliant SL390s hardware stayed the same, but the GPU was switched from the NVIDIA (Fermi) Tesla M2050 to the (Kepler) Tesla K20X.

Increasingly, we’re seeing Nvidia refer to half-precision floating point capability as “AI computation.” Half-precision is suitable for many AI training workloads (but by no means all) and it’s usually sufficient for inferencing tasks.

With this rubric in mind, Nvidia says TSUBAME3.0 is expected to deliver more than 47 petaflops of “AI horsepower” and when operated in tandem with TSUBAME2.5, the top speed increases to 64.3 petaflops, which would give it the distinction of being Japan’s highest performing AI supercomputer.

According to a Japanese-issue press release, DDN will be supplying the storage infrastructure for TSUBAME 3.0. The high-end storage vendor is providing a combination of high-speed in-node NVMe SSD and its high-speed Lustre-based EXAScaler parallel file system, consisting of three racks of DDN’s high-end ES14KX appliance with capacity of 15.9 petabytes and a peak performance of 150 GB/sec.

TSUBAME3.0 is expected to be up and running this summer. The Nvidia release notes, “It will used for education and high-technology research at Tokyo Tech, and be accessible to outside researchers in the private sector. It will also serve as an information infrastructure center for leading Japanese universities.”

“NVIDIA’s broad AI ecosystem, including thousands of deep learning and inference applications, will enable Tokyo Tech to begin training TSUBAME3.0 immediately to help us more quickly solve some of the world’s once unsolvable problems,” said Tokyo Tech Professor Satoshi Matsuoka, who has been leading the TSUBAME program since it began.

“Artificial intelligence is rapidly becoming a key application for supercomputing,” said Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA. “NVIDIA’s GPU computing platform merges AI with HPC, accelerating computation so that scientists and researchers can drive life-changing advances in such fields as healthcare, energy and transportation.”

We remind you the story is still breaking, but wanted to share what we know at this point. We’ll add further details as they become available.

The post Tokyo Tech’s TSUBAME3.0 Will Be First HPE-SGI Super appeared first on HPCwire.

Drug Developers Use Google Cloud HPC in the Fight against ALS

Thu, 02/16/2017 - 21:00

Within the haystack of a lethal disease such as ALS (amyotrophic lateral sclerosis / Lou Gehrig’s Disease) there exists, somewhere, the needle that will pierce this therapy-resistant affliction. Finding the needle is a trial-and-error process of monumental proportions for scientists at pharmaceutical companies, medical research centers and academic institutions. As models grow in scale so too does the need for HPC resources to run simulations iteratively, to try-and-fail fast until success is found.

That’s all well and good if there’s ready access to HPC on premises. If not, drug developers, such as ALS researcher Dr. May Khanna, Pharmacology Department assistant professor at the University of Arizona, have turned to HPC resources provided by public cloud services. But using AWS, Azure or Google introduces a host of daunting compute management problems that tax the skills and time availability of most on-site IT staffs.

These tasks include data placement, instance provisioning, job scheduling, configuring software and networks, cluster startup and tear-down, cloud provider setup, cost management and instance health checking. To handle these cloud orchestration functions tied to 5,000 cores of Google Cloud Preemptive VMs (PVMs), Dr. Khanna and her team at Arizona turned to Cycle Computing to run “molecular docking” simulations at scale by Schrödinger Glide molecular modeling drug design software.

The results: simulations that would otherwise take months have been compressed to a few hours, short enough to be run during one of Dr. Khanna’s seminars and the output shared with students.

Dr. May Khanna

Developing new drugs to target a specific disease often starts with the building blocks of the compounds that become the drugs. The process begins with finding small molecules that can target specific proteins that, when combined, can interact in a way that becomes the disease’s starting point. The goal is to find a molecule that breaks the proteins apart. This is done by simulating how the small molecules dock to the specific protein locations. These simulations are computationally intensive, and many molecules need to be simulated to find a few good candidates.

Without powerful compute resources, researchers must artificially constrain their searches, limiting the number of molecules to simulate. And they only check an area of the protein known to be biologically active. Even with these constraints, running simulations takes a long time. Done right, molecular docking is an iterative process that requires simulation, biological verification, and then further refinement. Shortening the iteration time is important to advancing the research.

The objective of Dr. Khanna’s work was to simulate the docking of 1 million compounds to one target protein. After a simulation was complete, the protein was produced in the lab, and compounds were then tested with nuclear magnetic resonance spectroscopy.

“It’s a target (protein) that’s been implicated in ALS,” the energetic Dr. Khanna told EnterpriseTech. “The idea is that the particular protein was very interesting, people who modulated it in different ways found some significant improvement in the ALS models they have with (lab) mice.   The closer we can link biology to what we’re seeing as a target, the better chance of actually getting to a real therapeutic.”

“Modulating,” Dr. Khanna explained, is disrupting two proteins interacting in a way that is associated with ALS, a disease that currently afflicts about 20,000 Americans and for which there is no cure. “We’re trying to disrupt them, to release them to do their normal jobs,” she said.

She said CycleCloud plays a central role in running Schrödinger Glide simulations. Without Google Cloud PVMs, simulations would take too long and model sizes would be too small to generate meaningful results. Without CycleCloud, the management of 5,000 PVM nodes would not be possible.

CycleCloud provides a web-based GUI, a command line interface and APIs to define cloud-based clusters. It auto-scales clusters by instance types, maximum cluster size and costing parameters, deploying systems of of up to 156,000 cores while validating each piece of the infrastructure. Additionally, it syncs in-house data repositories with cloud locations in a policy / job driven fashion, to lower costs.

It should be noted that the use of Google Cloud’s PVMs, while helping to hold down the cost of running simulations to $200, contribute an additional degree of complexity to Dr. Khanna’s project work. Preemptible compute capacity offers the advantage of a consistent price not subject to dynamic demand pricing, as are other public cloud instances. PVMs are assigned to a job for a finite period of time but – here’s the rub – they can be revoked at any moment. While Dr. Khanna’s workflow was ideal for leveraging PVMs, since it consists of small, short-running jobs, PVMs can disappear at without warning.

In the case of Dr. Khanna’s ALS research work, said Jason Stowe, CEO of Cycle Computing said, “if you’re willing to getting rid of the node, but you’re able to use it during that timeframe at substantially lower cost, that allows you get a lot more computing bang for your buck. CycleCloud automates the process, taking care of nodes that go away, making sure the environment isn’t corrupted, and other technical aspects that we take care of so the user doesn’t have to.”

The simulation process is divided into two parts. The first step uses the Schrödinger LigPrep package, which converts 2D structures to the 3D format used in the next stage. This stage started with 4 GB of input data staged to an NFS filer. The output data was approximately 800KB and was stored on the NFS filer as well. To get the simulation done as efficiently as possible, the workload was split into 300 smaller jobs to assist in scaling the next stage of the workflow. In total, the first stage consumed 1500 core-hours of computation.

The Schrödinger Glide software package performs the second stage of the process, where the actual docking simulation is performed. Each of the 300 sub-jobs consists of four stages, each with an attendant prep stage. The total consumption was approximately 20,000 core-hours using 5,000 cores of n1-highcpu-16 instances. Each instance had 16 virtual cores with 60 gigabytes of RAM. The CycleCloud software dynamically sized the cluster based on the number of jobs in queue and replaced preempted instances.

Dr. Khanna’s research is the early stages of a process that, if successful, could take several years before reaching human clinical trials.

“The faster we can do this, the less time we have to wait for results, so we can go back and test it again and try to figure out what compounds are really binding,” she said, “the faster the process can move along.”

Dr. Khanna said plans are in place to increase the size of the pool of potential compounds, as well as include other proteins in the simulation to look for interactions that would not typically be seen until later in the process. The team will also simulate over the entire surface of the protein instead of just a known-active area unlocking “an amazing amount of power” in the search process, she said.

“That jump between docking to binding to biological testing takes a really long time, but I think we can move forward on that with this cloud computing capacity,” she said. “The mice data that we saw was really exciting…, you could see true significant changes with the mice. I can’t tell you we’ve discovered the greatest thing for ALS, but showing that if we take these small molecules and we can see improvement, even that is so significant.”

The post Drug Developers Use Google Cloud HPC in the Fight against ALS appeared first on HPCwire.

Dutch Startup Offers Immersive Cooling for Cloud, Edge and HPC Datacenter

Thu, 02/16/2017 - 19:53

HAARLEM, The Netherlands, Feb. 16, 2017 — Asperitas, cleantech startup from the Amsterdam area, one of the world’s datacentre hotspots, is introducing a unique solution based on a total liquid cooling concept called Immersed Computing.

After 1.5 years of research and development with an ecosystem of partners Asperitas is launching their first market ready solution, the AIC24, at the leading international industry event Data Centre World & Cloud Expo Europe.

The AIC24

The Asperitas AIC24 is at the centre of Immersed Computing. It is a closed system and the first water-cooled oil-immersion system which relies on natural convection for circulation of the dielectric liquid. This results in a fully self-contained and Plug and Play modular system. The AIC24 needs far less infrastructure than any other liquid installation, saving energy and costs on all levels of datacentre operations. The AIC24 is the most sustainable solution available for IT environments today. Ensuring the highest possible efficiency in availability, energy reduction and reuse, while increasing capacity. Greatly improving density, while saving energy at the same time.

The AIC24 is designed to ensure the highest possible continuity for cloud providers. Total immersion ensures no oxygen gets in touch with the IT components, preventing oxidation. Thermal shock is greatly reduced due to the high heat capacity of liquid. The immersed environment only has minor temperature fluctuations, greatly reducing stress by thermal expansion on micro-electronics. These factors eliminate the root cause for most of the physical degradation of micro-electronics over time.

Plug and Play green advanced computing anywhere

The AIC24 is Plug and Play. A single module requires only power, access to a water loop and data connectivity to operate. Combined with its silent workings, these limited requirements enable high flexibility in deployment sites and scenarios for the AIC24.

Two specially designed Convection Drives for forced water and natural flow of oil, are capable of transferring 24 kW of heat from the oil while keeping all the IT components at allowable operating temperatures.

Maximised IT capacity, the Asperitas Universal Cassette can contain multiple physical servers. Each module accommodates 24 AUC’s, as well as 2 Universal Switching Cassettes. This currently adds up to 48 immersed servers and 2 immersed switches.

Immersed Computing

Immersed Computing is a concept driven by sustainability, efficiency and flexibility and goes far beyond just technology. In many situations, Immersed Computing can save more than 50% of the total energy footprint. By using immersion, 10-45% of IT energy is reduced due to the lack of fans, while other energy consumers like cooling installations can achieve up to 95% energy reduction. It allows for warm water cooling which provides even more energy savings on cooling installations. One more benefit, Immersed Computing enables high temperature heat reuse.

Immersed Computing includes an optimised way of work, highly effective deployment, flexible choice of IT and drastic simplification of datacentre design. Offering great advantages on all levels of any datacentre value chain, Immersed Computing realises maximum results in Cloud, Private and Edge environments.

About Asperitas

Asperitas is a cleantech company focused on greening the datacentre industry by introducing immersed computing.

The Asperitas Development partners include University of Leeds, Aircraft Development and Systems Engineering (ADSE), Vienna Scientific Cluster, Super Micro, Schleifenbauer and Brink Industrial. Asperitas is furthermore recognised and supported by the Netherlands Enterprise Agency as a promising new cleantech company.

Source: Asperitas

The post Dutch Startup Offers Immersive Cooling for Cloud, Edge and HPC Datacenter appeared first on HPCwire.

Weekly Twitter Roundup (Feb. 16, 2017)

Thu, 02/16/2017 - 13:53

Here at HPCwire, we aim to keep the HPC community apprised of the most relevant and interesting news items that get tweeted throughout the week. The tweets that caught our eye this past week are presented below.

Paul Messina of @argonne: capable #exascale system will have well-balanced ecosystem of software, hardware & applications | #HPC pic.twitter.com/dhidodsbod

— Chris Mustain (@ChrisMustain) February 16, 2017

The world largest student supercomputer challenge -ASC 2017 kicks off today in Zhengzhou University, over 230 teams from 15 countries.#HPC pic.twitter.com/CvWpU2rn6N

— Data Center Systems (@InspurServer) February 16, 2017

Alexander Heinecke from @IntelHPC gives a @KaustCemse seminar on Efficient Vectorization using #Intel AVX-512 #HPC pic.twitter.com/YktH7x6Mli

— Bilel Hadri (@mnoukhiya) February 13, 2017

My German immigrant hubby contributes to USA's supercomputing capabilities every day. Danke, Schatz! #HPC #Valentines #ToImmigrantsWithLove

— LauraSchulz (@lauraschulz) February 15, 2017

Wayne State C&IT was happy to be a part of the XSEDE HPC Workshop: Big Data Fri, Feb. 10. Did you attend? #bigdata #hpc pic.twitter.com/eGdM1JSF7K

— Wayne State C&IT (@WayneStateCIT) February 13, 2017

Doug Kothe of @ORNL: US leads in advanced computing applications, but China is rapidly closing the gap | #HPC @CompeteNow @exascaleproject

— Chris Mustain (@ChrisMustain) February 16, 2017

Thank you @ComputeCanada for the tuque … and for your support of the Canadian and global #HPC community! #tuques4compute pic.twitter.com/0DB7WMsQim

— Sharon Broude Geva (@SBroudeGeva) February 15, 2017

As seen in a source file today "If you edit this, you'll get what you deserve" #gemsofHPC #HPC #donteventrytoeditthatline #justdont

— Fernanda Foertter (@hpcprogrammer) February 15, 2017

Thanks to everyone that attended NCI's first ever #HPC Summer School last week @our_ANU. Check out more photos over on our Facebook page. pic.twitter.com/6eSKk7nbpA

— NCI Australia (@NCInews) February 12, 2017

What will rely on #exascale computing? A short list of applications | #HPC pic.twitter.com/xPLKMCnBrV

— Chris Mustain (@ChrisMustain) February 16, 2017

Alexander Heinecke from #intel gives a talk about efficient vectorization of deep learning kernels using #Intel AVX-512 #HPC @KAUST_HPC pic.twitter.com/bFIb5XNGAh

— George Markomanolis (@geomark) February 13, 2017

#HPC Summer School a great chance to grad/PhD students from across the world to learn newest innovations in #HPC.https://t.co/ce4XFOhFWN pic.twitter.com/rW8BgmNHY9

— NCSAatIllinois (@NCSAatIllinois) February 13, 2017

My funny valentine, you're my fav work of art #valentinesday #HPC pic.twitter.com/feL84ASvVZ

— Fernanda Foertter (@hpcprogrammer) February 14, 2017

Click here to view the top tweets from last week.

The post Weekly Twitter Roundup (Feb. 16, 2017) appeared first on HPCwire.

230 Teams worldwide join ASC17 to challenge AI and TaihuLight

Thu, 02/16/2017 - 11:00

February 16, 2017 − 2017 ASC Student Supercomputer Challenge (ASC17) held its opening ceremony at Zhengzhou University. 230 teams from all over the world will challenge the world’s fastest supercomputer Sunway TaihuLight, artificial intelligence application, Gordon Bell Award nomination application, and compete for 20 places in the finals. Hundred supercomputing experts and team representatives worldwide attended the opening ceremony.

The number of teams registered ASC17 Challenge has reached a new high, is up 31% compare to the last year. The competition platforms and applications have been designed to reflect the leading-edge characteristic: Sunway TaihuLight and the most advanced supercomputer in Henan province (which is in the middle of China) will perform different competition applications. Baidu’s AI application, intelligent driving traffic prediction and a high-resolution global surface wave simulation MASNUM_WAVE, a 2016 Gordon Bell Prize finalist will give the teams the opportunities to challenge the “Super Brain” and the “Big Science “. Meanwhile, ASC17 finals will include 20 teams, instead of the original 16 teams.

Wang Endong, initiator of the ASC challenge, academician of the Chinese Academy of Engineering and Chief Scientist at Inspur, said that with the convergence of HPC, big data and cloud computing, intelligent computing as represented by artificial intelligence will become the most important and significant component for the coming computing industry , and bring new challenges in computing technologies. For two consecutive seasons, ASC Challenge has set AI applications to hope students can understand the deep learning algorithms, and acquire the knowledge relating to big data and cutting-edge computing technologies, thereby grooming inter-disciplinary supercomputing talent for the future.

On the day of the opening ceremony, Henan province’s fastest supercomputer in Zhengzhou University (Zhengzhou City) Supercomputing Center launched and become one of the competition platforms for ASC17. Liu Jiongtian, academician of the Chinese Academy of Engineering , President of Zhengzhou University, pity not to attended the event but believed that this will allow teams worldwide to experience the latest technology such as KNL many core architecture. At the same time, this will also help to accelerate supercomputing applications innovations in Zhengzhou and Henan Province and help to groom supercomputing talent in the region, promote smart city development in Zhengzhou, and support rapid economic development of the regions in middle of China.

Yang Guangwen, director of National Supercomputing Center in Wuxi, said that all the processors used in Sunway TaihuLight are home grown by China, and that it is the world’s first supercomputer to achieve 100 petaflops. Using Sunway TaihuLight as the competition platform, will give each team the opportunity to experience the world’s fastest supercomputer, in order to promote the training of young talents better. At the same time, the international exchanges resulting from ASC17 Challenge will help more people appreciate Chinese capability in independent design in the supercomputing domain.

The organizers of ASC17 Challenge have also arranged a 2-day intensive training camp for the participants, where experts from National Supercomputing Center in Wuxi, Baidu, and Inspur conducted comprehensive and systematic lectures. Topics included the design of a supercomputer system, the KNL, deep learning application optimization solutions and techniques on using Sunway TaihuLight.

The ASC Student Supercomputer Challenge is initiated by China, and supported by experts and institutions worldwide. The competition aims to be the platform to promote exchanges among young supercomputing talent from different countries and regions, as well as to groom young talent. It also aims to be the key driving force in promoting technological and industrial innovations by improving the standards in supercomputing applications and research. ASC Challenge has been held for 6 years. This year the ASC17 Challenge is co-organized by  Zhengzhou University, the National Supercomputing Centre in Wuxi , and Inspur.

The post 230 Teams worldwide join ASC17 to challenge AI and TaihuLight appeared first on HPCwire.

Data Analytics Gets the Spotlight in Distinguished Talk Series at ISC 2017

Thu, 02/16/2017 - 08:42

FRANKFURT, Germany, Feb. 16 — In a continuous effort to diversify topics at the ISC High Performance conference, the organizers are pleased to announce that two of this year’s presentations in the Distinguished Talk series will focus on data analytics in manufacturing and scientific applications.

The ISC 2017 Distinguished Talk series will offer five talks, spread over Tuesday, June 20 and Wednesday, June 21. The five-day technical program sessions will be held from Sunday, June 19 through Thursday, June 22. Over 3,000 attendees are expected at this year’s conference.

On Tuesday afternoon at 1:45 PM, cybernetics expert, Dr. Sabine Jeschke, who heads the Cybernetics Lab at the RWTH Aachen University, will deliver a talk about “Robots in Crowds – Robots and Clouds.” Jeschke’s presentation will be followed by one from physicist Kerstin Tackmann, from the German Electron Synchrotron (DESY) research center, who will discuss big data and machine learning techniques used for the ATLAS experiment at the Large Hadron Collider.

Jeschke’s research expertise lies in distributed artificial intelligence, robotics, automation, and virtual worlds, among other areas. In her talk, she will discuss new trends in mobile robotic systems, with special emphasis on the relationship between AI, cognitive systems and robotics. She will also present new paradigms for robotic platforms, for example, humanoids, robots on wheels, animal-like robots and industrial robots, with respect to their application areas and their physical realization.

In her abstract, she specifies the need to consider big data and its analytics as a critical aspect of robotics. At the same time, Jeschke also identifies how high performance computing will need to be applied to robotic systems. She will be sharing all of these topics and more with the ISC 2017 audience.

Tackmann, who will be speaking immediately after Jeschke, will give an overview of the ATLAS experiment, with particular attention to its enormous flow of data generated by the ATLAS detectors. She will present some of the experiment’s results, and give an overview of the technologies employed to store, search and retrieve experimental data and metadata, including the use of analytics tools and machine learning techniques.

The ATLAS experiment is one of the two multi-purpose experiments at the Large Hadron Collider (LHC) at the European Organization for Nuclear Research (CERN) in Geneva. Since ATLAS began collecting data in 2009, it has been used to understand the processes described by the Standard Model of elementary particle physics, identify the Higgs boson, and search for particles and phenomena beyond the Standard Model.

Tackmann, who earned her PhD in Physics (experimental particle physics) from the University of Berkeley in California, has been involved in the ATLAS Experiment since 2011. She leads the Helmholtz Young Investigators Group Higgs Physics with Photons, which is a part of the ATLAS group at the DESY center.

Source: ISC

The post Data Analytics Gets the Spotlight in Distinguished Talk Series at ISC 2017 appeared first on HPCwire.

ExxonMobil and NCSA Achieve Simulation Breakthrough

Thu, 02/16/2017 - 07:27

SPRING, Tex., Feb. 16 — ExxonMobil, working with the National Center for Supercomputing Applications (NCSA), has achieved a major breakthrough with proprietary software using more than four times the previous number of processors used on complex oil and gas reservoir simulation models to improve exploration and production results.

The breakthrough in parallel simulation used 716,800 processors, the equivalent of harnessing the power of 22,400 computers with 32 processors per computer. ExxonMobil geoscientists and engineers can now make better investment decisions by more efficiently predicting reservoir performance under geological uncertainty to assess a higher volume of alternative development plans in less time.

The record run resulted in data output thousands of times faster than typical oil and gas industry reservoir simulation. It was the largest number of processor counts reported by the oil and gas industry, and one of the largest simulations reported by industry in engineering disciplines such as aerospace and manufacturing.

“This breakthrough has unlocked new potential for ExxonMobil’s geoscientists and engineers to make more informed and timely decisions on the development and management of oil and gas reservoirs,” said Tom Schuessler, president of ExxonMobil Upstream Research Company. “As our industry looks for cost-effective and environmentally responsible ways to find and develop oil and gas fields, we rely on this type of technology to model the complex processes that govern the flow of oil, water and gas in various reservoirs.”

The major breakthrough in parallel simulation results in dramatic reductions in the amount of time previously taken to study oil and gas reservoirs. Reservoir simulation studies are used to guide decisions such as well placement, the design of facilities and development of operational strategies to minimize financial and environmental risk. To model complex processes accurately for the flow of oil, water, and natural gas in the reservoir, simulation software must solve a number of complex equations. Current reservoir management practices in the oil and gas industry are often hampered by the slow speed of reservoir simulation.

ExxonMobil’s scientists worked closely with the NCSA to benchmark a series of multi-million to billion cell models on NCSA’s Blue Waters supercomputer. This new reservoir simulation capability efficiently uses hundreds of thousands of processors simultaneously and will have dramatic impact on reservoir management workflows.

“NCSA’s Blue Waters sustained petascale system, which has benefited the open science community so tremendously, is also helping industry break through barriers in massively parallel computing,” said Bill Gropp, NCSA’s acting director. “NCSA is thrilled to have worked closely with ExxonMobil to achieve the kind of sustained performance that is so critical in advancing science and engineering.”

ExxonMobil’s collaboration with the NCSA required careful planning and optimization of all aspects of the reservoir simulator from input/output to improving communications across hundreds of thousands of processors. These efforts have delivered strong scalability on several processor counts ranging from more than 1,000 to nearly 717,000, the latter being the full capacity of NCSA’s Cray XE6 system.

About ExxonMobil

ExxonMobil, the largest publicly traded international oil and gas company, uses technology and innovation to help meet the world’s growing energy needs. We hold an industry-leading inventory of resources and are one of the largest integrated refiners, marketers of petroleum products and chemical manufacturers. For more information, visit www.exxonmobil.com or follow us on Twitter www.twitter.com/exxonmobil.

About the National Center for Supercomputing Applications

The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign provides supercomputing and advanced digital resources for the nation’s science enterprise. At NCSA, University of Illinois faculty, staff, students, and collaborators from around the globe use advanced digital resources to address research grand challenges for the benefit of science and society. NCSA has been advancing one third of the Fortune 50 for more than 30 years by bringing industry, researchers and students together to solve grand challenges at rapid speed and scale. The Blue Waters Project is supported by the National Science Foundation through awards ACI-0725070 and ACI-1238993.

Source: ExxonMobil

The post ExxonMobil and NCSA Achieve Simulation Breakthrough appeared first on HPCwire.

Submissions Open for the SC Test of Time Award

Thu, 02/16/2017 - 07:10

Feb. 16 — The SC Test of Time Award (ToTA) Committee is soliciting nominations for this year’s Test-of-Time Award to be given at the SC17 Conference in November 2017 in Denver, CO. The ToTA Award recognizes an outstanding paper that has deeply influenced the HPC discipline. It is a mark of historical impact and recognition that the paper has changed HPC trends.

The award is also an incentive for researchers and students to send their best work to the SC Conference and a tool to understand why and how results last in the HPC discipline.  Papers that appeared in the SC Conference Series are considered for this award.  A paper must be at least 10 years old, from the twenty conference years 1988 to 2007, inclusive.

Source: SC17

The post Submissions Open for the SC Test of Time Award appeared first on HPCwire.

Alexander Named Dep. Dir. of Brookhaven Computational Initiative

Wed, 02/15/2017 - 14:09

Francis Alexander, a physicist with extensive management and leadership experience in computational science research, has been named Deputy Director of the Computational Science Initiative at the U.S. Department of Energy’s (DOE) Brookhaven National Laboratory. Alexander comes to Brookhaven Lab from DOE’s Los Alamos National Laboratory, where he was the acting division leader of the Computer, Computational, and Statistical Sciences (CCS) Division.

In his new role as deputy director, Alexander will work with CSI Director Kerstin Kleese van Dam to expand CSI’s research portfolio and realize its potential in data-driven discovery. He will serve as the primary liaison to national security agencies, as well as develop strategic partnerships with other national laboratories, universities, and research institutions. His current research interest is the intersection of machine learning and physics (and other domain sciences).

Francis Alexander

“I was drawn to Brookhaven by the exciting opportunity to strengthen the ties between computational science and the significant experimental facilities—the Relativistic Heavy Ion Collider, the National Synchrotron Light Source II, and the Center for Functional Nanomaterials [all DOE Office of Science User Facilities],” said Alexander. “The challenge of getting the most out of high-throughput and data-rich science experiments is extremely exciting to me. I very much look forward to working with the talented individuals at Brookhaven on a variety of projects, and am grateful for the opportunity to be part of such a respected institution.”

During his more than 20 years at Los Alamos, he held several leadership roles, including as leader of the CCS Division’s Information Sciences Group and leader of the Information Science and Technology Institute. Alexander first joined Los Alamos in 1991 as a postdoctoral researcher at the Center for Nonlinear Studies. He returned to Los Alamos in 1998 after doing postdoctoral work at the Institute for Scientific Computing Research at DOE’s Lawrence Livermore National Laboratory and serving as a research assistant professor at Boston University’s Center for Computational Science.

Link to full article on the Brookhaven website: https://www.bnl.gov/newsroom/news.php?a=112057

The post Alexander Named Dep. Dir. of Brookhaven Computational Initiative appeared first on HPCwire.

Here’s What a Neural Net Looks Like On the Inside

Wed, 02/15/2017 - 12:54

Ever wonder what the inside of a machine learning model looks like? Today Graphcore released fascinating images that show how the computational graph concept maps to a new graph processor and graph programming framework it’s creating.

Alexnet graph (Image Source: Graphcore)

Graphcore is a UK-based startup that’s building a new processor, called the Intelligent Processing Unit (IPU), which is designed specifically to run machine learning workloads. Graphcore says systems that have its IPU processors, which will plug into traditional X86 servers via PCIe interfaces, will have more than 100x the memory bandwidth than scalar CPUs, and will outperform both CPUs and vector GPUs for emerging machine learning workloads for both training and scoring stages.

The company is also developing a software framework called Poplar that will abstract the machine learning application development process from the underlying IPU-based hardware. Poplar was written in C++ and will be able to take applications written in other frameworks, like TensorFlow and MXNet, and compile them into optimized code to execute on IPU-boosted hardware. It will feature C++ and Python interfaces.

All modern machine learning frameworks like TensorFlow, MxNet, Caffe, Theano, and Torch use the concept of a computational graph as an abstraction, says Graphcore’s Matt Fyles, who wrote today’s blog post.

“The graph compiler builds up an intermediate representation of the computational graph to be scheduled and deployed across one or many IPU devices,” Fyles writes. “The compiler can display this computational graph, so an application written at the level of a machine learning framework reveals an image of the computational graph which runs on the IPU.”

Resnet-50 graph execution plan (Image Source: Graphcore)

This is where the images come from. The image at the top of the page shows a graph based on the AlexNet architecture, which is a powerful deep neural network used in image classification workloads among others.

“Our Poplar graph compiler has converted a description of the network into a computational graph of 18.7 million vertices and 115.8 million edges,” Fyles writes. “This graph represents AlexNet as a highly-parallel execution plan for the IPU. The vertices of the graph represent computation processes and the edges represent communication between processes. The layers in the graph are labelled with the corresponding layers from the high level description of the network. The clearly visible clustering is the result of intensive communication between processes in each layer of the network, with lighter communication between layers.”

Graphcore also generated images of the graph execution plan a deep neural network built on Resnet, which Microsoft Research released in 2015. Graphcore was used to compile a 50-layer deep neural network composed of a graph execution plan with 3.22 million vertices and 6.21 million edges.

One of the unique aspects of the ResNet architecture is that it allows deep networks to be assembled from repeated section. Graphcore says its IPU only needs to define these sections once, and then can call them repeatedly, using the same code but with different “network weight data.”

The graph computational execution plan for LIGO data (Image Source: Graphcore)

“Deep networks of this style are executed very efficiently as the whole model can be permanently hosted on an IPU, escaping the external memory bottleneck which limits GPU performance,” the company says.

Finally, Graphcore shared a computational graph execution plan that involved time-series data gathered from astrophysicists working at the University of Illinois. The researchers used the MXnet DNN framework to analyze data collected from the LIGO gravitational wave detector, which looks for gravitational abnormalities caused by the presence of black holes. The image that Graphcore shared is the “full forward and backward pass of the neural network trained on the LIGO data to be used for signal analysis,” the company says.

“These images are striking because they look so much like a human brain scan once the complexity of the connections is revealed,” Fyles writes, “and they are incredibly beautiful too.”

Graphcore emerged from stealth mode last October, when it announced a $30 million Series A round to help finance development of products. Its machine learning (ML) and deep learning acceleration solutions include a PCIe card that plugs directly into a server’s bus.


The post Here’s What a Neural Net Looks Like On the Inside appeared first on HPCwire.

Azure Edges AWS in Linpack Benchmark Study

Wed, 02/15/2017 - 09:43

The “when will clouds be ready for HPC” question has ebbed and flowed for years. It seems clear that for at least some workloads and on some clouds, the answer is now. HPC cloud specialist Nimbix, for example, focuses on providing fast interconnect, large memory, and heterogeneous architecture specifically tailored for HPC. The goliath public clouds have likewise steadily incorporated needed technology and (perhaps less decisively) pricing options.

A new study posted on arXiv.org last week – Comparative benchmarking of cloud computing vendors with High Performance Linpack – authored by Exabyte.io, an admittedly biased source, reports the answer is an unambiguous yes to the question of whether popular clouds can accommodate HPC and further examines some of the differences between a few of the major players.

“For high performance computing (HPC) workloads that traditionally required large and cost-intensive hardware procurement, the feasibility and advantages of cloud computing are still debated. In particular, it is often questioned whether software applications that require distributed memory can be efficiently run on ”commodity” compute infrastructure publicly available from cloud computing vendors,” write the authors, Mohammad Mohammadi, Timur Bazhirov of Exabyte.io.

“We benchmarked the performance of the best available computing hardware from public cloud providers with high performance Linpack. We optimized the benchmark for each computing environment and evaluated the relative performance for distributed memory calculations. We found Microsoft Azure to deliver the best results, and demonstrated that the performance per single computing core on public cloud to be comparable to modern traditional supercomputing systems.

“Based on our findings we suggest that the concept of high performance computing in the cloud is ready for a widespread adoption and can provide a viable and cost-efficient alternative to capital-intensive on- premises hardware deployments.”

Exabyte.io is a young company building a cloud-based environment to assist organizations with materials design – hence it has a horse in the race. Company marketing info on its website states, “Exabyte.io powers the adoption of high-performance cloud computing for design and discovery of advanced materials, devices and chemicals from nanoscale. We combine high fidelity simulation techniques, large-scale data analytics and machine learning tools into a hosted environment available for public, private and hybrid cloud deployments.”

Leaving its interest aside the study is interesting. Here’s a list of the cloud offerings evaluated:

The benchmarking was done using the High Performance Linpack (HPL) program, which solves a random system of linear equations, represented by a dense matrix, in double precision (64 bits) arithmetic on distributed-memory computers. “It does so through a two-dimensional block- cyclic data distribution, and right-looking variant of the LU factorization with row partial pivoting.” It is a portable and freely available software package.

Three different AWS scenarios were tested including– hyper-threaded, non-hyper-threaded, and non-hyper-threaded with placement groups. On Azure, three different instance types were used, F-series, A-series, and H-series. Compute1-60 instances were used on Rackspace. The benchmark was also run on NERSC Edison supercomputer with hyper-threading enabled. Edison, of course, is a Cray XC30, with a peak performance of 2.57 PFLOPS, 133,824 compute cores, 357 terabytes of memory, and 7.56 petabytes of disk, holding number 60 rank on the top500. Specific configurations shown below.

In many cases the performances were quite similar but each also had strengths and weaknesses. For example, network saturation at scale and slower processor clock speeds affected IBM Softlayer’s performance according to the study. The authors also noted: “AWS and Rackspace show a significant degree of parallel performance degradation, such that at 32 nodes the measured performance is about one-half of the peak value.”

The brief paper is best read in full for the details. The performance data for each of the clouds is presented. Below is a summary figure of cloud performances.

Figure 1: Speedup ratios (the ratios of maximum speedup Rmax to peak speedup Rpeak) against the number of nodes for all benchmarked cases. Speedup ratio for 1,2,4,8,16 and 32 nodes are investigated and given by points. Lines are drawn to guide the eye. The legend is as follows: AWS – Amazon Web Services in the default hyper-threaded regime; AWS-NHT – same, with hyperthreading disabled; AWS-NHT- PG – same, with placement group option enabled; AZ – Mi- crosoft Azure standard F16 instances; AZ-IB-A – same provider, A9 instances; AZ-IB-H – same provider, H16 instances; RS – Rackspace compute1-60 instances; SL – IBM/Softlayer virtual servers; NERSC – Edison computing facility of the National Energy Research Scientific Computing Center.

On balance, argue the authors, “Our results demonstrate that the current generation of publicly available cloud computing systems are capable of delivering comparable, if not better, performance than the top-tier traditional high performance computing systems. This fact confirms that cloud computing is already a viable and cost-effective alternative to traditional cost- intensive supercomputing procurement.”

Here is a link to the paper on arXiv.org: https://arxiv.org/pdf/1702.02968.pdf

The post Azure Edges AWS in Linpack Benchmark Study appeared first on HPCwire.

NCSA Introduces Blue Waters Weekly Webinar Series

Wed, 02/15/2017 - 07:54

Feb. 15 — The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign is pleased to announce the Blue Waters Weekly Webinar Series. The series will provide the research and education communities with a variety of opportunities to learn about methods, tools, and resources available to advance their computational and data analytics skills, with an emphasis on scaling to petascale and beyond.

Webinars will generally occur every Wednesday, with a few exceptions to avoid conflicts with major HPC conferences and events. All sessions will be free and open to the to everyone who registers. Registered participants will be able to pose questions using NCSA’s Blue Waters Slack environment. Registration is required for access to YouTube Live broadcasts. Webinars will begin at 10 a.m. Central Time (UTC-6).

Each webinar will be led by a developer or an expert on the topic. The first visualization webinar, “Introduction to Data Visualization” hosted by Vetria Byrd, Purdue University, will take place on March 1, 2017; the first workflows webinar, “Overview of Scientific Workflows” will be hosted by Scott Callaghan, University of Southern California, on March 8, 2017; and the first petascale application improvement discovery webinar, “Getting I/O Done with Parallel HDF5 on Blue Waters” hosted by Gerd Heber, HDF Group, will take place March 29, 2017. The list of webinar tracks as well as specific sessions will be refined and expanded over time.

For more information about the webinar series, including registration, abstracts, speakers, as well as links to Youtube recordings, please visit the Blue Waters webinar series webpage.

Source: NCSA

The post NCSA Introduces Blue Waters Weekly Webinar Series appeared first on HPCwire.

JuliaPro Featured in Danske Bank’s Business Analytics Challenge 2017

Wed, 02/15/2017 - 06:46

COPENHAGEN, Denmark, Feb. 15 — Danske Bank, Denmark’s largest bank, announced that JuliaPro will be available on Microsoft Azure’s Data Science Virtual Machine (DSVM) for participants in the Business Analytics Challenge 2017.

The Business Analytics Challenge 2017 is sponsored by Danske Bank, Microsoft and KMD. The competition is open to all undergraduate and master’s degree students in Denmark and the first prize is 75 thousand kroner.Registration is open until March 31.

This announcement comes two months after the release of JuliaPro and one month after JuliaPro launched on Microsoft Azure’s Data Science Virtual Machine (DSVM).

Viral Shah, Julia Computing CEO says, “We are thrilled that Julia adoption is accelerating so rapidly during the first quarter of 2017. In the last three months, we introduced the new JuliaPro and launched it on the world’s two largest cloud environments: Amazon’s AWS and Microsoft Azure’s Data Science Virtual Machine (DSVM). Julia Computing wishes the best of luck to all contestants in the Danske Bank Business Analytics Challenge 2017.”

About Julia Computing

Julia Computing (JuliaComputing.com) was founded in 2015 by the co-creators of the Julia language to provide support to businesses and researchers who use Julia.

Julia is the fastest modern high performance open source computing language for data and analytics. It combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of Java and C++. Julia delivers dramatic improvements in simplicity, speed, capacity and productivity.

Source: Julia Computing

The post JuliaPro Featured in Danske Bank’s Business Analytics Challenge 2017 appeared first on HPCwire.

Baidu Joins ASC17 Supercomputer Competition with AI Challenge

Wed, 02/15/2017 - 01:01

The 2017 ASC Student Supercomputer Challenge (ASC17) has announced that the contest will include an AI traffic prediction application provided by the Baidu Institute of Deep Learning. Commonly used among unmanned vehicle technologies, this key application software assesses spatial and temporal relations to make reasonable predictions on traffic conditions, helping vehicles choose the most appropriate route, especially in times of congestion.

For the preliminary contest, Baidu will provide the teams with a set of actual data of traffic conditions in a certain city from the past 50 weekdays for training. Each team will conduct data training using Baidu’s deep learning computing architecture, PaddlePaddle, to predict traffic every five minutes during the morning rush hour on the 51st weekday. Baidu will then judge each team on the accuracy of their traffic predictions.

This year’s ASC Student Supercomputer Challenge, the largest supercomputer contest in the world, is jointly organized by the Asia Supercomputer Community, Inspur, the National Supercomputing Center in Wuxi and Zhengzhou University. There are a total of 230 university teams from 15 countries and regions participating in the 2017 contest with the finalists announced on March 13th and the final competition held April 24th-28th.

The contest aims to inspire innovation in supercomputer applications and cultivate young talent. The era of intelligent computing is here and it is being driven by AI. High performance computing is one of the main technologies supporting AI and is facing changes and new challenges. It is with this that ASC has incorporated AI into the competition. It is hoped that more young university students can get involved in this trendy application more quickly and cultivate their enthusiasm for innovation.

For more information on the ASC17 preliminary contest, please visit http://www.asc-events.org/ASC17/Preliminary.php

The post Baidu Joins ASC17 Supercomputer Competition with AI Challenge appeared first on HPCwire.

LBNL Develops Machine Learning Tool for Alloy Research

Tue, 02/14/2017 - 10:48

New and interesting uses for machine learning seem to arise daily. Researchers from Lawrence Berkeley National Laboratory recently trained an algorithm to predict structural characteristics of certain metal alloys that slashes computational power usually required for such calculations. The new approach is expected to accelerate research of new advanced alloys for applications spanning automotive to aerospace and much more.

Their work – Predicting defect behavior in B2 intermetallics by merging ab initio modeling and machine learning – is published in a recent issue of Nature Computational Materials. As an extension of this proof-of-concept work, an open source Python toolkit for modeling point defects in semiconductors and insulators (PyCDT) has been developed.

Traditionally, researchers have used a computational quantum mechanical method known as density functional calculations to predict what kinds of defects can be formed in a given structure and how they affect the material’s properties. This approach is computationally challenging and has limited its use, according to an article on the work posted on the NERSC site.

“Density functional calculations work well if you are modeling one small unit, but if you want to make your modeling cell bigger the computational power required to do this increases substantially,” says Bharat Medasani, a former Berkeley Lab postdoc and lead author of the paper. “And because it is computationally expensive to model defects in a single material, doing this kind of brute force modeling for tens of thousands of materials is not feasible.”

Medasani and his colleagues developed and trained machine learning algorithms to predict point defects in intermetallic compounds, focusing on the widely observed B2 crystal structure. Initially, they selected a sample of 100 of these compounds from the Materials Project Database and ran density functional calculations on supercomputers at the National Energy Research Scientific Computing Center (NERSC), a DOE Office of Science User Facility at Berkeley Lab, to identify their defects.

The overall result is it is no longer necessary to run costly first principle calculations to identify defect properties for every new metallic compound, say the researchers.

“This tool enables us to predict metallic defects faster and robustly, which will in turn accelerate materials design,” says Kristin Persson, a Berkeley Lab Scientist and Director of the Materials Project, an initiative aimed at drastically reducing the time needed to invent new materials by providing open web-based access to computed information on known and predicted materials.

The post LBNL Develops Machine Learning Tool for Alloy Research appeared first on HPCwire.

Supermicro Introduces New BigTwin Server Architecture

Tue, 02/14/2017 - 07:45

SAN JOSE, Calif., Feb. 14 — Super Micro Computer, Inc. (NASDAQ: SMCI), a global leader in compute, storage and networking technologies including green computing, has announced the fifth generation of its Twin family, the new BigTwin server architecture.

The Supermicro BigTwin is a breakthrough multi-node server system with a multitude of innovations and industry firsts. BigTwin supports maximum system performance and efficiency by delivering 30% better thermal capacity in a compact 2U form-factor enabling solutions with the highest performance processor, memory, storage and I/O. Continuing Supermicro’s NVMe leadership the BigTwin is the first All-Flash NVMe multi-node system. BigTwin doubles the I/O capacity with three PCI-e 3.0 x16 I/O options and provides added flexibility with more than 10 networking options including 1GbE, 10G, 25G, 100G, and InfiniBand with its industry leading SIOM modular interconnect. Each node can support current and next generation dual Intel Xeon processors with up to 3TB of memory, 24 drives of All-Flash NVMe, Hybrid NVMe/SATA/SAS, SSD and HDD, and two m.2 NVMe/SATA drives per node. Extending the industry’s largest portfolio of server and storage systems, the BigTwin is ideal for customers looking to create a simple to deploy and manage blazing fast high-density compute infrastructure. This new system is targeted for cloud, big data, enterprise, hyper-converged and IoT workloads that demand maximum performance, efficiency and flexibility.

“Exceeding our customers’ computing performance and efficiency demands has been our hallmark and our new BigTwin server is no exception. As our fifth generation Twin platform, BigTwin optimizes multi-node server density with maximum performance per watt, per square foot and per dollar with support for free-air cooled data centers,” said Charles Liang, President and CEO of Supermicro. “BigTwin is also the first and only multi-node system that supports up to 205-watt Xeon CPUs, a full 24 DIMMs of memory per node and 24 All-Flash NVMe drives ensuring that this architecture is optimized for today and future proofed for the next generation of technology advancements, including next generation Intel Skylake processors.”

BigTwin is a 2U server configuration that supports four compute nodes. Each node supports all of the following: 24 DIMMs of ECC DDR4-2400MHz and higher for up to 3TB of memory; flexible networking with SIOM add-on cards with quad/dual 1GbE, quad/dual 10GbE/10G SFP+, dual 25G, 100G, FDR or EDR InfiniBand options; 24 hot-swap 2.5″ NVMe / SAS3 / SATA3 drives; two PCI-E 3.0 x16 slots; M.2 and SATADOM; and dual Intel Xeon processor E5-2600 v4/v3 product families up to 145W; Supermicro’s PowerStick fully redundant high-efficiency power supplies (2200W, 2600W); and support for free-air cooled datacenters. Sold as a complete system for highest product quality, delivery, and performance, the BigTwin is supported by Supermicro IPMI software and Global Services and is optimized for HPC, data center, cloud and enterprise environments.

About Super Micro Computer, Inc.

Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology is a premier provider of advanced server Building Block Solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and Embedded Systems worldwide. Supermicro is committed to protecting the environment through its “We Keep IT Green” initiative and provides customers with the most energy-efficient, environmentally-friendly solutions available on the market.

Source: Supermicro

The post Supermicro Introduces New BigTwin Server Architecture appeared first on HPCwire.

PRACE Issues Call for Posters for ACM Europe Celebration of Women in Computing

Tue, 02/14/2017 - 06:45

Feb. 14 — The ACM Europe Celebration of Women in Computing: womENcourage 2017 aims to celebrate, connect, inspire, and encourage women in computing. The conference brings together undergraduate, MSc, and PhD students, as well as researchers and professionals, to present and share their achievements and experience in computer science.

WomENcourage solicits posters from all areas of Computer Science. Posters offer the opportunity to engage with other conference attendees, disseminate research work, receive comments, practice presentation skills, benefit from discussing ideas with other researchers from the same field. Submissions should present novel ideas, designs, techniques, systems, tools, evaluations, scientific investigations, methodologies, social issues or policy issues related to any area of computing. Authors may submit original work or versions of previously published work. Posters are ideal for presenting early stage research.

Poster abstracts are to be submitted electronically through EasyChair at https://easychair.org/conferences/?conf=womencourage2017. Submissions should introduce the area in which the work has been done and should emphasize the originality and importance of the contribution. All submissions must be in English, in pdf format. They must not exceed one page in length and they must use the ACM conference publication format. This one-page extended abstract must be submitted to EasyChair as a paper which also contains a short (one paragraph) abstract. Poster abstracts that do not follow the submission guidelines will not be reviewed.

All submissions will be peer reviewed by an international Poster Evaluation Committee. Accepted submissions will be archived on the conference website (but there will be no proceedings). The Guide to a Successful Submission provides tips for preparing a good poster and provides information about the reviewing criteria. A submission may have one or more authors of any gender.

At least one author of each accepted submission is expected to attend the conference to present the ideas discussed in the submission. Information about student scholarships is available here.

Important Dates:

  • Poster abstracts: due April 30, 2017
  • Notification of accepted posters: June 5, 2017
  • Final poster abstracts due: July 3, 2017
  • Poster pdf due: July 31, 2017

Source: PRACE

The post PRACE Issues Call for Posters for ACM Europe Celebration of Women in Computing appeared first on HPCwire.

ASC Challenges TaihuLight and Gordon Bell Application

Tue, 02/14/2017 - 01:01

A high-resolution global surface wave simulation MASNUM_WAVE, a 2016 Gordon Bell Prize finalist, has entered into the ASC Student Supercomputer Challenge 2017. In the preliminary contest, all teams are to optimize this numerical model on Sunway TaihuLight, the world’s fastest computer.

In the preliminary contest, the MASNUM workload contains two sets of data: one from the Western Pacific Ocean, the other of all global oceans. Both are actual data yet with different granuality. ASC supplies each team with over 1,000 cores to use on the Sunway TaihuLight.

To maximize MASNUM’s scalability on Sunway TaihuLight will be critical for all teams. The Sunway TaihuLight system uses China’s home-grown manycore processor SW26010. An excellent performance would require a grasp of this unique computing architecture, network between nodes, and utilization efficiency.

Other than optimizing this Gordon Bell Prize application, each team will also work on an AI traffic prediction, conduct an HPL benchmark on a Xeon Phi cluster, and design a supercomputing system under 3000W power consumption. The finalists of the ASC Student Supercomputer Challenge 2017 will be announced on March 13. The final competition will take place at the National Supercomputing Center in Wuxi from April 24 to April 28.

In ASC17, over 230 teams from 15 countries have registered to compete. Details about the ASC17 preliminary contest can be found at http://www.asc-events.org/ASC17/Preliminary.php

The post ASC Challenges TaihuLight and Gordon Bell Application appeared first on HPCwire.

Mellanox Demonstrates Improvement in Crypto Performance With Innova IPsec 40G Ethernet Network Adapter

Mon, 02/13/2017 - 06:53

SUNNYVALE, Calif. and YOKNEAM, Israel, Feb. 13 — Mellanox Technologies, Ltd. (NASDAQ: MLNX), a leading supplier of high-performance, end-to-end interconnect solutions for data center servers and storage systems, today announced superior crypto throughput of line rate using Mellanox’s Innova IPsec Network Adapter, demonstrating more than three times higher throughput and more than four times better CPU utilization when compared to x86 software-based server offerings. Mellanox’s Innova IPsec adapter provides seamless crypto capabilities and advanced network accelerations to modern data centers, thereby enabling the ubiquitous use of encryption across the network while sustaining unmatched performance, scalability and efficiency. By replacing software-based offerings, Innova can reduce data center expenses by 60 percent or more.

As security concerns in data centers continue to rise, along with the inability of CPU-based products to handle today’s exponential data growth, delivering cost effective and performant hardware-accelerated crypto solutions on a per-server basis has become paramount to maintaining the integrity and confidentiality of the data exchanged over the network infrastructure.

The Innova IPsec adapter addresses the growing need for security and “encryption by default” by combining Mellanox ConnectX advanced network adapter accelerations with IPsec offload capabilities to deliver end-to-end data protection in a low profile PCIe form factor. The Innova IPsec adapter offers multiple integrated crypto and security protocols and performs the encryption/decryption of data-in-motion, freeing up costly CPU cycles.

“The Innova security adapter product line enables the use of secure communications in a cost effective and a performant manner,” said Gilad Shainer, vice president of marketing at Mellanox Technologies. “Whether used within an appliance such as firewall or gateway, or as an intelligent adapter that ensures data-in-motion protection, Innova IPsec adapters are the ideal solution for cloud, Web 2.0, telecommunication, high-performance compute, storage systems and other applications.”

Innova products deliver industry-leading technologies and accelerations through the integrated ConnectX-4 Lx network adapter, such as support for RDMA over Converged Ethernet (RoCE), Ethernet stateless offload engines, Overlay Networks, and more.

As part of the Innova product line, the Innova Flex Intelligent Network Adapter enables customers to leverage the flexibility of the embedded FGPA to develop their own logic within the adapter. The Innova Flex and IPsec network adapters are currently available in volume quantities.

Mellanox will be exhibiting at the RSA Conference 2017, Feb. 13-17, booth no. 406, in the South Hall of Moscone Center, San Francisco. At the show, Mellanox will showcase its Innova solutions, as well as the Company’s Ethernet and InfiniBand intelligent interconnect products.

About Mellanox

Mellanox Technologies (NASDAQ: MLNX) is a leading supplier of end-to-end Ethernet and InfiniBand intelligent interconnect solutions and services for servers, storage, and hyper-converged infrastructure. Mellanox intelligent interconnect solutions increase data center efficiency by providing the highest throughput and lowest latency, delivering data faster to applications and unlocking system performance. Mellanox offers a choice of high performance solutions: network and multicore processors, network adapters, switches, cables, software and silicon, that accelerate application runtime and maximize business results for a wide range of markets including high performance computing, enterprise data centers, Web 2.0, cloud, storage, network security, telecom and financial services. More information is available at: www.mellanox.com.

Source: Mellanox Technologies

The post Mellanox Demonstrates Improvement in Crypto Performance With Innova IPsec 40G Ethernet Network Adapter appeared first on HPCwire.