In a scaling breakthrough for oil and gas discovery, ExxonMobil geoscientists report they have harnessed the power of 717,000 processors – the equivalent of 22,000 32-processor computers – to run complex oil and gas reservoir simulation models.
This is more than four times the previous number of processors used in energy exploration HPC implementations, according to ExxonMobil, which worked with the National Center for Supercomputing Applications (NCSA) at the University of Illinois, Champaign-Urbana, and its Cray XE6 “Blue Waters” petascale system.
“Reservoir simulation has long been seen as difficult to scale beyond a few thousand processors,” John D. Kuzan, manager, reservoir function for ExxonMobil Upstream Research Company, told HPCwire’s sister publication EnterpriseTech, “and even then, ‘scale’ might mean (at best on a simple problem) ~50 percent efficiency. The ability to scale to 700,000-plus is remarkable – and gives us confidence that in the day-to-day use of this capability we will be efficient at a few thousand processors for a given simulation run (knowing that on any given day in the corporation there are many simulations being run).”(Source: NCSA)
The objective of the scaling effort is for ExxonMobil geoscientists and engineers to make better, and a higher number of, drilling decisions by more efficiently predicting reservoir performance. The company said the run resulted in data output thousands of times faster than typical oil and gas industry reservoir simulations and that was the largest number of processor counts reported by the energy industry.
ExxonMobil’s scientists, who have worked with the NCSA on various projects since 2008, began work on the “half million” challenge – i.e., scaling reservoir simulations past half a million processors – since 2015. NCSA’s Blue Waters system is one of the most powerful supercomputers in the world. Scientists and engineers use the system on a range of engineering and scientific problems. It uses hundreds of thousands of computational cores to achieve peak performance of more than 13 quadrillion calculations per second and has more than 1.5 PB of memory, 25 PB of disk storage and 500 PB of tape storage.
The reservoir simulation benchmark involved a series of multi-million to billion cell models on Blue Waters using hundreds of thousands of processors simultaneously. The project required optimization of all aspects of the reservoir simulator, from input/output to communications across hundreds of thousands of processors.
“The partnership with NCSA was important because we had the opportunity to use ‘all’ of Blue Waters,” said Kuzan, “and when trying to use the full capability/capacity of a machine the logistics can be a challenge. It means not having the machine for some other project (even if it is for only a few minutes per run). The NCSA was willing to accommodate this and worked very hard not to disrupt others using the machine.”
The simulations were run on a proprietary ExxonMobil application, one that Kuzan said has not yet been named but is referred to as the “integrated reservoir modeling and simulation platform.”
Reservoir simulation studies are used to guide decisions, such as well placement, the design of facilities and development of operational strategies, to minimize financial and environmental costs. To model complex processes accurately for the flow of oil, water, and natural gas in the reservoir, simulation software solves a number of complex equations. Current reservoir management practices in the oil and gas industry are often hampered by the slow speed of reservoir simulation.
“NCSA’s Blue Waters sustained petascale system, which has benefited the open science community so tremendously, is also helping industry break through barriers in massively parallel computing,” said Bill Gropp, NCSA’s acting director.
The post ExxonMobil, NCSA, Cray Scale Reservoir Simulation to 700,000+ Processors appeared first on HPCwire.
Since our initial coverage of the TSUBAME3.0 supercomputer yesterday, more details have come to light on this innovative project. Of particular interest is a new board design for NVLink-equipped Pascal P100 GPUs that will create another entrant to the space currently occupied by Nvidia’s DGX-1 system, IBM’s “Minsky” platform and the Supermicro SuperServer (1028GQ-TXR).
The press photo shared by Tokyo Tech revealed TSUBAME3.0 to be an HPE-branded SGI ICE supercomputer. The choice is not surprising considering that SGI has long held a strong presence in Japan. SGI Japan, the primary contractor here, has collaborated with Tokyo Tech on a brand-new board design that we’ve been told is destined for the HPE product line.TSUBAME3.0 node design (source: Tokyo Tech)
The board is first of its kind in employing Nvidia GPUs (four), NVLink processor interconnect technology, Intel processors (two) and the Intel Omni-Path Architecture (OPA) fabric. Four SXM2 P100s are configured into a hybrid mesh cube, making full use of the NVLink (1.0) interconnect to offer a large amount of memory bandwidth between the GPUs. As you can see in the figure on the right, each half the quad connects to its own PLX PCIe switch, which links to an Intel Xeon CPU. The PCIe switches also enable direct one-to-one connections between the GPUs and an Omni-Path link. A slide from a presentation shared by Tokyo Tech depicts how the this hooks into the fabric.
TSUBAME3.0 will be comprised of 540 such nodes for a total of 2,160 SXM2 P100s and 1,080 Xeon E5-2680 V4 (14 core) CPUs.
At the rack level, 36 server blades house a total of 144 Pascals and 72 Xeons. The components are water cooled with an inlet water temperature a warm 32 degrees (C), for a PUE of 1.033. “That’s lower than any other supercomputer I know,” commented Tokyo Tech Professor Satoshi Matsuoka, who is leading the design. (Here’s a diagram of the entire cooling system.)
Each node also has 2TBs of NVMe SSD for I/O acceleration, totalling more than 1 petabyte for the entire system. It can be used locally, or aggregated on-the-fly with BGFS as an ad-hoc “Burst-Buffer” filesystem, Matsuoka told us.
The second-tier storage is composed of DDN’s Exascalar technology, which uses controller integration to achieve a 15.9PB Lustre parallel file system in three racks.TSUBAME3.0 node overview (source: Tokyo Tech)
With 15 SGI ICE XA racks and two network racks, TSUBAME3.0 delivers 12.2 petaflops of spec’d computational power within 20 racks (excluding the in-row chillers). This makes TSUBAME 3.0 the smallest >10 petaflops machine in the world, said Matsuoka, who offered for comparison the K computer (10.5 Linpack petaflops, 11.3 peak) which extends to 1,000 racks, a 66X delta.
Like TSUBAME2.0/2.5, the new system continues the endorsement of smart partitioning. “The TSUBAME3.0 node is ‘fat’ but we want flexible partitioning,” said Matsuoka. “We will be using container technology as a default, being able to partition the nodes arbitrarily into pieces for flexible scheduling and achieving very high utilization. A job that uses only CPUs or just one GPU won’t waste the remaining resources on the node.”
As we noted in our earlier coverage, total rated system performance is 12.15 double-precision petaflops, 24.3 single-precision petaflops and 47.2 half-precision petaflops, aka “AI-Petaflops.”
“Since we will keep TSUBAME2.5 and KFC alive, the combined ‘AI-capable’ performances of the three machines will reach 65.8 petaflops, making it the biggest capacity infrastructure for ML/AI in Japan, or 6 times faster than the K-computer,” said Matsuoka.Satoshi Matsuoka with the TSUBAME3.0 blade
At yesterday’s press event in Japan, Professor Matsuoka also revealed that Tokyo Tech and the National Institute of Advanced Industrial Science and Technology (AIST) are going to open their joint “Open Innovation Laboratory” (OIL) next Monday, Feb. 20. Prof. Matsuoka will lead this organization and TSUBAME3.0 will be partially used for these joint efforts. The main resource of OIL will be an upcoming massive AI supercomputer, named “ABCI,” announced in late November (2016). So in some respects, TSUBAME3.0, with an operational target of summer 2017, will be a prototype machine to ABCI, which has a targeted installation of Q1 2018.
“Overall, I believe TSUBAME3.0 to be way above class compared to any supercomputers that exist, including the [other] GPU-based ones,” Professor Matsuoka told HPCwire. “There are not really any technical compromises, and thus the efficiency of the machine by every metric will be extremely good.”
The post TSUBAME3.0 Points to Future HPE Pascal-NVLink-OPA Server appeared first on HPCwire.
Blog originally posted here.
Members of the CTSC team are presenting the talk “Cybersecurity Program for Small Projects,” on February 27th at 11am (EDT). Our presenters are Susan Sons, Craig Jackson, and Bob Cowles (speaker info).
Please register here. Be sure to check spam/junk folder for registration confirmation with attached calendar file.
Based on CTSC’s cybersecurity program development guide (see trustedci.org/guide), this webinar addresses practical information security tasks for small and medium science projects. The NSF CCoE’s work spans the full range of NSF-funded projects and facilities, and cybersecurity is certainly not a one-size-fits-all endeavor.
Some of the topics covered include:
· Cybersecurity’s relevance to science projects.
· The complexity and scope of cybersecurity, and how cybersecurity programs can help you cope with that complexity (and protect your science).
· A handful of “must-do” (and doable!) action items.
This session is appropriate for principal investigators, program officers, IT professionals in research and higher education, research facility managers, and security professionals interested in information security approaches tailored to particular communities. It is not a detailed technical training. There will be significant opportunities for Q&A.
More information about this presentation is on the event page.
Presentations are recorded and include time for questions with the audience.
A physics professor’s recent trip to the city of Al Ain in the United Arab Emirates to deliver several talks exemplifies Colorado School of Mines’ ongoing efforts to embody technical excellence while embracing a broader worldview.
Professor Lincoln Carr was invited to United Arab Emirates University by Usama Al Khawaja, an associate professor of physics and collaborator who has visited Mines several times. While the original plan was for Carr to speak about quantum mechanics, he ended up delivering three talks there that varied greatly in scope.
In addition to discussing a new direction for analog quantum computations—quantum complexity—Carr also delivered a college-level talk on recent massive investments into quantum research by the U.S., China and Europe, and a university-wide lecture about bridging the divide between the sciences and the humanities.
This last lecture, covering centuries of history, “is an extremely radical talk,” Carr said. He noted that while many Americans are unfamiliar with fundamental science concepts, leading to a rejection of many scientific discoveries, that’s only part of the problem: on the other hand, many Americans in STEM fields have little appreciation for the humanities.
This belief that the scientific perspective is true and the rest is a matter of opinion, Carr said, “stands in strong contrast to the attitude of Western Enlightenment thinkers as well as their predecessors in Arab and Islamic civilizations, who approached the world around them with an attitude of curiosity and openness synthesizing many paradigms of thought.”
Carr cited the study of lucid dreaming, once dismissed as New Age pseudoscience but now the subject of serious research, as a point where science and the humanities have come together. He was pleasantly surprised by the reception to this lecture, having expected a more conservative audience—even avoiding using images of women. Instead, he ended up delivering his talk on the women’s half of the campus. “They loved it. The audience was really open-minded,” said Carr, who noted that 80 percent of the population of the UAE comes from other countries, leading to really diverse perspectives.
Professors at UAE University—one of the top two public research universities in the Middle East—come from all over Africa, southeast Europe, anywhere Islam has touched people, Carr said. Al Ain, a city of about 650,000, is “an intellectual city—a giant college town,” he said. During this trip, Carr also presented his talk on quantum complexity at Texas A&M University at Qatar.
Carr credits his fellow faculty in the McBride Honors Program, which he has taught in for several years, with teaching him to think beyond physics. “Everything that I spoke about there came straight out of honors courses I’ve taught here,” he said. Carr praised President Paul Johnson for his support of the program. “I think it will create the kind of future scientists and engineers who will solve key world problems,” he said. “You can’t solve the clean water problem by working only in your discipline and not understanding the global picture.”
Carr said Mines, along with the “Renaissance engineers” it will produce, will have a major role in these future solutions. “Mines is a place where we can do something really different,” he said. “I really believe in that mission.”
SANTA CLARA, Calif., Feb. 17 — DataDirect Networks (DDN) today announced that the Tokyo Institute of Technology (Tokyo Tech) has selected DDN as its strategic storage infrastructure provider for the new TSUBAME3.0 supercomputing system. The innovative design of the TSUBAME3.0 is a major step along an evolutionary path toward a fundamental convergence of data and compute. TSUBAME3.0 breaks with many of the conventions of the world’s top supercomputers, incorporating elements and design points from containerization, cloud, artificial intelligence (AI) and Big Data, and it exhibits extreme innovation in the area of power consumption and system efficiency.
“As we run out the clock on Moore’s law, performance enhancements will increasingly be driven by improvements in data access times that come from faster storage media and networks, innovative data access approaches and the improvement of algorithms that interact with data subsystems,” said Satoshi Matsuoka, Professor, Ph.D., of the High Performance Computing Systems Group, GSIC, Tokyo Institute of Technology.
The IO infrastructure of TSUBAME3.0 combines fast in-node NVMe SSDs and a large, fast, Lustre-based system from DDN. The 15.9PB Lustre parallel file system, composed of three of DDN’s high-end ES14KX storage appliances, is rated at a peak performance of 150GB/s. The TSUBAME collaboration represents an evolutionary branch of HPC that could well develop into the dominant HPC paradigm at about the time the most advanced supercomputing nations and consortia achieve Exascale computing.
DDN and Tokyo Tech have worked together, starting with TSUBAME2.0, the previous generation supercomputer at Tokyo Tech, which debuted in the #4 spot on the Top500 and was certified as “the Greenest Production Supercomputer in the World.”
“Our collaboration with Tokyo Tech began more than six years ago and has spanned several implementations of the TSUBAME system,” said Robert Triendl, senior vice president of global sales, marketing and field services, DDN. “What is exciting about working with Satoshi and his team is the clear vision of advancing research computing from systems that support tightly-coupled simulations toward a new generation of data-centric infrastructures for the future of research big data but also AI and machine learning.”
Operated by the Global Scientific Information and Computing Center at Tokyo Tech, the TSUBAME systems are used by a variety of scientific disciplines and a wide-ranging community of users. Tokyo Tech researchers – professors and students – are the top users of the system, followed by industrial users, foreign researchers and external researchers working in collaboration with Tokyo Tech professors.
“Tokyo Tech is very pleased with our DDN solution and long-term partnership, and we are looking forward to teaming with DDN on future storage technologies for new application areas, such as graph computing and machine learning,” added Matsuoka.
DataDirect Networks (DDN) is the world’s leading big data storage supplier to data-intensive, global organizations. For more than 18 years, DDN has designed, developed, deployed and optimized systems, software and storage solutions that enable enterprises, service providers, universities and government agencies to generate more value and to accelerate time to insight from their data and information, on premise and in the cloud. Organizations leverage the power of DDN storage technology and the deep technical expertise of its team to capture, store, process, analyze, collaborate and distribute data, information and content at the largest scale in the most efficient, reliable and cost-effective manner. DDN customers include many of the world’s leading financial services firms and banks, healthcare and life science organizations, manufacturing and energy companies, government and research facilities, and web and cloud service providers. For more information, go to www.ddn.com or call 1-800-837-2298.
The post Tokyo Tech Selects DDN as Storage Infrastructure Provider for New TSUBAME3.0 Supercomputer appeared first on HPCwire.
Feb. 17 — One of the main tools doctors use to detect diseases and injuries in cases ranging from multiple sclerosis to broken bones is magnetic resonance imaging (MRI). However, the results of an MRI scan take hours or days to interpret and analyze. This means that if a more detailed investigation is needed, or there is a problem with the scan, the patient needs to return for a follow-up.
A new, supercomputing-powered, real-time analysis system may change that.
Researchers from the Texas Advanced Computing Center (TACC), The University of Texas Health Science Center (UTHSC) and Philips Healthcare, have developed a new, automated platform capable of returning in-depth analyses of MRI scans in minutes, thereby minimizing patient callbacks, saving millions of dollars annually, and advancing precision medicine.
The team presented a proof-of-concept demonstration of the platform at the International Conference on Biomedical and Health Informatics this week in Orlando, Florida.
The platform they developed combines the imaging capabilities of the Philips MRI scanner with the processing power of the Stampede supercomputer – one of the fastest in the world – using the TACC-developed Agave API Platform infrastructure to facilitate communication, data transfer, and job control between the two.
An API, or Application Program Interface, is a set of protocols and tools that specify how software components should interact. Agave manages the execution of the computing jobs and handles the flow of data from site to site. It has been used for a range of problems, from plant genomics to molecular simulations, and allows researchers to access cyberinfrastructure resources like Stampede via the web.
“The Agave Platform brings the power of high-performance computing into the clinic,” said William (Joe) Allen, a life science researcher for TACC and lead author on the paper. “This gives radiologists and other clinical staff the means to provide real-time quality control, precision medicine, and overall better care to the patient.”
The entire article can be found here.
Source: Aaron Dubrow, TACC
The post Stampede Supercomputer Assists With Real-Time MRI Analysis appeared first on HPCwire.
Feb. 17 — Advanced Clustering Technologies is helping customers solve challenges by integrating NVIDIA Tesla P100 accelerators into its line of high performance computing clusters. Advanced Clustering Technologies builds custom, turn-key HPC clusters that are used for a wide range of workloads including analytics, deep learning, life sciences, engineering simulation and modeling, climate and weather study, energy exploration, and improving manufacturing processes.
“NVIDIA-enabled GPU clusters are proving very effective for our customers in academia, research and industry,” said Jim Paugh, Director of Sales at Advanced Clustering. “The Tesla P100 is a giant step forward in accelerating scientific research, which leads to breakthroughs in a wide variety of disciplines.”
Tesla P100 GPU accelerators are based on NVIDIA’s latest Pascal GPU architecture, which provides the throughput of more than 32 commodity CPU-based nodes. The Tesla P100 specifications are:
- 5.3 teraflops double-precision performance
- 10.6 teraflops single-precision performance
- 21.2 teraflops half-precision performance
- 732GB/sec memory bandwidth with CoWoS HBM2 stacked memory
- ECC protection for increased reliability
“Customers taking advantage of Advanced Clustering’s high performance computing clusters with integrated NVIDIA Tesla P100 GPUs benefit from the most technologically advanced accelerated computing solution in the market – greatly speeding workload performance across analytics, simulation and modeling, deep learning and more,” said Randy Lewis, Senior Director of Worldwide Field Operations at NVIDIA.
About Advanced Clustering
Advanced Clustering Technologies, a privately held corporation based in Kansas City, Missouri, is dedicated to developing high-performance computing (HPC) solutions. The company provides highly customized turn-key cluster systems — utilizing out-of-the-box technology — to companies and organizations with specialized computing needs.
The technical and sales teams have more than 50 years of combined industry experience and comprehensive knowledge in the areas of cluster topologies and cluster configurations. In business since 2001, Advanced Clustering Technologies’ commitment to exceeding client expectations has earned the company the reputation as one of the nation’s premier providers of high performance computing systems. For more details, please visit http://www.advancedclustering.com/technologies/gpu-computing/.
Source: Advanced Clustering
The post Advanced Clustering Integrating NVIDIA Tesla P100 Accelerators Into Line of HPC Clusters appeared first on HPCwire.
In a press event Friday afternoon local time in Japan, Tokyo Institute of Technology (Tokyo Tech) announced its plans for the TSUBAME 3.0 supercomputer, which will be Japan’s “fastest AI supercomputer,” when it comes online this summer (2017). Projections are that it will deliver 12.2 double-precision petaflops and 64.3 half-precision (peak specs).
Nvidia was the first vendor to publicly share the news in the US. We know that Nvidia will be supplying Pascal P100 GPUs, but the big surprise here is the system vendor. The Nvidia blog did not specifically mention HPE or SGI but it did include this photo with a caption referencing it as TSUBAME3.0:TSUBAME3.0 – click to expand (Source: Nvidia)
That is most certainly an HPE-rebrand of the SGI ICE XA supercomputer, which would make this the first SGI system win since the supercomputer maker was brought into the HPE fold. For fun, here’s a photo of the University of Tokyo’s “supercomputer system B,” an SGI ICE XA/UV hybrid system:Source: University of Tokyo-Institute for Solid State Physics
TSUBAME3.0 is on track to deliver more than two times the performance of its predecessor, TSUBAME2.5, which ranks 40th on the latest Top500 list (Nov. 2016) with a LINPACK score of 2.8 petaflops (peak: 5.6 petaflops). When TSUBAME was upgraded from 2.0 to 2.5 in the fall of 2013, the HP Proliant SL390s hardware stayed the same, but the GPU was switched from the NVIDIA (Fermi) Tesla M2050 to the (Kepler) Tesla K20X.
Increasingly, we’re seeing Nvidia refer to half-precision floating point capability as “AI computation.” Half-precision is suitable for many AI training workloads (but by no means all) and it’s usually sufficient for inferencing tasks.
With this rubric in mind, Nvidia says TSUBAME3.0 is expected to deliver more than 47 petaflops of “AI horsepower” and when operated in tandem with TSUBAME2.5, the top speed increases to 64.3 petaflops, which would give it the distinction of being Japan’s highest performing AI supercomputer.
According to a Japanese-issue press release, DDN will be supplying the storage infrastructure for TSUBAME 3.0. The high-end storage vendor is providing a combination of high-speed in-node NVMe SSD and its high-speed Lustre-based EXAScaler parallel file system, consisting of three racks of DDN’s high-end ES14KX appliance with capacity of 15.9 petabytes and a peak performance of 150 GB/sec.
TSUBAME3.0 is expected to be up and running this summer. The Nvidia release notes, “It will used for education and high-technology research at Tokyo Tech, and be accessible to outside researchers in the private sector. It will also serve as an information infrastructure center for leading Japanese universities.”
“NVIDIA’s broad AI ecosystem, including thousands of deep learning and inference applications, will enable Tokyo Tech to begin training TSUBAME3.0 immediately to help us more quickly solve some of the world’s once unsolvable problems,” said Tokyo Tech Professor Satoshi Matsuoka, who has been leading the TSUBAME program since it began.
“Artificial intelligence is rapidly becoming a key application for supercomputing,” said Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA. “NVIDIA’s GPU computing platform merges AI with HPC, accelerating computation so that scientists and researchers can drive life-changing advances in such fields as healthcare, energy and transportation.”
We remind you the story is still breaking, but wanted to share what we know at this point. We’ll add further details as they become available.
The post Tokyo Tech’s TSUBAME3.0 Will Be First HPE-SGI Super appeared first on HPCwire.
Within the haystack of a lethal disease such as ALS (amyotrophic lateral sclerosis / Lou Gehrig’s Disease) there exists, somewhere, the needle that will pierce this therapy-resistant affliction. Finding the needle is a trial-and-error process of monumental proportions for scientists at pharmaceutical companies, medical research centers and academic institutions. As models grow in scale so too does the need for HPC resources to run simulations iteratively, to try-and-fail fast until success is found.
That’s all well and good if there’s ready access to HPC on premises. If not, drug developers, such as ALS researcher Dr. May Khanna, Pharmacology Department assistant professor at the University of Arizona, have turned to HPC resources provided by public cloud services. But using AWS, Azure or Google introduces a host of daunting compute management problems that tax the skills and time availability of most on-site IT staffs.
These tasks include data placement, instance provisioning, job scheduling, configuring software and networks, cluster startup and tear-down, cloud provider setup, cost management and instance health checking. To handle these cloud orchestration functions tied to 5,000 cores of Google Cloud Preemptive VMs (PVMs), Dr. Khanna and her team at Arizona turned to Cycle Computing to run “molecular docking” simulations at scale by Schrödinger Glide molecular modeling drug design software.
The results: simulations that would otherwise take months have been compressed to a few hours, short enough to be run during one of Dr. Khanna’s seminars and the output shared with students.
Developing new drugs to target a specific disease often starts with the building blocks of the compounds that become the drugs. The process begins with finding small molecules that can target specific proteins that, when combined, can interact in a way that becomes the disease’s starting point. The goal is to find a molecule that breaks the proteins apart. This is done by simulating how the small molecules dock to the specific protein locations. These simulations are computationally intensive, and many molecules need to be simulated to find a few good candidates.
Without powerful compute resources, researchers must artificially constrain their searches, limiting the number of molecules to simulate. And they only check an area of the protein known to be biologically active. Even with these constraints, running simulations takes a long time. Done right, molecular docking is an iterative process that requires simulation, biological verification, and then further refinement. Shortening the iteration time is important to advancing the research.
The objective of Dr. Khanna’s work was to simulate the docking of 1 million compounds to one target protein. After a simulation was complete, the protein was produced in the lab, and compounds were then tested with nuclear magnetic resonance spectroscopy.
“It’s a target (protein) that’s been implicated in ALS,” the energetic Dr. Khanna told EnterpriseTech. “The idea is that the particular protein was very interesting, people who modulated it in different ways found some significant improvement in the ALS models they have with (lab) mice. The closer we can link biology to what we’re seeing as a target, the better chance of actually getting to a real therapeutic.”
“Modulating,” Dr. Khanna explained, is disrupting two proteins interacting in a way that is associated with ALS, a disease that currently afflicts about 20,000 Americans and for which there is no cure. “We’re trying to disrupt them, to release them to do their normal jobs,” she said.
She said CycleCloud plays a central role in running Schrödinger Glide simulations. Without Google Cloud PVMs, simulations would take too long and model sizes would be too small to generate meaningful results. Without CycleCloud, the management of 5,000 PVM nodes would not be possible.
CycleCloud provides a web-based GUI, a command line interface and APIs to define cloud-based clusters. It auto-scales clusters by instance types, maximum cluster size and costing parameters, deploying systems of of up to 156,000 cores while validating each piece of the infrastructure. Additionally, it syncs in-house data repositories with cloud locations in a policy / job driven fashion, to lower costs.
It should be noted that the use of Google Cloud’s PVMs, while helping to hold down the cost of running simulations to $200, contribute an additional degree of complexity to Dr. Khanna’s project work. Preemptible compute capacity offers the advantage of a consistent price not subject to dynamic demand pricing, as are other public cloud instances. PVMs are assigned to a job for a finite period of time but – here’s the rub – they can be revoked at any moment. While Dr. Khanna’s workflow was ideal for leveraging PVMs, since it consists of small, short-running jobs, PVMs can disappear at without warning.
In the case of Dr. Khanna’s ALS research work, said Jason Stowe, CEO of Cycle Computing said, “if you’re willing to getting rid of the node, but you’re able to use it during that timeframe at substantially lower cost, that allows you get a lot more computing bang for your buck. CycleCloud automates the process, taking care of nodes that go away, making sure the environment isn’t corrupted, and other technical aspects that we take care of so the user doesn’t have to.”
The simulation process is divided into two parts. The first step uses the Schrödinger LigPrep package, which converts 2D structures to the 3D format used in the next stage. This stage started with 4 GB of input data staged to an NFS filer. The output data was approximately 800KB and was stored on the NFS filer as well. To get the simulation done as efficiently as possible, the workload was split into 300 smaller jobs to assist in scaling the next stage of the workflow. In total, the first stage consumed 1500 core-hours of computation.
The Schrödinger Glide software package performs the second stage of the process, where the actual docking simulation is performed. Each of the 300 sub-jobs consists of four stages, each with an attendant prep stage. The total consumption was approximately 20,000 core-hours using 5,000 cores of n1-highcpu-16 instances. Each instance had 16 virtual cores with 60 gigabytes of RAM. The CycleCloud software dynamically sized the cluster based on the number of jobs in queue and replaced preempted instances.
Dr. Khanna’s research is the early stages of a process that, if successful, could take several years before reaching human clinical trials.
“The faster we can do this, the less time we have to wait for results, so we can go back and test it again and try to figure out what compounds are really binding,” she said, “the faster the process can move along.”
Dr. Khanna said plans are in place to increase the size of the pool of potential compounds, as well as include other proteins in the simulation to look for interactions that would not typically be seen until later in the process. The team will also simulate over the entire surface of the protein instead of just a known-active area unlocking “an amazing amount of power” in the search process, she said.
“That jump between docking to binding to biological testing takes a really long time, but I think we can move forward on that with this cloud computing capacity,” she said. “The mice data that we saw was really exciting…, you could see true significant changes with the mice. I can’t tell you we’ve discovered the greatest thing for ALS, but showing that if we take these small molecules and we can see improvement, even that is so significant.”
The post Drug Developers Use Google Cloud HPC in the Fight against ALS appeared first on HPCwire.
HAARLEM, The Netherlands, Feb. 16, 2017 — Asperitas, cleantech startup from the Amsterdam area, one of the world’s datacentre hotspots, is introducing a unique solution based on a total liquid cooling concept called Immersed Computing.
After 1.5 years of research and development with an ecosystem of partners Asperitas is launching their first market ready solution, the AIC24, at the leading international industry event Data Centre World & Cloud Expo Europe.
The Asperitas AIC24 is at the centre of Immersed Computing. It is a closed system and the first water-cooled oil-immersion system which relies on natural convection for circulation of the dielectric liquid. This results in a fully self-contained and Plug and Play modular system. The AIC24 needs far less infrastructure than any other liquid installation, saving energy and costs on all levels of datacentre operations. The AIC24 is the most sustainable solution available for IT environments today. Ensuring the highest possible efficiency in availability, energy reduction and reuse, while increasing capacity. Greatly improving density, while saving energy at the same time.
The AIC24 is designed to ensure the highest possible continuity for cloud providers. Total immersion ensures no oxygen gets in touch with the IT components, preventing oxidation. Thermal shock is greatly reduced due to the high heat capacity of liquid. The immersed environment only has minor temperature fluctuations, greatly reducing stress by thermal expansion on micro-electronics. These factors eliminate the root cause for most of the physical degradation of micro-electronics over time.
Plug and Play green advanced computing anywhere
The AIC24 is Plug and Play. A single module requires only power, access to a water loop and data connectivity to operate. Combined with its silent workings, these limited requirements enable high flexibility in deployment sites and scenarios for the AIC24.
Two specially designed Convection Drives for forced water and natural ﬂow of oil, are capable of transferring 24 kW of heat from the oil while keeping all the IT components at allowable operating temperatures.
Maximised IT capacity, the Asperitas Universal Cassette can contain multiple physical servers. Each module accommodates 24 AUC’s, as well as 2 Universal Switching Cassettes. This currently adds up to 48 immersed servers and 2 immersed switches.
Immersed Computing is a concept driven by sustainability, efficiency and flexibility and goes far beyond just technology. In many situations, Immersed Computing can save more than 50% of the total energy footprint. By using immersion, 10-45% of IT energy is reduced due to the lack of fans, while other energy consumers like cooling installations can achieve up to 95% energy reduction. It allows for warm water cooling which provides even more energy savings on cooling installations. One more benefit, Immersed Computing enables high temperature heat reuse.
Immersed Computing includes an optimised way of work, highly effective deployment, flexible choice of IT and drastic simplification of datacentre design. Offering great advantages on all levels of any datacentre value chain, Immersed Computing realises maximum results in Cloud, Private and Edge environments.
Asperitas is a cleantech company focused on greening the datacentre industry by introducing immersed computing.
The Asperitas Development partners include University of Leeds, Aircraft Development and Systems Engineering (ADSE), Vienna Scientific Cluster, Super Micro, Schleifenbauer and Brink Industrial. Asperitas is furthermore recognised and supported by the Netherlands Enterprise Agency as a promising new cleantech company.
The post Dutch Startup Offers Immersive Cooling for Cloud, Edge and HPC Datacenter appeared first on HPCwire.
The series starts Wednesday, March 1
The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign is pleased to announce the Blue Waters Weekly Webinar Series. The series will provide the research and education communities with a variety of opportunities to learn about methods, tools, and resources available to advance their computational and data analytics skills, with an emphasis on scaling to petascale and beyond.
Webinars will generally occur every Wednesday, with a few exceptions to avoid conflicts with major HPC conferences and events. All sessions will be free and open to the to everyone who registers. Registered participants will be able to pose questions using NCSA’s Blue Waters Slack environment. Registration is required for access to YouTube Live broadcasts. Webinars will begin at 10 a.m. Central Time (UTC-6).
Each webinar will be led by a developer or an expert on the topic. The first visualization webinar, “Introduction to Data Visualization” hosted by Vetria Byrd, Purdue University, will take place on March 1, 2017; the first workflows webinar, “Overview of Scientific Workflows” will be hosted by Scott Callaghan, University of Southern California, on March 8, 2017; and the first petascale application improvement discovery webinar, “Getting I/O Done with Parallel HDF5 on Blue Waters” hosted by Gerd Heber, HDF Group, will take place March 29, 2017. The list of webinar tracks as well as specific sessions will be refined and expanded over time.
For more information about the webinar series, including registration, abstracts, speakers, as well as links to Youtube recordings, please visit the Blue Waters webinar series webpage.
About Blue Waters
Blue Waters, managed by the National Center for Supercomputing Applications, is one of the most powerful supercomputers in the world. It can complete more than 1 quadrillion calculations per second on a sustained basis and more than 13 times that at peak speed. Blue Waters is the fastest supercomputer at a university anywhere in the world. Blue Waters also collaborates with other national HPC programs to prepare current and future faculty and both undergraduate and graduate students to gain the knowledge and skill sets necessary to capitalize on high-performance computing resources. Activities include training and workshops for faculty and students, campus visits, undergraduate internship, and graduate fellowships. Blue Waters is supported by the University of Illinois and the National Science Foundation through awards ACI-0725070 and ACI-1238993.
The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign provides supercomputing and advanced digital resources for the nation’s science enterprise. At NCSA, University of Illinois faculty, staff, students, and collaborators from around the globe use advanced digital resources to address research grand challenges for the benefit of science and society. NCSA has been advancing one third of the Fortune 50 for more than 30 years by bringing industry, researchers, and students together to solve grand challenges at rapid speed and scale.
Assistant Director, National Center for Supercomputing Applications
University of Illinois at Urbana-Champaign
Public Affairs / Marketing / Facilities
Here at HPCwire, we aim to keep the HPC community apprised of the most relevant and interesting news items that get tweeted throughout the week. The tweets that caught our eye this past week are presented below.
— Chris Mustain (@ChrisMustain) February 16, 2017
— Data Center Systems (@InspurServer) February 16, 2017
— Bilel Hadri (@mnoukhiya) February 13, 2017
— LauraSchulz (@lauraschulz) February 15, 2017
— Wayne State C&IT (@WayneStateCIT) February 13, 2017
— Chris Mustain (@ChrisMustain) February 16, 2017
— Sharon Broude Geva (@SBroudeGeva) February 15, 2017
— Fernanda Foertter (@hpcprogrammer) February 15, 2017
— NCI Australia (@NCInews) February 12, 2017
— Chris Mustain (@ChrisMustain) February 16, 2017
— George Markomanolis (@geomark) February 13, 2017
— NCSAatIllinois (@NCSAatIllinois) February 13, 2017
— Fernanda Foertter (@hpcprogrammer) February 14, 2017
Click here to view the top tweets from last week.
February 16, 2017 − 2017 ASC Student Supercomputer Challenge (ASC17) held its opening ceremony at Zhengzhou University. 230 teams from all over the world will challenge the world’s fastest supercomputer Sunway TaihuLight, artificial intelligence application, Gordon Bell Award nomination application, and compete for 20 places in the finals. Hundred supercomputing experts and team representatives worldwide attended the opening ceremony.
The number of teams registered ASC17 Challenge has reached a new high, is up 31% compare to the last year. The competition platforms and applications have been designed to reflect the leading-edge characteristic: Sunway TaihuLight and the most advanced supercomputer in Henan province (which is in the middle of China) will perform different competition applications. Baidu’s AI application, intelligent driving traffic prediction and a high-resolution global surface wave simulation MASNUM_WAVE, a 2016 Gordon Bell Prize finalist will give the teams the opportunities to challenge the “Super Brain” and the “Big Science “. Meanwhile, ASC17 finals will include 20 teams, instead of the original 16 teams.
Wang Endong, initiator of the ASC challenge, academician of the Chinese Academy of Engineering and Chief Scientist at Inspur, said that with the convergence of HPC, big data and cloud computing, intelligent computing as represented by artificial intelligence will become the most important and significant component for the coming computing industry , and bring new challenges in computing technologies. For two consecutive seasons, ASC Challenge has set AI applications to hope students can understand the deep learning algorithms, and acquire the knowledge relating to big data and cutting-edge computing technologies, thereby grooming inter-disciplinary supercomputing talent for the future.
On the day of the opening ceremony, Henan province’s fastest supercomputer in Zhengzhou University (Zhengzhou City) Supercomputing Center launched and become one of the competition platforms for ASC17. Liu Jiongtian, academician of the Chinese Academy of Engineering , President of Zhengzhou University, pity not to attended the event but believed that this will allow teams worldwide to experience the latest technology such as KNL many core architecture. At the same time, this will also help to accelerate supercomputing applications innovations in Zhengzhou and Henan Province and help to groom supercomputing talent in the region, promote smart city development in Zhengzhou, and support rapid economic development of the regions in middle of China.
Yang Guangwen, director of National Supercomputing Center in Wuxi, said that all the processors used in Sunway TaihuLight are home grown by China, and that it is the world’s first supercomputer to achieve 100 petaflops. Using Sunway TaihuLight as the competition platform, will give each team the opportunity to experience the world’s fastest supercomputer, in order to promote the training of young talents better. At the same time, the international exchanges resulting from ASC17 Challenge will help more people appreciate Chinese capability in independent design in the supercomputing domain.
The organizers of ASC17 Challenge have also arranged a 2-day intensive training camp for the participants, where experts from National Supercomputing Center in Wuxi, Baidu, and Inspur conducted comprehensive and systematic lectures. Topics included the design of a supercomputer system, the KNL, deep learning application optimization solutions and techniques on using Sunway TaihuLight.
The ASC Student Supercomputer Challenge is initiated by China, and supported by experts and institutions worldwide. The competition aims to be the platform to promote exchanges among young supercomputing talent from different countries and regions, as well as to groom young talent. It also aims to be the key driving force in promoting technological and industrial innovations by improving the standards in supercomputing applications and research. ASC Challenge has been held for 6 years. This year the ASC17 Challenge is co-organized by Zhengzhou University, the National Supercomputing Centre in Wuxi , and Inspur.
The post 230 Teams worldwide join ASC17 to challenge AI and TaihuLight appeared first on HPCwire.
Title: ‘IKE WAI DATABASE MANAGER
Deadline to Apply: 2017-02-20
Deadline to Remove: 2017-02-20
Job Summary: Develops and maintains database and research storage applications in support of the Information Technology Services Cyberinfrastructure group and the EPSCoR Track 1 ‘Ike Wai Project supporting the full data lifecycle. Works with ITS Cyberinfrastructure (CI) developers on software application, gateways and pipelines that utilize and store scientific research data. Works with researchers on managing their scientific data and on data dissemination. Aides in applying ontologies and controlled vocabularies for metadata and data organization. Develops and maintains databases for scientific research data. Provides training and technical assistance to end users, including developing technical documents.
Job URL: http://www.hawaii.edu/epscor/database-manager/
Job Location: Oahu. HI
Institution: University of Hawai‘i Information Technology Services
Requisition Number: 17048
Posting Date: 2017-02-16
Job Posting Type: Job
Please visit http://hpcuniversity.org/careers/ to view this job on HPCU.
Please contact email@example.com with questions.