HPC Wire

Subscribe to HPC Wire feed
Since 1987 - Covering the Fastest Computers in the World and the People Who Run Them
Updated: 6 hours 2 min ago

Two ORNL-Led Research Teams Receive $10.5 Million to Advance Quantum Computing

Fri, 10/20/2017 - 10:43

OAK RIDGE, Tenn., Oct. 20, 2017 — By harnessing the power of quantum mechanics, researchers hope to create quantum computers capable of simulating phenomenon at a scale and speed unthinkable on traditional architectures, an effort of great interest to agencies such as the Department of Energy tasked with tackling some of the world’s most complex science problems.

DOE’s Office of Science has awarded two research teams, each headed by a member of Oak Ridge National Laboratory’s Quantum Information Science Group, more than $10 million over five years to both assess the feasibility of quantum architectures in addressing big science problems and to develop algorithms capable of harnessing the massive power predicted of quantum computing systems. The two projects are intended to work in concert to ensure synergy across DOE’s quantum computing research spectrum and maximize mutual benefits.

Caption: ORNL’s Pavel Lougovski (left) and Raphael Pooser will lead research teams working to advance quantum computing for scientific applications. Credit: Oak Ridge National Laboratory, U.S. Dept. of Energy

ORNL’s Raphael Pooser will oversee an effort titled, “Methods and Interfaces for Quantum Acceleration of Scientific Applications,” part of the larger Quantum Computing Testbed Pathfinder program funded by DOE’s Advanced Scientific Computing Research office.

Pooser’s team, which includes partners from IBM, commercial quantum computing developer IonQ, Georgia Tech and Virginia Tech, received $7.5 million over five years to evaluate the performance of a suite of applications on near-term quantum architectures.

The idea, Pooser said, is to work with industry leaders to understand the potential of quantum architectures in solving scientific challenges on the scale of those being tackled by DOE. ORNL will focus on scientific applications spanning three fields of study: quantum field theory, quantum chemistry and quantum machine learning.

“Quantum applications that are more exact and faster than their classical counterparts exist or have been proposed in all of these fields, at least theoretically,” said Pooser. “Our job is to determine whether we can get them to work on today’s quantum hardware and on the hardware of the near future.”

Many of these applications have never been programmed for quantum architectures before, which presents a unique challenge. Because today’s quantum computers are relatively small, applications must be tuned to the hardware to maximize performance and accuracy. This requires a deep understanding of the uniquely quantum areas of the programs, and it requires running them on various quantum architectures to assess their validity, and ultimately their feasibility.

“Many new quantum programming techniques have evolved to address this problem,” said Pooser, adding that his team would “implement new programming models that leverage the analog nature of quantum simulators.”

To increase their chances of success, Pooser’s team will work closely with his ORNL colleague Pavel Lougovski who is overseeing the “Heterogeneous Digital-Analog Quantum Dynamics Simulations” effort, which has received $3 million over three years.

Lougovski has partnered with the University of Washington’s Institute for Nuclear Theory and the University of the Basque Country UPV/EHU in Bilbao, Spain, to develop quantum simulation algorithms for applications in condensed matter and nuclear physics, specifically large-scale, many-body systems of particular interest to DOE’s Office of Science.

Lougovski’s team will pursue an algorithm design approach that combines best features of digital and analog quantum computing with the end goal of matching the complexity of quantum simulation algorithms to available quantum architectures. Because development and deployment of quantum hardware is a nascent field compared to traditional computing platforms, the team will also harness the power of hybrid quantum systems that use a combination of quantum computers and traditional processors.

“We have assembled a multidisciplinary team of computer scientists, applied mathematicians, scientific application domain experts, and quantum computing researchers,” Lougovski said. “Quantum simulation algorithms, much like our team, are a melting pot of various quantum and classical computing primitives. Striking a right balance between them and available hardware will enable new science beyond the reach of conventional approaches.”

ORNL’s quantum information researchers have decades of quantum computing research experience, and the laboratory has also made significant investments across the quantum spectrum, including in quantum communications and quantum sensing, and has strong relationships with industry leaders. The lab’s Quantum Computing Institute brings together expertise across the quantum spectrum and fosters collaboration across domains, from nanotechnology to physics to chemistry to biology.

These assets, along with ORNL’s rich history in traditional high-performance computing and ramping up applications to exploit powerful computing resources, will be critical in realizing the potential of the quantum platform to greatly accelerate scientific understanding of the natural world.

ORNL is managed by UT-Battelle for the DOE Office of Science. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, please visit http://science.energy.gov/.

Source: ORNL

The post Two ORNL-Led Research Teams Receive $10.5 Million to Advance Quantum Computing appeared first on HPCwire.

NCSA Calls on HTCondor Partnership to Process Data for DES

Fri, 10/20/2017 - 07:54

Oct. 20, 2017 — The Laser Interferometer Gravitational-Wave Observatory (LIGO) has detected gravitational waves from a neutron star-neutron star merger. This event reveals a direct association between the merger and the galaxy where it occurred. Scientists have been trying to show for decades that this has been happening, but this is the first time they’ve been able to prove it. However, what makes this event unique from the previous four gravitational waves detected prior, is that this neutron star-neutron star merger was detected in three different ways.

LIGO detected gravitational waves. The FERMI satellite detected gamma rays, and hours later as the sun set in Chile, the Dark Energy camera saw an optical source (light) from a neutron-neutron star merger. This multi messenger astronomy event was the first detection of its kind in history. The images from the Dark Energy camera were processed using the Dark Energy Survey (DES) data reduction pipelines at NCSA using HTCondor.

“HTCondor has made it possible for us take raw data from a telescope and process and disseminate the results within hours of it the observations occurring,” Professor Robert Gruendl production scientist for DES and senior research scientist at NCSA.

HTCondor is a specialized workload management system for compute-intensive jobs. Unlike simple batch systems, HTCondor has the ability to distribute workloads across many sites. DES is actively running workloads on Blue WatersIllinois Campus Cluster, and the Open Science Grid at Fermilab. “HTCondor’s central role in the production system is to make data available scientists within hours of being observed,” said Miron Livny.

The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign and HTCondor at the University of Wisconsin-Madison Center for High Through Computing(CHTC) have been collaborating on projects for 30 years.

“This collaboration will be a powerful means to develop high-throughput computing (HTC) data processing and analysis capability that is also beginning to address the unique and evolving needs of the LSSTcommunity while advancing the state of the art of HTC,” said Miron Livny, senior researcher in distributed computing at the University of Wisconsin-Madison. “This mutually beneficial partnership will deliver better astronomy science and distributed data intensive computing science,” said Livny. The Condor project also contributed to the TeraGrid project and the GRIDS project, both involving significant NCSA involvement.

Originally known as simply Condor, this system reprocessed radio data from the Baltimore-Illinois-Maryland Association (BIMA) in the late eighties. This collaboration between the Universities of California, Illinois, and Maryland built and operated the BIMA radio telescope array, which was the premier imaging instrument of its time in 1986 at millimeter wavelengths in radio astronomy.

The Dark Energy Survey Data Management (DESDM), led by NCSA, has relied on HTCondor software to enable data processing on the Blue Waters supercomputer for DES. DES is an international, collaborative effort to map hundreds of millions of galaxies, detect thousands of supernovae, and find patterns of cosmic structure in an effort to understand dark matter and the expansion of the Universe.

Source: NCSA

The post NCSA Calls on HTCondor Partnership to Process Data for DES appeared first on HPCwire.

Scientists Use CSCS Supercomputer to Search for “Memory Molecules”

Fri, 10/20/2017 - 07:00

LUGANO, Switzerland, Oct. 20, 2017 — Until now, searching for genes related to memory capacity has been comparable to seeking out the proverbial “needle in a haystack”. Scientists at the University of Basel made use of the CSCS supercomputer “Piz Daint” to discover interrelationships in the human genome that might simplify the search for “memory molecules” and eventually lead to more effective medical treatment for people with diseases that are accompanied by memory disturbance.

Every human being’s physical and mental constitution is the outcome of a complex interaction between environmental factors and the individual genetic make-up (DNA). The complete set of genes and the genetic information it stores is called the genotype. This genotype influences, among other things, a person’s memory and hence the ability to remember the past. In addition to the DNA coding, there are other factors contributing to memory capacity, such as nutrition, schooling and family homelife.

Memories are made of… what?

Scientists at the University of Basel from the Transfaculty Research Platform Molecular and Cognitive Neuroscience (MCN) are interested in processes related to memory performance by investigating the molecular basis of memory. “Molecular neuroscience is a dynamic field, with work being done around the world by a community that ranges from mathematicians and computer scientists to applied psychologists,” explains Annette Milnik, a post-doctoral fellow in the research group under Professor Andreas Papassotiropoulos, co-head of the research platform. The goal of Milnik’s research is to find patterns in genes that are related to memory capacity and that might explain how memory works and how it can be influenced. “There is no such thing as ‘the’ memory gene, but rather many variations in the genome that, combined with numerous other factors, form our memory,” says Milnik.

To investigate memory capacity, researchers from the fields of medical science, psychiatry, psychology and biology make usage of brainwave measurements, memory tests and imaging techniques while the brain is subjected to various stimuli. The researchers also make usage of animal models, as well as genetic and epigenetic studies. The latter examine phenomena and mechanisms that cause chemical changes in the chromosomes and genes without altering their actual DNA sequence.

One quadrillion tests

To decipher the molecular basis of memory capacity, researchers “zoom” deep into the human DNA. For this purpose, Milnik examines particular gene segments and their variants. Originally Milnik studied  psychology and human medicine, but five years ago she traded in her doctor’s career for research and the associated statistical analysis. In her current work, the number of statistical tests together amounted to one quadrillion (1015). Analysing such a quantity of data would not be possible without a supercomputer like “Piz Daint”, she notes. Yet her results might significantly simplify future analysis of large datasets in the search for the “memory molecule”.

Although the genetic code (DNA) is fixed in all cells, mechanisms like epigenetic processes exists that regulate which parts of the code are expressed. As an example, kidney and liver cells each use different parts of the genome. One process that performs this “functional attribution” is called methylation. “You could imagine flags marking the spots on the human genome where methylation takes place,” explains Milnik. A pattern of flags typical for a particular gene thus identifies a given cell function like a pointer. The cell function is influenced by how the genes are specifically read. Moreover, according to Milnik, the environment too may influence the flag-patterns. “We are facing highly complex relationships between genes, the environment, and how they interact. This is why we want to take a step back and seek out a simplified model for these relationships,” says Milnik.

Influence of genetic variations 

Using material sampled from healthy young volunteers, Milnik and her colleagues examined 500,000 genetic variations known as Single Nucleotide Polymorphisms (SNPs)–the basic building blocks of nucleic acids–in conjunction with 400,000 flag-patterns. They wanted to investigate the impact of the genetic code on methylation. According to the researchers, the results of their study show not only that single SNPs located nearby the flags have an impact on the flag-pattern (methylation), but also that combinations of genetic variants–both in proximity and farther apart in the genome–affect this flag-pattern. “This shows us that genetic variants exert a complex influence on methylation,” says Milnik. The flag-pattern thus unifies the impact of a larger set of genetic variants that is then represented in one signal.

This means for Milnik that she has found a kind of intermediate filter that reduces the large datasets that have been used to investigate memory capacity. In the past, each single genetic variation has been related individually to memory capacity. But now, according to Milnik, we know that the flags accumulate information from a complex system of SNP-effects. So in the future, instead of using all of the individual SNPs to explore memory capacity and other complex human characteristics, the methylation flag-pattern could be used as well. This approach as it currently stands is still basic research. However, once the molecules relevant to memory capacity can be identified in this way, a next step could be to investigate whether medicines exist that interact with the corresponding gene products and which might be able to influence memory capacity, explains Milnik. This would offer a gleam of hope for treatment of diseases that are accompanied by memory disturbance, such as dementia or
schizophrenia.

Source: CSCS

The post Scientists Use CSCS Supercomputer to Search for “Memory Molecules” appeared first on HPCwire.

OpenSFS Offers Maintenance Release 2.10.1 for the Lustre File System

Thu, 10/19/2017 - 17:25

Oct. 19 — In a move to solidify and expand the use of the open-source Lustre file system for the greater high-performance computing (HPC) community, Open Scalable File Systems, Inc., or OpenSFS, today announced the first Long Term Support (LTS) maintenance release of Lustre 2.10.1.

This latest advance, in effect, fulfills a commitment made by Intel last April to align its efforts and support around the community release, including efforts to design and release a maintenance version. This transition enables growth in the adoption and rate of innovation for Lustre.

“When OpenSFS transitioned to being community-driven, it was exactly things like this that we were hoping for,” stated Steve Simms, former OpenSFS president. “It’s a real milestone for Lustre as an open-source project.”

“The Lustre file system is a critical tool for HPC to meet the growing demands of users who are using vast amounts of data to tackle increasingly complex problems,” added Trish Damkroger, Vice President of Technical Computing at Intel. “Intel continues to invest in the Lustre community as shown by the 2.10.1 release and are looking forward to continued collaboration with OpenSFS on the Lustre 2.11 release”

Aside from Intel, other members of the Lustre Working Group (LWG), a technical group of senior Lustre contributors from multiple organizations, played major roles in the development of this new software package. (See list below).

“We really appreciate all the contributions to the Lustre code base from all the different organizations of LWG,” said Dustin Leverman, LWG co-chair.

Though excitement about new software often revolves around new features and performance, this new release focuses on stability and reliability and is critical for researchers and other HPC users who want to focus on HPC rather than fixing bugs. This maintenance release improves the Lustre 2.10.0 code base, which included several new features, such as progressive file layouts, multi-rail LNet, and project quotas. The next feature release will be Lustre 2.11, currently slated for Q1 of 2018.

The new release supports a variety of recently updated popular open-source technologies including Red Hat 7.4, CentOS 7.4, and ZFS 0.7.1. This should help expand the adoption of Lustre in many organizations, particularly smaller universities and HPC centers, that rely on open-source for their advanced computing needs. The announcement further underscores Lustre’s status as community-supported open-source software. (A full list of changes is linked to below.)

“It is exciting to see the development community and the vendors supporting Lustre come together to deliver the first maintenance release of the 2.10 LTS stream. This is a very important achievement for the Lustre community,” said Sarp Oral, current OpenSFS president.

OpenSFS encourages contributions from any group interested in support Lustre, not just through software development. Contributors to the Lustre code base and LWG include:

Canonical, CEA, Cray, DDN, Hewlett Packard Enterprise, Indiana University, Intel HPDD, Lawrence Livermore National Laboratory, Oak Ridge National Laboratory, Sandia National Laboratory, Seagate, SuperMicro.

About Open SFS

OpenSFS is a nonprofit organization founded in 2010 to advance Lustre development, ensuring it remains vendor-neutral, open, and freely downloadable (http://lustre.org/download/). OpenSFS participants include vendors and customers who employ the world’s best Lustre file system experts, implementing and supporting Lustre solutions across HPC and commercial enterprises. OpenSFS actively promotes the growth, stability and vendor neutrality of the Lustre file system.

OpenSFS web site: http://opensfs.org

Lustre Working Group: http://wiki.opensfs.org/Lustre_Working_Group

Lustre 2.10.1 Changelog: http://wiki.lustre.org/Lustre_2.10.1_Changelog

The post OpenSFS Offers Maintenance Release 2.10.1 for the Lustre File System appeared first on HPCwire.

Data Vortex Users Contemplate the Future of Supercomputing

Thu, 10/19/2017 - 16:01

Last month (Sept. 11-12), HPC networking company Data Vortex held its inaugural users group at Pacific Northwest National Laboratory (PNNL) bringing together about 30 participants from industry, government and academia to share their experiences with Data Vortex machines and have a larger conversation about transformational computer science and what future computers are going to look like.

Coke Reed and John Johnson with PEPSY at PNNL

The meeting opened with Data Vortex Founder and Chairman Dr. Coke Reed describing the “Spirit of Data Vortex,” the self-routing congestion-free computing network that he invented. Reed’s talk was followed by a series of tutorials and sessions related to programming, software, and architectural decisions for the Data Vortex. A lively panel discussion got everyone thinking about the limits of current computing and the exciting potential of revolutionary approaches. Day two included presentations from the user community on the real science being conducted on Data Vortex computers. Beowulf cluster inventor Thomas Sterling gave the closing keynote, tracing the history of computer science all the way back from antiquity up to current times.

“This is a new technology but it’s mostly from my perspective an opportunity to start rethinking from the ground up and move a little bit from the evolutionary to the revolutionary aspect,” shared user meeting host PNNL research scientist Roberto Gioiosa in an interview with HPCwire. “It’s an opportunity to start doing something different and working on how you design your algorithm, run your programs. The idea that it’s okay to do something revolutionary is an important driver and it makes people start thinking differently.”

Roberto Gioiosa with JOLT at PNNL

“You had that technical exchange that you’d typically see in a user group,” added John Johnson, PNNL’s deputy director for the computing division. “But since we’re looking at a transformational technology, it provided the opportunity for folks to step back and look at computing at a broader level. There was a lot of discussion about how we’re reaching the end of Moore’s law and what’s beyond Moore’s computing – the kind of technologies we are trying to focus on, the transformational computer science. The discussion actually was in some sense, do we need to rethink the entire computing paradigm? When you have new technologies that do things in a very very different way and are very successful in doing that, does that give you the opportunity to start rethinking not just the network, but rethinking the processor, rethinking the memory, rethinking input and output and also rethinking how those are integrated as well?”

The heart of the Data Vortex supercomputer is the Data Vortex interconnection network, designed for both traditional HPC and emerging irregular and data analytics workloads. Consisting of a congestion-free, high-radix network switch and a Vortex Interconnection Controller (VIC) installed on commodity compute nodes, the Data Vortex network enables the transfer of fine-grained network packets at a high injection rate.

The approach stands in contrast to existing crossbar-based networks. Reed explained, “The crossbar switch is set with software and as the switches grow in size and clock-rate, that’s what forces packets to be so long. We have a self-routing network. There is no software management system of the network and that’s how we’re able to have packets with 64-bit headers and 64-bit payloads. Our next-gen machine will have different networks to carry different sized packets. It’s kind of complicated really but it’s really beautiful. We believe we will be a very attractive network choice for exascale.”

Data Vortex is targeting all problems that require either massive data movement, short packet movement or non-deterministic data movement — examples include sparse linear algebra, big data analytics, branching algorithms and fast fourier transforms.

The inspiration for the Data Vortex Network came to Dr. Reed in 1976. That was the year that he and Polish mathematician Dr. Krystyna Kuperberg solved Problem 110 posed by Dr. Stanislaw Ulam in the Scottish Book. The idea of Data Vortex as a data carrying, dynamical system was born and now there are more than 30 patents on the technology.

Data Vortex debuted its demonstration system, KARMA, at SC13 in Denver. A year later, the Data Vortex team publicly launched DV206 during the Supercomputing 2014 conference in New Orleans. Not long after, PNNL purchased its first Data Vortex system and named it PEPSY in honor of Coke Reed and as a nod to Python scientific libraries. In 2016, CENATE — PNNL’s proving ground for measuring, analyzing and testing new architectures — took delivery of another Data Vortex machine, which they named JOLT. In August 2017, CENATE received its second machine (PNNL’s third), MOUNTAIN DAO.

MOUNTAIN DAO is comprised of sixteen compute nodes (2 Supermicro F627R3-FTPT+ FatTwin Chassis with 4 servers each), each containing two Data Vortex interface cards (VICs), and 2 Data Vortex Switch Boxes (16 Data Vortex 2 level networks, on 3 switch boards, configured as 4 groups of 4).

MOUNTAIN DAO is the first multi-level Data Vortex system. Up until this generation, the Data Vortex systems were all one-level machines, capable of scaling up to 64 nodes. Two-level systems extend the potential node count to 2,048. The company is also planning for three-level systems that will be scalable up to 65,653 nodes, and will push them closer to their exascale goals.

With all ports utilized on 2-level MOUNTAIN DAO, L2 applications depict negligible L1 to L2 performance differences.

PNNL scientists Gioiosa and Johnson are eager to be exploring the capabilities of their newest Data Vortex system.

“If you think about traditional supercomputers, the application has specific characteristics and parameters that have evolved to match those characteristics. Scientific simulation workloads tend to be fairly regular; they send fairly large messages so the networks we’ve been using so far are very good at doing that, but we are facing a new set of workloads coming up — big data, data analytics, machine learning, machine intelligence — these applications do not look very much like the traditional scientific computing so it’s not surprising that the hardware we been using so far is not performing very well,” said Giosiosa.

“Data Vortex provides an opportunity to run both sets of workloads, both traditional scientific application and matching data analytics application in an efficient way so we were very interested to see how that was actually working in practice,” Gioiosa continued. “So as we received the first and second system, we started porting workloads, porting applications. We have done a lot of different implementations of the same algorithm to see what is the best way to implement things in these systems and we learned while doing this and making mistakes and talking to the vendor. The more we understood about the system the more we changed our programs and they were more efficient. We implement these algorithms in ways that we couldn’t do on traditional supercomputers.”

Johnson explained that having multiple systems lets them focus on multiple aspects of computer science. “On the one hand you want to take a system and understand how to write algorithms for that system that take advantage of the existing hardware and existing structure of the system but the other type of research that we like to do is we liked to get in there and sort of rewire it and do different things, and put in the sensors and probes and all different things, which can help you bring different technologies together but would get in the way of porting algorithms directly to the existing architecture so having different machines that have different purposes. It goes back to one of the philosophies we have, looking at the computer as a very specialized scientific instrument and as such we want it to be able to perform optimally on the greatest scientific challenges in energy, environment and national security but we also want to make sure that we are helping to design and construct and tune that system so that it can do that.”

The PNNL researchers emphasized that even though these are exploratory systems they are already running production codes.

“We can run very large applications,” said Gioiosa. “These applications are on the order of hundreds of thousands of lines of code. These are production applications, not test apps that we are just running to extract the FLOPS.”

At the forum, researchers shared how they were using Data Vortex for cutting-edge applications, quantum computer simulation and density function theory, a core component in computational chemistry. “These are big science codes, the kind you would expect to see running on leadership-class systems and we heard from users who ported either the full application or parts of the application to Data Vortex,” said Johnson.

“This system is usable,” said Gioiosa. “You can run your application, you can do real science. We saw a simulation of quantum computers and people in the audience who are actually using a quantum computer said this is great because in quantum computing we cannot see the inside of the computer, we only see outside. It’s advancing understanding of how quantum algorithms work and how quantum machines are progressing and what we need to do to make them mainstream. I call it science, but this means production for us; we don’t produce carts but we produce tests and problems and come up with solutions and increase discovery and knowledge so that is our production.”

Having held a successful first user forum, the organizers are looking ahead to future gatherings. “There are events that naturally bring us together, like Supercomputing and other big conferences, but we are keen to have this forum once every six months or every year depending on how fast we progress,” said Gioiosa. “We expect it will grow as more people who attend will go back to their institution and say, oh this was great, next time you should come too.”

What’s Next for Data Vortex

The next major step on the Data Vortex roadmap is to move away from the commodity server approach they have employed in all their machines so far to something more “custom.”

“What we had in this generation is a method of connecting commodity processors,” said Dr. Reed. “We did Intel processors connected over an x86 (PCIe) bus. Everything is fine grained in this computer except the Intel processor and the x86 bus and so the next generation we’re taking the PCIe bus out of the critical path. Our exploratory units [with commodity components] have done well but now we’re going full custom. It’s pretty exciting. We’re using exotic memories and other things.”

Data Vortex expects to come out with an interim approach using FPGA-based compute nodes by this time next year. Xilinx technology is being given serious consideration, but specific details of the implementation are still under wraps. (We expect more will be revealed at SC17.) Current generation Data Vortex switches and VICs are built with Altera Stratix V FPGAs and future network chip sets will be built with Altera Stratix 10 FPGAs.

Data Vortex has up to this point primarily focused on big science and Department of Defense style problems, but now they are looking at expanding the user space to explore anywhere there’s a communication bottleneck. Hyperscale and embedded systems hold potential as new market vistas.

In addition to building its own machines, Data Vortex is inviting other people to use its interconnect in their computers or devices. In fact, the company’s primary business model is not to become a deliverer of systems. “We’ve got the core communication piece so we’re in a position now where we’re looking at compatible technologies and larger entities to incorporate this differentiating piece to their current but more importantly next-generation designs,” Data Vortex President Carolyn Coke Reed Devany explained. “What we’re all about is fine-grained data movement and that doesn’t necessarily have to be in a big system, that can be fine-grained data movement in lots of places.”

The post Data Vortex Users Contemplate the Future of Supercomputing appeared first on HPCwire.

AI Self-Training Goes Forward at Google DeepMind

Thu, 10/19/2017 - 14:23

Imagine if all the atoms in the universe could be added up into a single number. Big number, right? Maybe the biggest number conceivable. But wait, there’s a bigger number out there. We’re told that Go, the world’s oldest board game, has more possible board positions than there are atoms in the universe. Urban myth? All right, let’s say Go has half as many positions as there are atoms. Make it a tenth. The point is: Go complexity is beyond measure.

DeepMind, Google’s AI research organization, announced today in a blog that AlphaGo Zero, the latest evolution of AlphaGo (the first computer program to defeat a Go world champion) trained itself within three days to play Go at a superhuman level (i.e., better than any human) – and to beat the old version of AlphaGo – without leveraging human expertise, data or training.

The absence of human training may have “liberated” AlphaGo Zero to find new ways to play Go that humans don’t know, putting the new system beyond the talents of the human-trained AlphaGo.

Richard Windsor, analyst at Edison Investment Research, London, notes that today’s announcement is an important step forward on one of the three big AI challenges which, he said, are:

  • AI systems that can be trained with less data
  • AI that takes lessons learned from one task and applies it across multiple tasks
  • AI that builds its own models

“DeepMind has been able to build a new Go (AlphaGo Zero) algorithm that relies solely on self-play to improve and within 36 hours was able to defeat AlphaGo Lee (the one that beat [professional Go player] Lee Sedol) 100 games to 0…,” Windsor said. “DeepMind’s achievement represents a huge step forward in addressing the first challenge as AlphaGo Zero used no data at all…”

According to DeepMind, previous versions of AlphaGo were trained on the basis of thousands of human games. But AlphaGo Zero “skips this step and learns to play simply by playing games against itself, starting from completely random play.” In doing so, it quickly surpassed human level of play and went undefeated against AlphaGo.

The new self-training algorithm, according to the DeepMind blog, is significant for AI systems to take on problems for which “human knowledge may be too expensive, too unreliable or simply unavailable. As a result, a long-standing ambition of AI research is to bypass this step, creating algorithms that achieve superhuman performance in the most challenging domains with no human input.”

DeepMind said AlphaGo Zero uses a novel form of reinforcement learning in which the system starts off with a neural network that knows nothing about Go. “It then plays games against itself, by combining this neural network with a powerful search algorithm. As it plays, the neural network is tuned and updated to predict moves, as well as the eventual winner of the games.”

AhaGo has become progressively more efficient thanks to hardware gains and more recently algorithmic advances (Source: DeepMind)

The updated neural network is then recombined with the search algorithm to create a new, stronger version of AlphaGo Zero, and the process begins again, improving incrementally with each game. (The algorithmic change also significantly improves system efficiency, see graphic at right.)

“This technique is more powerful than previous versions of AlphaGo because it is no longer constrained by the limits of human knowledge. Instead, it is able to learn tabula rasa from the strongest player in the world: AlphaGo itself,” said DeepMind.

Put another way by Windsor: “It is almost as if the use of human data limited the potential of the machine’s ability to maximize its potential.”

While the new system makes strides against the self-training Big AI Challenge, Windsor expressed doubts that it addresses the third challenge (automated model building) because it used a model already used by the previous version of AlphaGo.

“…the system of board assessment and move prediction (but not the experience) used in AlphaGo Lee was also built into AlphaGo Zero,” said Windsor. “Hence, we think that this system was instead using a framework that had already been developed to play and applying reinforcement learning to improve, rather than building its own models.”

But this isn’t to minimize the achievement of AlphaGo Zero, nor to quell those (such as Elon Musk) who worry that human intelligence will eventually be dwarfed by AI, with potential dystopic implications.

“What will really have the likes of Elon Musk quaking in their boots is the fact that AlphaGo Zero was able to obtain a level of expertise of Go that has never been achieved by a human mind,” Windsor said.

Having said that, include Windsor among those who don’t believe machines will enslave the human race. He also said that DeepMind may have trouble applying its achievement elsewhere.

“Many of the other digital ecosystems have been trying to use computer generated images to train image and video recognition algorithms but there has been no real success to date and we suspect that taking what DeepMind has achieved and applying it to real world AI problems like image and video recognition will be very difficult,” he said, explaining that “the Go problem is based on highly structured data in a clearly defined environment whereas images, video, text, speech and so on are completely unstructured.”

But DeepMind sounded a more optimistic note on the broader applicability of AlphaGo Zero teaching itself new and incredibly complicated tricks.

“These moments of creativity give us confidence that AI will be a multiplier for human ingenuity, helping us with our mission to solve some of the most important challenges humanity is facing…. If similar techniques can be applied to other structured problems, such as protein folding, reducing energy consumption or searching for revolutionary new materials, the resulting breakthroughs have the potential to positively impact society.”

The post AI Self-Training Goes Forward at Google DeepMind appeared first on HPCwire.

SC17 Video: How Supercomputing Helps Explain the Ocean’s Role in Weather and Climate

Thu, 10/19/2017 - 13:10

DENVER, Oct. 19, 2017 — Using the power of today’s high performance computers, Earth scientists are working hand in hand with visualization experts to bring exquisitely detailed views of Earth’s oceans into sharper focus than ever before.

A video just released by SC17 conference relates how scientists are zooming in on one of the highest-resolution computer simulations in the world to explore never-before-seen features of the global ocean eddies and circulation.

“The ocean is what makes life possible on this beautiful planet,” said Dr. Dimitris Menemenlis, Research Scientist in the Earth Science Section at NASA’s Jet Propulsion Laboratory (JPL), Pasadena, Calif. “We should therefore try to understand and study and know how it works.”

Menemenlis has been doing just that—collaborating with other experts for two decades to continually improve data assimilation and numerical modeling techniques in order to achieve increasingly accurate descriptions of the global ocean circulation.  Numerical global ocean simulations today have horizontal grids cells spaced by 1 to 2 kilometers, compared to 25 to 100 kilometers 20 years ago.

“We are working with people at NASA centers, universities, and labs around the world who are looking for answers to important questions such as how ocean heat interacts with land and sea ice, how ice melt could raise sea levels and affect coastal areas, how carbon in the atmosphere is changing seawater chemistry, and how currents impact the ocean carbon cycle,” stated Menemenlis.

The new simulation accurately represents temperature and salinity variations in the ocean caused by a wide range of processes, from mesoscale eddies to internal tides. This simulation gives scientists a better picture of how ocean currents carry nutrients, carbon dioxide, and other chemicals to various locations around the world. These improvements are made possible by evolving supercomputer capabilities, satellite and other observational methods, and visualization methods.”

In particular, visualization and data analysis experts in the NASA Advanced Supercomputing (NAS) Division at NASA’s Ames Research Center in Silicon Valley have developed an interactive visualization technique that allows scientists to explore the entire global ocean on NAS’s 128-screen hyperwall and then zoom in on specific regions in near-real-time. Menemenlis says the new capability helps to quickly identify interesting ocean phenomena in the numerical simulation, that would otherwise be difficult to discover.

Scientists making satellite and in situ ocean observations can use the results from the simulation to better understand the observations and what they tell us about the ocean’s role in our planet’s weather and climate. The ultimate goal is to create a global, full-depth, time-evolving description of ocean circulation that is consistent with the model equations as well as with all the available observations.

“The ocean is vast and there are still a lot of unknowns. We still can’t represent all the conditions and are pushing the boundaries of current supercomputer power,” said Menemenlis. “This is an exciting time to be an oceanographer who can use satellite observations and numerical simulations to push our understanding of ocean circulation forward.”

Source: SC17

The post SC17 Video: How Supercomputing Helps Explain the Ocean’s Role in Weather and Climate appeared first on HPCwire.

Intel FPGAs Power Acceleration-as-a-Service for Alibaba Cloud

Thu, 10/19/2017 - 08:03

Oct. 19, 2017 — Intel today announced that Intel field programmable gate arrays (FPGAs) are now powering the Acceleration-as-a-Service of Alibaba Cloud, the cloud computing arm of Alibaba Group. The acceleration service, which can be launched from the Alibaba Cloud website, enables customers to develop and deploy accelerator solutions in the cloud for Artificial Intelligence inference, video streaming analytics, database acceleration and other fields where intense computing is required.

The Acceleration-as-a-Service with Intel FPGAs, also known as Alibaba Cloud’s F1 Instance, provides users access to cloud acceleration in a pay-as-you go model, with no need for upfront hardware investments.

“Intel FPGAs offer us a more cost-effective way to accelerate cloud-based application performance for our customers who are running business applications and demanding data and scientific workloads,” said Jin Li, vice president of Alibaba Cloud. “Another key value of FPGAs is that they provide high performance at low power, and the flexibility for managing diverse computing workloads.”

“Our collaboration with Alibaba Cloud brings forward FPGA-based accelerator capabilities and tools that will be offered to developers and end users as they work on large and intense computing workloads,” said John Sakamoto, vice president, Communications and Data Center Solutions, Intel Programmable Solutions Group. “A public cloud environment offers developers a place to start the FPGA journey, with virtually no initial capital outlay and a low-risk environment to experiment, that can scale to meet growing capacity requirements.”

As part of the Intel deployment, Alibaba Cloud users will have access to the Acceleration Stack for Intel Xeon CPU with FPGAs, which offers a common developer interface, abstracted hardware design, and development tools that support hardware or software development flows (OpenCL or RTL) that the developer is most familiar with. Users will also have access to a rich ecosystem of IP for genomics, machine learning, data analytics, cyber security, financial computation and video transcoding.

Source: Intel

The post Intel FPGAs Power Acceleration-as-a-Service for Alibaba Cloud appeared first on HPCwire.

Dassault Systèmes’ Living Heart Project Reaches Next Milestones

Wed, 10/18/2017 - 10:31

HOLLYWOOD, Fla., Oct. 18, 2017 — Dassault Systèmes (Paris:DSY) (Euronext Paris: #13065, DSY.PA) today outlined, at the 3DEXPERIENCE Forum North America, multiple milestones in its Living Heart Project aimed to drive the creation and use of simulated 3D personalized hearts in the treatment, diagnosis and prevention of heart diseases. As the scientific and medical community seeks faster and more targeted ways to improve patient care, the Living Heart Project is extending its reach through new partnerships and applications while lowering the barriers to access.

The Living Heart is now available through the 3DEXPERIENCE platform on the cloud, offering the speed and flexibility of high-performance computing (HPC) to even the smallest medical device companies. Any life sciences company can immediately access a complete, on-demand HPC environment to scale up virtual testing securely and collaboratively while managing infrastructure costs. This also crosses an important boundary toward the use of the Living Heart directly in a clinical setting.

“Medical devices need thousands of tests in the development stage,” said Joe Formicola, President and Chief Engineer, Caelynx. “With the move of the Living Heart to the cloud, effectively an unlimited number of tests of a new design can be carried out simultaneously using the simulated heart rather than one at a time, dramatically lowering the barrier to innovation, not to mention the time and cost.”

Since signing a 5-year agreement with the FDA in 2014, Dassault Systèmes continues to align with the regulatory agency on the use of simulation and modeling to accelerate approvals. Bernard Charles, CEO and vice chairman of the board of directors of Dassault Systèmes, gave the keynote at the 4th Annual FDA Scientific Computing Day in October 2016. Later, in July 2017, FDA Commissioner Dr. Scott Gottlieb publicly outlined the FDA plan to help consumers capitalize on advances in science stating, “Modeling and simulation plays a critical role in organizing diverse data sets and exploring alternate study designs. This enables safe and effective new therapeutics to advance more efficiently through the different stages of clinical trials.”

The Living Heart Project has grown to more than 95 member organizations worldwide including medical researchers, practitioners, device manufacturers and regulatory agencies united in a mission of open innovation to solve healthcare challenges. The project has supported 15 research grant proposals by providing access to the model, associated technologies and project expertise. Novel use of the model to understand heart disease and study the safety and effectiveness of medical devices has appeared in eight articles published in peer-reviewed journals to date.

For the first time, the Living Heart was used to simulate detailed drug interactions affecting the entire organ function. Researchers at Stanford University working with UberCloud recently used the Living Heart as a platform for a model that would enable pharmaceutical companies to test drugs for the risk of inducing cardiac arrhythmias, the leading negative side effect preventing FDA approval.

“The Living Heart Project is a strategic part of a broader effort by Dassault Systèmes to leverage its advanced simulation applications to push the boundaries of science,” said Jean Colombel, Vice President Life Sciences, Dassault Systèmes. “By creating both a community and a transformational platform we are beginning to see the advances from the Living HeartProject being used for additional aspects of cardiovascular research as well as for other parts of the body, for example the brain, the spine, the foot, and the eye, to reach new frontiers in patient care.”

Source: Dassault Systèmes

The post Dassault Systèmes’ Living Heart Project Reaches Next Milestones appeared first on HPCwire.

IBM Reports 2017 Third-Quarter Results

Wed, 10/18/2017 - 08:32

ARMONK, N.Y., Oct. 18, 2017 — IBM (NYSE: IBM) has announced third-quarter earnings results.

“In the third quarter we achieved double-digit growth in our strategic imperatives, extended our enterprise cloud leadership, and expanded our cognitive solutions business,” said Ginni Rometty, IBM chairman, president and chief executive officer.  “There was enthusiastic adoption of IBM’s new z Systems mainframe, which delivers breakthrough security capabilities to our clients.”

                  THIRD QUARTER 2017 Diluted EPS Net Income Gross Profit Margin GAAP from Continuing Operations $2.92 $2.7B 45.9%    Year/Year -2% -4% -0.9Pts Operating (Non-GAAP) $3.30 $3.1B 47.6%    Year/Year 0% -2% -0.4Pts REVENUE Total IBM Strategic Imperatives Cloud As-a-service annual exit run rate As reported (US$) $19.2B $8.8B $4.1B $9.4B    Year/Year 0% 11% 20% 25%    Year/Year adjusting for currency -1% 10% 20% 24%

“During the first three quarters of the year, our strong free cash flow has enabled us to maintain our R&D investments and to expand IBM’s cloud and cognitive capabilities through capital investments,” said Martin Schroeter, IBM senior vice president and chief financial officer.  “In addition, we have returned nearly $8 billion to shareholders through dividends and share repurchases.”

Strategic Imperatives Revenue

Third-quarter cloud revenues increased 20 percent to $4.1 billion.  Cloud revenue over the last 12 months was $15.8 billion, including $8.8 billion delivered as-a-service and $7.0 billion for hardware, software and services to enable IBM clients to implement comprehensive cloud solutions.  The annual exit run rate for as-a-service revenue increased to $9.4 billion from $7.5 billion in the third quarter of 2016.  In the quarter, revenues from analytics increased 5 percent.  Revenues from mobile increased 7 percent and revenues from security increased 51 percent (up 49 percent adjusting for currency).

Full-Year 2017 Expectations

The company continues to expect operating (non-GAAP) diluted earnings per share of at least $13.80 and GAAP diluted earnings per share of at least $11.95.  Operating (non-GAAP) diluted earnings per share exclude $1.85 per share of charges for amortization of purchased intangible assets, other acquisition-related charges and retirement-related charges.  IBM continues to expect free cash flow to be relatively flat year to year.

Cash Flow and Balance Sheet

In the third quarter, the company generated net cash from operating activities of $3.6 billion, or $3.3 billion excluding Global Financing receivables.  IBM’s free cash flow was $2.5 billion.  IBM returned $1.4 billion in dividends and $0.9 billion of gross share repurchases to shareholders.  At the end of September 2017, IBM had $1.5 billion remaining in the current share repurchase authorization.

IBM ended the third quarter of 2017 with $11.5 billion of cash on hand.  Debt totaled $45.6 billion, including Global Financing debt of $29.4 billion.  The balance sheet remains strong and is well positioned over the long term.

Segment Results for Third Quarter

  • Cognitive Solutions (includes solutions software and transaction processing software) —revenues of $4.4 billion, up 4 percent (up 3 percent adjusting for currency), driven by solutions software, including security and analytics, and transaction processing software.
  • Global Business Services (includes consulting, global process services and application management) — revenues of $4.1 billion, down 2 percent.  Strategic imperatives revenue grew 10 percent led by the cloud practice.
  • Technology Services & Cloud Platforms (includes infrastructure services, technical support services and integration software) — revenues of $8.5 billion, down 3 percent (down 4 percent adjusting for currency).  Strategic imperatives revenue grew 12 percent, driven by hybrid cloud services, security and mobile.
  • Systems (includes systems hardware and operating systems software) — revenues of $1.7 billion, up 10 percent, driven by growth in z Systems and storage.
  • Global Financing (includes financing and used equipment sales) — revenues of $427 million, up 4 percent (up 3 percent adjusting for currency).

Expense and Other Income

Third-quarter GAAP expense and other income year-to-year performance reflects lower IP income of $221 million, an impact of $105 million year to year related to several commercial disputes and a benefit of $91 million resulting from the favorable resolution of pension-related litigation in the U.K.

Operating (non-GAAP) expense and other income for the third quarter of 2017 compared to 2016 reflects lower IP income of $221 million and an impact of $105 million year to year related to several commercial disputes.

Tax Rate

IBM’s third-quarter effective GAAP and operating (non-GAAP) tax rates were 11.0 percent and 14.7 percent, respectively.  The company continues to expect a full-year effective operating (non-GAAP) tax rate of 15 percent, plus or minus 3 points, excluding discrete items.

Year-To-Date 2017 Results

Consolidated diluted earnings per share were $7.24 compared to $7.67, down 6 percent year to year.  Consolidated net income was $6.8 billion compared to $7.4 billion in the year-ago period, a decrease of 8 percent.  Revenues from continuing operations for the nine-month period totaled $56.6 billion, a decrease of 3 percent year to year (decrease of 2 percent adjusting for currency) compared with $58.1 billion for the first nine months of 2016.

Operating (non-GAAP) diluted earnings per share from continuing operations were $8.64 compared with $8.59 per diluted share for the 2016 period, an increase of 1 percent.  Operating (non-GAAP) net income for the nine months ended September 30, 2017 was $8.1 billion compared with $8.3 billion in the year-ago period, a decrease of 2 percent.

Source: IBM

The post IBM Reports 2017 Third-Quarter Results appeared first on HPCwire.

Multiscale Coupled Urban Systems Project to Use Exascale Computing to Aid City Management

Wed, 10/18/2017 - 08:23

Oct. 18, 2017 — Walk around any city neighborhood and chances are it looks nothing like it did 20 years ago. Thanks to growing urbanization, cities globally are rapidly expanding and accounting for more of our world’s population, gross domestic product and greenhouse gases.

Adapting a city to keep up with evolving needs is one of the greatest daily challenges that city planners, designers and managers face. They must consider how proposed changes will affect systems and processes such as our power grid, green spaces and public health facilities. They also need to understand how these systems and processes will influence each other.

Charlie Catlett wants to make their job easier by using the power of exascale – supercomputers that will be at least 50 times faster than those in use today. Catlett, a senior computer scientist at the U.S. Department of Energy’s (DOE) Argonne National Laboratory and a senior fellow at the Computation Institute, a joint institute of Argonne and the University of Chicago, leads the Multiscale Coupled Urban Systems project, which will create a computational framework for urban developers and planners to evaluate integrated models of city systems and processes.

With this framework, city planners can better examine complex systems, understand the relationships between them and predict how changes will affect them. It can ultimately help officials identify the best solutions to benefit urban communities.

“We’re focused on coupling models for urban atmosphere, building energy, socioeconomic activity and transportation, and we’ll will later expand to energy systems models,” Catlett said. “The framework will define what data will be exchanged between these models and how that data will be structured.”

Once the framework is complete, city planners such as those within the City of Chicago’s Department of Planning and Development can work with researchers to answer questions, raise their own and optimize design proposals.

“It’s a whole new frontier for us,” said Eleanor Gorski, the deputy commissioner of planning, design and historic preservation for the City of Chicago’s Department of Planning and Development.

“I think the most valuable aspect for us in city planning is being able to see how different conditions and parameters can affect different systems,” said Gorski. “For example, if you have a building that is 10 stories and the developers want to add five stories, one of the things we’d want to know is what effect that will have on transportation. Is it going to cause congestion? What we don’t have, and what I’m interested in, are those links between the data and the influence that one system has over another.”

Two models that Catlett and his collaborators are working to couple are EnergyPlus, a DOE program to model the energy demands of buildings, and Nek5000, a turbulence model that will track heat and airflow going through a city.

By pairing these two, researchers can, for example, capture how variations in local climate can influence heat transfer, ventilation and energy demands. From there, policy experts could propose ways to improve structure design in future developments.

First, however, researchers must determine what kind of information to share between models. Temperature, for example, is something Nek5000 could send to EnergyPlus, since air temperature naturally affects the temperature along building surfaces, as well as heating and cooling costs. Yet even though such models are connected, today most run independently, not generally coupled with others, Catlett said.

The coupling framework will also aim to incorporate data from sensory devices, like those used in Argonne’s urban Array of Things project. These sensors measure key components of the environment, such as ultraviolet and infrared light, cloud cover, temperature and humidity. These measurements can validate and improve existing models.

“The framework is key to solving these problems. It will essentially act as a data cache (short-term storage) through which a model can feed and receive information from another model or obtain data from sensory devices,” Catlett said.

One of the challenges is that simulations of models run at different rates. For example, simulating one hour of time with an atmospheric model may take a day of computing, while simulating the same amount of time with a building energy model may take half a second. To overcome this problem, researchers are examining various techniques.

“We’re exploring ways to match speeds by experimenting with the resolution of the simulations and by redistributing the resources on the machines, for example, having the more time-intensive simulation run on more computer cores than the less time-intensive one,” Catlett said.

Researchers are also examining how to make the framework flexible enough to handle a wide variety of models. With a more broad-based design, developers can use the framework to answer many different kinds of questions.

“To couple models, you’d traditionally have a laboratory such as Argonne or Oak Ridge develop a custom package. The problem is that it ends up being so specific that others can’t work with it, even if they’re trying to address similar questions. In that case, they have to get another custom package developed to address their study,” Catlett said.

“With our framework, we can eliminate this duplication of effort, but only if we design it in a general way such that other researchers can plug in their model with any of the others,” he said.

This project is funded by and is one of the applications of the Exascale Computing Project (ECP), a collaborative effort of the DOE Office of Science and the National Nuclear Security Administration, that seeks to provide breakthrough modeling and simulation solutions through exascale computing.

Laboratories participating in the Multiscale Coupled Urban Systems project include Argonne National Laboratory, Lawrence Berkeley National Laboratory, National Renewable Energy Laboratory, Oak Ridge National Laboratory and Pacific Northwest National Laboratory.

The Array of Things project is supported by the National Science Foundation, with additional support from Argonne National Laboratory and the Chicago Innovation Exchange.

Source: Joan Koka, Argonne National Laboratory

The post Multiscale Coupled Urban Systems Project to Use Exascale Computing to Aid City Management appeared first on HPCwire.

U.S. Industries to Benefit from Exascale Computing

Wed, 10/18/2017 - 08:11

Oct. 18, 2017 — Computer-aided design and engineering have come a long way since the first advanced CAD and CAE software programs appeared in the 1970s, and as manufacturing techniques, modeling and simulation have become increasingly complex over the years, computing power has had to keep up to meet the demand.

The need for exascale computing to handle the advanced physics and massive data sizes of today’s multimodal simulations is perhaps nowhere more apparent than in the product development industry. At Altair, a global software development and services company headquartered in Michigan with more than 2,600 employees and 64 offices worldwide, high-performance computing (HPC) is essential to providing the company and its clients with the tools to optimize product design. Altair is a member organization of the DOE’s Exascale Computing Project (ECP) Industry Council, an external advisory group of prominent US companies helping to define the industrial computing requirements for a future exascale ecosystem.

Exascale will impact a wide range of US industries.

Through its proprietary CAE software suite HyperWorks and its HPC workload management solution PBS Works, Altair relies on HPC to explore the vast design space afforded by advanced manufacturing processes, and to study the physics behind the designs to validate them. Increasingly, Altair’s 5,000-plus customers — in industries ranging from automotive and aerospace to heavy equipment and high-end electronics — need simulations that combine multiple physics-based solvers to predict performance, including structural optimization, electromagnetics, and computational fluid dynamics.

One example is the auto industry, which is designing cars to adhere to stricter carbon emissions guidelines. Meeting these standards requires manufacturing lightweight vehicles that are also strong enough to meet crash ratings, meaning engineers need to simultaneously model processes such as fluid-structural interaction, thermal interaction and crash dynamics. Typically, these multidisciplinary simulations take a long time and use a lot of computational power. To truly optimize these combined studies, and get the results back quickly, Altair and other industry leaders will need a higher level of computation than is available today, according to the company’s Chief Technical Officer Sam Mahalingam.

“The need for exascale really becomes extremely important because the size and complexity of the model increases as you do multiphysics simulations,” Mahalingam said. “This is a lot more complex model that allows you to truly understand what the interference and interactions are from one domain to another. In my opinion, exascale is truly going to contribute to capability computing in solving problems we have not solved before, and it’s going to make sure the products are a lot more optimized and introduced to the market a lot faster.”

Multiphysics simulations also generate tremendous amounts of data. When launching a product, manufacturers typically go through several iterations of simulations, creating file sizes too large to download to desktop computers. While Altair has a large infrastructure of high performance machines to store data for validation and support its cloud-based storage, the sheer amount of data stretches the limits of existing hardware. Exascale machines might be able to store the data where it is generated and enable engineers to visualize it remotely, Mahalingam said.

“The data you’re going to get cannot be visualized without exascale computing power and without parallelization,” he said.

While any product that is engineered or designed could benefit from exascale computing, Mahalingam said, it could be most transformational in industries where prototyping is difficult or impossible, such as aerospace or shipbuilding. Currently, companies in these industries must set up internal laboratories to test designs, which can be extremely cost-prohibitive. Exascale would allow for virtual labs that could completely simulate the physical experience, Mahalingam said, and instead of having to do individual studies sequentially, they could be done in parallel, saving time for engineers.

The benefits of exascale could even extend after a product launch, Mahalingam explained, when companies typically obtain real-world operational data and perform simulations to determine the remaining usable life of their products. If product developers could get the answer back in seconds instead of days, Mahalingam said, it could enhance preventative maintenance. “By superimposing the real-world operational data onto a digital model, we will be able to come back and predict where/when this part is going to fail depending on its design requirements.”

“Today we model everything first and then we basically validate that model. But can we turn it around?” Mahalingam said. “Based on the real-world operational data we’re collecting, can we truly come out with a data-driven model, a prescribed model, as a starting point that we can say will deliver a design a lot faster?”

Mahalingam said exascale will be “critical” to running the deep learning and machine learning algorithms necessary to create data-driven models that are much closer to a final, polished model. Also, it will allow engineers to shrink the design space instantly because it will incorporate historical data. The result, Mahalingam said, is that engineers will be freed up to think about more complex problems to solve, and in turn come up with more innovative products.

To stay competitive, Mahalingam said, product development companies will need to scale up solvers and make sure multiphysics simulations work on next-generation systems. In preparation, Altair is already looking at newer programming paradigms like CHARM++, PMIx, as well as middleware designed for exascale applications. The company is exploring scheduling that will cater to exascale and is keeping close watch on hardware announcements.

Logistically, the move to exascale isn’t without its challenges in hardware, applications and software, Mahalingam said. Hardware will be challenged in meeting higher performance standards while using less power. As computing moves beyond Moore’s Law, software will need to be highly parallelized, and the onus will fall on resource managers to perform dynamic scheduling and place computing jobs as fast as they can to make full use of exascale capability. Systems will also need to be more “fault-tolerant,” Mahalingam said, and less dependent on a single node.

More broadly, exascale computing will likely shift the paradigm away from capacity computing (brute force/trial and error) to cognitive computing, Mahalingam said. From a national perspective, he added, exascale could have widespread implications, not just in manufacturing, but also life sciences, personalized medicine and agriculture.

“It’s all about real time simulations, predicting what’s going to happen, and prescribing what needs to be done to make sure failures can be avoided or preempted,” Mahalingam said. “This is much bigger than any one company or any one industry. If you consider any industry, exascale is truly going to have a sizeable impact, and if a country like ours is going to be a leader in industrial design, engineering and manufacturing, we need exascale to keep the innovation edge.”

Source: Jeremy Thomas, Lawrence Livermore National Laboratory

The post U.S. Industries to Benefit from Exascale Computing appeared first on HPCwire.

Ligo and Virgo Detect Gravitational Waves from Colliding Neutron Stars

Tue, 10/17/2017 - 12:05

Oct. 17, 2017 — For the first time, scientists have directly detected gravitational waves—ripples in space and time—in addition to light from the spectacular collision of two neutron stars. This marks the first time that a cosmic event has been viewed in both gravitational waves and light.

The discovery was made using the U.S.-based Laser Interferometer Gravitational-Wave Observatory (LIGO); the Europe-based Virgo detector; and some 70 ground- and space-based observatories.

Neutron stars are the smallest, densest stars known to exist and are formed when massive stars explode in supernovas. As these neutron stars spiraled together, they emitted gravitational waves that were detectable for about 100 seconds; when they collided, a flash of light in the form of gamma rays was emitted and seen on Earth about two seconds after the gravitational waves. In the days and weeks following the smashup, other forms of light, or electromagnetic radiation—including X-ray, ultraviolet, optical, infrared, and radio waves—were detected.

The observations have given astronomers an unprecedented opportunity to probe a collision of two neutron stars. For example, observations made by the U.S. Gemini Observatory, the European Very Large Telescope, and NASA’s Hubble Space Telescope reveal signatures of recently synthesized material, including gold and platinum, solving a decades-long mystery of where about half of all elements heavier than iron are produced.

The LIGO-Virgo results are published today in the journal Physical Review Letters; additional papers from the LIGO and Virgo collaborations and the astronomical community have been either submitted or accepted for publication in various journals.

“It is tremendously exciting to experience a rare event that transforms our understanding of the workings of the universe,” says France A. Córdova, director of the National Science Foundation (NSF), which funds LIGO. “This discovery realizes a long-standing goal many of us have had, that is, to simultaneously observe rare cosmic events using both traditional as well as gravitational-wave observatories. Only through NSF’s four-decade investment in gravitational-wave observatories, coupled with telescopes that observe from radio to gamma-ray wavelengths, are we able to expand our opportunities to detect new cosmic phenomena and piece together a fresh narrative of the physics of stars in their death throes.”

“This new observation by LIGO begins a new era of multi-messenger astronomy, combining gravitational waves with optical observations,” said Dr. William “Bill” Gropp, director of the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign. “NCSA is proud to be part of the LIGO consortium and especially for our contribution in the use of HPC to model and understand that event.”

To read more, see the original article: http://www.ncsa.illinois.edu/news/story/ligo_and_virgo_make_first_detection_of_gravitational_waves_produced_by_coll

Source: National Center for Supercomputing Applications

The post Ligo and Virgo Detect Gravitational Waves from Colliding Neutron Stars appeared first on HPCwire.

Flexible Elastic Scaling with Stability and Without Compromise

Tue, 10/17/2017 - 10:45

This  Server  StorageIO®  Industry  Trends  Perspective  report  looks  at  common  issues, trends, and how to address different application server storage I/O challenges.  In  this  report,  we  look  at  WekaIO  Matrix™,  an  elastic, flexible, highly-scalable easy to use (and manage) software-defined (e.g. software-based) storage solution.  This report applies to environments with multi-dimensional server, storage and I/O management challenges. This includes  applications  with  diverse  performance,  availability,  capacity  and economic  (PACE)  needs,  who  also  require  scaling with ease, as well as stability.

The post Flexible Elastic Scaling with Stability and Without Compromise appeared first on HPCwire.

Where Security Meets High Performance Computing

Tue, 10/17/2017 - 10:24

As its power increases and its cost declines, High Performance Computing (HPC) is making an impact on the security field. The ability to use parallel processing to run at speeds of a teraflop or higher is now contributing to improved security in airports, online and elsewhere. At the same time, HPC itself creates a number of new security risks for organizations that employ it. This article looks at HPC’s impact on security. It also explores HPC’s own vulnerabilities and discusses how new solutions from Dell EMC and Intel help address them.

The Impact of HPC on Security

HPC has the potential to improve security on both physical and digital fronts. With airport security, for example, HPC makes it possible to analyze and correlate vast amounts of disparate data in a rapid time cycle. An HPC-assisted airport security system is able to compare results from facial recognition systems, other video input, flight data, “watch list” data and threat intelligence sources to identity security risks as they arise. It would be impossible for a person to put all these data sources together manually, nor would it even be feasible with standard computers. It’s early in the lifecycle for these sorts of HPC-driven physical security technologies, but its drivers include lower cost, higher-performing HPC clusters along with advances in APIs and integration.

The finance world presents another example of HPC’s applicability for security and crime prevention. Just as HPC can be used for high-frequency stock trading, the technology can also be put to work detecting a number of fraudulent trading practices. Trading irregularities like “front running,” where a financial institution trades on its own account ahead of its clients, can be extremely hard to detect in a busy trading environment. False positives abound. HPC gives fraud analysts and compliance managers sharper tools to use in the kind of sophisticated pattern recognition it takes to identity true fraud and distinguish it from other trades occurring simultaneously.

Online security benefits from HPC as well. Like the airport, a large social site, for example, may have tens of millions of people interacting with one another at any given moment. Most likely, 99.999% of these interactions are completely innocent. Yet, as we have seen, there are some situations where predatory interactions occur online but escape detection. HPC gives social networks and other online venues the ability to analyze interactions and profiles quickly enough to block harmful or illegal activity.

Security Issues in HPC

The nature of HPC as well as the settings it’s used in expose it to a variety of security risks. For one thing, when HPC is a key part of a security system, like in the airport scenario, HPC becomes a high profile target of attack. Other risks arise from the following factors:

  • Openness – HPC is often used in scientific research and government settings. In a research institution like a university, there may be fewer controls over system access as well as back-end administrative access. Data sources may not be as well governed or secured as they might be in a private enterprise. Alternatively, the system may be accessible by people from multiple entities, given the open and sharing spirit of much research.
  • Distributed data sources – Security in “big data” is related to security for HPC, as the two technologies overlap much of the time. The creation of “data lakes” used in big data, powered by HPC, may not have adequate security controls in place. Reasons for these deficiencies vary, but they often emerge from an ad-hoc creation of a data repository, perhaps from multiple sources. A research project might, for example, merge data streams from inputs as diverse as weather reporting, financial markets, device logs, geological instruments and so forth.
  • Clusters – HPC environments are typically clustered, an architecture which exposes them to multiple risks. Given their generally heterogeneous nature, clustered HPC systems may require multiple management systems to operate. This can slow down the implementation of security policies and processes like security patch management – with vulnerabilities un-remediated as a result.
Mitigating HPC Risk at the Hardware Level

Dell EMC and Intel have devised numerous countermeasures to protect against HPC security risks. Dell EMC PowerEdge servers embed new hardware and system-level security features. They make it possible to recover to a trusted base in the event of a breach. PowerEdge servers are also designed to prevent unauthorized or inadvertent changes to their configurations through System Lockdown. An industry-first, System Lockdown prevents security-weakening changes to an HPC system’s administrative backend.

Other Dell EMC security features like SecureBoot, BIOS Recovery capabilities, signed firmware and iDRAC RESTful API (compliant with Redfish standards) provide further cyber protections. Dell EMC System Erase ensures data privacy by quickly and securely erasing user data from drives or wiping all non-volatile media when an HPC system is retired.

Intel processors form basis for robust, vulnerability-resistant platforms. Security features are embedded in each processor, including the “Skylake” family of processors which are used in Dell EMC HPC solutions. These include Intel® Identity Protection Technology, Intel® Advanced Encryption Standard New Instructions (Intel® AES–NI) and Intel® Trusted Execution Technology.

Conclusion

As new HPC capabilities transform the security field, the technology also exposes organizations to new security risks. Common HPC traits like openness, clusters and distributed data sources create potential vulnerabilities for data breaches. Defending HPC requires a multi-layered approach. The hardware itself, however, should play a key role in mitigating HPC security risks. Dell EMC PowerEdge addresses this need by offering servers and processors with built-in security capabilities. It’s a never-ending, constantly evolving task. Dell EMC is committed to working with the HPC community to devise security solutions for HPC environments.

The post Where Security Meets High Performance Computing appeared first on HPCwire.

2017 MacArthur Fellows Class Includes 2 Computer Scientists

Tue, 10/17/2017 - 09:49

Oct. 17, 2017 — The MacArthur Foundation recently announced its 2017 MacArthur Fellows – 24 individuals whose achievements show “exceptional creativity, promise for important future advances based on a track record of significant accomplishments, and potential for the fellowship to facilitate subsequent creative work.”

The MacArthur Fellows program grants each recipient a no-strings attached stipend of $625,000 in order to support his or her own creative and professional ambitions. The program features scientists, artists, historians, and writers.

The 2017 Fellows class features two computer scientists: Regina Barzilay, Delta Electronics professor and a member of the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology, and Stefan Savage, professor of computer science and engineering at the University of California, San Diego.

Dr. Barzilay was selected for her work on machine learning methods that allow computers to read and analyze unstructured documents. Language decipherment can take decades when done by a human. Barzilay created algorithms that can rapidly decipher the ancient Semitic language of Ugaritic by using its similarities to Hebrew. Her recent work has used machine learning to interpret and evaluate oncology medical documents to create a database for preventative methods and treatment.

Dr. Savage researches cybersecurity and cyber crime, using an interdisciplinary method that considers the economic and social context of crime, in addition to technological solutions. One of Dr. Savage’s projects focused on email spam – rather than try to prevent spam emails, he focused on preventing profitability. After finding that a small number of banks were involved in processing these transactions, the various stakeholders were able to track and shut down these bank accounts. Dr. Savage was also a recent participant in the CCC workshop series on Sociotechnical Cybersecurity.

Original article: http://www.cccblog.org/2017/10/16/2017-macarthur-fellows-class-includes-2-computer-scientists/

Source: Khari Douglas, CCC Blog

The post 2017 MacArthur Fellows Class Includes 2 Computer Scientists appeared first on HPCwire.

Jesύs Labarta Recognized with ACM-IEEE CS Ken Kennedy Award

Tue, 10/17/2017 - 09:44

NEW YORK, Oct. 17, 2017 — The Association for Computing Machinery (ACM) and IEEE Computer Society (IEEE CS) have named Jesύs Labarta of the Barcelona Supercomputing Center (BSC) and Universitat Politècnica de Catalunya (UPC) as the recipient of the 2017 ACM-IEEE CS Ken Kennedy Award.  Labarta is recognized for his seminal contributions to programming models and performance analysis tools for high performance computing. The award will be presented at SC 17: The International Conference for High Performance Computing, Networking, Storage and Analysis , November 12-17, in Denver, Colorado.

Throughout his career, Labarta has developed tools for scientists and engineers working in parallel programming.  In the programming models area, he made fundamental contributions to the concept of asynchronous task-based models and intelligent runtime systems. With Labarta’s approach, by using pragma directives that specify the region of code that constitutes tasks and the directionality of the data used by them, the programmer has a unified mechanism to allow intelligent runtime systems to detect and exploit concurrency as well as to manage locality. These ideas have been developed by Labarta’s team on the OmpSs model and Nanos runtime.  His team’s work has also enhanced the interoperability between OmpSs (later Open multi-processing (MP)) and message passing interface (MPI).

In the performance tools area, Labarta’s team develops and distributes Open Source Barcelona Supercomputer Center (BSC) tools that are employed throughout the field. These BSC tools are designed to analyze an application’s behavior and identify issues that may impact performance. Paraver, the most widely used BSC tool, is a trace-based performance analyzer that processes and extracts information. Other tools like Dimemas or the Performance Analytics modules developed by Labarta’s team help squeeze relevant insight and perform predictive analyses from the raw performance data captured by the instrumentation packages.

Labarta is Director of the Computer Science Department at the Barcelona Supercomputing Center and a Professor of Computer Architecture at the Universitat Politècnica de Catalunya.  From 1996 to 2004 he served as the Director of the European Center of Parallelism of Barcelona (CEPBA). He has published more than 250 articles in conferences and journals in areas including high performance architectures and systems software.  He has been involved in research and cooperation with many leading companies on HPC- related topics. Currently Labarta is the leader of the Performance Optimization and Productivity  EU Center of Excellence where more than 100 users (both academic and SMEs) from a very wide range of application sectors receive performance assessments and suggestions for code refactoring efforts.

ACM and the IEEE Computer Society co-sponsor the Kennedy Award, which was established in 2009, to recognize substantial contributions to programmability and productivity in computing and significant community service or mentoring contributions. It was named for the late Ken Kennedy, founder of Rice University’s computer science program and a world expert on high performance computing. The Kennedy Award carries a US $5,000 honorarium endowed by the SC Conference Steering Committee.

About ACM

ACM, the Association for Computing Machinery www.acm.org, is the world’s largest educational and scientific computing society, uniting computing educators, researchers and professionals to inspire dialogue, share resources and address the field’s challenges. ACM strengthens the computing profession’s collective voice through strong leadership, promotion of the highest standards, and recognition of technical excellence. ACM supports the professional growth of its members by providing opportunities for life-long learning, career development, and professional networking.

About IEEE Computer Society

IEEE Computer Society, www.computer.org, is one of the world’s leading computing membership organizations and a trusted information and career-development source for a global workforce of technology leaders including: professors, researchers, software engineers, IT professionals, employers, and students. IEEE Computer Society provides high-quality, state-of-the-art information on an on-demand basis. The Computer Society provides a wide range of forums for top minds to come together, including technical conferences, publications, a comprehensive digital library, unique training webinars, and professional training. IEEE is the world’s largest professional association for advancement of technology and the Computer Society is the largest society within IEEE.

About SC17

SC17, the International Conference for High Performance Computing, sc17.supercomputing.org, sponsored by ACM and IEEE-CS offers a complete technical education program and exhibition to showcase the many ways high performance computing, networking, storage and analysis lead to advances in scientific discovery, research, education and commerce. This premier international conference includes a globally attended technical program, workshops, tutorials, a world class exhibit area, demonstrations and opportunities for hands-on learning.

Source: ACM

The post Jesύs Labarta Recognized with ACM-IEEE CS Ken Kennedy Award appeared first on HPCwire.

Researchers Scale COSMO Climate Code to 4888 GPUs on Piz Daint

Tue, 10/17/2017 - 08:23

Effective global climate simulation, sorely needed to anticipate and cope with global warming, has long been computationally challenging. Two of the major obstacles are the needed resolution and prolonged time to compute. This month a group of researchers from ETH Zurich, MeteoSwiss, and the Swiss National Supercomputing Center (CSCS) report scaling popular COSMOS code to run on all 4888 GPUs of CSCS’s Piz Daint supercomputer and achieving ultra-high resolution.

In their paper, ‘Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0’, posted on the open access site, Geoscientific Model Development Discussion, authors present their rather extensive efforts necessary to port the code. Previously COSMO had only been scaled to 1000 GPUs on Piz Daint.

“To our knowledge this represents the first complete atmospheric model being run entirely on accelerators at this scale. At a grid spacing of 930 m (1.9 km), we achieve a simulation throughput of 0.043 (0.23) simulated years per day and an energy consumption of 596 MWh per simulated year. Furthermore, we propose the new memory usage efficiency metric that considers how efficiently the memory bandwidth – the dominant bottleneck of climate codes – is being used,” write the researchers led by Oliver Fuhrer of MeteoSwiss.

Not surprisingly adapting the COSMO (Consortium for Small-Scale Modeling) code was a significant task. For example, the core of the model had to be rewritten using a domain-specific language across different hardware architectures, say the researchers. Here are a few more details from the paper:

“To enable COSMO on hybrid high performance computing systems with GPU-accelerated compute nodes, we rewrote the dynamical core of the model, that implements the solution to the non-hydrostatic Euler equations, from Fortran to C++. This enabled us to introduce a new C++ template library-based domain specific language (DSL) we call STELLA, to provide a performance-portable implementation for the stencil algorithmic motifs by 15 abstracting hardware dependent optimization.

“Specialized backends of the library produce efficient code for the target computing architecture…[T]he DSL supports an analysis backend that records the access patterns and data dependencies of the kernels. This information is then used to determine the amount of memory accesses and assess the memory utilization efficiency. For GPUs, the STELLA backend is written in CUDA, and other parts of the refactored COSMO implementation use OpenACC directives.”

The researchers report the code shows excellent strong scaling up to the full machine size when running at a grid spacing of 4 km and below, on both the P100 GPU accelerators and the Haswell CPU. For smaller problems, “e.g., at coarser grid spacing of 47 km,” the GPUs run out of parallelism and strong scalability is limited to about 100 nodes, while the same problem continues to scale on multi-core processors to 1000 nodes.

Link to paper: https://www.geosci-model-dev-discuss.net/gmd-2017-230/

The post Researchers Scale COSMO Climate Code to 4888 GPUs on Piz Daint appeared first on HPCwire.

In-Memory Computing Summit North America 2017 Announces Keynote Speakers

Tue, 10/17/2017 - 08:21

FOSTER CITY, Calif., Oct. 17, 2017 — GridGain Systems, provider of enterprise-grade in-memory computing solutions based on Apache Ignite, today announced the keynote speakers for the third annual In-Memory Computing Summit North America, taking place October 24-25, 2017, at the South San Francisco Conference Center. Speakers from Sberbank, Workday, Wellington Management, the Storage Networking Industry Association (SNIA) and GridGain Systems will discuss how in-memory computing is solving processing speed and scalability challenges in a variety of industries and the evolution of in-memory computing-related technologies.

The In-Memory Computing Summit (IMCS) is held annually in both Europe and North America. The conferences are the only industry-wide events that focus on the full range of in-memory computing-related technologies and solutions. The conferences are attended by technical decision makers, business decision makers, operations experts, DevOps professionals, architects and developers. The attendees make or influence purchasing decisions about in-memory computing, Big Data, Fast Data, IoT, HTAP and HPC solutions.

Keynote Speakers

Tuesday, October 24

  • Abe Kleinfeld, President & CEO, GridGain Systems – “The Evolving In-Memory Computing Platform” – 9:25 a.m. to 9:55 a.m.
    In-memory computing (IMC) is already profoundly changing a variety of industries, including financial services, fintech, healthcare, IoT, online travel and web-scale SaaS. But we are still just at the beginning of the IMC revolution, and new innovations and the adoption of memory-centric architectures will continue to redefine the datacenter.
  • Mikhail Khasin, Senior Managing Director & Head of Core Banking Transformation Program, Sberbank – “The Evolving Role of In-Memory Computing in the Banking Industry” – 10:00 a.m. to 10:30 a.m.
    Traditional core banking platforms face new challenges, including processing high volumes of data in real time, workloads in the many thousands of transactions per second, and 24×7 availability. Distributed in-memory computing unlocks a web-scale, client-centric architecture for next-generation banking platforms that can handle hundreds of thousands of transactions per second and are capable of built-in machine learning algorithms and AI.

Wednesday October 25

  • Jim Pappas, Vice Chairman, SNIA – “Realizing the Benefits of Persistent Memory with the NVM Programming Model and NVDIMMs” – 9:20 a.m. to 9:40 a.m.
    The convergence of memory and storage has been realized with the help of the NVM Programming Model and NVDIMMs. Many end users are taking advantage of high speed byte addressable access to Persistent Memory. Learn how the NVM Programming Model and NVDIMMs are helping to accelerate the availability of software that enables Persistent Memory hardware.
  • Noah Arliss, Senior Development Manager, Workday – “The Intersection of In-Memory and Scale-Out in the Age of Internet Proportions” – 9:40 a.m. to 10:05 a.m.
    From day one, in-memory computing has been part of our DNA at Workday. However, as our customers and data sets grow, we continue to push the boundaries at the intersection of in-memory and scale-out computing. Distributed computing is not for the faint of heart, and key design principles are often understood only by domain experts with deep knowledge. Moving forward, we need to simplify these paradigms to make them easier to understand and more readily adopted in the industry.
  • Rafique Awan, Lead Architect, Wellington Management – “Implementation of Investment Book of Record (IBOR) Using Apache Ignite/GridGain” – 10:05 a.m. to 10:30 a.m.
    This talk will include an overview of the IBOR use case, a perfect example of using fast and big data together in the financial industry. The use case includes using Apache Ignite/GridGain to solve one of the most complex problems in the financial world and using Apache Spark and Apache Ignite together in order to solve complex big data ETL processing.

Raffle for Complimentary Passes

Enter the weekly raffle for complimentary passes to the conference. Visit the conference website raffle page now to enter the drawing.

Sponsors

The In-Memory Computing Summit North America 2017 is sponsored by leading technology vendors. Current sponsors include:

Platinum Sponsor – GridGain Systems
Gold Sponsor – YADRO
Silver Sponsors – Fujitsu, Hazelcast, ScaleOut Software, Neeve Research, Starcounter, Striim
Association Sponsors – Storage Networking Industry Association (SNIA), Apache Software Foundation
Media Sponsors – CMSWire, Datanami, InsideBigData, InsideHPC, ODBMS.org

About the In-Memory Computing Summit

The In-Memory Computing Summits are the only industry-wide events of their kind, tailored to in-memory computing-related technologies and solutions. They are the perfect opportunity to reach technical IT decision makers, IT implementers, architects, developers, and business decision makers who make or influence purchasing decisions in the areas of in-memory computing, Big Data, Fast Data, IoT, HTAP and HPC. Attendees include CEOs, CIOs, CTOs, VPs, IT directors, IT managers, data scientists, senior engineers, senior developers, architects and more. The events are unique forums for networking, education and the exchange of ideas that power digital transformation and the future of Fast Data. For more information, visit https://imcsummit.org/us/ and follow the events on Twitter @IMCSummit.

About GridGain Systems

GridGain Systems offers an in-memory computing platform built on Apache Ignite. GridGain solutions are used by global enterprises in financial, software, e-commerce, retail, online business services, healthcare, telecom and other major sectors, with a client list that includes Barclays, ING, Sberbank, Misys, IHS Markit, Workday, and Huawei. GridGain delivers unprecedented speed and massive scalability to both legacy and greenfield applications. Deployed on a distributed cluster of commodity servers, GridGain software can reside between the application and data layers (RDBMS, NoSQL and Apache Hadoop), requiring no rip-and-replace of the existing databases, or it can be deployed as an in-memory transactional SQL database. GridGain is the most comprehensive in-memory computing platform for high-volume ACID transactions, real-time analytics, web-scale applications and HTAP. For more information, visit gridgain.com.

Source: GridGain Systems

The post In-Memory Computing Summit North America 2017 Announces Keynote Speakers appeared first on HPCwire.

ESnet’s Science DMZ Design Could Help Transfer, Protect Medical Research Data

Tue, 10/17/2017 - 07:15

Oct. 17, 2017 — Like other sciences, medical research is generating increasingly large datasets as doctors track health trends, the spread of diseases, genetic causes of illness and the like. Effectively using this data for efforts ranging from stopping the spread of deadly viruses to creating precision medicine treatments for individuals will be greatly accelerated by the secure sharing of the data, while also protecting individual privacy.

In a paper published Friday, Oct. 6 by the Journal of the American Medical Informatics Association, a group of researchers led by Sean Peisert of the Department of Energy’s (DOE) Lawrence Berkeley National Laboratory (Berkeley Lab) wrote that the Science DMZ architecture developed for moving large data sets quick and securely could be adapted to meet the needs of the medical research community.

The Science DMZ traces its name to an element of network security architecture. Typically, located at the network perimeter, a DMZ has its own security policy because of its dedicated purpose – exchanging data with the outside world.

Exponentially increasing amounts of data from genomics, high quality imaging and other clinical data sets could provide valuable resources for preventing and treating medical conditions. But unlike most scientific data, medical information is subject to strict privacy protections under the Health Insurance Portability and Accountability Act (HIPAA) so any sharing of data must ensure that these protections are met.

Image courtesy of Lawrence Berkeley National Lab.

“You can’t just take the medical data from one site and drop it straight in to another site because of the policy constraints on that data,” said Eli Dart, a network engineer at the Department of Energy’s Energy Sciences Network (ESnet) who is a co-author of the paper. “But as members of a society, our health could benefit if the medical science community can become more productive in terms of accessing relevant data.”

For example, an authenticated user could query a very large data base stored at multiple sites to learn more about an emerging medical issue, such as the appearance of a new virus, said Peisert, who works in Berkeley Lab’s Computational Research Division. In this way, teams of widely dispersed experts could collaborate in real-time to address the problem.

According to the authors of the paper, the storage, analysis and network resources needed to handle the data and integrate it into patient diagnoses and treatments have grown so much that they strain the capabilities of academic health centers. At the same time, shared data repositories like those at the National Library of Medicine, the National Cancer Institute and international partners such as the European Bioinformatics Institute are rapidly growing.

“But by implementing a Medical Science DMZ architecture, we believe biomedical researchers can leverage the scale provided by high performance computer and cloud storage facilities and national high-speed research networks while preserving privacy and meeting regulatory requirements,” Peisert said. “Access would of course need to be properly authenticated, but unlocking the world’s medical information could yield enormous benefits.”

The authors define a “Medical Science DMZ” as “a method or approach that allows data flows at scale while simultaneously addressing the HIPAA Security Rule and related regulations governing biomedical data and appropriately managing risk.” Their network design pattern addresses Big Data and can be implemented using a combination of physical, administrative and technical safeguards.

The paper was written as the National Institutes of Health (NIH) are spearheading a “Commons Initiative” for sharing data; the NIH have long provided reference data through the National Library of Medicine. The National Cancer Institute funded a number of pilot projects to use cloud computing for cancer genomics in 2016, and the initiative has since continued and expanded beyond the pilot phase.s. Many universities with high-performance computing facilities available are increasingly applying their capacity to biomedical research.

The Science DMZ network architecture, which is used by more than 100 research institutions across the country, provides speed and security for moving large data sets. Dart led the development of the Science DMZ concept, formalized it in 2010, and has been helping organizations deploy it ever since.

A Science DMZ is specifically dedicated to external-facing high-performance science services and is separate from an organization’s production network, which allows bulk science data transfers to be secured without inheriting the performance limitations of the infrastructure used to defend enterprise applications.

Data transfers using Science DMZs are straightforward from a network security perspective: the data transfer nodes (specially tuned servers) exchange security credentials to authenticate the transfer and then open several connections to move the specified data. One the job is completed, the connections close down. In the case of moving medical data, the information is encrypted both while it is being stored and while it’s moving across the network.

“There’s no magic,” Dart said. “The security is easy to manage in that the sites are known entities and nothing moves without proper security credentials.”

In fact, Dart said, such transfers pose less of a security problem than surfing the web on a personal computer connected to an open network. When someone browses a web site, the user’s computer downloads content from many different locations as specified by the web page, including ads that are sold and resold by firms around the world and may contain malware or other security threats. A data transfer between Science DMZs is a comparatively simple operation that doesn’t involve image rendering or media players (which are common attack surfaces), and only transfers data from approved endpoints.

In their paper, the authors present the details of three implementations and describe how they balance the key aspects of a Medical Science DMZ of high-throughput and regulatory compliance. Indiana University, Harvard University, and the University of Chicago all use a non-firewalled approach to moving HIPAA-protected data in their Medical Science DMZs. Each site has implemented frameworks that allow free flow of data where needed and address HIPAA using alternate, reasonable and appropriate controls that manage risk.

In each case the data transfers are encrypted, and can only be initiated by authenticated and authorized users. The interactive network traffic needed to initiate such transfers still passes through one or more systems that are heavily protected and monitored. Although firewalls are not removed entirely from the system, they are used intelligently and overall system security is maintained while still permitting the transfer of sensitive data, such as large biomedical datasets.

“We wrote this paper as a starting point,” Peisert said, “and hope that it will allow a lot of great things to happen.”

ESnet is a DOE Office of Science User Facility. DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.

Source: Lawrence Berkeley National Laboratory

The post ESnet’s Science DMZ Design Could Help Transfer, Protect Medical Research Data appeared first on HPCwire.

Pages