HPC Wire

Subscribe to HPC Wire feed
Since 1987 - Covering the Fastest Computers in the World and the People Who Run Them
Updated: 16 hours 32 min ago

ISC 2017 Student Day Tackles the Worldwide HPC Workforce Shortage

Mon, 06/12/2017 - 10:12

At a time when university graduates in science and engineering may find job markets difficult, thousands of well-paying jobs go unfilled across the world because the graduates lack basic HPC competency and because supercomputing is sometimes mistakenly seen as an old technology. As part of the new STEM Student Day at the ISC17 conference (Wednesday, June 21), Michael Bader of the Technical University of Munich (TUM) and I will attempt to describe the problem and the opportunities.

This issue has been near and dear to me for quite a while. It affects the futures of young people coming out of universities, and to the extent that it’s not addressed, it will constrain the future of the global HPC industry. I invited Dr. Bader to team with me because TUM is a global leader in responding to this issue on behalf of its students.

The Problem
The HPC personnel shortage is no accident. When HPC funding from the U.S. Government and allied nations declined sharply after the end of the Cold War, the HPC market entered a period of slowdown from which it did not start to recover until about the year 2002, when the fast rise of HPC clusters caused a five-year spurt of 20% annual revenue growth. Between 2000 and 2016, the HPC market doubled in size, from about $11 billion to $22.4 billion—creating the need for many new employees in the process.

Steve Conway, Hyperion SVP

The period of HPC slowdown, occurring as it did alongside the explosive growth of Internet companies, helped to transform the image of HPC into that of a maturing and even a dying, “old technology” market. The number of university programs in computational science and related fields plummeted, as did HPC-related internship and postgraduate fellowship opportunities. Young people who might have chosen an HPC career a decade earlier all too often opted instead for employment with “new technology” Internet, PC or gaming companies. As a result, a high proportion of today’s graying HPC workforce is within a decade of retirement age and educational institutions are not producing enough HPC-trained graduates to replace them.

Worldwide Study
An extensive study we conducted for the U.S. Department of Energy in 2010—subsequent studies confirm that the situation hasn’t changed much since then—confirmed that the HPC community has only begun to address this job candidate shortage through new curricular and internship offerings, as well as through accelerated on-the-job training, but there is still a long way to go – especially in view of the challenges needed to harness the potential of exascale computers.

  • The most-needed science and mathematics skills are multidisciplinary abilities, code physicists, and mathematicians. The highest-need HPC skills are for computer scientists, operating systems experts, parallel programmers, programmers for heterogeneous (CPU-accelerator) systems, systems administrators, tools developers, especially for large scale systems, and experts in verification and validation.
  • The important future inflection points for the study respondents from HPC sites fell into the categories of: parallelism, petascale/exascale computing, HPC system heterogeneity, HPC system architectural balance, HPC system reliability, and HPC system and data center power and cooling. These inflection points are closely related to each other and therefore represent a complex of issues that in most cases cannot be addressed entirely alone.
  • Nearly all (93%) of the HPC centers said it is “somewhat hard” or “very hard” to hire staff with the requisite skills. It is especially telling that the majority of the centers (56%) fell into the “very hard” category.
  • The most fruitful source of qualified candidates for HPC positions are “university graduates in mathematics, engineering, or the physical sciences” (cited by 63% of the respondents). A smaller but substantial percentage of the respondents (48%) pointed to “university graduates in computer science.”

Many Universities Are in a Bind
The single biggest recommendation offered by HPC-knowledgeable study respondents in academic and training organizations was for universities to expand coursework in computational science, and to integrate computational science methods into the requirements for science and engineering degrees, certainly at the graduate level and preferably also at the undergraduate level.

If simulation really has become the third branch of the scientific method, complementing theory and physical experimentation, it stands to reason that science and engineering majors should be required to attain basic competency in computer simulation. But for many universities, that’s far easier said than done. Science and engineering curricula often are so tightly packed with requirements that adding anything new is a herculean task.

Yet, for the sake of graduating students and HPC’s growing importance in our societies, innovative solutions to this education problem need to be found. Michael Bader will present TUM’s approach at ISC. There are other universities pursuing innovative approaches. Hyperion plans to continue tracking this issue closely.

Description of ISC2017 STEM STUDENT DAY & GALA (from ISC web site)

ISC High Performance is a forum for HPC community members to network and explore opportunities – for current experts but also for future generations. We have created a program to welcome STEM students into the world of high performance computing, demonstrate how technical skills can propel their future careers in this area, introduce them to the current job landscape, and also show them what the European HPC workforce will look like in 2020 and beyond.

The ISC STEM Student Day & Gala will take place on Wednesday, June 21 and is free for students.

Author Bio:
Steve Conway, Senior Research Vice President, Hyperion Research, directs research related to the worldwide HPC and high performance data analysis (HPDA) markets. He is a member of the steering committee of the HPC User Forum and a frequent conference speaker and author, including of the global study for the U.S. Department of Energy, Talent and Skill Sets Issues Impacting HPC Data Centers. Mr. Conway was an executive at Cray, SGI and CompuServe. He had a 12-year career in university teaching and administration at Boston University and Harvard University. A former Senior Fulbright Fellow, he holds advanced degrees in German from Columbia University and in comparative literature from Brandeis University.

The post ISC 2017 Student Day Tackles the Worldwide HPC Workforce Shortage appeared first on HPCwire.

Inflection Point in HPC Wattage Driving Demand for Distributed Liquid Cooling

Mon, 06/12/2017 - 01:01

Increasing wattage of CPUs and CPUs has been the trend in HPC for many years. So why is it that in the run up to ISC17, there is an increasing amount of buzz about this increased wattage? Many sense that an inflection point has been reached in the relationship between server density, the wattage of key silicon components and heat rejection.

This is first becoming critical in HPC as sustained computing throughput is paramount to the type of applications implemented in high density racks and clusters. Cutting edge HPC requires the highest performance versions of the latest CPUs and GPUs. As ISC17 approaches we are seeing the highest frequency offerings of Intel’s Knight’s landing, Nvidia’s P100 and Intel’s Skylake (Xeon) processor. With that comes high wattages at the node-level with even higher wattages coming in the near future. HPC cooling requirements mean that good-enough levels of heat removal are no longer sufficient.

For HPC, the cooling target needs to not only assure reliability but be able to prevent down-clocking and throttling of nodes running at 100% compute for sustained periods. This recent inflection point means that to cool high wattage nodes, there is little choice other than node-level liquid cooling to maintain reasonable rack densities. Assuming that massive air heat sinks with substantially higher air flows could cool some of the silicon on the horizon, the result would be extremely low compute density. 4U nodes, partially populated racks taking up expensive floor space, and costly data center build outs and expansions would become the norm.

OEMs and a growing number of TOP500 HPC sites such as JCAHPC’s Oakforest-PACS, Lawrence Livermore National Laboratory, Los Alamos National Laboratory, Sandia National Laboratories and the University of Regensburg have addressed both the near term and anticipated cooling needs using Asetek RackCDU D2C™ hot water liquid cooling.

Asetek’s direct to chip distributed cooling architecture addresses the full range of heat rejection scenarios. The distributed pumping architecture is based on low pressure, redundant pumps and closed loop liquid cooling within each server node. This approach allows for a high level of flexibility.

The lowest data center impact is with Asetek ServerLSL server-level liquid enhanced air cooling solution. ServerLSL replaces less efficient air coolers in the servers with redundant coolers (cold plate/pumps) and exhausts 100% of this hot air into the data center. It can be viewed as a transitional stage in the introduction of liquid cooling or as a tool for HPC sites to instantly incorporate the highest performance computing into the data center. At a site level, all the heat is handled by existing CRACs and chillers with no changes to the infrastructure.

While ServerLSL isolates the liquid cooling system within each server, the wattage trend is pushing the HPC industry toward all liquid-cooled nodes and racks. Enter

Asetek’s RackCDU system which is rack-level focused, enabling a much greater impact on cooling costs for the datacenter. In fact, RackCDU is used by all of the current sites in the TOP500 using Asetek liquid cooling.

Asetek RackCDU provides the answer both at the node level and for the facility overall. As with ServerLSL, RackCDU D2C (Direct-to-Chip) utilizes redundant pumps/cold plates atop server CPUs & GPUs, cooling those components and optionally memory and other high heat components. The heat collected is moved it via a sealed liquid path to heat exchangers in the RackCDU for transfer of heat into facilities water. RackCDU D2C captures between 60 percent and 80 percent of server heat into liquid, reducing data center cooling costs by over 50 percent and allowing 2.5x-5x increases in data center server density.

KNL Server Node with Asetek RackCDU D2C Liquid Cooling

The remaining heat in the data center air is removed by existing HVAC systems in this hybrid liquid/air approach. When there is unused cooling capacity available, data centers may choose to cool facilities water coming from the RackCDU with existing CRAC and cooling towers.

Asetek’s distributed pumping method of cooling at the server, rack, cluster and site levels deliver flexibility in the areas of heat capture, coolant distribution and heat rejection that centralized pumping cannot.

As HPC requirements in 2017 and beyond require more efficient cooling, Asetek continues to demonstrate global leadership in liquid cooling as its OEM and installation bases grow. To learn more about Asetek liquid cooling, stop by booth J-600 at ISC 2017 in Germany.

Appointments for in-depth discussions about Asetek’s data center liquid cooling solutions at ISC17 may be scheduled by sending an email to questions@asetek.com.

 

The post Inflection Point in HPC Wattage Driving Demand for Distributed Liquid Cooling appeared first on HPCwire.

Code Modernization: Bringing Codes Into the Parallel Age

Thu, 06/08/2017 - 19:49

The ways that advanced computing performance depends on more – much more – than the processor take many forms. Regardless of Moore’s Law validity, it’s indisputable that other aspects of the computing ecosystem must keep pace with processor development if the system is to deliver the results everyone’s after.

One of those aspects is application code, some of which dates back to the late 1950s, when parallel computing was a futuristic computer science vision. Still in use today, those codes have been goosed, tickled and jolted for better performance, but they remain at their core what they’ve always been: serial applications.

We recently caught up with Joe Curley, senior director of Intel’s code modernization organization, who shared observations about Intel’s effort to optimize, or parallelize, widely used public codes for the latest generations of highly parallel x86 CPUs.

This includes work around applications used by manufacturers in product design, such as OpenFOAM for CFD; advanced MRI diagnostics programs used in the medical industry; seismic code for the oil and gas industry and applications used by banks and other financial services organizations.

Obviously, it’s in Intel’s self-interest to extend the life of the 40-year-old x86 architecture by maintaining an up-to-date code library. But organizations all over the world are hampered by the old code that, unoptimized, drags down the throughput of high performance clusters and impedes the work they do.

Intel’s Joe Curley

A recent development in code modernization, Curley said, has been incorporation of AI and machine learning techniques, which – when done right – can boost performance exponentially well beyond conventional, processor-focused code modernization work.

Much of Intel’s code modernization work comes out of its global network of Intel Parallel Computing Centers (IPCCs). Begun four years ago with six centers, the program has expanded to 72 and has worked on 120 codes in more than 21 domains.

The following are excerpts from our interview with Curley, some of which have been re-ordered for clarity.

Definition and Need

Code modernization can mean many things, from using a modern language to optimizing performance. We use code modernization in the literal sense: to become modern, using the newest information methods with technology.

The typical impact of a code modernization problem is giving someone the ability to take on a problem that was just too big to get at before. We’re trying to extract the maximum performance from an application and take full advantage of modern hardware. Other words have been used: optimization, parallelization and some others. But you can be parallel without being optimal, you can be optimal without being parallel. So we chose a slightly different term. It’s imperfect but it gets across the idea.

Modern, general purpose server processors have 18-22 processing cores with two threads and a vector unit built into it. They’re massively parallel processors. But by and large the applications we run on them have been derived from code that was generated in a sequential processing era. The fundamental problem that we work with is that many of the codes used in industry or in the enterprise today are derived from algorithms written anywhere from the 1950s to the 2000’s. And the microprocessors used at the time were primarily single core machines, so you have a very serial application.

In order to use a modern processor you could just take that serial application, create many copies of it and try to run it in parallel. And that’s been done for years. But the real power performance breakthroughs happen when someone steps back and asks: How can I start using all of these cores together computationally and in parallel?

What’s encompassed by code modernization

Our group does everything from training, academic engagement, building sample codes, working with ISVs in communities, both internally and externally. We focus efforts on open source communities and open source codes. The reason is that we’re not only trying to improve science, we’re also trying to improve the understanding of how to program in parallel and how to solve problems, so having the teaching example, or the example that a developer can look at, that’s incredibly important.

We’ve taken the output from the IPCCs, we’ve written it down, we’ve created case studies, we’ve created open source examples, graphs, charts – teaching examples – and then put it out through a series of textbooks. But importantly, all of the (output) can be used either by a software developer or an academic to teach people the state of the art.

For the IPCCs, the idea was to find really good problems that would most benefit from using the modern machine if only we could unlock performance of the code. Our work ranges practical academics to communities that generate community codes. In some cases they’re industrial and academic partnerships, some are in the oil and gas industry, working on refinement of core codes that will then go back in for use in seismic imaging. The idea is for these are to be real hands-on workshops between domain scientists, computer scientists, and Intel that have actual practical use within the life of our products.

So not only are we getting the first order of benefit if, say, an auto manufacturer was using OpenFOAM and got a result faster. That’s great, we’ve made it more efficient. But we’re also creating a pool of programmers and developers who’ll be building code for the next 20 years, making them more efficient as well.

Example: Medical/Life Sciences

One of our IPCC’s was with Princeton University, they were trying to get a better understanding of what was happening inside the human brain through imaging equipment while a patient was in the medical imaging apparatus. It’s a form of MRI called fMRI. The science on that is pretty well established. They knew how to take the data that was coming from the MRI, and they could compute on it and create a model of what’s going on inside the brain. But in 2012, when we started the project, they estimated it would take 44 years on their cluster to be able to calculate. It wasn’t a practical problem to solve.

So instead of using the serial method they were using they could start using it in parallel on more energy efficient, modern equipment. They came up with a couple of things. One: they parallelized their code and saw huge increases in performance. But they also looked at it algorithmically, they began to look at the practically of machine learning and AI, and how you could use that for science. Since these researchers happened to be from neural medicine centers they understood how the brain works. They were trying to use the same kind of cognition, or inference, that you have inside your brain algorithmically on the data coming from the medical imaging instrument.

They changed the algorithm, they parallelized their code, they put it all together and ended up with a 10,000X increase in performance. More practically, they were able to take something that would have taken 44 years down to a couple of minutes. They went from something requiring a supercomputing project at a national lab to something that could be done clinically inside a hospital.

That really captures what you can try to do inside a code modernization project. If you can challenge your algorithms, you can look at the best ways to compute, you can look at the parallelization, you can look at energy efficiency, and you can achieve massive increases in performance.

So now, how that hospital treats the neurology of the brain is different because of the advances offered by code modernization. Of course the application of that goes out into the medical community, and you can start looking at fMRI in more clinical environments.

Example: Industrial Design

One of the community applications, OpenFOAM, is used heavily in automobile manufacturing. We’ve worked with a number of fellow researchers to deliver breakthroughs in power and performance by 2 or 3x, which, across an application the size and magnitude of OpenFOAM is really substantial.

It also creates a lighthouse example for commercial ISVs of what can be done. This clearly showed that for computational fluid dynamics at scale, entirely new methods can be applied to the problem. We’ve had a lot of interest and pick-up from commercial ISVs on some of the work being done using some of the community codes.

Here’s the thing we want to get at: What’s the real value in computing a model faster? Most people tend to think of code modernization simply as making a simulation run faster. But one of the things we’ve done is develop software that can help you better visualize your physical design.

Audi, for example, has worked with Autodesk as an ISV partner, they’ve developed modern Raytracer (rendering engine) examples of things we work on inside our code modernization group. We have another group that works on visualization and how to take your images and make them look lifelike. Autodesk has come up with clever ways of doing that and building that into their product line, and then allowing Audi to remove physical prototypes both for assembly as well as for interior and exterior design from their process.

Think of someone building a clay model of a car and taking it to a wind tunnel, or building a fit-and-finish model of a car, to see how the interior design will look and to see if it’s pleasing to the customer. They’ve removed all that modeling. It’s all being done digitally, not only the digital design and simulation but also the digital prototyping, and then visualizing it through modern software on a departmental-sized computer.

The impact of that, according to Audi when they spoke at ISC, is that it removed seven months from their process for the fit-and-finish prototypes and six for the physical prototypes. If you can shave that much time out of your process you can gain major competitive advantage from HPC.

It’s all made possible by new highly parallel codes and interestingly, all the visualization is done entirely on general-purpose CPUs.

Example: Financial Services

For financial services companies, with code modernization there’s the opportunity to use the same cluster, that you’d use for the rest of your bank’s operations, for the most high performance tasks. Whether it’s options valuation or risk management or some of the tasks you use HPC for, we can do that on general-purpose Xeon CPUs.

In banking one of the problem is that most of those codes are the crown jewels of the banks. So we can’t talk about them. In many cases we don’t even see them. But we can work on the STAC-A2 benchmark – it’s a consortium of banks that’s built a suite of benchmarks for a variety of problems that operate sufficiently like what they do to get an idea of how fast they can run their software, and the STAC-A2 results get published.

On both our general-purpose Xeon and Xeon Phi CPUs through code modernization we’ve set world records for the STAC-A2 repeatedly. It’s an arms race. But we’ve done it multiple times with general purpose code.

That allows the bank to take that code as an exemplar, and apply it to their own special algorithms and their own financial science, and get the most performance out of their general-purpose infrastructure.

The post Code Modernization: Bringing Codes Into the Parallel Age appeared first on HPCwire.

Erik Lindahl on Bio-Research Advances, the March to Mixed-Precision and AI-HPC Synergies

Thu, 06/08/2017 - 15:53

At PRACEdays in Barcelona, HPCwire had the opportunity to interview Dr. Erik Lindahl, Stockholm University biophysics professor and chair of the PRACE Scientific Steering Committee about the goals of PRACE, the evolution of PRACEdays, and the latest bioscience and computing trends. Part one of that interview, available here, takes a in-depth look at how PRACE is enabling European HPC research. In part two, below, Lindahl offers his perspective on some of the trends making the biggest impact on HPC today, including the momentum for mixed-precision and the potential for AI synergies.

HPCwire: What excites you most about the field of bioinformatics right now?

Erik Lindahl: When it comes to bioinformatics in general that’s very much dominated by sequencing today, which is an amazing technology but it’s also one of the fields where it’s been hardest to use supercomputers because people are very much dependent on lots of scripts and things that aren’t really that paralyzed yet. When it comes to structural biology where I’m working, there’s an amazing generation of techniques that can determine the structures of small molecules that we have in our bodies — proteins, DNA, RNA — that are basically the work horses for everything. Almost anything in your body that does anything is a protein. We’ve been able to determine structures of proteins for decades, but historically we’ve always seen these as small rigid molecules because by the time you determine a crystal of something you’re going to determine at 100 Kelvin, they don’t move.

Erik Lindahl

The challenge with all these molecules is they would not work unless they actually moved because this ion channel is literally like a door or window in your cell. It will have to open hundreds or thousands of times per second to let through an ion, and that’s when you get a nerve signal. But with traditional techniques we’ve tried to study, you only get the still images you never get the movie. Both we and other groups have worked for years on simulating these channels. It’s just the last five years or so that computers are now fast enough that we can reach these biological timescales and actually see these channels opening and closing on a computer. And I get super excited because suddenly we can start to use these as computational microscopes that actually go beyond what we see in the lab because in the lab you get either the open or the closed states but in the computer we see the opening and closing. Then you can start to understand what happens if something went wrong, so if there was a small mutation of this channel that causes disease. Can you start to understand how should you design a drug to not just get a drug to bind but actually get the drug to say prevent the channel from closing so easily? So I’m very much on the research side here and my interest is understanding fundamental biology, but there are amazing applications opening just in the next few years. Give this 10 years and I think the majority of drugs are going to be designed this way.

HPCwire: Where does GROMACS fit into your research?

Lindahl: GROMACS is a product that started in the mid 1990s and that’s a product to simulate the motion of molecules, in particular bio-molecules. Molecular simulation is in principle a very easy problem but then it gets very hard. The idea of this is that everything in life including biology is really deterministic, so if you know all of the positions of atoms, and we know that these obey the normal laws of physics, you can actually calculate the forces on atoms for instance that two charges repel each other if they have the same sign; the attract each other if they have different signs.

So in principle these equations are not that difficult. The only problem is you have many of them and not just many, you have billions of equations that you need to solve. Just solving these equations is not going to be enough because when you calculate on all these forces you can move the atom say during a femtosecond, so you’re going to need to repeat this billions of times to get to the relevant timescales that would be milliseconds or something. So GROMACS started out as a code we developed as students [at the University of Groningen]. At the time I still remember when we were proud that we could run on over 20 cores. But over the years of course we’ve had to push this to hundreds of cores and thousands of cores and then tens of thousands of cores and for the last few years, hundreds of thousands of cores in some cases.

Initially this was just meant as our own research tool and we’re very happy that this has gone on in the field. While we lead the product the wonderful thing is that we’re having dozens of wonderful students and professors help develop this so this is turned into a community project where we jointly try to make the simulations more advanced. What is happening now is we’re trying to connect a whole range of experimental techniques. Twenty years ago we [as a field] were pretty happy to just sit in front of our computers but what’s happened in the last decade is that’s not sufficient anymore, you need to do both experiments and simulations and you need to constantly couple the simulations to the experiments and to me this has been like doing a second PhD. It’s wonderful; I’m very much a beginner in it but it’s revolutionizing everything we know about science.

HPCwire: So it’s more interdisciplinary…

Lindahl: First, it’s making it more interdisciplinary but the other thing is that when simulations first appeared, they were kind of tier two in science in the sense that you first determined the structure and then you also tried to simulate it a bit basically to confirm your ideas. This was when simulations were difficult, we didn’t necessarily always trust them, there were errors in many of these programs, there were errors in our models; we couldn’t simulate far enough to actually make biological predictions.

But what’s changed the last few years is the simulations have become so powerful that we not just we but even experimentalists tend to trust them. I wouldn’t say that they replace experiments, in some cases they do, but they have increasingly become a complement used very early in the pipeline, so nowadays we frequently use simulations and computational methods while we are determining the structure and that of course is taking a bit of a leap of faith on the experimental side which has forced us to be way more strict about the quality control in these programs.

My focus is more on the software side but one of the reasons why I love this field is how software and hardware develop together. The hardware is pointless without software that can use it, but the software is just as futile an exercise unless we have faster computers all the time.

HPCwire: Hardware, software and wet lab.

Lindahl: If we didn’t have the wet lab we wouldn’t have any biological knowledge at all and we couldn’t test our ideas.

HPCwire: You are also well-known for retooling GROMACS to take advantage of the increased performance of single-precision.

Lindahl: When I was a student in the late 1990s we were sitting down and trying to get these programs faster and as we were fairly proud these codes were fast and if you could get your codes to a couple of percent faster you were really happy – and then at some point we noticed that there were these new gaming instructions in modern CPUs. They were only published for gaming; they had very low accuracy and the idea is that there were a couple of operations in particular when you’re drawing shadows. It turns out when you’re drawing shadows, calculating distances are very important and you calculate distance by calculating inverse square roots.

I don’t remember exactly how I noticed, but at the time we spent about 85 percent of the time in our code calculating inverse square roots – that’s the bottleneck in all of these codes. At this point it dawned upon me if we can use these inverse square roots we could probably double the performance of these programs. The only problem is these instructions are meant for games; you typically don’t need 16 digits of accuracy in a game. So the hardest part was we had to move everything over to get by with single-precision. Of course changing to single-precision is easy but changing to single-precision and still maintaining your accuracy that’s hard. There are quite a few algorithms that you need to redo the way you sum things or change the order in which you do operations and in some cases even come up with a different algorithm so that you don’t need to rely on brute-force double-precision.

So this actually worked. It probably took us a year, and I don’t even want to remember the number of nights I spent coding assembly because that was the only way to access instructions at the time, but we doubled our performance by using single-precision for these instructions. This kept us very happy for about a decade and then in the early 2000s or so everybody started using GPUs and it was the same story all over again — initially GPUs were only targeting games at the time.

We were lucky. We had already done the strength reduction, we could code everything in single-precision. It was fairly easy for us to just switch over and use all the same algorithms on the GPUs. And of course since then GPUs have become better at double-precision but there is still a factor two difference and I think we are seeing that in all modern processors, not just based on the floating point. It’s because double-precision data also takes twice as much space. We’re heading into all this big data and artificial intelligence area and since data is becoming more important than the compute in many ways. If you can save a factor of two when it comes to storing your data, that is increasingly important. So I think that single-precision is here to stay — give us 5 to 10 years and I think we’re increasingly going to see the double-precision is a niche.

HPCwire: Will computational scientists be willing to make the trade-off between compute power and time to solution?

Lindahl: I think they will have to because I think we’ve seen this development a couple of times. There are certain aspects to double-precision that are important particularly to the national labs on the largest supercomputers, but most chip design is driven by the mass-market and the consumer market doesn’t really need double-precision that much so I think it will still be around but I suspect that it’s going to cost you more and more if you absolutely need double-precision. At some point I suspect that we’re going to be in situations similar to the vector machines we had in the 1990s. There were certainly some codes that only worked on a vector machine and there were centers that kept buying the vector machines because they were so dependent on the code that absolutely needed a vector machine. Then the vendors stopped producing these machines and it doesn’t matter how important your code is if you can’t buy such a machine and of course there are today there are no vector codes. We probably won’t have exactly the same development here, but this is an important momento mori. It doesn’t matter how important your code is or the amazing science you can do with it; if there is no computer that can run it it is no longer a useful code.

HPCwire: What potential do you see to bring AI into traditional modeling and simulation workflows?

Lindahl: I think there are two parts maybe even three. With AI the data is the most valuable part, but if you look at traditional computing applications we have huge amounts of data; this is what’s produced in all these simulations. I think there’s tremendous potential immediately to start using artificial intelligence and analyze all of the data produced in simulations – not just in life sciences, but fluid dynamics, everything that you see presented here. Give this two or three years and I bet that suddenly we will say, why didn’t we do this three years ago because the programs are already available. There are amazing algorithms to mine the data and find important events, but that’s kind of the low hanging fruit.

The harder part, I think, will force us to completely revisit how we do modeling and this is where we have a problem. We are frequently good at what we do because we’ve been doing it for 25 years. That’s good in many ways but it’s also very dangerous because the natural instinct is of course to apply the stuff you know to a problem. The difference with AI is just as I mentioned that computers developed to have new instructions, new architectures, accelerators you can use in new ways. The latest processors have roughly an order of magnitude more power when used for machine learning than for traditional calculations. Now this is even worse because you don’t even have single-precision, you have half-precision, and somewhere there, most traditional scientists, and I’ve done this too, start to say sorry half-precision is too little, I can’t get by with that. And that is true, you can’t do molecular simulation with half-precision you would lose too much. But I keep looking — if it’s a factor of 10 more powerful, I think maybe we should even forget about trying simulate all of the motions of atoms. Can we find other ways to mine experimental data? If you have a protein that moves from one state to another, rather than simulating how each atom moves, maybe we can use machine learning to predict how the protein would move. That’s of course a bit of blasphemy as a scientist in my own field we’re not supposed to do it this way. But I think we will gradually be forced to or rather the same thing there, that if you accept a bit a blasphemy suddenly you will realize that computers today — if they are a factor of 10 faster in a decade they might be a factor of 100 or 1,000 faster if you accept to do this in new ways and when that happens I think we will all need to move over.

HPCwire: Who is doing early work in this direction?

Lindahl: One of the works I am most impressed with is where researchers have started to use deep learning to solve a problem for which you would historically always use expensive quantum chemistry codes. In quantum chemistry, you have processions of atoms and then you solve extremely expensive equations to tell what is the energy and this means that given a set of coordinates for your atoms you should predict what an energy is. This is used lots in material science, occasionally life sciences too. It’s extremely costly; it’s orders of magnitudes more costly than the problems I work with, and you can’t even imagine doing this as a function of time in milliseconds, but what people have done is they have trained machine learning, deep learning networks to do this, so given a set of coordinates what should the energy be.

Here you’ve combined the traditional way of doing simulations because you are going to need to create a training set with millions of small simulations that given these coordinates this is what the energy should be. Then you train your network to predict energies based on coordinates, and then you can start feeding this network a new set of coordinates but instead of taking 24 hours you get the network in a millisecond – and then you get what is the atom.

There are of course cases where it’s not as good as quantum chemistry, but in a few examples they do surprisingly well. [Here’s an example of this research published earlier this year in the journal Chemical Science.] Particularly if you are in industry, the advantage of being able to do things in less than a second I think they frequently outweigh the added accuracy you would get in 24 hours. Here again we see this marriage that you can’t train this network unless people have access to these very large resources to create the training data but once you’ve trained the machine learning algorithms this becomes something that you can apply directly in industry.

The post Erik Lindahl on Bio-Research Advances, the March to Mixed-Precision and AI-HPC Synergies appeared first on HPCwire.

Neutrons Zero in on the Elusive Magnetic Majorana Fermion

Thu, 06/08/2017 - 14:25

OAK RIDGE, Tenn., June 8, 2017 — Neutron scattering has revealed in unprecedented detail new insights into the exotic magnetic behavior of a material that, with a fuller understanding, could pave the way for quantum calculations far beyond the limits of the ones and zeros of a computer’s binary code.

A research team led by the Department of Energy’s Oak Ridge National Laboratory has confirmed magnetic signatures likely related to Majorana fermions—elusive particles that could be the basis for a quantum bit, or qubit, in a two-dimensional graphene-like material, alpha-ruthenium trichloride. The results, published in the journal Science, verify and extend a 2016 Nature Materials study in which theteam of researchers from ORNL, University of Tennessee, Max Planck Institute and Cambridge University first proposed this unusual behavior in the material.

“This research is a promise delivered,” said lead author Arnab Banerjee, a postdoctoral researcher at ORNL. “Before, we suggested that this compound, alpha-ruthenium trichloride, showed the physics of Majorana fermions, but the material we used was a powder and obscured many important details. Now, we’re looking at a large single crystal that confirms that the unusual magnetic spectrum is consistent with the idea of magnetic Majorana fermions.”

Majorana fermions were theorized in 1937 by physicist Ettore Majorana. They are unique in that, unlike electrons and protons whose antiparticle counterparts are the positron and the antiproton, particles with equal but opposite charges, Majorana fermions are their own antiparticle and have no charge.

In 2006, physicist Alexei Kitaev developed a solvable theoretical model describing how topologically protected quantum computations could be achieved in a material using quantum spin liquids, or QSLs. QSLs are strange states achieved in solid materials where the magnetic moments, or “spins,” associated with electrons exhibit a fluidlike behavior.

“Our neutron scattering measurements are showing us clear signatures of magnetic excitations that closely resemble the model of the Kitaev QSL,” said corresponding author Steve Nagler, director of the Quantum Condensed Matter Division at ORNL. “The improvements in the new measurements are like looking at Saturn through a telescope and discovering the rings for the first time.”

Because neutrons are microscopic magnets that carry no charge, they can be used to interact with and excite other magnetic particles in the system without compromising the integrity of the material’s atomic structure. Neutrons can measure the magnetic spectrum of excitations, revealing how particles behave. The team cooled the material to temperatures near absolute zero (about minus 450 degrees Fahrenheit) to allow a direct observation of purely quantum motions.

Using the SEQUOIA instrument at ORNL’s Spallation Neutron Source allowed the investigators to map out an image of the crystal’s magnetic motions in both space and time.

“We can see the magnetic spectrum manifesting itself in the shape of a six-pointed star and how it reflects the underlying honeycomb lattice of the material,” said Banerjee. “If we can understand these magnetic excitations in detail then we will be one step closer to finding a material that would enable us to pursue the ultimate dream of quantum computations.”

Banerjee and his colleagues are pursuing additional experiments with applied magnetic fields and varying pressures.

“We’ve applied a very powerful measurement technique to get these exquisite visualizations that are allowing us to directly see the quantum nature of the material,” said coauthor Alan Tennant, chief scientist for ORNL’s Neutron Sciences Directorate. “Part of the excitement of the experiments is that they’re leading the theory. We’re seeing these things, and we know they’re real.”

The paper’s authors also include ORNL’s Jiaqiang Yan, Craig A. Bridges, Matthew B. Stone, and Mark D. Lumsden; Cambridge University’s Johannes Knolle; the University of Tennessee’s David G. Mandrus; and Roderich Moessner from the Max Planck Institute for the Physics of Complex Systems in Dresden.

The study was supported by DOE’s Office of Science. The Spallation Neutron Source is a DOE Office of Science User Facility. UT-Battelle manages ORNL for the DOE Office of Science. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, please visit http://science.energy.gov/.

Source: ORNL

The post Neutrons Zero in on the Elusive Magnetic Majorana Fermion appeared first on HPCwire.

OCF Achieves Elite Partner Status with NVIDIA

Thu, 06/08/2017 - 14:23

LONDON, June 8, 2017 — High performance computing, storage and analytics integrator OCF has successfully achieved Elite Partner status with NVIDIA for Accelerated Computing, becoming only the second business partner in Northern Europe to achieve this level.

Awarded in recognition of OCF’s ability and competency to integrate a wide portfolio of NVIDIA’s Accelerated Computing products including TESLA P100 and DGX-1, the Elite Partner level is only awarded to partners that have the knowledge and skills to support the integration of GPUs, as well as the industry reach to support and attract the right companies and customers using accelerators.

“For customers using GPUs, or potential customers, earning this specialty ‘underwrites’ our service and gives them extra confidence that we possess the skills and knowledge to deliver the processing power to support their businesses,” says Steve Reynolds, Sales Director, OCF plc. “This award complements OCF’s portfolio of partner accreditations and demonstrates our commitment to the vendor.”

OCF has been a business partner with NVIDIA for over a decade and has designed, built, installed and supported a number of systems throughout the UK that include GPUs. Most recently, OCF designed, integrated and configured ‘Blue Crystal 4’, a High Performance Computing (HPC) system at the University of Bristol, which includes 32 nodes with 2 NVIDIA Tesla P100 GPUs accelerators each.

In addition, as a partner of IBM and NVIDIA via the OpenPOWER Foundation, OCF has supplied two IBM Power Systems S822LC for HPC systems, codenamed ‘Minsky’, to Queen Mary University of London (QMUL).

The two systems, which pair a POWER8 CPU with 4 NVIDIA Tesla P100 GPU accelerators, are being used to aid world-leading scientific research projects as well as teaching, making QMUL one of the first universities in Britain to use these powerful deep learning machines. The university was also the first in Europe to deploy an NVIDIA DGX-1 system, described as the world’s first AI supercomputer in a box.

Source: OCF

The post OCF Achieves Elite Partner Status with NVIDIA appeared first on HPCwire.

Promising Spintronic Switch Proposed by UTD Researcher-led Team

Thu, 06/08/2017 - 11:00

Spintronics has for some time been a promising research area in efforts to develop alternative computing systems. Among its many prospects are smaller ‘transistor’ size, higher speeds, lower power consumption, and innovative architecture. This week a group of researchers led by Joseph Friedman of the University of Texas, Dallas, report in Nature Communications development of an all-carbon spintronic system in which spintronic switches function as gates.

“The all-carbon spintronic switch functions as a logic gate that relies on the magnetic field generated when an electric current moves through a wire. In addition, the UTD researchers say a magnetic field near a graphene nanoribbon affects the current flowing through the ribbon. Transistors cannot exploit this phenomenon in silicon-based computers, but in the new spintronic circuit design, electrons moving through carbon nanotubes (CNT) create a magnetic field that impacts the flow of current in a nearby graphene nanoribbon (GNR), providing cascaded logic gates that are not physically connected,” notes ACM TechNews in a description of the work.

Since communication between each of the graphene nanoribbons takes place via an electromagnetic wave, the researchers predict communication will be much faster, with the potential for terahertz clock speeds. At least that’s the hope. The new work is significant on several fronts, not least for providing a design that can be tested.

Friedman and his collaborators write in their Nature article (Cascaded spintronic logic with low-dimensional carbon), “Though a complete all-carbon spin logic system is several years away from realization, currently available technology permits experimental proof of the concept as shown [here]. By exploiting the exotic behavior of GNRs and CNTs, all-carbon spin logic enables a spintronic paradigm for the next generation of high-performance computing.”

“The concept brings together an assortment of existing nanoscale technologies and combines them in a new way,” says Friedman in an account of the work on the UT Dallas website (Engineer Unveils New Spin on Future of Transistors with Novel Design).

As shown below (taken from the paper), the active switching element is a zigzag GNR field-effect transistor with a constant gate voltage and two CNT control wires. The gate voltage is held constant, and the GNR conductivity is therefore modulated solely by the magnetic fields generated by the CNTs. “These magnetic fields can flip the orientation of the strong on-site magnetization at the GNR edges, which display local antiferromagnetic (AFM) ordering due to Hubbard interactions,” according to the paper.

Magnetoresistive GNR unzipped from carbon nanotube and controlled by two parallel CNTs on an insulating material above a metallic gate. As all voltages are held constant, all currents are unidirectional. The magnitudes and relative directions of the input CNT control currents ICTRL determine the magnetic fields B and GNR edge magnetization, and thus the magnitude of the output current IGNR.

“Remarkable breakthroughs have established the functionality of graphene and carbon nanotube transistors as replacements to silicon in conventional computing structures, and numerous spintronic logic gates have been presented. However, an efficient cascaded logic structure that exploits electron spin has not yet been demonstrated. In this work, we introduce and analyse a cascaded spintronic computing system composed solely of low-dimensional carbon materials,” write the researchers.

“We propose a spintronic switch based on the recent discovery of negative magnetoresistance in graphene nanoribbons, and demonstrate its feasibility through tight-binding calculations of the band structure. Covalently connected carbon nanotubes create magnetic fields through graphene nanoribbons, cascading logic gates through incoherent spintronic switching. The exceptional material properties of carbon materials permit Terahertz operation and two orders of magnitude decrease in power-delay product compared to cutting-edge microprocessors. We hope to inspire the fabrication of these cascaded logic circuits to stimulate a transformative generation of energy-efficient computing.”

Shown here is a conceptual spintronic-based 1-bit adder circuit.

(a) The physical structure of a spintronic one-bit full adder with magnetoresistive GNR FETs (yellow) partially unzipped from CNTs (green), some of which are insulated (brown) to prevent electrical connection. The all-carbon circuit is placed on an insulator above a metallic gate with constant voltage VG. Binary CNT input currents A and B control the state of the unzipped GNR labelled XOR1, which outputs a current with binary magnitude A”B. The output of XOR1 flows through a CNT that functions as an input to XOR2 and XOR3 before reaching the wired-OR gate OR2, which merges currents to compute CIN3(A”B). This current controls XOR4 and terminates at V . The other currents operate similarly, computing the one-bit addition function with output current signals S and COUT. (b) In the symbolic circuit diagram shown here with conventional symbols.

“This was a great interdisciplinary collaborative team effort,” Friedman said, “combining my circuit proposal with physics analysis by Jean-Pierre Leburton and Anuj Girdhar at the University of Illinois at Urbana-Champaign; technology guidance from Ryan Gelfand at the University of Central Florida; and systems insight from Alan Sahakian, Allen Taflove, Bruce Wessels, Hooman Mohseni and Gokhan Memik at Northwestern.”

Link to Nature Communications article: https://www.nature.com/articles/ncomms15635

Link to UTD article: http://www.utdallas.edu/news/2017/6/5-32589_Engineer-Unveils-New-Spin-on-Future-of-Transistors_story-wide.html?WT.mc_id=NewsHomePageCenterColumn

Images: Nature

The post Promising Spintronic Switch Proposed by UTD Researcher-led Team appeared first on HPCwire.

TACC Mines Cancer Data for Treatment Clues

Thu, 06/08/2017 - 10:28

AUSTIN, Texas, June 8, 2017 — There is an enormous amount that we do not understand about the fundamental causes and behavior of cancer cells, but at some level, experts believe that cancer must relate to DNA and the genome.

In their seminal 2011 paper, “The Hallmarks of Cancer: The Next Generation,” biologists Douglas Hanahan and Robert Weinberg identified six hallmarks, or commonalities, shared by all cancer cells.

“Underlying these hallmarks are genome instability, which generates the genetic diversity that expedites their acquisition, and inflammation, which fosters multiple hallmark functions,” they wrote.

An approach that has proved very successful in uncovering the complex nature of cancer is genomics — the branch of molecular biology concerned with the structure, function, evolution, and mapping of genomes.

Since the human genome consists of three billion base pairs, it is impossible for an individual to identify single mutations by sight. Hence, scientists use computing and scientific software to find connections in biological data. But genomics is more than simple pattern matching.

“When you move into multi-dimensional, structural, time-series, and population-level studies, the algorithms get a lot harder and they also tend to be more computationally intensive,” said Matt Vaughn, Director of Life Sciences Computing at the Texas Advanced Computing Center (TACC). “This requires resources like those at TACC, which help large numbers of researchers explore the complexity of cancer genomes.”

Fishing in Big Data Ponds

A group led by Karen Vazquez, professor of pharmacology and toxicology at The University of Texas at Austin, has been working to find correlations between chromosomal rearrangements — one of the hallmarks of cancer genomes — and certain DNA sequences with the potential to fold into secondary structures.

These structures, including hairpin or cruciform shapes, triple or quadruple-stranded DNA, and other naturally-occurring, but alternative, forms, are collectively known as “potential non-B DNA structures” or PONDS.

PONDS enable genes to replicate and generate proteins and are therefore essential for human life. But scientists also suspect they may be linked to mutations that can elevate cancer risk.

Using the Stampede and Lonestar supercomputers at TACC, Vasquez worked with researchers from the University of Texas MD Anderson Cancer Center and Cardiff University to test the hypothesis that PONDS might be found at, or near, rearrangement breakpoints — locations on a chromosome where DNA might get deleted, inverted, or swapped around.

By analyzing the distribution of PONDS-forming sequences within about 1,000 bases of approximately 20,000 translocations and more than 40,000 deletion breakpoints in cancer genomes, they found a significant association between PONDS-forming sequences and cancer. They published their results in the July 2016 issue of Nucleic Acids Research.

“We found that short inverted repeats are indeed enriched at translocation breakpoints in human cancer genomes,” said Vazquez.

The correlation recurred in different individuals and patient tumor samples. They concluded that PONDS-forming sequences represent an intrinsic risk factor for genomic rearrangements in cancer genomes.

“In many cases, translocations are what turn a normal cell into a cancer cell,” said co-author Albino Bacolla, a research investigator in molecular and cellular oncology at MD Anderson. “What we found in our study was that the sites of chromosome breaks are not random along the DNA double helix; instead, they occur preferentially at specific locations. Cruciform structures in the DNA, built by the short, inverted repeats, mark the spots for chromosome breaks, mutations, and potentially initiate cancer development.”

While the study provides evidence that PONDS-forming repeats promote genomic rearrangements in cancer genomes, it also raises new questions, such as why PONDS are more strongly associated with translocation than with deletions?

Vasquez and her collaborators have followed up their computational research with laboratory experiments that explore the specific conditions under which translocations form cancer-inducing defects. Writing in Nucleic Acids Research in May 2017, she described how a specific 23-base pair-long translocation breakpoint can form a potential non-B DNA structure known as H-DNA, in the presence of sodium and magnesium ions.

“The predominance of H-DNA implicates this structure in the instability associated with the human c-MYC oncogene,” Vasquez and her collaborators wrote.

Understanding the processes by which PONDS lead to chromosomal rearrangements, and these rearrangements impact cancer, will be important for future diagnostic and treatment purposes.

[The National Cancer Institute, part of the National Institutes of Health, funded these studies.]

Analyzing the Genome in Action

With the exception of mutations, the genome remains roughly fixed for a given cell line. On the other hand, the transcriptome — the set of all messenger RNA molecules in one cell or a population of cells — can vary with external conditions.

Messenger RNA (mRNA) convey genetic information from DNA to the ribosome, where they specify what proteins the cell should make — a process known as gene expression. Understanding what genes are being expressed in a tumor helps to more precisely classify tumors into subgroups so they can be properly treated.

Vishy Iyer, a professor of molecular biosciences at The University of Texas at Austin, has developed a way to identify sections of DNA that correlate with variations in specific traits, as well as epigenetic, or non-DNA related, factors that impact gene expression levels.

He and his group use this approach on data from The Cancer Genome Atlas (TCGA) to study the effects of genetic variation and mutations on gene expression in tumors. TACC’s Stampede supercomputer helps them mine petabytes of data from TCGA to identify genetic variants and subtle correlations that relate to various forms of cancer.

“TACC has been vital to our analysis of cancer genomics data, both for providing the necessary computational power and the security needed for handling sensitive patient genomic datasets,” Iyer said.

In February 2016, Iyer and a team of researchers from UT Austin and MD Anderson Cancer Center, reported in Nature Communications on a genome-wide transcriptome analysis of the two types of cells that make up the prostate gland — prostatic basal and luminal epithelial populations. They studied the cells’ gene expression in healthy individuals as well as individuals with cancer, and identified cell-type-specific gene signatures that were associated with aggressive subtypes of prostate cancer that showed adverse clinical responses.

“By analyzing gene expression programs, we found that the basal cells in the human prostate showed a strong signature associated with cancer stem cells, which are the tumor originating cells,” Iyer said. “This knowledge can be helpful in the development of more targeted therapies that seek to eliminate cancer at its origin.”

Using a similar methodology, Iyer and a separate team of researchers from UT Austin and the National Cancer Institute identified a specific transcription factor associated with an aggressive type of lymphoma that is highly correlated with poor therapeutic outcomes. They published their results in the Proceedings of the National Academy of Sciences in January 2016.

By identifying these subtle indicators, not just in DNA but in mRNA expression, the work will help improve patient diagnoses and provide the proper treatment based on the specific cancers involved.

“Next-generation sequencing technology allows us to observe genomes and their activity in unprecedented detail,” he said. “It’s also making a lot of biomedical research increasingly computational, so it’s great to have a resource like TACC available to us.”

[These projects were supported, in part, by grants from NIH, DOD, Cancer Prevention Research Institute of Texas, MD Anderson Cancer Center Center for Cancer Epigenetics, Center for Cancer Research, Lymphoma Research Foundation and the Marie Betzner Morrow Centennial Endowment.]

Powering Cancer Research Through Web Portals

With more than 30,000 biomedical researchers running more than 3,000 computing jobs a day, Galaxy represents one of the world’s largest, most successful, web-based bioinformatics platforms.

Since 2014, TACC has powered the data analyses for a large percentage of Galaxy users, allowing researchers to quickly and seamlessly solve tough problems in cases where their personal computer or campus cluster is not sufficient.

Though Galaxy supports scientists studying a range of biomedical problems, a significant number use the platform to study cancer.

“Galaxy is like a Swiss army knife. You can run many different kinds of analyses, from text processing to identifying genomic mutations to quantifying gene expression and more,” said Jeremy Goecks, Assistant Professor of Biomedical Engineering and Computational Biology at Oregon Health and Science University and one of the principal investigators for the project. “For cancer, Galaxy can be used to identify tumor mutations that drive cancer growth, find proteins that are overexpressed in a tumor, as well as for chemo-informatics and drug discovery.”

He estimates that hundreds of researchers each year use the platform for cancer research, himself included. Because cancer patient data is closely protected, the bulk of this usage involves either publically available cancer data, or data on cancer cell lines – immortalized cells that reproduce in the lab and are used to study how cancer reacts to different drugs or conditions.

In Goecks’s personal research, he develops data analysis pipelines to perform genomic profiles of pancreatic cancer and to use those profiles to find mutations associated with the disease and potentially useful drugs.

His work on exome and transcriptome tumor sequencing pipelines published in Cancer Research in January 2015, analyzed sequence data from six tumors and three common cell lines. He showed that they shared common mutations related to the KRAS gene, but that they also exhibited mutations not found in the cell lines, indicating the need to re-evaluate preclinical models of therapeutic response in the context of genomic medicine.

Broadly speaking, Galaxy helps researchers identify biomarkers that give an indication of a patient’s prognosis and drug responses by placing individuals’ genomic data in the context of larger cohorts of cancer patients, often from the International Cancer Genome Consortium or the Genomic Data Commons, both of which encompass more than 10,000 tumor genomes.

“Whenever you get a person’s genomic data and a list of mutations which have arisen in the tumor but not in the rest of the body, the question is: ‘Have we seen these mutations before?'” he explained. “That requires us to connect our individual patient data with these large cohorts, which tells us if we’ve seen it before and know how to treat it. This helps us determine if the cancer is aggressive or benign, or if we know particular drugs that will work given this particular mutation profile that the patient has.”

The fact that it’s now fast and inexpensive to generate DNA sequence data means lots of data is being produced, which in turn requires massive supercomputers like TACC’s Stampede, Jetstream and Corral systems for analysis, storage and distribution.

“This is an ideal marriage of TACC having tremendous computing power with scalable architecture and Galaxy coming along and saying, ‘we’re going to go the last mile and make sure that people who can’t normally use this hardware are able to.'”

As biology becomes an increasingly data-driven discipline, high-performance computing grows in importance as a critical component for the science.

“It’s so easy to collect data from sequencing, proteomics, imaging. But when you have all of these datasets, you have to be able to process them automatically,” he says. “The value of Galaxy is hiding some of the complexity that comes with that computing so that the scientist can focus on what matters to them: how to analyze a dataset to extract meaningful information, whether an analysis was successful, and how to produce knowledge by connecting analysis results with those in the broader biomedical community.”

[The Galaxy Project is supported in part by NSF, NHGRI, The Huck Institutes of the Life Sciences, The Institute for CyberScience at Penn State, and Johns Hopkins University.]

Source: Aaron Dubrow, TACC

The post TACC Mines Cancer Data for Treatment Clues appeared first on HPCwire.

Dell Technologies Reports Fiscal Year 2018 Q1 Financial Results

Thu, 06/08/2017 - 08:43

ROUND ROCK, Texas, June 8, 2017 — Dell Technologies (NYSE: DVMT) announces its fiscal 2018 first quarter results, which reflect the growth and impact of the EMC transaction.

For the first quarter, consolidated revenue was $17.8 billion and non-GAAP revenue was $18.2 billion. During the quarter, the company generated an operating loss of $1.5 billion, with a non-GAAP operating income of $1.2 billion.

“We’re pleased with overall results in the first quarter of our new go-to-market structure and the demand velocity we saw in a challenging component cost environment,” said Tom Sweet, chief financial officer, Dell Technologies Inc. “I’m encouraged by these achievements and excited about the opportunities ahead as we continue to provide a broad portfolio of solutions for our customers’ digital transformations.”

The company ended the quarter with a cash and investments balance of $14.9 billion. Since closing the EMC transaction, Dell Technologies has paid down approximately $7.1 billion in gross debt, resulting in a $200 million reduction in annualized interest expense on a run-rate basis. The company also has repurchased $1.1 billion of Class V Common Stock under both the previously announced Class V Group and DHI Group repurchase programs.

Fiscal first quarter 2018 results

Three Months Ended

May 5, 2017

April 29, 2016

Change

(in millions, except percentages; unaudited)

Net revenue

$                         17,816

$                         12,241

46 %

Operating loss

$                         (1,500)

$                            (139)

(979)%

Net loss from continuing operations

$                         (1,383)

$                            (424)

(226)%

Non-GAAP net revenue

$                         18,171

$                         12,319

48 %

Non-GAAP operating income

$                           1,197

$                              539

122 %

Non-GAAP net income from continuing operations

$                              581

$                              264

120 %

Adjusted EBITDA

$                           1,567

$                              643

144 %

Information about Dell Technologies’ use of non-GAAP financial information is provided under “Non-GAAP Financial Measures” below. All comparisons in this press release are year-over-year unless otherwise noted.

Operating segments summary

Client Solutions Group continued to outgrow the market worldwide in unit shipments for both commercial and consumer product categories on a calendar year basis. Revenue for the first quarter was $9.1 billion, up 6 percent year over year. Operating income was $374 million for the quarter, or 4.1 percent of revenue.

Key highlights:

  • Increased PC shipments by 6.2 percent year-over-year, with 17 consecutive quarters of year-over-year PC unit share growth
  • Maintained No. 1 share position worldwide for displays, gaining unit share year-over-year for the 17th consecutive quarter
  • Only vendor to gain share year-over-year in both Fixed and Mobile workstation categories

Infrastructure Solutions Group generated $6.9 billion of revenue in the first quarter, which includes $3.2 billion in servers and networking and $3.7 billion in storage, with an operating income of $323 million.

Key highlights:

  • Remained the worldwide market share leader in x86 servers, with PowerEdge units and revenue growth up by double digits in the fiscal quarter
  • Demand for hyperconverged portfolio grew at a triple-digit rate, while demand for all-flash solutions grew at a very high double-digit rate
  • Increased demand for Virtustream Public Cloud for mission-critical applications by approximately 100 percent

VMware segment revenue for the first quarter was $1.7 billion, with operating income of $486 million, or 28 percent of revenue.

Early in the first quarter, the company successfully integrated the combined sales organization and is now operating with one common go-to-market sales motion for customers. Immediately following the quarter close, Dell Technologies hosted its second annual Dell EMC World conference last month in Las Vegas for 13,000 customers and partners. During the event the company launched approximately 40 innovative products and solutions, including the new 14th generation of Dell EMC PowerEdge servers, four flexible consumption models, seven all-flash and hybrid storage systems and the world’s first artificial intelligence platform for women entrepreneurs. In addition, the company announced Dell Technologies Capital, its venture practice for the entire Dell Technologies family of businesses aimed at investments in early-stage startups.

Conference call information

As previously announced, the company will hold a conference call to discuss its first quarter performance today at 7 a.m. CDT. The conference call will be broadcast live over the internet and can be accessed at investors.delltechnologies.com. For those unable to listen to the live broadcast, an archived version will be available at the same location for 30 days.

A slide presentation containing additional financial and operating information may be downloaded from Dell Technologies’ website at investors.delltechnologies.com.

About Dell Technologies

Dell Technologies is a unique family of businesses that provides the essential infrastructure for organizations to build their digital future, transform IT and protect their most important asset, information. The company services customers of all sizes across 180 countries – ranging from 98 percent of the Fortune 500 to individual consumers – with the industry’s most comprehensive and innovative portfolio from the edge to the core to the cloud.

Source: Dell Technologies

The post Dell Technologies Reports Fiscal Year 2018 Q1 Financial Results appeared first on HPCwire.

Leidos, Cray Announce Strategic Alliance for Multi-Level Security

Wed, 06/07/2017 - 14:51

RESTON, Va. and SEATTLE, Wa., June 7, 2017 — Leidos (NYSE: LDOS) a global science and technology solutions leader, and global supercomputer leader Cray Inc. (Nasdaq: CRAY) today announced the companies have signed a strategic alliance agreement to develop, market, and sell Multi-Level Security (MLS) solutions that include the Cray CS series of cluster supercomputers to Federal and commercial customers.

Customers are facing rapidly evolving challenges: increasing cyberattacks, competition-driven need to reduce “time to market,” and a constant focus on increasing efficiencies.  Through this strategic alliance, Leidos and Cray can now work together to expand current MLS solutions that are designed to give customers the ability to:

  • manage risk and collaborate more efficiently by allowing teams at varying security clearances to access the same system in a single environment, while maintaining data access levels;
  • save money and time by being able to consolidate multiple computing systems; and
  • streamline implementation by getting a comprehensive MLS solution from a single vendor.

“We look forward to working with Cray to evolve the capabilities and technologies necessary to offer innovative, robust MLS solutions,” said Keith Johnson, Leidos Defense & Intelligence chief technology officer. “We remain committed to delivering the best technology and efficiencies that directly support our customer’s most pressing requirements.”

“The point is simple – it’s all about securing the data and the systems that will analyze that data, and Leidos brings the expertise to play a key role in developing powerful MLS solutions built on our distributed memory clusters,” said Fred Kohout, Cray’s senior vice president of products and chief marketing officer. “Our strategic alliance with Leidos gives us a strong go-to-market strategy for Federal and commercial customers that require supercomputing performance and separation of data.”

For more information on the Cray CS series of cluster supercomputers, please visit the Cray website at www.cray.com.

About Leidos

Leidos is a global science and technology solutions and services leader working to solve the world’s toughest challenges in the defense, intelligence, homeland security, civil, and health markets. The company’s 32,000 employees support vital missions for government and commercial customers. Headquartered in Reston, Virginia, Leidos reported annual revenues of approximately $7.04 billion for the fiscal year ended December 30, 2016. For more information, visit www.Leidos.com.

About Cray Inc.

Global supercomputing leader Cray Inc. (Nasdaq: CRAY) provides innovative systems and solutions enabling scientists and engineers in industry, academia and government to meet existing and future simulation and analytics challenges. Leveraging more than 40 years of experience in developing and servicing the world’s most advanced supercomputers, Cray offers a comprehensive portfolio of supercomputers and big data storage and analytics solutions delivering unrivaled performance, efficiency and scalability. Cray’s Adaptive Supercomputing vision is focused on delivering innovative next-generation products that integrate diverse processing technologies into a unified architecture, allowing customers to meet the market’s continued demand for realized performance. Go to www.cray.com for more information.

Source: Cray

The post Leidos, Cray Announce Strategic Alliance for Multi-Level Security appeared first on HPCwire.

Gartner: Server Shipments Down 4.2 Percent in Q1 2017

Wed, 06/07/2017 - 14:46

STAMFORD, Conn., June 7, 2017 — In the first quarter of 2017, worldwide server revenue declined 4.5 percent year over year, while shipments fell 4.2 percent from the first quarter of 2016, according to Gartner, Inc.

“The first quarter of 2017 showed declines on a global level with a slight variation in results by region,” said Jeffrey Hewitt, research vice president at Gartner. “Asia/Pacific bucked the trend and posted growth while all other regions fell.

“Although purchases in the hyperscale data center segment have been increasing, the enterprise and SMB segments remain constrained as end users in these segments accommodate their increased application requirements through virtualization and consider cloud alternatives,” Mr. Hewitt said.

Hewlett Packard Enterprise (HPE) continued to lead in the worldwide server market based on revenue. The company posted just more than $3 billion in revenue for a total share of 24.1 percent for the first quarter of 2017 (see Table 1). Dell EMC maintained the No. 2 position with 19 percent market share. Dell EMC was the only vendor in the top five to experience growth in the first quarter of 2017.

Table 1: Worldwide: Server Vendor Revenue Estimates, 1Q17 (U.S. Dollars)

Company

1Q17

Revenue

1Q17 Market Share (%)

1Q16

Revenue

1Q16 Market Share (%)

1Q17-1Q6 Growth (%)

HPE

3,009,569,241

24.1

3,296,591,967

25.2

-8.7

Dell EMC

2,373,171,860

19.0

2,265,272,258

17.3

4.8

IBM

831,622,879

6.6

1,270,901,371

9.7

-34.6

Cisco

825,610,000

6.6

850,230,000

6.5

-2.9

Lenovo

731,647,279

5.8

871,335,542

6.7

-16.0

Others

4,737,196,847

37.9

4,537,261,457

34.7

4.4

Total

12,508,818,106

100.0

13,091,592,596

100.0

-4.5

Source: Gartner (June 2017)

In server shipments, Dell EMC secured the No. 1 position in the first quarter of 2017 with 17.9 percent market share (see Table 2). The company had a slight increase of 0.5 percent growth over the first quarter of 2016. Despite a decline of 16.7 percent, HPE secured the second spot with 16.8 percent of the market. Inspur Electronics experienced the highest growth in shipments with 27.3 percent.

Table 2: Worldwide: Server Vendor Shipments Estimates, 1Q17 (Units)

Company

1Q17
Shipments

1Q17 Market Share (%)

1Q16
Shipments

1Q16 Market Share (%)

1Q17-1Q16 Growth (%)

Dell EMC

466,800

17.9

464,292

17.1

0.5

HPE

438,169

16.8

526,115

19.4

-16.7

Huawei

156,559

6.0

130,755

4.8

19.7

Lenovo

145,977

5.6

199,189

7.3

-26.7

Inspur Electronics

139,203

5.4

109,390

4.0

27.3

Others

1,254,892

48.2

1,286,097

47.4

-2.4

Total

2,601,600

100.0

2,715, 138

100.0

-4.2

Source: Gartner (June 2017)

Additional information is available to subscribers of the Gartner Servers Quarterly Statistics Worldwide program. This program provides worldwide market size and share data by vendor revenue and unit shipments. Segments include: region, vendor, vendor brand, subbrand, CPU type, CPU group, max CPU, platform, price band, operating systems and distribution channels.

About Gartner

Gartner, Inc. (NYSE: IT) is the world’s leading research and advisory company. The company helps business leaders across all major functions in every industry and enterprise size with the objective insights they need to make the right decisions. Gartner’s comprehensive suite of services delivers strategic advice and proven best practices to help clients succeed in their mission-critical priorities. Gartner is headquartered in Stamford, Connecticut, U.S.A., and has more than 13,000 associates serving clients in 11,000 enterprises in 100 countries. For more information, visit www.gartner.com.

Source: Gartner

The post Gartner: Server Shipments Down 4.2 Percent in Q1 2017 appeared first on HPCwire.

‘Charliecloud’ Simplifies Big Data Supercomputing

Wed, 06/07/2017 - 13:12

LOS ALAMOS, N.M., June 7, 2017 — At Los Alamos National Laboratory, home to more than 100 supercomputers since the dawn of the computing era, elegance and simplicity of programming are highly valued but not always achieved. In the case of a new product, dubbed “Charliecloud,” a crisp 800-line code helps supercomputer users operate in the high-performance world of Big Data without burdening computer center staff with the peculiarities of their particular software needs.

“Charliecloud lets users easily run crazy new things on our supercomputers,” said lead developer Reid Priedhorsky of the High Performance Computing Division at Los Alamos. “Los Alamos has lots of supercomputing power, and we do lots of simulations that are well supported here. But we’ve found that Big Data analysis projects need to use different frameworks, which often have dependencies that differ from what we have already on the supercomputer. So, we’ve developed a lightweight ‘container’ approach that lets users package their own user defined software stack in isolation from the host operating system.”

To build container images, Charliecloud sits atop the open-source Docker product that users install on their own system to customize the software choices as they wish. Users then import the image to the designated supercomputer and execute their application with the Charliecloud runtime, which is independent of Docker. This maintains a “convenience bubble” of administrative freedom while protecting the security of the larger system. “This is the easiest container solution for both system administrators and users to deal with,” said Tim Randles, co-developer of Charliecloud, also of the High Performance Computing Division. “It’s not rocket science; it’s a matter of putting the pieces together in the right way. Once we did that, a simple and straightforward solution fell right out.”

The open-source product is currently being used on two supercomputers at Los Alamos, Woodchuck and Darwin, and at-scale evaluation on dozens of nodes shows the same operational performance as programs running natively on the machines without a container. “Not only is Charliecloud efficient in compute time, it’s efficient in human time,” said Priedhorsky. “What costs the most money is people thinking and doing. So we developed simple yet functional software that’s easy to understand and costs less to maintain.”

Charliecloud is very small, only 800 lines of code, and built following two bedrock principles of computing, that of least privilege and the Unix philosophy to “make each program do one thing well.” Competing products range from 4,000 to over 100,000 lines of code. Charliecloud is described in detail in a technical report online, (http://permalink.lanl.gov/object/tr?what=info:lanl-repo/lareport/LA-UR-16-22370).

Los Alamos National Laboratory and supercomputing have a long, entwined history. Los Alamos holds many “firsts,” from bringing the first problem to the nation’s first computer to building the first machine to break the petaflop barrier. Supercomputers are integral to stockpile stewardship and the national security science mission at Los Alamos.

About Los Alamos National Laboratory (www.lanl.gov)

Los Alamos National Laboratory, a multidisciplinary research institution engaged in strategic science on behalf of national security, is operated by Los Alamos National Security, LLC, a team composed of Bechtel National, the University of California, BWX Technologies, Inc. and URS Corporation for the Department of Energy’s National Nuclear Security Administration.

Los Alamos enhances national security by ensuring the safety and reliability of the U.S. nuclear stockpile, developing technologies to reduce threats from weapons of mass destruction, and solving problems related to energy, environment, infrastructure, health and global security concerns.

Source: Los Alamos National Laboratory

The post ‘Charliecloud’ Simplifies Big Data Supercomputing appeared first on HPCwire.

Micron Foundation Supports STEM-Trek & the PEARC17 Student Program

Wed, 06/07/2017 - 10:09

SYCAMORE, Ill., June 7, 2017 — Micron Foundation Supports STEM-Trek & the PEARC17 Student Program STEM-Trek, a global, grassroots, nonprofit organization that supports travel-related professional development opportunities for science, technology, engineering and mathematics (STEM) scholars, is pleased to announce that a STEM-Trek donation from Micron Foundation will make it possible for more scholars to participate in the PEARC17 conference and student program. The Practice & Experience in Advanced Research Computing conference will be held in New Orleans, Louisiana, July 9-13, 2017.

PEARC student-participants will have access to the general conference technical program, plus activities that are being developed specifically for them by experienced education and outreach specialists from national laboratories and research institutions. The student program begins with a cybersecurity presentation by New Orleans-based Federal Bureau of Investigation agents, and includes an intensive collaborative modeling and analysis challenge; a session on careers in modeling and large data analytics; a mentoring program; and a volunteer effort where they’ll learn how conferences are facilitated, from the inside-out!

“We’re delighted to support STEM-Trek and the PEARC17 Student Program,” said Dee Mooney, Micron Foundation Executive Director. “It’s a great opportunity for students to learn in a professional engineering conference environment and engage with industry icons. As they prepare to fill the STEM pipeline, professional development and networking experiences will strengthen their success in the microelectronics industry and beyond,” she added.

General Conference Chair Dave Hart (National Center for Atmospheric Research) is looking forward to welcoming HPC professionals and students to PEARC17. “We want attendees to expand their professional networks, rekindle old relationships and learn what’s new in advanced research computing,” he said. “Thanks to Micron Foundation, our student program will have an even bigger impact on the national HPC workforce pipeline,” he added.

About PEARC

PEARC17 is for those engaged with the challenges of using and operating advanced research computing on campuses or for the academic and open science communities. This year’s inaugural conference offers a robust technical program, as well as networking, professional growth and multiple student participation opportunities.

Organizations supporting the new conference include the Advancing Research Computing on Campuses: Best Practices Workshop (ARCC); XSEDE; the Science Gateways Community Institute (SGCI); the Campus Research Computing Consortium (CaRC); the ACI-REF consortium; the Blue Waters project; ESnet; Open Science Grid; Compute Canada; the EGI Foundation; the Coalition for Academic Scientific Computation (CASC); and Internet2.

See http://pearc17.pearc.org/ for details, and follow PEARC on Twitter (@PEARC_17) and on Facebook (PEARChpc).

About Micron Foundation


The Micron Technology Foundation, Inc., a private, nonprofit organization established in 1999 with a gift from Micron Technology, Inc., is committed to the advancement of education and local communities. The Micron Foundation partners with educators to spark a passion in youth for science, technology, engineering and math; engineers the future for students; and enriches the communities through strategic giving where team members live, work and volunteer. To learn more, visit www.micron.com/foundation.

STEM-Trek and Social Responsibility

STEM-Trek beneficiaries are encouraged to “pay it forward” by volunteering to serve as technology evangelists in their home communities or in a way that helps STEM-Trek achieve its objectives. When scholars pause to help others who struggle with the technical interfaces they need to find work or to perform in the workplace, the exchange is useful for everyone. The people they help gain useful skills and a better understanding of how STEM impacts their lives, and the volunteer gains empathy for others—especially members of the workforce who are adapting to disability or aging. In doing so, they become more considerate future innovators. Read about STEM-Trek programs at www.stem-trek.org.

Source: STEM-Trek

The post Micron Foundation Supports STEM-Trek & the PEARC17 Student Program appeared first on HPCwire.

Mellanox Powers the First 25, 50 and 100 Gigabit Ethernet Fabric for HPE Synergy Platform

Wed, 06/07/2017 - 08:42

LAS VEGAS, June 7, 2017 — Mellanox Technologies, Ltd. (NASDAQ:MLNX), a leading supplier of high-performance, end-to-end smart interconnect solutions for data center servers and storage systems, today announce its Spectrum Ethernet switch ASIC will power the first Hewlett Packard Enterprise (HPE) Synergy Switch Module supporting native 25, 50, and 100 Gb/s Ethernet connectivity. The Spectrum switch module connects HPE Synergy compute module with an Ethernet switch fabric of unmatched performance and latency, which is ideal for cloud, financial services, telco and HPC.

This new switch module helps HPE to maintain its leadership, by enabling the transition to the next generation of Ethernet performance, offering 25 Gb/s connectivity options for the Synergy platform. The Mellanox SH2200 Synergy Switch Module enables 25 and 50 Gb/s Ethernet compute and storage connectivity while benefiting from even higher speed 100 Gb/s uplinks. In addition to providing future proof connectivity, this is an important addition to the HPE Synergy fabric portfolio, bringing high performance Ethernet connectivity into applications previously unserved by this class of high-performance switches. Applications including financial trading and analytics, scientific computing, cloud, and NFV (Network Function Virtualization) greatly benefit from full line rate, zero packet loss, and ultra-low 300ns latency Ethernet switches.

“HPE Synergy is the first truly composable infrastructure, a new category of infrastructure designed to accelerate application and services delivery for both traditional and new cloud native and DevOps environments on the same infrastructure,” said Paul Miller, vice president of marketing, HPE. “We are pleased to partner with Mellanox to now offer even higher performance networking as an integrated component of the HPE Synergy platform. The unmatched Ethernet switch fabric performance and latency will enable our financial, business analytics, telco, and scientific customers to improve efficiency and achieve business results faster than ever before.”

“The complete integration of compute and storage with our Spectrum 25, 50, and 100 Gb/s Ethernet networking fabric makes Synergy an unmatched business, analytics, and telco platform,” said Kevin Deierling, vice president of marketing at Mellanox Technologies. “Faster compute and storage needs faster networks to access and process data in real-time, and HPE Synergy delivers, enabling enterprises and service providers to achieve total infrastructure efficiency across a broad range of business-critical workloads.”

HPE Synergy comes complete with compute, storage, and built-in management, and now offers the industry’s most advanced Ethernet fabric option. The SH2200 HPE Synergy Fabric, powered by Mellanox’s Spectrum Ethernet switch, is a critical building block in making enterprise applications more efficient and by enabling data center operators to analyze data in real-time and drive their business forward. Together, HPE and Mellanox have opened the door for businesses to reimagine what the data center is truly capable of delivering, much faster than ever before.

Availability of HPE Synergy compute modules with the Mellanox SH2200 Synergy Switch Module is targeted for the third quarter of calendar year, 2017.

Visit Mellanox Technologies at HPE Discover 2017

Visit Mellanox during HPE Discover at the Sands Expo Center, Las Vegas, NV; June 5 – 8, 2017, booth no. 209, to learn more about Mellanox’s SH2200 Synergy Switch Module.

About Mellanox

Mellanox Technologies (NASDAQ: MLNX) is a leading supplier of end-to-end InfiniBand and Ethernet smart interconnect solutions and services for servers and storage. Mellanox interconnect solutions increase data center efficiency by providing the highest throughput and lowest latency, delivering data faster to applications and unlocking system performance capability. Mellanox offers a choice of fast interconnect products: adapters, switches, software and silicon that accelerate application runtime and maximize business results for a wide range of markets including high performance computing, enterprise data centers, Web 2.0, cloud, storage and financial services. More information is available at: www.mellanox.com.

Source: Mellanox

The post Mellanox Powers the First 25, 50 and 100 Gigabit Ethernet Fabric for HPE Synergy Platform appeared first on HPCwire.

LIGO Detects Gravitational Waves for Third Time

Tue, 06/06/2017 - 15:57

URBANA, Ill., June 6, 2017 — A new window in astronomy has been firmly opened with a third detection of gravitational waves. The Laser Interferometer Gravitational-wave Observatory (LIGO) has made yet another detection of ripples in space and time, demonstrating that the detection of gravitational waves may soon become commonplace. As was the case with the first two detections, the waves were generated when two black holes collided to form a larger black hole.

The newfound black hole, formed by the merger, has a mass about 49 times that of our sun. This fills in a gap between the masses of the two merged black holes detected previously by LIGO, with solar masses of 62 (first detection) and 21 (second detection).

“We have further confirmation of the existence of stellar-mass black holes that are larger than 20 solar masses—these are objects we didn’t know existed before LIGO detected them,” says MIT’s David Shoemaker, the newly elected spokesperson for the LIGO Scientific Collaboration, a body of more than 1,000 international scientists who perform LIGO research together with the European-based Virgo Collaboration. “It is remarkable that humans can put together a story, and test it, for such strange and extreme events that took place billions of years ago and billions of light-years distant from us. The entire LIGO and Virgo scientific collaborations worked to put all these pieces together.”

The new detection occurred during LIGO’s current observing run, which began November 30, 2016, and will continue through the summer. Its observations are carried out by twin detectors—one in Hanford, Washington, the other in Livingston, Louisiana—operated by Caltech and MIT with funding from the National Science Foundation (NSF).

In all three cases, each of the twin detectors of LIGO detected gravitational waves from the tremendously energetic mergers of black hole pairs. These are collisions that produce more power than is radiated as light by all the stars and galaxies in the universe at any given time.

The recent detection appears to be the farthest yet, with the black holes located about three billion light-years away. The black holes in the first and second detections are located 1.3 and 1.4 billion light-years away, respectively.

NCSA’S Role in the Detection

“NCSA is proud to be part of the LIGO Consortium. In addition to supporting the development of state-of-the-art algorithms for the detection and characterization of new gravitational wave sources, NCSA provides world-leading expertise in identity management, cyber-security, and network engineering, which are critical for the LIGO mission. Ongoing work by our gravity group will enable the use of Blue Waters, the NSF-supported leadership-class supercomputer operated by NCSA, to further accelerate these discoveries,” says Bill Gropp, NCSA’s interim director and chief scientist.

Achieving new insights about the astrophysical nature of gravitational wave sources in a completely uncharted territory is one of the most fascinating activities in contemporary astrophysics says Eliu Huerta, Gravity Group lead at NCSA. “The gravitational wave spectrum is full of surprises. Three events detected and all of them are binary black hole mergers. NCSA Gravity Group is actively contributing to this work with a transdisciplinary research program that involves the application of advanced cyber-infrastructure facilities, and innovative computational tools to tackle problems that range from detector characterization to analytical and numerical gravitational wave source modeling.”

Ed Seidel, founder professor of physics and vice president for economic development and innovation for the University of Illinois System, notes, “This third detection firmly establishes the emergent field of gravitational wave astrophysics, and confirms that the LIGO detectors will soon transition from their current discovery mode into an astronomical observatory. The gravitational wave spectrum has so far provided us with a glimpse of black hole collisions, and we eagerly await to hear about new classes of objects, in particular those that are expected to generate electromagnetic and astro-particle counterparts.”

“Numerical relativity has played a significant role in the validation of these three remarkable events. In addition to providing insights into the nature of the ultra compact objects that generate these signals, numerical relativity will play a critical role in the future identification and validation of events that involve matter, such as neutron stars mergers and black hole-neutron star collisions,” says Gabrielle Allen, astronomy professor and associate dean of the U of I College of Education.

About LIGO

LIGO is funded by the National Science Foundation (NSF), and operated by MIT and Caltech, which conceived and built the project. Financial support for the Advanced LIGO project was led by NSF with Germany (Max Planck Society), the U.K. (Science and Technology Facilities Council) and Australia (Australian Research Council) making significant commitments and contributions to the project. More than 1,000 scientists from around the world participate in the effort through the LIGO Scientific Collaboration, which includes the GEO Collaboration. LIGO partners with the Virgo Collaboration, a consortium including 280 additional scientists throughout Europe supported by the Centre National de la Recherche Scientifique (CNRS), the Istituto Nazionale di Fisica Nucleare (INFN), and Nikhef, as well as Virgo’s host institution, the European Gravitational Observatory. Additional partners are listed at http://ligo.org/partners.php.

About NCSA

The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign provides supercomputing and advanced digital resources for the nation’s science enterprise. At NCSA, University of Illinois faculty, staff, students, and collaborators from around the globe use advanced digital resources to address research grand challenges for the benefit of science and society. NCSA has been advancing one third of the Fortune 50 for more than 30 years by bringing industry, researchers, and students together to solve grand challenges at rapid speed and scale.

Source: NCSA

The post LIGO Detects Gravitational Waves for Third Time appeared first on HPCwire.

In Memoriam: Jean Sammet Computing Pioneer and First Women President of ACM

Tue, 06/06/2017 - 10:02

Jean E. Sammet, a pioneering American computer scientist who developed the FORMAC programming language and who served as the first female president of ACM (Association for Computing Machinery), passed away May 21 at the age of 89. Despite her strong interest in mathematics, she was unable to attend the Bronx High School of Science because it did not accept girls; she attended instead the now-defunct all-girls Julia Richman High School, where she took every available math course.

Her life, efforts, and triumphs are a good reminder of the formidable obstacles women faced then and of the significant obstacles they still often face in pursuing science as a career in modern society. Excellent tribute articles are posted on The New York Times web site (Jean Sammet, Co-Designer of a Pioneering Computer Language, Dies at 89) and the ACM web site (In Memoriam: Jean E. Sammet 1928-2017). The ACM piece recounts the circuitous path she was forced to take to rise in computer science and is well worth reading.

She is famously remembered for saying, “I thought of a computer as some obscene piece of hardware that I wanted nothing to do with,” in an interview in 2000. As noted in the New York Times piece written by Steve Lohr, “[H]er initial aversion was not unusual among the math purists of the time, long before computer science emerged as an academic discipline. Later, Ms. Sammet tried programming calculations onto cardboard punched cards, which were then fed into a computer.”

“To my utter astonishment,” she said, “I loved it.”

Link to YouTube video of Sammet receiving the 2009 IEEE Computer Society Pioneer Award: https://www.youtube.com/watch?v=5PVqBBAxFlU

 

According to the NYT article both her parents, Harry and Ruth Sammet, were lawyers. Jean excelled in math starting in the first grade and chose to attend college at Mount Holyoke because it had an excellent mathematics department. Her programming career included stints at Sperry Gyroscope and its successor Sperry Rand, and Sylvania Electric before she joined IBM in 1961. She was also a historian and advocate for her profession. Her book, “Programming Languages: History and Fundamentals,” published in 1969, “was, and remains, a classic” in the field, said Dag Spicer, senior curator of the Computer History Museum in Mountain View, Calif.

Ms. Sammet was a graduate student in mathematics when she first encountered a computer in 1949 at the University of Illinois at Urbana-Champaign. She wasn’t impressed.

Link to ACM article: https://cacm.acm.org/news/217652-in-memoriam-jean-e-sammet-1928-2017/fulltext

Link to The New York Times article: https://www.nytimes.com/2017/06/04/technology/obituary-jean-sammet-software-designer-cobol.html?_r=0

Feature Image: The New York Times/Ben Shneiderman

The post In Memoriam: Jean Sammet Computing Pioneer and First Women President of ACM appeared first on HPCwire.

ISC High Performance Names Erich Strohmaier and Yutong Lu as Conference Fellows

Tue, 06/06/2017 - 08:25

FRANKFURT, Germany, June 6, 2017 — ISC High Performance is excited to announce the appointment of Dr. Erich Strohmaier of Lawrence Berkeley National Laboratory, USA, and Prof. Yutong Lu from the National Supercomputing Center in Guangzhou, China, as ISC Fellows. ISC Fellows are individuals who have made and continue to make important contributions to the advancement of high performance computing (HPC), the community and the ISC High Performance conference series.

Dr. Strohmaier and Prof. Lu will be added to the list of ISC High Performance Fellows, which currently boasts eight fellows.

The awarding will take place on Monday June 19, during the opening session of the ISC High Performance conference. This year’s event will be held at Messe Frankfurt from June 18 – 22, and will be attended by over 3,000 HPC community members, including researchers, scientists and business people.

Erich Strohmaier is a Senior Scientist and leads the Performance and Algorithms Research Group of the Computational Research Division at the Lawrence Berkeley National Laboratory. His current research focuses on performance characterization, evaluation, modeling, and prediction for HPC systems; analysis of advanced computer architectures and parallel programming paradigms; classification of and programming patterns for scientific computational kernels; and analysis and optimization of data-intensive large scale scientific workflows. Together with Prof. Dr. Hans W. Meuer he devised and founded the TOP500 project in 1993.

Strohmaier was awarded the 2008 ACM Gordon Bell Prize for parallel processing research in the special category for algorithmic innovation. He received a Diploma in Physics and a Dr. rer. nat. in Theoretical Physics from the Ruprecht-Karls-University of Heidelberg, Germany.

“After finishing with university, I started my professional career in HPC by attending ISC – then called the Mannheimer Supercomputer Seminar – on the first day of my new job,” recalled Strohmaier. “ISC has remained the annual highlight of my professional life ever since. The event has come a long way from its early days, and is now rightfully considered one of the top conferences and exhibitions in the larger field of HPC. I am very happy and feel extremely honored to be named an ISC Fellow and hope that I will be able to contribute to it in the future.”

Yutong Lu is the Director of National Supercomputing Center in Guangzhou, China. She is also the professor in the School of Computer Science at Sun Yat-sen University, as well as at the National University of Defense Technology (NUDT). Her extensive research and development work has spanned several generations of domestic supercomputers in China, which includes her role as the deputy chief designer of the Tianhe supercomputers. She is also leading a number of HPC and big data projects under the support of the Chinese Ministry of Science and Technology, the National Natural Science Foundation of China, and Guangdong Province.

Lu had a first class award and outstanding award for Chinese national science and technology progress in 2009 and 2014. Her continuing research interests include parallel operating systems, high-speed communication, large-scale file system and data management, and advanced programming environments and applications.

“It is a great honor to be appointed as an ISC Fellow,” said Lu. “The conference series has helped expand our international exchanges. ISC is a family that connects HPC communities from east to west, and from academia to industry. I have met many interesting, talented and creative people as a result of my participation in the event.”

In the future, I want to continue to work with my colleagues to help eliminate the gap between HPC systems and applications, and lessen regional imbalances of HPC technology. I believe that the world should share research achievements in HPC, allowing science, engineering and society to take advantages the latest developments. I hope to contribute to this effort.”

About ISC High Performance

First held in 1986, ISC High Performance is the world’s oldest and Europe’s most important conference and networking event for the HPC community. It offers a strong five-day technical program focusing on HPC technological development and its application in scientific fields, as well as its adoption in commercial environments.

Over 400 speakers and 150 exhibitors, consisting of leading research centers and vendors, will greet attendees at ISC High Performance. A number of events complement the Monday – Wednesday keynotes, including the Distinguished Speaker Series, the Industrial Day, The Deep Learning Day, Tutorials, Workshops, the Research Paper Sessions, Birds-of-a-Feather (BoF) Sessions, Research Poster, the PhD Forum, Project Poster Sessions and Exhibitor Forums.

Source: ISC

The post ISC High Performance Names Erich Strohmaier and Yutong Lu as Conference Fellows appeared first on HPCwire.

Synopsys Announces Availability Complete DesignWare CCIX IP Solution

Tue, 06/06/2017 - 08:19

MOUNTAIN VIEW, Calif., June 6, 2017 — Synopsys, Inc. (Nasdaq: SNPS) today announced immediate availability of its complete DesignWare CCIX IP solution, consisting of controller, PHY and verification IP delivering data transfer speeds up to 25Gbps and supporting cache-coherency for high-performance cloud computing applications. The Cache Coherent Interconnect for Accelerators (CCIX) standard allows accelerators and processors to access shared memory in a heterogeneous multi-processor system for significantly lower latency. In addition, CCIX leverages the PCI Express 4.0 line rates with extended speed modes to accelerate throughput up to 25Gbps for applications such as machine learning, network processing and storage off-load. The new DesignWare CCIX IP solution is built on Synopsys’ silicon-proven PCI Express 4.0 architecture, which has been validated in over 1500 designs and shipped in billions of units, enabling designers to lower integration risk, while accelerating adoption of the new standard.

Synopsys also announced yesterday that it has collaborated with Mellanox Technologies to successfully demonstrate full system interoperability between their two independently developed PCI Express 4.0 solutions. The demonstration includes a host, which contains the DesignWare Root Port Controller IP for PCI Express 4.0 specification in a DesignWare IP Prototyping Kit, connected to the Mellanox ConnectX-5 network adapter as a device. The demonstration shows full initialization of the PCI Express 4.0 interface including link up, configuration and enumeration at 16 GT/s. In addition, the two companies used the lane margining functionality between the host and device to assess the link quality for a more robust design. The demonstration will be presented at the PCI-SIG® Developers Conference 2017 in Synopsys’ Booth #2.

“CCIX leverages the PCI Express protocol to support several line rates with additional high-speed 25Gbps option to address the need for higher bandwidth, lower latency and ease of programming in data center applications,” said Gaurav Singh, chairman of the CCIX Consortium. “As a contributing member of the CCIX Consortium and with the availability of the DesignWare CCIX IP solution, Synopsys helps accelerate adoption of the standard and allows designers to deliver emerging data-intensive computing SoCs with a new class of interconnects.”

Synopsys’ interoperable CCIX controller, PHY and verification IP solution enables faster system integration. The RAS features in the DesignWare CCIX controller offer data protection and integrity in the datapath and read-access memory (RAM). In addition, debug capabilities, error injection and statistics monitoring give visibility into components such as link training and status state machine (LTSSM) and PHY equalization process for a more comprehensive system testing. The CCIX PHY IP uses power management features such as I/O supply underdrive, V-Boost OFF and decision feedback equalization (DFE) bypass to significantly reduce power consumption. The PHY optimizes performance across voltage and temperature variations and includes adaptive continuous time linear equalizer (CTLE), DFE and feed forward equalization (FFE) for superior signal integrity and jitter performance. Synopsys’ Verification IP for CCIX includes configurable environments, complete port-level checks and system-wide coherency checks for rapid coherency verification.

“As the industry’s most trusted IP provider for nearly two decades, Synopsys has consistently provided our customers with a broad portfolio of high-quality IP for emerging standards such as CCIX,” said John Koeter, vice president of marketing for IP at Synopsys. “By providing a complete CCIX IP solution based on our silicon-proven PCI Express architecture that has been used by more than 250 companies, Synopsys enables designers to achieve the multi-gigabit performance and cache coherency requirements of their cloud computing designs with less risk.”

Availability & Additional Resources

The DesignWare CCIX Controller, PHY and Verification IP for CCIX are available now.

For more information, visit DesignWare CCIX IP Solutions website.

Synopsys’ complete DesignWare IP solution for PCI Express 4.0 specification consisting of controllers, PHYs, IP Prototyping Kits, IP subsystems and verification IP are available now. In addition, software drivers for DesignWare IP for PCI Express Root Complex are available in the Linux kernel.

Visit Synopsys at the PCI-SIG Developers Conference (Booth #2) to see the demonstration.

About DesignWare IP

Synopsys is a leading provider of high-quality, silicon-proven IP solutions for SoC designs. The broad DesignWare IP portfolio includes logic libraries, embedded memories, embedded test, analog IP, wired and wireless interface IP, security IP, embedded processors and subsystems. To accelerate prototyping, software development and integration of IP into SoCs, Synopsys’ IP Accelerated initiative offers IP prototyping kits, IP software development kits and IP subsystems. Synopsys’ extensive investment in IP quality, comprehensive technical support and robust IP development methodology enable designers to reduce integration risk and accelerate time-to-market. For more information on DesignWare IP, visit www.synopsys.com/designware.

About Synopsys

Synopsys, Inc. (Nasdaq: SNPS) is the Silicon to Software partner for innovative companies developing the electronic products and software applications we rely on every day. As the world’s 15th largest software company, Synopsys has a long history of being a global leader in electronic design automation (EDA) and semiconductor IP and is also growing its leadership in software security and quality solutions. Whether you’re a system-on-chip (SoC) designer creating advanced semiconductors, or a software developer writing applications that require the highest security and quality, Synopsys has the solutions needed to deliver innovative, high-quality, secure products. Learn more at http://www.synopsys.com/ www.synopsys.com.

Source: Synopsys

The post Synopsys Announces Availability Complete DesignWare CCIX IP Solution appeared first on HPCwire.

Ohio Supercomputer Center Runs Its Largest-Scale Calculation Ever

Tue, 06/06/2017 - 08:09

COLUMBUS, Ohio, June 6, 2017 — Scientel IT Corp used 16,800 cores of the Owens Cluster on May 24 to test database software optimized to run on supercomputer systems. The seamless run created 1.25 Terabytes of synthetic data.Big Data-specialist Scientel developed Gensonix Super DB, a software designed for big data environments that can use thousands of data-processing nodes compared to other database software that use considerably fewer nodes at a time. Scientel CEO Norman Kutemperor said Genosonix Super DB is the only product designed and optimized for supercomputers to take full advantage of high performance computing architecture that helps support big data processing.

“This is a wonderful testimonial of the capabilities of Genoxonix Super DB for Big Data,” Kutemperor said. “The robust nature of the OSC Owens Cluster provided the reliability for this large parallel job.”

To demonstrate the power of Genosonix Super DB, the Scientel team created a sample weather database application to run using OSC’s Owens Cluster. For this rare large run, Scientel used 600 of the system’s available 648 compute nodes. The Owens Cluster has additional nodes dedicated to GPU use and data analytics, for a total of 824 nodes on the Dell-built supercomputer. During the run, the Owens Cluster reached a processing speed of over 86 million data transactions per minute with no errors.

“As the largest run ever completed on OSC’s systems, Scientel helped us demonstrate the power of the Owens Cluster,” said David Hudak, Ph.D., OSC interim executive director. “Owens regularly delivers a high volume of smaller-scale runs, providing outstanding price performance for OSC’s clients. The ability to scale calculations to this size demonstrates another unique capability of Owens not found elsewhere in the state and unmatched by our previous systems.”

With satisfactory test results on the software, Scientel will take Genosonix Super DB to the forefront of technology to process large varieties of data and compute intense problems in areas such as cancer research, drug development, traffic analysis, and space exploration. A single application written for Genosonix Super DB can use more than 100,000 cores to handle multiple petabytes of data in real time.

“[The OSC staff] are extremely knowledgeable and very capable of understanding customer requirements, even when jobs are super scaled,” Kutemperor said. “Their support and enthusiasm for projects of this nature are outstanding.”

The Ohio Supercomputer Center recently displayed the power of its new Owens Cluster by running the single-largest scale calculation in the Center’s history.

The Ohio Supercomputer Center (OSC), a member of the Ohio Technology Consortium of the Ohio Department of Higher Education, addresses the expanding computational demands of academic and industrial research communities by providing a robust shared infrastructure and proven expertise in advanced modeling, simulation and analysis. OSC empowers researchers with the vital services essential to make extraordinary discoveries and innovations, partners with businesses and industry to leverage computational science as a competitive force in the global knowledge economy, and leads efforts to equip the workforce with the key technology skills required to secure 21st century jobs. For more, visit www.osc.edu.

Scientel IT Corp., has been in the systems design/development business since 1977, supporting U.S. & International operations, plus Asian-sphere software development capabilities. Scientel’s expertise is NoSQL DB design. Scientel also designs/produces highly optimized high end servers, which can be bundled with its “Gensonix ENTERPRISE” DB software, as a single-source supplier of complete systems for Big Data environments. Scientel can also customize hardware/software for specific needs, resulting in extremely higher performance. For more, visit www.scientel.com.

Source: OSC

The post Ohio Supercomputer Center Runs Its Largest-Scale Calculation Ever appeared first on HPCwire.

SPEC Releases New Version of PTDaemon

Tue, 06/06/2017 - 08:04
GAINESVILLE, Va., June 6, 2017 — The Standard Performance Evaluation Corp. (SPEC) has released a new version of PTDaemon, the software that allows power measurement devices to be incorporated into performance evaluation software. PTDaemon provides a common TCP/IP-based interface that can be integrated into different benchmark harnesses. It runs in the background to offload control of the power analyzer or temperature sensor to a system other than the one under test. Benchmarks using PTDaemon include SPECpower_ssj2008, SPECvirt_sc2013 and the upcoming SPEC CPU2017. PTDaemon is also used in the Server Efficiency Rating Tool (SERT) suite and the Chauffeur WDK.

The latest version of the software, PTDaemon 1.8.1, includes new support for Windows USBTMC using Keysight drivers and the Hioki PW3335 (single channel) power measurement device.

Users of SPECpower_ssj2008 and the SERT suite are encouraged to download and apply the update.

Award-winning paper available

In other server efficiency news, the Autopilot tool created by the SPEC RG Power Working Group won the best demo/poster award at the International Conference on Performance Engineering 2017 (ICPE 2017) in L’Aquila, Italy. Autopilot, a plug-in for the Eclipse IDE, enables fast and easy building and deployment of a workload for testing server efficiency.
A paper describing the tool can be downloaded on the SPEC website.

About SPEC

SPEC is a non-profit organization that establishes, maintains and endorses standardized benchmarks and tools to evaluate performance and energy consumption for the newest generation of computing systems.  Its membership comprises more than 120 leading computer hardware and software vendors, educational institutions, research organizations, and government agencies worldwide.

Source: SPEC

The post SPEC Releases New Version of PTDaemon appeared first on HPCwire.

Pages