HPC Wire

Subscribe to HPC Wire feed
Since 1987 - Covering the Fastest Computers in the World and the People Who Run Them
Updated: 2 hours 42 min ago

Dataiku Academic Brings Data Science Software to Students

Tue, 05/09/2017 - 11:13

NEW YORK, May 9, 2017 — Dataiku, the maker of the enterprise data science platform Dataiku Data Science Studio (DSS), has announced the Dataiku Academic Program – an initiative to provide completely free licensing, support, and learning resources for teachers, researchers, and students in data science and business analytics fields. The company looks to enable the next generation of data scientists by helping students learn data science concepts within Dataiku DSS, the software platform that companies around the world use to develop and run advanced data science solutions.

The platform allows users to leverage the fundamental data science coding languages (R, Python, SQL) as well as the most state-of-the-art machine learning algorithms, open source, and big data technologies (Scala, Spark, Hadoop) to provide a complete tool for data science students, as well as their teachers.

The visual interface of Dataiku DSS – further improved with the recent release of version 4.0 – empowers students with a less technical background to learn the data mining process, and build projects from raw data to predictive application, without having to write a single line of code.  In addition, real time collaboration allows students to work on group projects simultaneously or follow a professor’s actions live.

Learn how Professor Dan Adelman at the University of Chicago uses Dataiku DSS to teach students

For researchers, the platform can be used to create powerful models and predictions using the most recent technologies in data science and machine learning.

“Our goal at Dataiku is to help everyone, everywhere, with their data analysis and predictive modeling skills,” said Florian Douetteau, CEO of Dataiku. “A vital part of that is to provide free licenses for our software and specific support resources for academics, researchers, and personal learning to help the next generation of data scientists learn, create, and succeed.”

Free academic licenses of the Dataiku DSS platform include access to:

  • Exclusive training material.
  • Community support.
  • Visibility on the Dataiku blog, social media, and recruitment channels.
  • Student events, meetups, and hackathons.
  • Dataiku DSS Enterprise Edition setup for students on the cloud or in-house servers.
  • Certifications for students.
  • Expert guest lectures from Dataiku data scientists.

To learn more and to request a Dataiku Academic License visit: http://pages.dataiku.com/dataiku-academic-program

Source: Dataiku

The post Dataiku Academic Brings Data Science Software to Students appeared first on HPCwire.

Cavium QLogic Accelerates NVMe over Fabrics Adoption

Tue, 05/09/2017 - 10:26

SAN JOSE, Calif., May 9, 2017 — Today, Cavium, Inc. (NASDAQ: CAVM), a leading provider of products that enable intelligent processing for enterprise and cloud data centers, announced that its family of QLogic Gen 6 Fibre Channel and FastLinQ Ethernet adapters will support NVMe over Fabrics (NVMe-oF) technology, enabling the scale out of flash storage platforms.

NVMe-oF defines an efficient mechanism to utilize NVMe devices in large scale storage deployments and provides investment protection by leveraging the latest innovations and advances in low latency SSD flash to be realized over proven Fibre Channel and Ethernet RDMA fabrics. NVMe over Fabrics enables NVMe storage devices to be shared, pooled and managed more effectively across a fabric.

With support for 32GFC and 100Gbps Ethernet, the 2690 Enhanced Gen 5 and 2700 Series Gen 6 Fibre Channel, and FastLinQ 45000 series Ethernet adapters, provide streamlined access to low latency NVMe SSDs enabling scale out architectures, simplifying design, reducing overhead and improving performance.

Cavium has commenced an early access program for NVMe over Fibre Channel (FC-NVMe) and NVMe over Universal RDMA (NVMe-oF) for select partners, with plans to expand the program in Q2’17.

Cavium QLogic NVMe over Fibre Channel (FC-NVMe) Solution

Next-generation data intensive workloads are utilizing low latency NVMe flash-based storage to meet ever increasing user demand. By combining the lossless, highly deterministic nature of Fibre Channel with NVMe, FC-NVMe delivers the performance, application response time, and scalability needed for next generation data centers, while leveraging existing Fibre Channel infrastructure. Currently shipping QLogic 2690 Series of Enhanced Gen 5 and 2700 Series of Gen 6 Fibre Channel adapters are FC-NVMe ready and are being evaluated in FC-NVMe fabrics by multiple customers and partners across the industry.

Key benefits of QLogic FC-NVMe Solution are:

  • Leading the standard: Cavium chairs the T11 committee defining the FC-NVMe specification. With broad support from industry, the specification achieved internal Letter Ballot status on December 8th, 2016, and is on track to become a ratified standard later this year.
  • Industry’s Highest Performance: QLogic Fibre Channel adapters deliver the industry’s highest transactional performance of up to 2.6 Million IOPS, enabling the most demanding enterprise workloads that have the potential to leverage flash and NVMe. The QLogic 2700 series Gen 6 Fibre Channel adapter portfolio is the broadest in the industry with single, dual and quad port variants.
  • Concurrent Operation and Investment Protection: With QLogic technology, FC-NVMe workloads can be seamlessly introduced into existing FCP-SCSI fabrics. With QLogic 2690 and 2700 Series FC HBAs, FCP-SCSI and FC-NVMe protocol traffic can run concurrently without requiring any rip and replace of existing infrastructure.
  • Advanced SAN Fabric Management– QLogic StorFusion™ technology delivers a full suite of diagnostics, rapid provisioning and QoS services throughout the fabric which automate and simplify SAN deployment and orchestration.
  • Software Defined Storage – Cavium is driving innovation in software defined storage platforms with target mode drivers for the Storage Performance Developer Kit (SPDK) project, which provides a high performance user-space device implementation under the BSD license.

Cavium FastLinQ NVMe over Ethernet Universal RDMA (NVMe-oF) Solution

Driven by the performance demands of NVMe, high performance, low latency networking is a fundamental requirement for a fabric to scale out. Ethernet based RDMA fabrics with their exceptional low latency and offload capabilities will become a popular choice for NVMe over Fabrics. Cavium FastLinQ 45000 Series 10/25/40/50/100GbE Ethernet adapters support Universal RDMA (RoCE, RoCEv2 and iWARP) and deliver the ultimate choice to customers for scaling out NVMe over a general-purpose Ethernet fabric.

Key benefits of Cavium FastLinQ NVMe over Ethernet RDMA Solution are:

  • Broad Spectrum of Ethernet connectivity speeds – 10/25/40/50/100GbE to host the most demanding enterprise, telco and cloud applications and deliver scalability to drive business growth, available in cost optimized standard, blade and OCP form factors.
  • Universal RDMA –  Industry’s only network adapter that delivers customers technology choice and investment protection with support for concurrent RoCE, RoCEv2 and iWARP for NVMe over Fabrics.
  • Server Virtualization – Optimize infrastructure costs and increase virtual machine density by leveraging in-built technologies like SR-IOV and Network Partitioning (NPAR) that deliver acceleration and QoS for virtual workloads and infrastructure traffic.
  • Standards Compliant: Cavium NVMe over Ethernet RDMA solutions are compliant with the currently released version of the NVMe over RDMA Fabrics specification.

Cavium 45000 Series 10/25/40/50/100GbE, QLogic 2690 Series Enhanced Gen 5 16GFC and 2700 Series Gen 6 32GFC adapters are available from Cavium and multiple leading OEMs and ODMs.

To get access to the Cavium QLogic NVMe-oF Evaluation Kit and for more information, visit www.qlogic.com/nvmeof

“NVMe over Fabric technologies are unlocking the value of data by delivering a low latency, scalable, secure and proven fabric to extended the reach of NVMe across a large-scale fabric.” said Raghib Hussain, Chief Operating Officer, Cavium. “Cavium QLogic Fibre Channel and FastLinQ Ethernet Universal RDMA solutions for NVMe deliver the ultimate choice to customers looking to bring the performance and economics of NVMe to enterprise and cloud datacenters.”



About Cavium

Cavium, Inc. (NASDAQ: CAVM), offers a broad portfolio of infrastructure solutions for compute, security, storage, switching, connectivity and baseband processing. Cavium’s highly integrated multi-core SoC products deliver software compatible solutions across low to high performance points enabling secure and intelligent functionality in Enterprise, Datacenter and Service Provider Equipment. Cavium processors and solutions are supported by an extensive ecosystem of operating systems, tools, application stacks, hardware reference designs and other products. Cavium is headquartered in San Jose, CA with design centers in California, Massachusetts, India, Israel, China and Taiwan.

Source: Cavium

The post Cavium QLogic Accelerates NVMe over Fabrics Adoption appeared first on HPCwire.

One Stop Systems Exhibits Various High-Density GPU Appliances at GTC

Tue, 05/09/2017 - 10:18

SAN JOSE, Calif., May 9, 2017 — One Stop Systems (OSS), a leading provider of high performance computing appliances for a multitude of HPC applications, today will exhibit a wide array of its high-density GPU appliances. The ExpressBox 3600 (EB3600) and the High Density Compute Accelerator (HDCA) provide 168 and 299 Tflops of half-precision performance respectively using NVIDIA Tesla P100 for PCIe GPU accelerators. The GPUltima consists of eight nodes of the HDCA and provides up to 2.4 Petaflops of half-precision performance with a total of 128 NVIDIA Tesla P100 for PCIe GPU Accelerators. OSS will also exhibit the OSS-PASCAL4 for the first time. The OSS-PASCAL4 is a GPU Acclerated Server that can accommodate up to four NVIDIA P100 SXM2 GPUs and provide 84 Tflops of half-precision performance.

Visitors to the GPU Technology Conference (GTC) this week in San Jose can view these GPU Appliances in the One Stop Systems’ booth #304.

“One Stop Systems offers a wide variety of GPU appliances with different power density solutions to support a range of customer needs,” said Steve Cooper, OSS CEO. “Customers can choose a solution that fits their rack space and budget while still ensuring they get the compute power they need for their application. OSS GPU Appliances allow for tremendous performance gains in many applications like deep learning, oil and gas exploration, financial calculations, and medical devices. As GPU technology continues to improve, OSS products are immediately able to accommodate the newest and most powerful GPUs.”

All of One Stop Systems GPU Appliances are available with the latest NVIDIA Tesla GPUs. Pricing is determined by the customer’s specific configuration. OSS sales engineers are available to assist customers in designing the right system for their requirements.

About One Stop Systems

One Stop Systems designs and manufacturers supercomputers for high performance computing (HPC) applications such as deep learning, oil and gas exploration, financial trading, defense and any other applications that require the fastest and most efficient data processing. By utilizing the power of the latest GPU accelerators and flash storage cards, our systems stay on the cutting edge of the latest technologies. Our equipment also costs less than other solutions and we occupy much less rack space in the data center. We have a reputation as innovators using the very latest technology and design equipment to operate with the highest efficiency. Visit www.onestopsystems.com for more information.

Source: One Stop Systems

The post One Stop Systems Exhibits Various High-Density GPU Appliances at GTC appeared first on HPCwire.

SkyScale Announces the World’s Fastest Cloud Computing Service

Tue, 05/09/2017 - 10:15

SAN JOSE, Calif., May 9, 2017 – SkyScale announced the launch of its world-class, ultra-fast multi-GPU hardware platforms in the cloud, available for lease to customers desiring the fastest performance available as a service anywhere on the globe.

SkyScale, a High Performance Computing-as-a-Service (HPCaaS) partnership with One Stop Systems (OSS), features OSS’s renowned compute and flash storage systems. OSS is the only company to have successfully commercialized systems with sixteen of the latest NVIDIA P100 GPU accelerators per node, which can be clustered and scaled. The dedicated, non-virtualized systems provide a competitive edge, especially in deep learning, oil and gas simulation, financial analysis, product and software development, genomics, medical imaging, and more.

“Employing OSS compute and flash storage systems gives SkyScale customers the overwhelming competitive advantage in the cloud they have been asking for,” said Tim Miller, President of SkyScale. “The systems we deploy at SkyScale in the cloud are identical to the hundreds of systems in the field and in the sky, trusted in the most rigorous defense, space, and medical deployments with cutting-edge, rock-solid performance, stability, and reliability. By making these systems accessible on a time-rental basis, developers have the advantage of using the most sophisticated systems available to run their algorithms without having to own them, and they can scale up or down as needs change. SkyScale is the only service that gives customers the opportunity to lease time on these costly systems, saving time-to-deployment and money.”

“But it wasn’t enough to just have world-class equipment and an experienced team,” emphasized Steve Cooper, CEO of equipment maker OSS. “Customers, current and prospective, have been asking us for more than a year to make our equipment available in the cloud because the processing power and speed we offer is unavailable elsewhere. But it was critical to us to have a colocation partner that could provide enterprise-level security that was acceptable to our most demanding customers before we would consider offering the service. Once we found that partner, which not only met, but exceeded those demands, we moved ahead quickly and SkyScale is the fantastic result!”

SkyScale also offers extensive supporting software, at no additional cost. Machine learning framework options include Caffe, Torche, TensorFlow, and Theano, plus preinstalled machine learning libraries including MLPython, cuDNN, DIGITS, Caffe on Spark and more. An expanding list of partnerships with advanced software suppliers provides for easier license management as well.

SkyScale’s standard plans feature:

  • Dedicated, non-virtualized platforms, built to scale up with nearly limitless power
  • SkyScale’s full software library pre-loaded, or load your own applications
  • Enterprise-grade intrusion prevention systems
  • 24/7 manned-security, biometric identity verification and HD camera coverage
  • Weekly, monthly, or yearly rentals, with discounts for 6 months or more
  • Rent to own scenarios

Prices range from $1,300 per week for rental of a machine with four P100 GPUs, to $4,800 per week for a system with sixteen P100 GPUs. Visitors to GTC 2017 can learn more about SkyScale’s offerings at booth #432. Visit www.SkyScale.com for additional information and pricing, or call the SkyScale team for a 30-minute demo or a quote for specific requirements (888-236-5454). Systems are available for immediate scheduling.

About SkyScale

SkyScale is a world-class provider of cloud-based, ultra-fast multi-GPU hardware platforms for lease to customers desiring the fastest performance available as a service anywhere in the world. SkyScale builds, configures, and manages dedicated systems strategically located in maximum-security facilities, allowing customers to focus on results while minimizing capital equipment investment.

Source: SkyScale

The post SkyScale Announces the World’s Fastest Cloud Computing Service appeared first on HPCwire.

‘Retired’ UC Berkeley Professor Tackles Google Tensor Processing Unit

Tue, 05/09/2017 - 09:53

Retirement, it seems, isn’t for everyone. Indeed you may already know that David Patterson joined Google’s Tensor Processing Unit (TPU) development effort after a 40-year career in academia at the University of California at Berkley. On Saturday, CNBC posted an engaging account of Paterson’s leap into his next career.

David Patterson

“Four years ago they (Google) had this worry and it went to the top of the corporation,” said Patterson, 69, while sporting a T-shirt for Google Brain, the company’s research group. The fear was that if every Android user had three minutes of conversation translated a day using Google’s machine learning technology, “we’d have to double our data centers,” Patterson is quoted in the article* by Ari Levy who covered a talk last week by Patterson on the roughly year anniversary of his UC Berkeley retirement.

The short piece is a fun ride tracing the indefatigable Patterson’s path forward. It’s also an interesting reminder of shifting dynamics in computer technology development. Today’s big cloud providers – Google, Amazon, and Microsoft to name three – are far from mere technology consumers but in fact leading edge developers. It’s not an understatement that the big system and semiconductor suppliers take many of their cues from hyperscalers’ efforts and directions.

Patterson is one of the lead authors on a report from Google last month on the TPU’s performance. The report concludes that the TPU is running 15 to 30 times faster and 30 to 80 times more efficient than contemporary processors from Intel and Nvidia. The paper, written by 75 engineers, will be delivered next month at the International Symposium on Computer Architecture in Toronto

Google TPU

According to the article, Google says the TPU is being tested broadly across the company. It’s used for every search query as well as for improving maps and navigation, and it was the technology used to power DeepMind’s AlphaGo victory over Go legend Lee Sedol last year in Seoul.

Levy writes “It’s still very early days for the TPU. Thus far, the processor has proven effective at what’s called inference, or the second phase of deep learning. The first phase is training and — as far as we know — for that Google still counts on off-the-shelf processors.”

* Link to CNBC article (Meet the 69-year-old professor who left retirement to help lead one of Google’s most crucial projects): http://www.cnbc.com/2017/05/06/googles-tpu-for-machine-learning-being-evangelized-by-david-patterson.html

The post ‘Retired’ UC Berkeley Professor Tackles Google Tensor Processing Unit appeared first on HPCwire.

Microsoft Azure Will Debut Pascal GPU Instances This Year

Mon, 05/08/2017 - 18:21

As Nvidia’s GPU Technology Conference gets underway in San Jose, Calif., Microsoft today revealed plans to add Pascal-generation GPU horsepower to its Azure cloud. Azure, which already includes an M60 and a K80 GPU-backed instance, will be adding P40 and P100-powered virtual machines to its lineup. The new instance families will not be available until “later in the year” according to Microsoft.

The P40 accelerators will be rolled out as part of the brand-new ND series instance, while the (PCIe-based) P100 will be included in the next generation NC-series, NCv2. Missing from the announcement was any mention of the open source HGX-1 servers, announced in March. We’re still waiting to hear how Azure will utilize the eight-way NVLink-connected P100 boxes developed under Project Olympus. A future NCv3 instance perhaps?

Since the N+X naming convention can get confusing, here’s a quick cheat-cheat.

NV series: M60 GPUs
NC series: K80 GPUs
NCv2 series: P100 GPUs (new, not yet available)
ND series: P40 GPUs (new, not yet available)

Microsoft reports that the new ND-series, based on Pascal-generation P40 architecture, is excellent for training and inference. “These instances provide over 2x the performance over the previous generation for FP32 (single precision floating point operations), for AI workloads utilizing CNTK, TensorFlow, Caffe, and other frameworks,” said Corey Sanders, director of compute, Azure, in a blog post. “The ND-series also offers a much larger GPU memory size (24GB), enabling customers to fit much larger neural net models.”

Microsoft and Nvidia emphasized the performance boost provided by GPUs for AI and deep learning workloads, including image recognition, speech training, and natural language processing, but also identified the benefit to traditional HPC workloads, such as reservoir modeling, DNA sequencing, protein analysis, Monte Carlo simulations and rendering.

Both Pascal based offerings in Azure provide a VM option with RDMA and InfiniBand connectivity to support HPC workloads and speed large-scale neural net training jobs spanning up to hundreds of GPUs.

ND Instance sizes

NCv2 Instance sizes

There’s a sign-up page here for those seeking a private preview of the new instance types. Microsoft says it will respond if “additional preview participants are needed.”

Other cloud providers, notably Nimbix, IBM and Cirrascale, have already deployed the Pascal-gen P100s in their clouds. Google says that P100s will be “coming soon” to its cloud. Tencent is in the process of incorporating P100 and P40 accelerators into its datacenters.

The post Microsoft Azure Will Debut Pascal GPU Instances This Year appeared first on HPCwire.

GOAI Publishes Python Data Frame for GPU Analytics

Mon, 05/08/2017 - 16:29

A group of data analytics vendors joined forces today at the GPU Technology Conference to create the GPU Open Analytics Initiative (GOAI) with the goal of fostering the development of a community of data science and deep learning workloads running on GPUs. The group also unveiled a Python-based API that begins to address its concern.

Continuum Analytics, H2O.ai and MapD Technologies are the founding members of GOAI, which was unveiled at NVidia’s annual GPU Technology Conference in San Jose, California. The vendors say that, while each of them have powerful frameworks, the lack of a common standard data format hinders intercommunication among the various applications.

Without the capability to access and work with the same data in a GPU environment, the vendors say, it slows the workflow, increases latency, and increases of complexity of analytic workflows running on GPUs.

The group proposed a new data standard to address this concern. Called the GPU Data Frame, the standard facilitates the interchange of data among various processes running on the GPU. It currently exposes a Python API.

The new GPU Data Frame API enables end-to-end computation on the GPU, which therefore “avoids transfers back to the CPU or copying of in-memory data, reducing compute time and cost for high-performance analytics common in artificial intelligence workloads,” the group says in a press release.

The announcement continues:

“Users of the MapD Core database can output the results of a SQL query into the GPU Data Frame, which then can be manipulated by the Continuum Analytics’ Anaconda NumPy-like Python API or used as input into the H2O suite of machine learning algorithms without additional data manipulation.”

Early tests show that, by keeping the data resident in the GPU and avoiding round-trips back to the CPU, processing times decreased by an order of magnitude, the group says.

Todd Mostak, CEO and co-founder of MapD Technologies and one of Datanami’s 2017 People to Watch, says that, while the data science community is rapidly adopting GPUs for machine learning and deep learning workloads, the need to involve CPUs for tasks like subsetting and preprocessing of training data is creating a bottleneck.

“The GPU Data Frame makes it easy to run everything from ingestion to preprocessing to training and visualization directly on the GPU,” he says in the announcement. “This efficient data interchange will improve performance, encouraging development of ever more sophisticated GPU-based applications.”

Travis Oliphant, co-founder and chief data scientist of Continuum Analytics and also one of Datanami’s 2017 People to Watch, says the approach will benefit Anaconda users who are using GPUs.

“Using NVIDIA’s technology, Anaconda is mobilizing the Open Data Science movement by helping teams avoid the data transfer process between CPUs and GPUs and move nimbly toward their larger business goals,” he says in the press release.

Sri Ambati, CEO and co-founder of H2O.ai, says he’s excited about GOAI’s potential to drive a truly diverse open source ecosystem. “GOAI is a call for the community of data developers and researchers to join the movement to speed up analytics and GPU adoption in the enterprise,” he says.

Joining the three co-founders of GOAI are three additional data outfits, including BlazingDB, a scale-out data warehousing outfit with a proprietary file format for petabyte-scale data sets; Graphistry, which develops a GPU-based data store and a visual analytics language; and Gunrock, an open source, high-performance graph primitive for GPU led by UC Daviss John Owens.

GOAI has published some of its specs at github.com/gpuopenanalytics.

In other news, MapD also announced that its database is now open source, which matches the code status of its two GOAI co-founders.

The post GOAI Publishes Python Data Frame for GPU Analytics appeared first on HPCwire.

IBM Power Systems Academic Initiative Reaches Nearly 600 Schools Worldwide

Mon, 05/08/2017 - 11:50

ARMONK, N.Y., May 6, 2017 — The IBM (NYSE: IBM) Power Systems Academic Initiative (PSAI) recently reached a major milestone with nearly 600 participating schools now in the program.

The PSAI is an innovative flexible program designed to assist colleges and universities in the education of students in Power Systems technologies and concepts. The program offers college and high school faculty a variety of educational materials and resources to expand and enhance their curricula. Best of all, there are no costs associated with participating in the program.

Spanning the globe

The PSAI has witnessed more than 300 percent growth in the number of member schools over the last four years alone. PSAI’s 600 schools span 67 countries across six continents.   This growth will provide IBM clients around the world with a broad range of qualified graduates with Power Systems skills.

PSAI educational offerings include IBM Power Systems courses covering Linux on Power, IBM i, and AIX operations and administration. The course catalog, which is available at PSAI’s OnTheHub storefront, includes beginner, intermediate and advanced courses for each operating system.  In addition, the Power Systems Academic Cloud is available to faculty members for research and teaching purposes and contains Power Systems that run Linux on Power, IBM i and AIX.  Students can also access PSAI’s North America Job Board, which is updated daily with entry-level and internship opportunities.

Power Systems Academic Initiative

Check out the Power Systems Academic Initiative and learn more about the schools around the world that are educating the next generation of IBM Power Systems administrators, programmers, and information technology specialists.

For a complete list of PSAI participating schools or for more information about the program, please visit: http://www.ibm.com/university/power.

Source: IBM

The post IBM Power Systems Academic Initiative Reaches Nearly 600 Schools Worldwide appeared first on HPCwire.

PNY Demos NVIDIA Quadro VCA Certified Systems, GP100 CAE at GTC

Mon, 05/08/2017 - 11:40

SAN JOSE, Calif., May 8, 2017 — PNY, a leading supplier of NVIDIA Quadro professional graphics solutions to system integrators, value-added resellers, and distributors, is showcasing the new family of NVIDIA Quadro Pascal architecture professional GPU-fueled solutions, including:  NVIDIA Quadro VCA (Visual Computing Appliance) Certified Systems featuring eight ultra-high-end Quadro P6000 GPUs for interactive photorealistic rendering with unmatched performance, the unique mixed-mode compute and NVLink capable Quadro GP100 running advanced CAE software, as well as exciting new Quadro embedded (MXM) solutions.

Announced at GTC, the Quadro VCA Certified System Program enables select PNY partners to offer powerful turnkey rendering appliances, capable of supporting multiple users simultaneously, over network environments ranging from departmental LANs to the Internet.  Running NVIDIA’s innovative VCA software, these systems offer batch or interactive streaming rendering options, intuitive browser-based queue management, support for NVIDIA Iray and MDL technology for ease-of-use, and are exactingly specified, tested, and certified for maximum uptime.  Other GPU-accelerated rendering options like SOLIDWORKS Visualize Professional and Chaos Group’s V-Ray are also supported.

Demonstrations and displays at the PNY booth will include:


Features eight ultra-high-end Quadro P6000 GPUs running SOLIDWORKS Visualize Professional and NVIDIA VCA software on a VCA Certified System, which offers unmatched stability and performance for mission-critical raytracing rendering workflows driven directly from CAD files.


Client workstation networked to the VCA Certified System for accelerated photorealistic raytracing rendering.


New embedded solutions offering the same powerful Quadro Pascal GPU performance in a small, low-power MXM form factor for ruggedized or custom hardware applications. 


NVIDIA Quadro GP100 graphic boards paired with NVLink to double the GPU memory footprint and scale application performance by enabling bandwidth GPU-to-GPU data transfers at up to 80 GB/s.


ANSYS 18 engineering simulation software GPU accelerated with the Quadro GP100 using double precision (FP64) and fast HBM2 memory to both compute, validate and visualize engineering simulations. 

“We invite developers and designers to come by our booth to see how the latest Pascal family of Quadro GPUs, including the uniquely compute enabled ultra-high-end GP100, are changing the future of manufacturing, computation, visualization, simulation and VR workflows,” said Steven Kaner, Vice President Sales & Marketing, PNY. “This transformative product, and offerings from our new VCA Certified System Program are opening up new frontiers across disciplines ranging from CAE to deep learning and AI, visualization and simulation, and beyond.”

NVIDIA Quadro graphics solutions from PNY are certified on 100+ professional software applications, come with a three-year warranty and are available from system integrators, value-added resellers, and distributors.  PNY Technologies, Inc. is the authorized NVIDIA Quadro channel partner for the Americas and Europe. For additional information, visit PNY at www.pny.com/pnypro or contact gopny@pny.com. 

About PNY Technologies, Inc.

Celebrating over 30 years of B2B, OEM and Channel expertise, PNY Technologies, Inc. is a leading supplier of NVIDIA Quadro, NVS and GeForce Graphics Boards, and manufacturer of PNY GeForce Graphics Boards, Solid State Drives, and USB Flash Drives.  Headquartered in Parsippany, N.J., PNY maintains facilities across North America, Europe and Asia.  PNY’s commitment to process improvement, quality, and customer service and support, has made the company a supplier of record to vendors across markets ranging from photorealistic rendering to Deep Learning (AI).

Source: PNY

The post PNY Demos NVIDIA Quadro VCA Certified Systems, GP100 CAE at GTC appeared first on HPCwire.

GTC17: Inspur to Unveil 2U 8GPU AI Supercomputer

Mon, 05/08/2017 - 09:40

San Jose, California, May 8, 2017 – Inspur will unveil a new ultra-high density AI computing server to accelerate Artificial Intelligence at the upcoming 2017 GPU Technology Conference (GTC2017), which is designed to provide superior application performance to science and engineering computing, taking AI computing to the next level.

Inspur will unveil the new AI supercomputer AGX-2 at its booth# 911 on May 10. The AGX-2 supports up to 8 NVIDIA® Tesla® P100 GPUs, offering either PCI-e interface or NVLink 2.0 for faster interlink connections between the CPU and GPU. This represents a successful innovation of Inspur in AI computing server, which will offer global customers more efficient computing power.

Presently, Inspur has the most complete CPU server product line in single-computer 2/4/8 card. At the IPF17 before GTC17, Inspur and Baidu co-launched hyper-scale AI computing platform AI-SR Rack Scale Server for larger-scale data collection and deep neural network, which has extended 16 GPU accelerator cards in single computer and met model training requirement of several hundred billion samples and trillion parameters.

As a leading cloud computing manufacturer in China, Inspur is always dedicated to providing strong computing power for artificial intelligence. Inspur is now a major provider of AI GPU server for three of the world’s Super 7 CSPs (Baidu, Ali and Tencent). Meanwhile, it maintains close cooperation in AI system and applications with such leading companies as Iflytek, Qihoo 360, Sogou, Toutiao and Face++, helping customers to achieve order-of-magnitude application performance improvement in voice, image, video, search and network, etc.

Inspur’s schedule for the GTC17:

The post GTC17: Inspur to Unveil 2U 8GPU AI Supercomputer appeared first on HPCwire.

Supermicro Systems Deliver 170 TFLOPS FP16 of Peak Performance for AI at GTC

Mon, 05/08/2017 - 07:49

SAN JOSE, Calif., May 8, 2017 — GPU Technology Conference – Super Micro Computer, Inc. (NASDAQ: SMCI), a global leader in compute, storage and networking technologies including green computing, will exhibit new GPU-based servers at the GPU Technology Conference (GTC) from May 8 to 11 at the San Jose Convention Center, Booth #111.

Optimized applications for Supermicro GPU supercomputing systems include Machine Learning, Artificial Intelligence, HPC, Cloud and Virtualized graphics, and Hyperscale Workloads. Supermicro will have on display the SYS-1028GQ-TXRT and SYS-4028GR-TXRT with support for four and eight NVIDIA Tesla P100 SXM 2.0 modules, respectively, both featuring NVIDIA NVLink™ interconnect technology. Supermicro will also be displaying its multi-node GPU solutions and high-performance workstations with support for 4 PCIe 3.0 x16 slots.

Supermicro’s GPU solutions can be found at: https://www.supermicro.com/products/nfo/GPU_MIC.cfm

“Leveraging our extensive portfolio of GPU solutions, customers can massively scale their compute clusters to accelerate their most demanding deep learning, scientific and hyperscale workloads with fastest time-to-results, while achieving maximum performance per watt, per square foot, and per dollar,” said Charles Liang, President and CEO of Supermicro. “With our latest innovations in performance and density optimized 1U and 4U architectures that incorporate the new NVIDIA P100 processors with NVLink, our customers can achieve exponential improvements in deep learning application performance improvements, to address some of the world’s most complex and important challenges, while also saving money.”

NVIDIA’s GPU computing platform provides a dramatic boost in application throughput for HPC, advanced analytics and AI workloads,” said Paresh Kharya, Tesla Product Management Lead at NVIDIA. With our Tesla data center GPUs, Supermicro’s new high-density servers offer customers high performance and superior efficiency to address their most demanding computing challenges.”

Showcased Systems will include:

  • Supermicro’s SuperServer, SYS-4028GR-TXR(T), supports eight NVIDIA Tesla P100 SXM2 accelerators in 4U to provide maximum high bandwidth for mission critical HPC clusters and hyperscale workloads. This solution optimizes NVIDIA NVLink GPU interconnect technology in a cube mesh architecture in tandem with RDMA fabric to improve latency of data access and transfer and maximize performance. This SuperServer provides the eight Tesla P100 SXM2 accelerator parallel computing solution, and with independent GPU and CPU thermal zones ensures uncompromised performance and stability with up to 170 TFLOPS FP16 of peak performance.
  • The 1U SuperServer, SYS-1028GQ-TXR(T), optimized to support four of the NVIDIA Tesla P100 SXM2 accelerators. This highly scalable solution implements a fully-connected quad GPU architecture utilizing NVIDIA’s 160GB/s NVLink interconnects with over 5x the total bandwidth of PCI-E 3.0. This SuperServer provides a non-preheat GPU thermal zone design, which ensures highest performance and stability under the most demanding workloads and can tackle the largest DL Models, in conjunction with high speed connectivity optimized for latency and bandwidth.
  • The Supermicro 2U dual-node TwinPro SuperServer, SYS-2028TP-DTFR, offers data center customers unmatched benefits when configured with two GPUs per node, including eight 2.5″ hot-swap SATA drive bays for unprecedented storage capability, FDR 56Gbps InfiniBand onboard, highest processing power, and high energy efficiency.
  • The SuperWorkstation, SYS-7048GR-TR, supports up to four GPU cards in a 4U Tower form factor with highest efficiency Titanium Level power supplies to bring the unparalleled power of GPU supercomputing to individual digital content creators.

Follow Supermicro on Facebook and Twitter to receive their latest news and announcements.

About Super Micro Computer, Inc. (NASDAQ: SMCI)

Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology is a premier provider of advanced server Building Block Solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and Embedded Systems worldwide. Supermicro is committed to protecting the environment through its “We Keep IT Green” initiative and provides customers with the most energy-efficient, environmentally-friendly solutions available on the market. For more information on Supermicro compute solutions please visit http://www.supermicro.com.

Source: Supermicro

The post Supermicro Systems Deliver 170 TFLOPS FP16 of Peak Performance for AI at GTC appeared first on HPCwire.

API Released for the IBM Quantum Experience

Mon, 05/08/2017 - 07:45

YORKTOWN HEIGHTS, N.Y., May 8, 2017 — While technologies that currently run on classical computers, such as Watson, can help find patterns and insights buried in vast amounts of existing data, quantum computers will deliver solutions to important problems where patterns cannot be seen because the data doesn’t exist and the possibilities that you need to explore to get to the answer are too enormous to ever be processed by classical computers.

In March 2017, IBM (NYSE: IBM) announced the industry’s first initiative to build commercially available universal quantum computing systems. “IBM Q”quantum systems and services will be delivered via the IBM Cloud platform.

IBM Q systems will be designed to tackle problems that are currently seen as too complex and exponential in nature for classical computing systems to handle. One of the first and most promising applications for quantum computing will be in the area of chemistry. Even for simple molecules like caffeine, the number of quantum states in the molecule can be astoundingly large –so large that all the conventional computing memory and processing power scientists could ever build could not handle the problem.

The IBM Q systems promise to solve problems that today’s computers cannot tackle, for example:

  1. Drug and Materials Discovery: Untangling the complexity of molecular and chemical interactions leading to the discovery of new medicines and materials
  2. Supply Chain & Logistics: Finding the optimal path across global systems of systems for ultra-efficient logistics and supply chains, such as optimizing fleet operations for deliveries during the holiday season
  3. Financial Services: Finding new ways to model financial data and isolating key global risk factors to make better investments
  4. Artificial Intelligence: Making facets of artificial intelligence such as machine learning much more powerful when data sets can be too big such as searching images or video
  5. Cloud Security: Making cloud computing more secure by using the laws of quantum physics to enhance private data safety

As part of the IBM Q System, IBM has released a new API (Application Program Interface) for the IBM Quantum Experience that enables developers and programmers to begin building interfaces between its existing five quantum bit (qubit) cloud-based quantum computer and classical computers, without needing a deep background in quantum physics.  IBM has also released an upgraded simulator on the IBM Quantum Experience that can model circuits with up to 20 qubits. In the first half of 2017, IBM plans to release a full SDK (Software Development Kit) on the IBM Quantum Experience for users to build simple quantum applications and software programs.

The IBM Quantum Experience enables anyone to connect to IBM’s quantum processor via the IBM Cloud, to run algorithms and experiments, work with the individual quantum bits, and explore tutorials and simulations around what might be possible with quantum computing.

For more information on IBM’s universal quantum computing efforts, visit www.ibm.com/ibmq.

For more information on IBM Systems, visit www.ibm.com/systems.

IBM is making the specs for its new Quantum API available on GitHub (https://github.com/IBM/qiskit-api-py) and providing simple scripts (https://github.com/IBM/qiskit-sdk-py) to demonstrate how the API functions.

About IBM Research

For more than seven decades, IBM Research has defined the future of information technology with more than 3,000 researchers in 12 labs located across six continents. Scientists from IBM Research have produced six

Nobel Laureates, 10 U.S. National Medals of Technology, five U.S. National Medals of Science, six Turing Awards, 19 inductees in the National Academy of Sciences and 20 inductees into the U.S. National Inventors Hall of Fame.

Source: IBM

The post API Released for the IBM Quantum Experience appeared first on HPCwire.

Mellanox to Present at Upcoming Investor Conferences

Mon, 05/08/2017 - 07:40

SUNNYVALE, Calif. & YOKNEAM, ISRAEL, May 8, 2017 — Mellanox Technologies, Ltd. (NASDAQ:MLNX), a leading supplier of end-to-end interconnect solutions for servers and storage systems, today announced that it will present at the following conferences during the second quarter of 2017:

  • Oppenheimer 17th Annual Israel Conference in Tel Aviv, Israel, Sunday, May 21st at 10:15 a.m. Israel Standard Time.
  • 2017 JP Morgan 45th Annual TMT Conference in Boston, MA, Tuesday, May 23rd at 10:40 a.m. Eastern Daylight Time.
  • 2017 Stifel Technology Conference in San Francisco, CA, Monday, June 5th at 10:15 a.m. Pacific Daylight Time.
  • Jefferies Israel Tech Trek in Tel Aviv, Israel, Tuesday, June 6th at 12:20 p.m. Israel Standard Time.

When available, a webcast of the live event, as well as a replay, will be available on the company’s investor relations website at: http://ir.mellanox.com.

About Mellanox

Mellanox Technologies (NASDAQ:MLNX) is a leading supplier of end-to-end InfiniBand and Ethernet smart interconnect solutions and services for servers and storage. Mellanox interconnect solutions increase data center efficiency by providing the highest throughput and lowest latency, delivering data faster to applications and unlocking system performance capability. Mellanox offers a choice of fast interconnect products: adapters, switches, software and silicon that accelerate application runtime and maximize business results for a wide range of markets including high performance computing, enterprise data centers, Web 2.0, cloud, storage and financial services. More information is available at: www.mellanox.com.

Source: Mellanox

The post Mellanox to Present at Upcoming Investor Conferences appeared first on HPCwire.

Exa, BP Sign Multi-Year Commercial Agreement for Exa DigitalROCK

Fri, 05/05/2017 - 16:03

BURLINGTON, Mass., May 05, 2017 — Exa Corporation (Nasdaq:EXA), a global innovator of engineering software for simulation-based design, has announced a multi-year commercial agreement with BP for Exa’s DigitalROCK relative permeability software solution based on Exa’s unique multi-phase fluid flow simulation technology.

As a result of three years of cooperative research and validation, Exa is now bringing to market the first predictive computational solver for relative permeability, a goal that has long been elusive for the oil and gas industry.  This new capability will help engineering teams make more informed decisions on wells, production facilities and resource progression, including enhanced oil recovery.

Dr. Joanne Fredrich, Upstream Technology Senior Advisor at BP, commented, “After a three-year program of cooperative development and testing, our extensive validation studies are drawing to a close. The ability to generate reliable relative permeability information directly from digital scans on a much faster time-scale than laboratory testing, and to gain insight into the underlying pore-scale dynamics, provides substantial business value during appraisal, development, and management of our reservoirs. We plan to deploy this technology across our global portfolio.”

“Predictive simulation of relative permeability from a digital scan of a rock sample is now a reality for oil exploration and production companies globally,” remarked David Freed, VP of Oil & Gas at Exa. “Our DigitalROCK technology is providing essential data on oil reservoir characteristics in a fraction of the time possible with physical lab test.”

Exa’s President and CEO, Stephen Remondi, stated, “As we bring our patented flow simulation technology to this exciting new market for Exa, we are pleased that an integrated oil and gas company of BP’s scale has adopted our DigitalROCK multi-phase flow solution. We believe there is significant opportunity for Exa’s simulation technology to make a positive impact on the oil and gas industry, as it has for other industries we have served over the past twenty-five years.”

Learn more in BP’s press release “BP Expands Digital Rocks Technology

About Exa Corporation

Exa (Nasdaq:EXA) (www.exa.com) Corporation’s visualization and simulation software helps designers and engineers enhance the performance of their products, reduce product development costs and improve the efficiency of their design and engineering processes. As a design evolves, Exa accurately predicts the performance of that design while providing actionable insight to optimize the performance of the product. With Exa, the need for costly physical prototypes and expensive late-stage changes is reduced. Now, designers and engineers are freed from the risk of producing compromised products that do not meet market and regulatory requirements. Exa currently focuses primarily on the ground transportation market, in which some of the most successful product companies in the world use Exa, including BMW, Delphi, Denso, Fiat Chrysler, Ford, Hino, Honda, Hyundai, Jaguar Land Rover, Kenworth, Komatsu, MAN, Nissan, Peterbilt, Peugeot, Renault, Scania, Toyota, Volkswagen and Volvo Trucks, and has recently expanded its technology offerings into the fields of aerospace and oil and gas production

Source: Exa

The post Exa, BP Sign Multi-Year Commercial Agreement for Exa DigitalROCK appeared first on HPCwire.

Intel: Management Change at Data Center Group; Xeon Re-architecting on Way

Fri, 05/05/2017 - 13:29

Diane Bryant, EVP and GM of Intel’s Data Center Group and a prominent industry figure (SC15 keynote speaker, e.g.), is taking an extended leave of absence from Intel “to tend to a personal family matter,” according to Intel. Navin Shenoy, most recently GM of the Client Computing Group, will take over DCG (fiscal 2016 revenue: $17 billion).

“Given the extended duration (of Bryant’s leave), an interim leader for the Data Center Group is not possible,” Intel CEO Brian Krzanich said in an email to Intel employees yesterday.

Shenov, a 22-year veteran of Intel who will report directly to Krzanich, has held positions in the CEO’s office, sales and marketing, and the PC and tablet businesses, before running CCG.

Source: Intel

Shenoy and Bryant “will work closely together for the next month to ensure a smooth transition for the organization and our customers…,” Krzanich said, adding that Bryant will be on leave for six to eight months. “And, it goes without saying that my thoughts are with Diane. I look forward to welcoming her back to her next challenging role.”

Krzanich said that over the past five years, Bryant has “transformed DCG from a server-centric group to a business that spans servers, network and storage across all end-user segments, and with product lines and business models that extend beyond the traditional.”

The Bryant news follows an April announcement that Brent Gorda, GM of Intel’s High Performance Data Division, would be leaving the company. Gorda is the former CEO of Whamcloud, the Lustre specialist acquired by Intel in 2012. In March, Intel announced the appointment 11-year Intel veteran Aicha S. Evans as chief strategy officer.

Meanwhile, Intel continues to sharpen its processor focus on data centers and handling a broader set of workloads with the rebranding and “re-architecting” of its Xeon family.

The chip giant, which recently began shipping new Optane solid-state drives based on its 3-D Xpoint memory technology, continues to double down on data center applications and demanding data-driven workloads. Analysts noted that the repackaged “Xeon Processor Scalable Family,” partially unveiled on Thursday (May 4) based on its Skylake processor architecture, is aimed at accelerating HPC workloads, along with machine learning, networking and security.

The new family is designed to deliver “four levels of performance and a variety of choices with regard to which integrations and accelerators customers need,” Intel said.

Source: Intel

Added Jennifer Huffstetler, Intel’s director of data center product marketing: “We are bringing essentially two of our previously segmented platforms, the Xeon E5 and the Xeon E7, together into one unified platform that now will be able to provide that scalability and flexibility from entry compute, network and storage workloads all the way through to mission critical workloads, like in-memory database and analytics, so really bringing the best to bear of both of our Xeon families into this one scalable stack.”

The new data center processor family is scheduled for release this summer, and the chipmaker was stingy about details. Nevertheless, analysts said the scaling strategy was significant. “Intel is separating core count, [reliability, availability and serviceability] features and other special features from their ‘good-better-best’ [categories] they had with Xeon E3, E5 and E7,” chip analyst Patrick Moorhead of Moor Insights & Strategy told EnterpriseTech.

Huffstetler explained that workload characteristics would determine whether greater or lower core counts are required. For example, data center operators running demanding workloads involving HPC utilization or in-memory analytics “would be looking higher up the stack,” she said.

While full details won’t be released until the processor family is officially launched, Huffstetler did provide this nugget: “We are seeing new integrations [that are] able to address those shifting workload needs, including integrations of accelerators, like our quick-assist technology for crypto- or compression algorithms.”

The post Intel: Management Change at Data Center Group; Xeon Re-architecting on Way appeared first on HPCwire.

NCSA Highlights Scientific Impacts from Three Years of Blue Waters

Fri, 05/05/2017 - 09:07

URBANA, Ill., May 5, 2017 —  “Build it and they will come” is one way to approach building a supercomputer, but it’s not what the National Center for Supercomputing Applications (NCSA) did with Blue Waters, the largest leadership-class National Science Foundation supercomputer. Prior to the system going online in April 2013, Blue Waters staff worked with more than 20 science teams to determine a unique, balanced hardware configuration—a process now known as co-design.

Three years later, a sample of 31 science teams that have used Blue Waters were surveyed and interviewed as part of a report meant to judge the effectiveness and productivity of this unique system—housed at NCSA’s home institution, the University of Illinois at Urbana-Champaign.

Using information gathered in the surveys, the report’s authors at International Data Corporation’s HPC division (now known as Hyperion Research) ranked the impact of each team’s findings into an “innovation index”—using a methodology they developed to analyze the effectiveness of 700-plus scientific projects, including international HPC projects. The Hyperion Research analysts noted in the report that “NCSA did an unusually thorough job of preparing [science teams] for Blue Waters.”

“These findings confirm that Blue Waters has proven to be an exceptionally—and in some cases uniquely—competent platform for accelerating scientific innovation,” the Hyperion authors wrote in the report. “The Blue Waters-enabled innovations described and ranked in this study will produce strong benefits for the scientific disciplines they belong to.”

The report’s release coincides with a comprehensive workload study of all the scientific applications run on the supercomputer since its start in 2013. That study, written about extensively here, found most of the applications run on the system have been extreme-scale. It also found there were a number of problems that couldn’t be run anywhere else, and confirms the balance of hardware the Blue Waters team chose—from high network bandwidth, to a high node count with large memory—is being used effectively.

The Hyperion report reinforces those points from the workload study by showing how Blue Waters projects rank with “Innovation Impact.” Hyperion has a database of similar surveys run for the Department of Energy and several governments in Europe and Asia, which gives them an approximate average to compare to. Notably, Hyperion found far fewer Blue Waters science innovations fell into the category of “least important” than average, and more Blue Waters projects were evaluated in the top three categories for impact.

“The innovations produced so far by Blue Waters users are crucially important as a group because they constitute substantial steps forward in major disciplines,” the authors wrote in the report.

“Not once during the 31 interviews [Hyperion] conducted with Blue Waters users for this study did any of them point to shortcomings of the supercomputer,” the authors added. “On the contrary, all of the researchers had praise for the system and enthusiastically reported on the progress it has enabled for their work.”

Outside of the science Blue Waters has enabled, it has also re-established the importance of a balanced system—a supercomputer that can effectively do a complete range of simulation problems, as well as data intensive problems at scale. Doing so required that the Blue Waters team not focus entirely on increasing the “peak calculations per second” metric used in the high-profile TOP500 list (which Blue Waters refused to participate in), instead disbursing project funds to invest in other vital components of the machine, like network bandwidth and memory.

“[Hyperion] applauds NCSA’s bold decision not to optimize Blue Waters for superior performance on the narrow benchmark test used to determine rankings on the semi-annual TOP500 supercomputers list. If more leadership-class HPC sites resisted this political temptation, vendors would be more motivated to design HPC systems that are applicable to a broader range of user needs,” the authors wrote.

Follow this link to download the report and learn more about its findings and methodologies. This includes the Scientific Breakthroughs section, which has direct quotes from researchers. Other sections focus on cost savings, preparing to support better future projects, as well as sections that call out specific projects that have helped society, or discovered something new.

Source: NCSA

The post NCSA Highlights Scientific Impacts from Three Years of Blue Waters appeared first on HPCwire.

University of Waterloo Launches New National Supercomputer

Fri, 05/05/2017 - 09:04

WATERLOO, Ont., May 5, 2017 — The University of Waterloo, Compute Canada and Compute Ontario today unveiled the largest supercomputer at any Canadian university. Located at Waterloo, it will provide expanded resources for researchers across the country working on a broad range of topics, including artificial intelligence, genomics and advanced manufacturing.

Named Graham, the supercomputer can handle more simultaneous computational jobs than any other academic supercomputer in Canada, ultimately generating more research results at one time. With its extraordinary computing power and a storage system of more than 50 petabytes — or 50 million gigabytes — Graham can support researchers who are collecting, analyzing, or sharing immense volumes of data.

“Research and innovation have helped define the University of Waterloo, and will remain important priorities for our future,” said Feridun Hamdullahpur, president and vice-chancellor of Waterloo. ”Graham allows us to increase our capacity to be a global leader in advanced computing. Thanks to the support of both the federal and provincial governments, CFI, Compute Canada and Compute Ontario we will be even closer to realizing this vision.”

Graham is the result of an investment worth $17 million from the Canada Foundation for Innovation (CFI) and the Government of Ontario. It is one of four new supercomputing and data centres that are part of a national initiative valued at $75 million that involves CFI, and various provincial and industry partners. Compute Canada, in collaboration with its member institutions and partners, is implementing the improvements to facilities across the country. SHARCNET, a multi-university consortium in Ontario, led the implementation at Waterloo in partnership with Compute Ontario.

“Research today is increasingly data intensive. For the community of over 11,000 Canadian researchers that we serve today, Graham will give Canadian researchers and innovators the ability to compete and excel globally using big data and big compute tools,” said Mark Dietrich, president and CEO of Compute Canada. “We are honoured to collaborate with our partners at the University of Waterloo and Compute Ontario in this achievement.”

Supercomputers are a fundamental part of advanced research computing (ARC), which plays an essential role in scientific discovery, innovation and national competitiveness. Graham is the third of four new national systems at universities across Canada.

“We are excited to announce the launch of Graham for the benefit of the research community,” said Nizar Ladak, president and CEO of Compute Ontario. “With such a strong reputation for innovation, the University of Waterloo makes an excellent host site. Compute Ontario proudly supports this system, which will ensure Ontario is well positioned as a global leader in advanced computing and a global focal point for highly qualified personnel.”

Waterloo’s supercomputer takes its name from J. Wesley (Wes) Graham, a former professor at the University. His many contributions to the development of software and hardware have had a

major impact on the computing industry, and he played a significant role in establishing the University’s international reputation for teaching and research in information technology.

About Compute Canada

Compute Canada, in partnership with regional organizations ACENET, Calcul Québec, Compute Ontario and WestGrid, leads the acceleration of research and innovation by deploying state-of-the-art advanced research computing (ARC) systems, storage and software solutions. Together we provide essential ARC services and infrastructure for Canadian researchers and their collaborators in all academic and industrial sectors. Our world-class team of more than 200 experts employed by 37 partner universities and research institutions across the country provide direct support to research teams. Compute Canada is a proud ambassador for Canadian excellence in advanced research computing nationally and internationally.

About Compute Ontario

Compute Ontario is the provincial agency that coordinates access to advanced research computing and Ontario’s Big Data Strategy. Access to this critical technology happens through our four consortia (SciNet, SHARCNET, Centre for Advanced Computing, and HPC4Health).

Nationally, it partners with Compute Canada and regional organizations ACENET, Calcul Quebec and Westgrid, to plan and coordinate the supply of advanced computing for Canadian academic researchers.

About the University of Waterloo

University of Waterloo is Canada’s top innovation university. With more than 36,000 students we are home to the world’s largest co-operative education system of its kind. Our unmatched entrepreneurial culture, combined with an intensive focus on research, powers one of the top innovation hubs in the world. Find out more at uwaterloo.ca

Source: Compute Canada

The post University of Waterloo Launches New National Supercomputer appeared first on HPCwire.

Budget Deal Spares Science Funding, Boosts Exascale Spend

Thu, 05/04/2017 - 23:15

Bipartisan congressional negotiators reached an agreement this week on the 2017 fiscal year budget, which funds the government through September 30. The bill is, as Computing Research Association Policy Analyst Brian Mosley put it, “not great, but not terrible for science.” The deal rejects most of the sweeping cuts to federal science agencies that Trump and his advisers have proposed, and some programs, notably the Department of Energy (DOE) and National Institutes of Health (NIH), will receive increases. Also up for a nice boost: the Exascale Computing Project.

Here’s a run-down:

The DOE Office of Science is set to receive a $43 million increase over FY16 levels to $5.39 billion in FY17. Advanced Scientific Computing Research (ASCR) — responsible for funding the bulk of DOE supercomputing — gets a bump from $621 million in FY16 to $647 million for FY17, an increase of 4.2 percent (but shy of President Obama’s requested 6.8 percent increase).

The Exascale Computing Project, which like so many government activities has been in the thicket of budget uncertainties, is designated to receive $259 million — that’s $10 million more than requested by the Obama administration. Of the total, $164 million falls under DOE ASCR and $95 million hits the National Nuclear Security Administration ledgers. Recall that the ECP was initiated as a joint ASCR/NNSA partnership.

As a DOE crosscut, total exascale funding, linked to the Exascale Computing Initiative, is set to go from FY16 enacted levels of $252.6 million to $295 million in FY17, an increase of nearly $42.4 million. One science official we spoke with off-record sees this appropriation as an encouraging sign from Congress.

The bill directs the DOE “to provide to the Committees on Appropriations of both Houses of Congress not later than 90 days after enactment of this Act a report that differentiates the roles and responsibilities of the NNSA and the Office of Science for carrying out the exascale computing initiative and describes how those respective roles and responsibilities are complementary and not duplicative.”

Science laboratories infrastructure will also get a boost, receiving $130 million, compared to $114 million in FY16. The new funding is marked for the Integrated Engineering Research Center at Fermi National Accelerator Laboratory and the Core Facility Revitalization project at Brookhaven National Laboratory (BNL).

The Advanced Research Projects Agency-Energy (ARPA-E) axed under the 2018 skinny budget proposal (see our coverage here), gets a robust 5.2 percent increase over FY16 enacted levels to $306 million.

The omnibus bill allots $7.46 billion for National Science Foundation (NSF) activities, a mere $9 million increase over FY16 numbers. The six research directorates and NSF’s education directorate remain at their 2016 totals ($6.033 billion and $880 million respectively). NSF did not receive the $43 million in operating funds it requested for a new headquarters building in northern Virginia.

NASA‘s budget is set at $19.65 billion, up 1.9 percent from $19.28 billion last year. Funding for earth science, slated for a 7 percent cut under Trump’s 2018 skinny budget, remains at 2016 levels: $1.92 billion.

The NIH sees the healthiest increases; the budget raises FY17 spending by $2 billion to $34.1 billion (a 6.2 percent increase). Included is $352 million already approved as part of the 21st Century Cures Act that was part of the last continuing resolution in December. The Precision Medicine Initiative is slated to receive an additional $160 million, including $40 million from the Cures act, and the BRAIN Initiative receives $120 million, including $10 million from Cures.

Hardest hit of the science agencies is the Environmental Protection Agency. It’s up for an $81 million cut, trimming roughly 1 percent from its $8.06 FY16 billion budget. But the agency avoided steeper cuts and staff reductions proposed in the skinny budget.

The $1 trillion omnibus spending bill passed the House and Senate this week with strong bipartisan support. On Wednesday, the House voted 309-118 in favor, with four members abstaining. Thursday, the bill passed in the Senate with a vote of 79-18. The President has promised to sign it.

“After years of partisan bickering and gridlock, this bill is a clear win for the American people,” Trump said Tuesday. “It’s been a very hotly contested budget because, as you know, we have to go through a long and rigorous process.”

Senate Minority Leader Chuck Schumer of N.Y. called the budget “a good agreement for the American people,” and “a win for science.”

For in-depth coverage on how the budget impacts science, refer to the Science article “How science fares in the U.S. budget deal.”

The post Budget Deal Spares Science Funding, Boosts Exascale Spend appeared first on HPCwire.

Chameleon Speeds Development of Portable Hadoop Reader for Parallel File Systems

Thu, 05/04/2017 - 15:42

May 4 — Some scientists dream about big data. The dream bridges two divided realms. One realm holds lofty peaks of number-crunching scientific computation. Endless waves of big data analysis line the other realm. A deep chasm separates the two. Discoveries await those who cross these estranged lands.

Unfortunately, data cannot move seamlessly between Hadoop (HDFS) and parallel file systems (PFS). Scientists who want to take advantage of the big data analytics available on Hadoop must copy data from parallel file systems. That can slow workflows to a crawl, especially those with terabytes of data.

Computer Scientists working in Xian-He Sun’s group are bridging the file system gap with a cross-platform Hadoop reader called PortHadoop, short for portable Hadoop. “PortHadoop, the system we developed, moves the data directly from the parallel file system to Hadoop’s memory instead of copying from disk to disk,” said Xian-He Sun, Distinguished Professor of Computer Science at the Illinois Institute of Technology. Sun’s PortHadoop research was funded by the National Science Foundation and the NASA Advanced Information Systems Technology Program (AIST).

The concept of ‘virtual blocks’ helps bridge the two systems by mapping data from parallel file systems directly into Hadoop memory, creating a virtual HDFS environment. These ‘virtual blocks’ reside in the centralized namespace in HDFS NameNode. The HDFS MapReduce application cannot see the ‘virtual blocks’; a map task triggers the MPI file read procedure and fetches the data from the remote PFS before its Mapper function processes its data. In other words, a dexterous slight-of-hand from PortHadoop tricks the HDFS to skip the costly I/O operations and data replications it usually expects.

Sun said he sees PortHadoop as the consequence of the strong desire for scientists to merge high performance computing with cloud computing, which companies such as Facebook and Amazon use to ‘divide and conquer’ data-intensive MapReduce tasks among its sea of servers. “Traditional scientific computing is merging with big data analytics,” Sun said. “It creates a bigger class of scientific computing that is badly needed to solve today’s problems.”

PortHadoop was extended to PortHadoop-R to seamlessly link cross-platform data transfer with data analysis and virtualization. Sun and colleagues developed PortHadoop-R specifically with the needs of NASA’s high-resolution cloud and regional scale modeling applications in mind. High performance computing has served NASA well for their simulations, which crunch data through various climate models. Sun said the data generated from models combined with observational data are unmanageably huge and have to be analyzed and also visualized to more fully understand chaotic phenomena like hurricanes and hail storms in a timely fashion.

PortHadoop faced a major problem in preparation to work with NASA applications. NASA’s production environment doesn’t allow any testing and development on its live data.

PortHadoop developers overcame the problem with the Chameleon cloud testbed system, funded by the National Science Foundation (NSF). Chameleon is a large-scale, reconfigurable environment for cloud computing research co-located at the Texas Advanced Computing Center of the University of Texas at Austin and also at the the Computation Institute of the University of Chicago. Chameleon allows researchers bare-metal access, i.e., allows them to fully reconfigure the environment on its nodes including support for operations such as customizing the operating system kernel and console access.

What’s more, the Chameleon system of ~15,000 cores with Infiniband interconnect and 5 petabytes of storage adeptly blends in a variety of heterogeneous architectures, such as low-power processors, graphical processing units, and field-programmable gate arrays.

Read the rest of the story at TACC’s website.

The post Chameleon Speeds Development of Portable Hadoop Reader for Parallel File Systems appeared first on HPCwire.

UMU Computer Scientist Unveils Scheduling Model, Techniques to Speed HPC

Thu, 05/04/2017 - 10:13

UMEÅ, Sweden, May 4, 2017 — Computer scientist Gonzalo Rodrigo at Umeå University has developed new techniques and tools to manage high performance computing systems more efficiently. This in an effort to comply with the increasing demand to handle large amounts of data within research and allowing for advance simulations.

In a world paralysed by fear of global warming, energy shortage, and resource depletion, an unexpected hero arises: High Performance Computing (HPC). An HPC system aggregates the power of tens of thousands of processors interconnected by low latency optical networks to run large-scale scientific applications. They support research in fields when it is practically impossible to advance only through experimentation and observation. For instance, research on weather models, ground water movements, or new energy sources rely on simulations and data analysis performed on increasingly larger HPC systems.

However, traditional HPC schedulers can no longer efficiently manage the new complex scientific applications of even more sophisticated newer systems. In his doctoral dissertation at Umeå University, Gonzalo Rodrigo has developed new techniques and tools to manage HPC systems more efficiently, increasing their capacity to support advanced scientific research.

Gonzalo Rodrigo in front of the High Performance Computing System SCinet. Photo: private

In his doctoral dissertation, Gonzalo Rodrigo at the Department of Computing Science at Umeå University, presents methods and tools to efficiently schedule application and workflows in High Performance Computing Systems and increase the speed of the scientific work they can produce.

“In detail, I have provided a better understanding of trends of current workloads and I have developed a general application-oriented scheduling model in HPC systems, a scheduling simulation framework to support future research on scheduling algorithms, and a scheduling technique for efficient execution of complex scientific workflows,” says Gonzalo Rodrigo. The outcome of this work also includes two open-source projects that will enable future research on HPC scheduling.

Work leading to Gonzalo Rodrigo’s dissertation has been conducted in collaboration with researchers from the Data Science and Technology Department and the National Energy Research Scientific Computation Center at the Lawrence Berkeley National Lab in the US.

Gonzalo P. Rodrigo Álvarez comes from Spain. He holds a Master’s in Computer Engineering from Universidad de Zaragoza and a Master’s in Business Administration from ESIC, both in Spain. In December 2012, he commenced his doctoral studies at Umeå University in Sweden under the tuition of Professor Erik Elmroth and later also Dr Lavanya Ramakrishnan at the Lawrence Berkeley National Lab. During his doctoral studies, Gonzalo Rodrigo held a visiting research fellowship at the Lawrence Berkley National Lab for a year and a half and a four-month internship at Google Inc.

Read the full dissertation

About the public defence of the dissertation:

On Friday 21 April, Gonzalo P. Rodrigo at the Department of Computing Science at Umeå University, defends his dissertation entitled: HPC Scheduling in a Brave New World.
The public defence of the dissertation takes place at 10:15 in room MA121, in the MIT Building, (the Mathematics and Information Technology Building) at Umeå University.
Faculty opponent is Professor Ewa Deelman, Information Sciences Institute, University of South California (USC), Los Angeles, the US.

Source: Umeå University

The post UMU Computer Scientist Unveils Scheduling Model, Techniques to Speed HPC appeared first on HPCwire.