Fp64 precision nvidia

5/12/2023

researchers from the University of Tokyo, Oak Ridge National Laboratory (ORNL), and the Swiss National Supercomputing Centre collaborated on this new solver, called MOTHRA (iMplicit sOlver wiTH artificial intelligence and tRAnsprecision computing). Running MOTHRA on the Summit supercomputer using a combination of AI and mixed-precision, MOTHRA achieved a 25x speed-up compared to the standard solver. Scientists also want to include the shaking of soft soil near the surface as well as the building structures below and above the ground level, shown in figure 3. Current seismic simulations can compute the properties of hard soil shaking deep underground. One of the Gordon Bell finalists simulates an earthquake using AI and transprecision computing (transprecision is synonymous with mixed-precision). Using Mixed-Precision for Earthquake Simulation Tensor Cores provide fast matrix multiply-add with FP16 input and FP32 compute capabilities. Let’s look at a few examples discussed at SC18 on how researchers used Tensor Cores and mixed-precision for scientific computing. The 16x multiple versus FP64 within the same power budget has prompted researchers to explore techniques to leverage Tensor Cores in their scientific applications. Tensor Cores provide up to 125 TFlops FP16 performance in the Tesla V100. Volta V100 and Turing architectures, enable fast FP16 matrix math with FP32 compute, as figure 2 shows.

Accumulation to FP32 sets the Tesla V100 and Turing chip architectures apart from all the other architectures that simply support lower precision levels. Using FP16 with Tensor Cores in V100 is just part of the picture. NVIDIA Tesla V100 includes both CUDA Cores and Tensor Cores, allowing computational scientists to dramatically accelerate their applications by using mixed-precision. Figure 1: IEEE 754 standard floating point format Figure 1 describes the IEEE 754 standard floating point formats for FP64, FP32, and FP16 precision levels. Using reduced precision levels can accelerate data transfers rates,increase application performance, and reduce power consumption, especially on GPUs with Tensor Core support for mixed-precision. In recent years, the big bang for machine learning and deep learning has focused significant attention on half-precision (FP16). Researchers have experimented with single-precision (FP32) in the fields of life science and seismic for several years. Problem complexity and the sheer magnitude of data coming from various instruments and sensors motivate researchers to mix and match various approaches to optimize compute resources, including different levels of floating-point precision. However, FP64 also requires more computing resources and runtime to deliver the increased precision levels. Most numerical methods used in engineering and scientific applications require the extra precision to compute correct answers or even reach an answer. Double-precision floating point (FP64) has been the de facto standard for doing scientific simulation for several decades.

0 Comments

Fp64 precision nvidia

Leave a Reply.

Author

Archives

Categories