Research Papers

Parallelized Simulation of a Finite Element Method in Many Integrated Core Architecture

[+] Author and Article Information
Moonho Tak

Computational Solid and Structural
Mechanics Laboratory,
Department of Civil and Environmental
Hanyang University,
222 Wangsimni-ro, Seongdong-gu,
Seoul 04763, South Korea
e-mail: pivotman@hanyang.ac.kr

Taehyo Park

Computational Solid and Structural
Mechanics Laboratory,
Department of Civil and Environmental
Hanyang University,
222 Wangsimni-ro, Seongdong-gu,
Seoul 04763, South Korea
e-mail: cepark@hanyang.ac.kr

1Corresponding author.

Contributed by the Materials Division of ASME for publication in the JOURNAL OF ENGINEERING MATERIALS AND TECHNOLOGY. Manuscript received June 1, 2016; final manuscript received October 25, 2016; published online February 7, 2017. Assoc. Editor: Xi Chen.

J. Eng. Mater. Technol 139(2), 021009 (Feb 07, 2017) (6 pages) Paper No: MATS-16-1162; doi: 10.1115/1.4035326 History: Received June 01, 2016; Revised October 25, 2016

We investigate a domain decomposition method (DDM) of finite element method (FEM) using Intel's many integrated core (MIC) architecture in order to determine the most effective MIC usage. For this, recently introduced high-scalable parallel method of DDM is first introduced with a detailed procedure. Then, the Intel's Xeon Phi MIC architecture is presented to understand how to apply the parallel algorithm into a multicore architecture. The parallel simulation using the Xeon Phi MIC has an advantage that traditional parallel libraries such as the message passing interface (MPI) and the open multiprocessing (OpenMP) can be used without any additional libraries. We demonstrate the DDM using popular libraries for solving linear algebra such as the linear algebra package (LAPACK) or the basic linear algebra subprograms (BLAS). Moreover, both MPI and OpenMP are used for parallel resolutions of the DDM. Finally, numerical parallel efficiencies are validated by a two-dimensional numerical example.

Copyright © 2017 by ASME
Your Session has timed out. Please sign back in to continue.


Giloi, W. K. , 1994, “ Parallel Supercomputer Architectures and Their Programming Models,” Parallel Comput., 20(10–11), pp. 1443–1470. [CrossRef]
Attig, N. , Gibbon, P. , and Lippert, Th. , 2011, “ Trends in Supercomputing: The European Path to Exascale,” Comput. Phys. Commun., 182(9), pp. 2041–2046. [CrossRef]
Lim, D. J. , Anderson, T. R. , and Shott, T. , 2015, “ Technological Forecasting of Supercomputer Development: The March to Exascale Computing,” Omega, 51, pp. 128–135. [CrossRef]
Lyakh, D. I. , 2015, “ An Efficient Tensor Transpose Algorithm for Multicore CPU, Intel Xeon Phi, and NVIDIA Tesla GPU,” Comput. Phys. Commun., 189, pp. 84–91. [CrossRef]
Needham, P. J. , Bhuiyan, A. , and Walker, R. C. , 2016, “ Extension of the AMBER Molecular Dynamics Software to Intel's Many Integrated Core (MIC) Architecture,” Comput. Phys. Commun., 201, pp. 95–105. [CrossRef]
Amestoy, P. R. , Duff, I. S. , Guermouche, A. , and Slavova, Tz. , 2010, “ Analysis of the Solution Phase of a Parallel Multifrontal Approach,” Parallel Comput., 36(1), pp. 3–15. [CrossRef]
Farhat, C. , and Roux, F. X. , 1991, “ A Method of Finite Element Tearing and Interconnecting and Its Parallel Solution Algorithms,” Int. J. Numer. Methods Eng., 32(6), pp. 1205–1227. [CrossRef]
Farhat, C. , Lesoinne, M. , and Pierson, K. , 2000, “ A Scalable Dual-Primal Domain Decomposition Method,” Numer. Linear Algebra Appl., 7(7–8), pp. 687–714. [CrossRef]
Farhat, C. , Pierson, K. , and Lesoinne, M. , 2000, “ The Second Generation FETI Methods and Their Application to the Parallel Solution of Large-Scale Linear and Geometrically Non-Linear Structural Analysis Problems,” Comput. Methods Appl. Mech. Eng., 184(2–4), pp. 333–374. [CrossRef]
Tak, M. , and Park, T. , 2013, “ High Scalable Non-Overlapping Domain Decomposition Method Using a Direct Method for Finite Element Analysis,” Comput. Methods Appl. Mech. Eng., 264, pp. 108–128. [CrossRef]
Cuthill, E. , and McKee, J. , 1969, “ Reducing the Bandwidth of Sparse Symmetric Matrices,” ACM 24th National Conference, New York, Aug. 26–28, pp. 157–172.


Grahic Jump Location
Fig. 1

Flowchart for the direct DDM

Grahic Jump Location
Fig. 2

Half-circle ring model

Grahic Jump Location
Fig. 3

Elapsed time for the DDM on two MICs and one thread usages

Grahic Jump Location
Fig. 4

Time difference between steps on one threads usage

Grahic Jump Location
Fig. 5

Time difference between steps on N = 16

Grahic Jump Location
Fig. 6

Elapsed time for the DDM on N = 1 for one MIC and N = 16 for four MICs



Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In