S. M. McQuay Defense SMBinterp: an N-th-Order Accurate, Distributed Interpolation Library Committee: Dr. S. E. Gorrell Dr. C. G. Jensen Dr. S. L. Thomson Going to start with some motivation behind the research briefly describe the problem space, including competing products Start talking about what I made Discuss the constituent parts of the solution (numerical Method, distribution method, plugin system) results of each section will follow that section Last, conclusions, future work Motivation Pratt & Whitney company in Hartford, CT Do not make light bulbs; Make world's best Jet Engines Provided funding for the first year of this research (no comment on the date, nor on how many subsequent years went unfunded ... :) Pratt & Whitney: a History Co-op: 2005 Geometric Modeling Parametric Jet Engine Internship: Summer 2007 Simulation Integration FORTRAN CHIMPS a bit of history been out twice Dr. Jensen: gracious and instrumental in setting them up Co-op: long internship Unigraphics/NX Glorified CAD Jockey internship: More thermo-fluids-focused integrated two Pratt codes using CHIMPS library (more on that later) SUMMER OF FORTRAN Pratt & Whitney: a History Co-op: 2005 Geometric Modeling Parametric Jet Engine Internship: Summer 2007 Simulation Integration FORTRAN (summer of ❤) CHIMPS internship: Gained exposure/experience in Multiphysics Why would you want to do pay an intern to do this? Pratt makes jet engines ... Oops, that was supposed to not happen Pratt makes jet engines Cost of Fuel is a significant cost associated with Engine Ownership to stay competitive: design for efficiency Must prototype to test out models Prototypes are expensive Use numerical simulations to decrease cost of making best engine for simple things, numerical simulations are ... simple single regime, single code: I am only concerned with the effects of temperature on convective flow in a lake Multiphysics Simulation Large Simulations Multiple Physical Regimes Different Code per Regime Jet has many regimes: Compressor Combustor Turbine External flow Nacelle Simulations at Pratt getting very large When I was there last, they were trying to perform Nacelle/Fan and Combustor/Turbine simulations Well, what's the problem? Mesh Element Disparity Coupling of Disparate Code Disparate Mesh Densities Disparate codes ⇒ Meshes Generally not coincident Introduces Need for Interpolation Coupling of Disparate codes different codes can lead different meshes mesh elements (cells and verticies) don't line up for these points, no problem, for these points what value do you use? This is what started this research (optional) Side note: these principles also work for overset/chimera grid applications, mesh refinement State of Multiphysics Interpolation What applications exist in this realm? Multiphysics Applications Commercial: ADINA COMSOL Open Source: CHIMPS AVUSINTERP Elmer Questionable license: openengineering/Oofelie Some applications are a canned solution At Pratt: We are not talking about the workstation scale problems We are talking about clusters and hundreds of millions of cells Unix philosophy: program does one thing, does it well The authors of chimps feel strongly about the decomposition of the entire domain according to the UNIX philosophy One aspect of this problem space involves itself with controlling the flow of information from one regime to another; That's not with what I concerned myself. highlighted: specifically deal with interpolation what is it that these two applications do? Two Problems Time-Efficient Data Lookup Accurate Interpolation ☯ These are frameworks for multiphysics interpolation and data transfer this class of library tries to solve two main problems Speed Accuracy CHIMPS CHIMPS Coupler for High-Performance Integrated Multi- Physics Simulation Read name from slide CITS: center for integrated turbulence simulations it controls the overall multiphysics simulation (not something I tried to do) Essentially provides framework for getting data from one code to another code2 needs information from code 1, it does that CHIMPS Question 1: Fast Heavily Parallel MPI Alternating Digital Tree Spatial Lookup Structure Think BST, but 3D Logarithmic Scaling ☺ What does CHIMPS do well? answers question 1 Distributed ADT ... which is ... CHIMPS Tri-linear Interpolation Inaccurate (2nd Order) ☹ only provide linear interpolation Later version of CHIMPS provides Higher-Order Interpolation Requires Extra Input Cumbersome Availability of Info AVUSINTERP AVUSINTERP Air Vehicle Unstructured/ Structured INTERPolation Tool Marshall Galbraith @ University of Cincinnati and US AFRL We'll get back to image on right: First implementation of an automatic, higher-order interpolation scheme (optional) Focus on Chimera and overset grid methods AVUSINTERP Answers Question 2: Interpolation Accuracy Linear Quadratic ☺ AVUSINTERP provides automatic higher-order interpolation Answer for question 2: Yields linear/quadratic interpolation So what's the drawback to just using AVUSINTERP? AVUSINTERP Interpolation Accuracy Linear Quadratic ... Serial Implementation ☹ Ellipses means that the method can handle it, but it was only implemented to third-order accuracy only serial implementation What did I do? What did I do? Wrote Python Module named SMBinterp GPL Why Python Popular scripting language Easily Extensible see Numpy/Scipy Wrote SMBinterp Python module Module is the python word for Library It is an Open Source library Why Python? Growing popularity in scientific community Computing in Science and Engineering dedicated an entire magazine on Python in SC I only had limited time to write my code: Very quick for prototyping and gluing define and emphasize importance of extensibility SMBinterp CHIMPS U AVUSINTERP AVUSINTERP++ Arbitrary Order of Interp goes to 11 Parallel Implementation Master/Worker kdtree Plugin system It is the union of positive characteristics of other two applications Uses same automatic interpolation numerical method However, this method was implemented in a dynamic way to allow for the library to provide results for orders greater than 3 Parallel, master/worker, uses kdtree (spatial data structure with performance characteristics similar to ADT) Made it as mesh agnostic as possible: implemented plugin system will now describe these thing individually, starting with Baker's method Keep in mind that I will first describe the attribute, and then present results on that attribute Baker's Method T.J. Baker -- passed Backbone of the Interpolation library Non-trivial to grok/implement Literature is not complete on how to implement this scheme (we'll get to pattern) those who have reviewed my thesis may argue with me, but having made an tractable explanation is part of my contribution Baker's Method Green=source, Red=dest Baker's Method Performs interpolation using a compact stencil of participating points define green/red ... blue is simplex: simplex is (triangle/tetrahedron) Baker’s interpolation method comprises the adjustment of a linear interpolation by a least squares estimate of higher-order terms in an approximate, interpolatory surface. In math: ... Baker's Method highlight constituent parts Starting with the Linear Interpolation Higher Orders and Dimensions details on this were previously sparse described in detail in McQuay2011 Baker: Linear 2D Linear system for linear portion looked like this in 2D 3D simply added another vertex to the simplex (tet in 3D) Baker: Linear 3D Baker: Error 2D, Quadratic quadratic products of basis functions if we want cubic we need triples ... Baker: Error 2D, Cubic What is this pattern!? That's what I asked myself Pattern inputs: simplex size, nu product is Cartesian product of 0,1,...,simplex_size with itself nu times (a,b,c) X (x,y) example filter duplicate / triplicate / etc terms sort so we can make unique make unique, and return Expensive, required for f() and each row in B: cache results! Memoization Memoization is a CS term for I've calculated this already, and here's the value values cached in this cache if they're not there, then call the real function Interpolation Scheme Results Performed a Mesh resolution study Gmsh Resolution 1 Gmsh Resolution 2 two other more fine grids Exact Function: needed smoothly varying test equation: choose carefully! if the test equation is linear or constant, results are meaningless (perfect) similar to what had been previously used (optional) artifact of mistyping original equation, still works Root Mean Square useful metric for determining error performance across entire domain that's just how you calculate it μ is number of points in destination mesh Total RMS Error Mesh spacing decreases, error decreases larger decrease of error for higher-order interps slope of line is indicative of numerical accuracy not perfectly matching, but by happy coincidence, it's better than expected also, better as ν is increased can change ν, what else can we change? Parameter Tweaking Represents single mesh resolution as expected, RMS of error is lower for higher ν sudden drop: i believe this is due to having a more complete picture in the B matrix. While the system will be complete regardless of number of participating points, there is more encoded information in the B and w matrices when using more points Generally better with higher Extra Points, but dangerous if too many slope up is important for ν == 2 Truthiness quadratic drops because of smoothing what does this excellence cost us? Timing spatial querying: Matrices have larger rank for higher dimensions scales exponentially by dimension scales linearly by number of points Timing worst-case 10X slowdown for tested parameters Distribution Scheme simplest words: Master/Worker Distribution Method walk through diagram Distribution Method Requests Queued into server.py Army of Minions Access to KDTree Keep Alive Interpolations Returned Request UID Distribution Scheme Results A few terms need to be explained to be able to discuss results Speedup p ⇒ number of procs T1 ⇒ Time to execute serial Tp ⇒ Time to execute on p procs as things speed up ideally, Sp should be line of slope 1 Distribution Method (Speedup) this is what we say ... to a point misleading: it looks like the 200 case is still awesome, but it's not how do we know? Efficiency Sp ⇒ Speedup p ⇒ number of procs 0 < Ep < 1 Efficiency is speedup normalized by proc count Typo I fixed last night in thesis Distribution Method (Efficiency) above 90% efficiency up to 180 participating nodes efficiency less than 1 indicative of problems, usually bottlenecks Bottleneck? Where is the bottleneck? I observed something interesting with CPU load of server.py: it was parallel: tons of threads, above 100% CPU utilization interesting: linear up to about ... guess ... 180 nodes, maxed out at 200% Bottleneck? server.py is current bottleneck: suggested for future work Plugin System Many, many types of meshes and packages SMBinterp can't be aware of all of them more importantly the notion of Pathological Configurations Pathological Configurations Collinear points to extended edge of simplex singular system can use pinv, but solution is non-unique To overcome this: introduced plugin system Code in SMBinterp for meshes does two things: vertex/cell connectivity for speed decide what points to use May define class that inherits from this super class and it can be used with SMBinterp algorithm Populate kdtree and connectivity info (rapid data lookup) default implementation for simplex and nearest points Provided Plugins left: canonical mesh example for gmsh representation of Delaunay triangulation from Wikipedia Benefits of Plugins Abstract Interface to Functionality Use All/Some/None of defaults Defaults are good, not universal Engineer Override Connectivity Alternative Tree Structure Simplex Selection Extra Point Selection Two provided plugins are approximately 90 LOC each Conclusions Conclusions Provided library performs Nth-order interpolation Parallel Implementation Plugin system Conclusions Implementation is accurate Not a silver bullet For tested equations: Cubic Quartic, 32 Extra points Future Work Future Work Real world application/Multiphysics simulation More Performant Message Queuing Library (0MQ) Adaptive Parameter Selection Pathological Configuration Acknowledgments Committee, especially Dr. Gorrell for his patience, and Dr. Jensen for his help securing project Trust me, Dr. Thomson, you didn't want to be heavily involved until this point: you can ask either of them :) Marshall Galbraith who helped explain things Pratt for Funding Open Source: Scipy,Numpy,Paul Rouget for slides My dad for all his nagging and help Vanessa for keeping our home together while I've tried to do both school and work fin fin "We're the people your decision will effect"