New FFT development notes: The ideas I've developed for a new FFT have been tested on a GPU for a number of different 1D vector lengths. The speedups obtained were as follows: Length 99, speedup of 44.5x; length 199, speedup of 105.5 x; length 499, speedup of 207.7 x, and length 1023, speedup of 5.6x. Results of other speedups will be announced soon for the 2D FFT.
Dale Mugler’s Post
More Relevant Posts
-
A previous post where I shared some speedups for the 1D FFT have been extended now to the important 2D FFT for image analysis. For example, over a rectangular domain with odd side lengths, the new algorithms provide speedups from 22.4x to 81.1x for example regions of size 151x153 to size 991x993. These results are for GPU computations in comparison to the built-in fft algorithm. The next goal is to speed up 2D filtering!
To view or add a comment, sign in
-
⚡Fast Fluid Simulations with Sparse Volumes on the GPU⚡ City scene simulated with 29 million particles on a 512×256×512 grid with our spatially sparse, matrix-free FLIP solver on a Quadro GP100 GPU at an average 1.8 seconds/frame 📝Paper: https://bit.ly/3mN6TJ4 💻Code: https://lnkd.in/epCKUeJy #cfd #simulation #engineering
To view or add a comment, sign in
-
Tired of marching cubes producing skinny triangles and losing sharp geometric details? The NVIDIA Kaolin library for #3d #deeplearning and #AIresearch now includes FlexiCubes, a method for extracting meshes from scalar fields designated for gradient-based mesh optimization. Check out this tutorial: https://lnkd.in/g2zQP3aX
Mesh Optimization Using FlexiCubes with NVIDIA Kaolin Library v0.15.0
https://www.youtube.com/
To view or add a comment, sign in
-
⚡Fast Fluid Simulations with Sparse Volumes on the GPU⚡ City scene simulated with 29 million particles on a 512×256×512 grid with our spatially sparse, matrix-free FLIP solver on a Quadro GP100 GPU at an average 1.8 seconds/frame 📝Paper: https://bit.ly/3mN6TJ4 💻Code: https://lnkd.in/epCKUeJy 📥 Latest newsletter: https://bit.ly/41N6Ixq #engineering #cfd #simulation
To view or add a comment, sign in
-
#Ansys Discovery™ GPU meshing offers a powerful combination of robustness, speed, and efficient GPU memory use, ideal for complex fluid simulations. With local fidelity control, it enables smooth mesh transitions along boundaries to enhance resolution in intricate areas like thin fluid channels. Its fault-tolerant approach effectively manages overlapping, imperfect, and non-manifold geometries, while conformal mesh generation automatically synchronizes mesh sizes across interfaces, reducing the need for multiple assignments. 👉 Explore the latest in Ansys Discovery here: https://lnkd.in/dpUsr-ZN #Discovery #GPUMeshing #FluidSimulation #Engineering #Simulation
To view or add a comment, sign in
-
498 millions (one might even say half a billion) unknowns on 1000 cores with 16 TB RAM solved in < 6 minutes. All you need is your laptop browser! That breaks our internal record for 3D #EM waves #FEM frequency simulation of optical metalenses in Quanscient #Allsolve and certainly will help our customers design/understand even better their devices. Order 2 edges shape functions. Fully automatic unstructured DDM partitioning. No tricks or problem-specific optimizations involved (because you wouldn't believe the numbers if it was even faster).
To view or add a comment, sign in
-
I don't usually share R&D... but this is pretty cool. Implicit ray marching on GPU (proof-of-concept). Dynamic SDF blending of the DeathStar with a Torus. SDF and its gradients evaluated in a fragment.glsl shader... #xcompute #implicit #geometry #engineering
To view or add a comment, sign in
-
Introducing SWIR microscope system *This video shows pattern of silicon wafer from back side. *0sec: Back side Surface of silicon wafer *7sec : 1100nm’s pattern image *16sec : 1100nm’s pattern image after doing image processing. *27sec : 1300nm’s pattern image *38sec : 1550nm’s pattern image
Comparison images per wavelength
To view or add a comment, sign in
-
RISC-V Vector Processing leveraging Vector Length Agnosticism is a unique approach compared to how it's been approached by other CPU vendors. Worth a watch if you're not familiar with the concept. https://lnkd.in/ea2g4xiz
The Magic of RISC-V Vector Processing
https://www.youtube.com/
To view or add a comment, sign in
-
⚡OpenFOAM - CFD - Sport Car - Aerodynamic Flow - DDES-SA⚡ Example case showing the numerical simulation of the airflow near a car. The OpenFOAM solver used is pimpleFoam with DDES - SA turbulence model. Use was the so-called marineFoam solver, entirely developed at G-MET Technologies. - Mesh size is about 50 M cells. - CPU time: 50 h on 128 cores cluster with Infiniband BUS. 🌎 Source: https://lnkd.in/eUhxnfET 📥 Engineering & Science newsletter: https://lnkd.in/d7B7fqA #cfd #simulation #engineering
To view or add a comment, sign in
Professor of Mathematics at Georgia Southern University
6moThis is great, Dr. Mugler. Please keep the updates coming. Please let me know if you happen to visit the Savannah area. We will get together and catch up!