General-purpose computing on graphics processing units
Related publications (324)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Deep learning has revolutionalized image-level tasks such as classification, but patch-level tasks, such as correspondence, still rely on hand-crafted features, e.g. SIFT. In this paper we use Convolutional Neural Networks (CNNs) to learn discriminant patc ...
With the availability of user oriented software tools, dedicated architectures, such as the parallel computing platform and programming model CUDA (Compute Unified Device Architecture) released by NVIDIA, one of the main producers of graphics cards, and of ...
To address the classic interior tomography problem where projections at each view extend only to the shadow of a circular region completely interior to the subject being scanned, previously we showed that the exact recovery of two- and three-dimensional pi ...
The RSA cryptosystem introduced in 1977 by Ron Rivest, Adi Shamir and Len Adleman is the most commonly deployed public-key cryptosystem. Elliptic curve cryptography (ECC) introduced in the mid 80's by Neal Koblitz and Victor Miller is becoming an increasin ...
General-purpose processors, while tremendously versatile, pay a huge cost for their flexibility by wasting over 99% of the energy in programmability overheads. We observe that reducing this waste requires tuning data storage and compute structures and thei ...
All computing platforms, from mobile to supercomputers, are becoming more and more heterogeneous and massively parallel. While they can provide higher power efficiency and computation throughput, effective and confident use of these systems always requires ...
One of the problems proven to be NP-hard in the field of many-core architectures is the partitioning of stream programs. In order to maximize the execution parallelism and obtain the maximal data throughput for a streaming application it is essential to fi ...
Deformation and fracture processes in engineering materials often require simultaneous descriptions over a range of length and time scales, with each scale using a different computational technique. Here we present a high-performance parallel 3D computing ...
The floating occupation molecular orbital-complete active space configuration interaction (FOMO-CASCI) method is a promising alternative to the state-averaged complete active space self-consistent field (SA-CASSCF) method. We have formulated the analytic f ...
We present a numerical solver for the fast conduction system in the heart using both a CPU and a hybrid CPU/GPU implementation. To verify both implementations, we construct analytical solutions and show that the L-2-error is similar in both implementations ...