iSuppli: CCDs fall in image sensor market as CMOS surges. It is predicted that CCDs will share less than 5% of total shipment in 2014.
Finally, Fermi is powering the fastest supercomputers in the world, as expected. The new Tianhe-1A is using NVIDIA GPUs instead of ATI GPUs, which are found in the old Tianhe.
Can CPUs Match GPUs on Performance with Productivity?: Experiences with Optimizing a FLOP-intensive Application on CPUs and GPU, IBM Tech Report 2010. The plot looks similar to previous papers comparing CPUs and GPUs.
Stacked Silicon Interconnect Technology to appear in Virtex 7.
Motion history image: its variants and applications , in Machine Vision and Applications 2010.
Computer Graphics on a Stream Architecture, by John Owens in 2002. It has some nice arguments for time-multiplexed architectures for computer graphics. It claims the same motivation as NVIDIA’s unified shader approach, which lead to the GPU computing revolution.
Understanding throughput-oriented architectures, by Michael Garland and David B. Kirk from NVIDIA, in CACM Nov 2010. For educational purpose, this article is excellent. It emphasises more on historical aspect compared to previous articles from NVIDIA. It also has some pointers to classical works.
Tera-Scale Performance Machine Learning SoC (MLSoC) With Dual Stream Processor Architecture for Multimedia Content Analysis, in JSSC Nov. 2010. Their Machine Learning SoC (MLSoC) has impressive energy efficiency, compared with Xetal, KAIST vision chip, iVision, and CRISP.
Machine Vision and Applications has a special issue on Integrated Imaging and Vision Techniques for Industrial Inspection.
HPArch, research group lead by Hyesoon Kim.
Tom’s Hardware has a recent overview and benchmarking of the Radeon HD 6800 series. The die size of Radeon is 50% less than its GeFroce counterpart. Besides more architecture support for general purpose computing, GeForce spends more area on its scalar PE compared to Radeon’s VLIW PE with the same raw performance.
In the paper of Henry Wong and Tor M. Aamodt, “The Performance Potential for Single Application Heterogeneous Systems“, they touch the topic of VLIW PE in the context of read-after-write latency. However, it is unclear whether the VLIW is helping to improve computational density for their benchmarks.