NVIDIA Claims TSMC 20nm will not Scale?, in SemiWiki.com.
Reading CS classics, in Communications of the ACM April 2012. It points to some classic work in computer science. One of the classic work is the Turing Lecture of Dijkstra in 1972, titled “The Humble Programmer“. Most of the challenges mentioned by Dijkstra in 1972 are not yet solved in 2012. The problems only become worse.
Some interesting papers in Transactions on Architecture and Code Optimization (TACO) March 2012:
- When Prefetching Works, When It Doesn’t, and Why, by Jaekyu Lee, Kyesoon Kim and Richard Vuduc.
- A Massively Parallel, Energy Efficient Programmable Accelerator for Learning and Classification, the MAPLE accelerator from NEC.
- Comparability Graph Coloring for Optimizing Utilization of Software-Managed Stream Register Files for Stream Processors, from National University of Defense Technology in China.
MultiAmdahl: How Should I Divide My Heterogeneous Chip? (PDF), to appear in Computer Architecture Letters.
Crossbar NoCs Are Scalable Beyond 100 Nodes, in TCAD April 2012.
I have recently read the biography of Alan Greenspan, titled “The Age of Turbulence: Adventures in a New World“. The time span of this book stretches from the 1920s to the new millennium. It is shocking to realize how much the world has changed in the past century. Despite, or perhaps because of, that he is one of the most experienced central bankers alive, he is some how hesitating to predict the future. That’s why the subtitle is called the “adventures” in a new world.
Two recent blog posts in HBR:
VLSI Design of an SVM Learning Core on Sequential Minimal Optimization Algorithm, in TVLSI April 2012. The architecture is implemented and evaluated on an FPGA.
Frontiers of audiovisual communications: new convergences of broadband communications, computing, and rich media, a special issue in Proceedings of the IEEE April 2012. Some interesting papers:
- The Evolution of Video Processing Technology and Its Main Drivers, from Intel.
- Next-Generation Applications on Cellular Networks: Trends, Challenges, and Solutions, from Ericsson.
- Multimedia Standards: Interfaces to Innovation
- Display Holography’s Digital Second Act, from MIT Media Lab.
- FTV for 3-D Spatial Communication, from Nagoya U.
- Large-Scale Situation Awareness With Camera Networks and Multimodal Sensing
- Haptic Communications, from TU Munich.
Some interesting papers form Transactions on Pattern Analysis and Machine Intelligence (TPAMI) May 2012:
- Meaningful Matches in Stereovision
- High Accuracy and Visibility-Consistent Dense Multiview Stereo
- Gradient Response Maps for Real-Time Detection of Textureless Objects
- The Light Field Camera: Extended Depth of Field, Aliasing, and Superresolution
- Tracking Pedestrians Using Local Spatio-Temporal Motion Patterns in Extremely Crowded Scenes
Reducing DRAM Image Data Access Energy Consumption in Video Processing, in Transactions on Multimedia (TMM) April 2012.
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays (FPGA) 2012 is available. Some related papers:
- A lean FPGA soft processor built using a DSP block, from Nanyang Tech. U. and Xilinx.
- A scalable approach for automated precision analysis, from Imperial College London.
- OCTAVO: an FPGA-centric processor family. from U. of Toronto.
- Accelerator compiler for the VENICE vector processor, from U. of British Colombia.
- FPGA-accelerated 3D reconstruction using compressive sensing, from UCLA.
- Compiling high throughput network processors, from University of Texas.
- Impact of FPGA architecture on resource sharing in high-level synthesis, from U. of Toronto and Altera.
- The VTR project: architecture and CAD for FPGAs from verilog to routing, from U. of Toronto, CUHK, U. of British Columbia, U. of New Brunswick, and Miami U. The Verilog-to-Routing (VTR) Project for FPGAs is the Google Code page for its source code hosting.
- A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications, from U. of Florida.
- CONNECT: re-examining conventional wisdom for designing nocs in the context of FPGAs, from Carnegie Mellon U.
Performance of a Real-Time EtherCAT Master Under Linux, to appear in Transactions on Industrial Informatics. The jitters of the EtherCAT interface is very low, measured at approx. 15 micrometers. A GPU with pinned memory workload will significantly increase the jitters. A GPU with unpinned memory workload, however, has little impact on the jitter. The paper has not mentioned which northbridge chip set is used. It may have large impact on the jitters.
A random collection of interesting papers by Danny Abramovitch:
- A Tutorial on the Mechanisms, Dynamics, and Control of Atomic Force Microscopes, in ACC 2007.
- A Survey of Non-Raster Scan Methods with Application to Atomic Force Microscopy, in ACC 2007.
- Combined Feedforward/Feedback Control of Atomic Force Microscopes, in ACC 2007.
- Semi-automatic tuning of PID gains for Atomic Force Microscopes, in ACC 2008. It breaks down the contributions of different methods on the frequency response.
- A tale of three actuators: How mechanics, business models and position sensing affect different mechatronic servo problems, in ACC 2009.
- Low latency demodulation for Atomic Force Microscopes, Part I efficient real-time integration, in ACC 2009.
- Low Latency Demodulation for Atomic Force Microscopes, Part II: Efficient Calculation of Magnitude and Phase (PDF), in Proceedings of the 18th IFAC World Congress, 2011.
- A Discrete-Time Single-Parameter Combined Feedforward/Feedback Adaptive-Delay Algorithm With Applications to Piezo-Based Raster Tracking, in TCST March 2012.
NVIDIA provides the details of its new GeForce GTX 680 in a whitepaper. The emphasis on energy efficiency leads to two improtant architectural changes. The first change is the shift of the instruction scheduling from hardware to software. The second change is to lower the clock frequency of the processing elements, which in the old days is running at twice the clock frequency of the other components of the data path.
Are RTOSes Dead?, by Bryon Moyer in eejournal.com. The answer is “NO”, at least for the moment. There is room for Real Time Operating System (RTOS) to fill the gap between bare metal and real time Linux.