Most of computing’s carbon emissions are coming from manufacturing and infrastructure
Authors: Carole-Jean Wu, Udit Gupta
Date: March 2021
Reducing computing’s carbon footprint isn’t all about
optimizing hardware and software.
When it comes to reducing carbon emissions, tech companies have started
considering their complete carbon footprint. Since companies have stronger
operational control over their own facilities and energy procurement, many of
them have spent the last decade focusing on reducing their emissions related to
operational energy consumption (opex) and setting carbon neutral or net zero
operational goals. But as more companies are approaching their 100 percent
renewable energy targets, they’ve started looking into emissions related to
their value chain or capital energy consumption (capex) — indirect emissions
that come from hardware manufacturing and infrastructure.
Read more...
Optimizing infrastructure for neural recommendation at scale
Authors: Carole-Jean Wu, Udit Gupta
Date: March 2020
We are sharing an in-depth characterization and analysis for infrastructures used to deliver personalized results in deep neural network-based (DNN) recommendation at scale. Although DNNs are often used to help generate search results, to provide content suggestions, and for other common applications for internet services, relatively little research attention has been devoted to optimizing system infrastructures to serve such recommendations at scale. In addition to sharing insights about how this important class of neural recommendation models performs at production scale, we’ve also released
the open source workloads
and related performance metrics that we used, to help other researchers and engineers to evaluate their DNNs.
Read more...
Deep Learning: It’s Not All About Recognizing Cats and Dogs
Authors: Carole-Jean Wu, David Brooks, Udit Gupta, Hsien-Hsin Lee, and Kim Hazelwood
Date: November 2019
Recommendation systems form the backbone of most internet
services: search engines use recommendation to order results,
social networks to suggest friends and content, shopping
websites to suggest purchases, and video streaming services to
recommend movies [Facebook, Google, Alibaba, YouTube].
Recent publications show that an important class of
Facebook’s recommendation use cases require more than 10x the
datacenter inference capacity compared to common computer
vision and NLP tasks. In fact, major categories of
recommendation models account for over 70% of all AI inference
cycles in Facebook’s production datacenter.
In addition to their importance, DNN-based personalized
recommendation models porcess both continuous and categorical
input features leading to unique performance bottlenecks
compared to CNNs and RNNs.
Read more...
Designing AI-Enabled Technology for Society
Authors: Udit Gupta, Lillian Pentecost
Date: October 2018
Al-Enabled technology surrounds us in everyday life — from Face
ID on an iPhoneX to Google searches and tailored advertisements
sent from the cloud. This means AI is implemented everywhere —
from smart phones to data centers all over the globe. How are
these devices designed to support AI, and how does this change
our daily interactions with technology? In this talk, we will
use three examples (intelligent personal assistants, serving
online search requests, and medical imaging), to discuss how AI
is implemented and its impact on how we interact with
technology.
Read more...
Software-Programmable FPGAs
Authors: Udit Gupta
Date: June 2016
Modern workloads demand higher computational capabilities at
low power consumption and cost. As traditional multi-core
machines do not meet the growing computing requirements,
architects are exploring alternative approaches. One solution
is hardware specialization in the form of application specific
integrated circuits (ASICs) to perform tasks at higher
performance and lower power than software implementations. The
cost of developing custom ASICs, however, remains high.
Reconfigurable computing fabrics, such as field-programmable
gate arrays (FPGAs), offer a promising alternative to custom
ASICs. FPGAs couple the benefits of hardware acceleration with
flexibility and lower cost.
Read more...