diff --git a/_quarto.yml b/_quarto.yml index b43e72ba0..b041b6775 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -244,7 +244,7 @@ format: reference-location: margin citation-location: margin - sidenote: true Enable sidenotes for Tufte style + sidenote: true #Enable sidenotes for Tufte style linkcolor: "#A51C30" urlcolor: "#A51C30" highlight-style: github diff --git a/contents/core/conclusion/conclusion.qmd b/contents/core/conclusion/conclusion.qmd index 9e5d5b1a7..bc2e83bbf 100644 --- a/contents/core/conclusion/conclusion.qmd +++ b/contents/core/conclusion/conclusion.qmd @@ -44,7 +44,7 @@ In addition to distributed training, we discussed techniques for optimizing the Deploying trained ML models is more complex than simply running the networks; efficiency is critical (@sec-efficient_ai). In this chapter on AI efficiency, we emphasized that efficiency is not merely a luxury but a necessity in artificial intelligence systems. We dug into the key concepts underpinning AI systems' efficiency, recognizing that the computational demands on neural networks can be daunting, even for minimal systems. For AI to be seamlessly integrated into everyday devices and essential systems, it must perform optimally within the constraints of limited resources while maintaining its efficacy. -Throughout the book, we have highlighted the importance of pursuing efficiency to ensure that AI models are streamlined, rapid, and sustainable. By optimizing models for efficiency, we can widen their applicability across various platforms and scenarios, enabling AI to be deployed in resource-constrained environments such as embedded systems and edge devices. This pursuit of efficiency is crucial for the widespread adoption and practical implementation of AI technologies in real-world applications. +Throughout the book, we have highlighted the importance of pursuing efficiency to ensure that AI models are streamlined, rapid, and sustainable. By optimizing models for efficiency, we can widen their applicability across various platforms and scenarios, enabling AI to be deployed in resource-constrained environments such as embedded systems and edge devices. This pursuit of efficiency is necessary for the widespread adoption and practical implementation of AI technologies in real-world applications. ## Optimizing ML Model Architectures @@ -90,7 +90,7 @@ In addition to security, we addressed the critical issue of data privacy. Techni ## Upholding Ethical Considerations -As we embrace ML advancements in all facets of our lives, it is crucial to remain mindful of the ethical considerations that will shape the future of AI (@sec-responsible_ai). Fairness, transparency, accountability, and privacy in AI systems will be paramount as they become more integrated into our lives and decision-making processes. +As we embrace ML advancements in all facets of our lives, it is essential to remain mindful of the ethical considerations that will shape the future of AI (@sec-responsible_ai). Fairness, transparency, accountability, and privacy in AI systems will be paramount as they become more integrated into our lives and decision-making processes. As AI systems become more pervasive and influential, it is important to ensure that they are designed and deployed in a manner that upholds ethical principles. This means actively mitigating biases, promoting fairness, and preventing discriminatory outcomes. Additionally, ethical AI design ensures transparency in how AI systems make decisions, enabling users to understand and trust their outputs. @@ -98,11 +98,11 @@ Accountability is another critical ethical consideration. As AI systems take on Ethical frameworks, regulations, and standards will be essential to address these ethical challenges. These frameworks should guide the responsible development and deployment of AI technologies, ensuring that they align with societal values and promote the well-being of individuals and communities. -Moreover, ongoing discussions and collaborations among researchers, practitioners, policymakers, and society will be crucial in navigating the ethical landscape of AI. These conversations should be inclusive and diverse, bringing together different perspectives and expertise to develop comprehensive and equitable solutions. As we move forward, it is the collective responsibility of all stakeholders to prioritize ethical considerations in the development and deployment of AI systems. +Moreover, ongoing discussions and collaborations among researchers, practitioners, policymakers, and society will be important in navigating the ethical landscape of AI. These conversations should be inclusive and diverse, bringing together different perspectives and expertise to develop comprehensive and equitable solutions. As we move forward, it is the collective responsibility of all stakeholders to prioritize ethical considerations in the development and deployment of AI systems. ## Promoting Sustainability -The increasing computational demands of machine learning, particularly for training large models, have raised concerns about their environmental impact due to high energy consumption and carbon emissions (@sec-sustainable_ai). As the scale and complexity of models continue to grow, addressing the sustainability challenges associated with AI development becomes imperative. To mitigate the environmental footprint of AI, the development of energy-efficient algorithms is crucial. This involves optimizing models and training procedures to minimize computational requirements while maintaining performance. Techniques such as model compression, quantization, and efficient neural architecture search can help reduce the energy consumption of AI systems. +The increasing computational demands of machine learning, particularly for training large models, have raised concerns about their environmental impact due to high energy consumption and carbon emissions (@sec-sustainable_ai). As the scale and complexity of models continue to grow, addressing the sustainability challenges associated with AI development becomes imperative. To mitigate the environmental footprint of AI, the development of energy-efficient algorithms is necessary. This involves optimizing models and training procedures to minimize computational requirements while maintaining performance. Techniques such as model compression, quantization, and efficient neural architecture search can help reduce the energy consumption of AI systems. Using renewable energy sources to power AI infrastructure is another important step towards sustainability. By transitioning to clean energy sources such as solar, wind, and hydropower, the carbon emissions associated with AI development can be significantly reduced. This requires a concerted effort from the AI community and support from policymakers and industry leaders to invest in and adopt renewable energy solutions. In addition, exploring alternative computing paradigms, such as neuromorphic and photonic computing, holds promise for developing more energy-efficient AI systems. By developing hardware and algorithms that emulate the brain's processing mechanisms, we can potentially create AI systems that are both powerful and sustainable. @@ -124,17 +124,17 @@ As we look to the future, the trajectory of ML systems points towards a paradigm We anticipate a growing emphasis on data curation, labeling, and augmentation techniques in the coming years. These practices aim to ensure that models are trained on high-quality, representative data that accurately reflects the complexities and nuances of real-world scenarios. By focusing on data quality and diversity, we can mitigate the risks of biased or skewed models that may perpetuate unfair or discriminatory outcomes. -This data-centric approach will be crucial in addressing the challenges of bias, fairness, and generalizability in ML systems. By actively seeking out and incorporating diverse and inclusive datasets, we can develop more robust, equitable, and applicable models for various contexts and populations. Moreover, the emphasis on data will drive advancements in techniques such as data augmentation, where existing datasets are expanded and diversified through data synthesis, translation, and generation. These techniques can help overcome the limitations of small or imbalanced datasets, enabling the development of more accurate and generalizable models. +This data-centric approach will be vital in addressing the challenges of bias, fairness, and generalizability in ML systems. By actively seeking out and incorporating diverse and inclusive datasets, we can develop more robust, equitable, and applicable models for various contexts and populations. Moreover, the emphasis on data will drive advancements in techniques such as data augmentation, where existing datasets are expanded and diversified through data synthesis, translation, and generation. These techniques can help overcome the limitations of small or imbalanced datasets, enabling the development of more accurate and generalizable models. -In recent years, generative AI has taken the field by storm, demonstrating remarkable capabilities in creating realistic images, videos, and text. However, the rise of generative AI also brings new challenges for ML systems (@sec-generative_ai). Unlike traditional ML systems, generative models often demand more computational resources and pose challenges in terms of scalability and efficiency. Furthermore, evaluating and benchmarking generative models presents difficulties, as traditional metrics used for classification tasks may not be directly applicable. Developing robust evaluation frameworks for generative models is an active area of research. +In recent years, generative AI has taken the field by storm, demonstrating remarkable capabilities in creating realistic images, videos, and text. However, the rise of generative AI also brings new challenges for ML sysatem. Unlike traditional ML systems, generative models often demand more computational resources and pose challenges in terms of scalability and efficiency. Furthermore, evaluating and benchmarking generative models presents difficulties, as traditional metrics used for classification tasks may not be directly applicable. Developing robust evaluation frameworks for generative models is an active area of research, and something we hope to write about soon! -Understanding and addressing these system challenges and ethical considerations will be crucial in shaping the future of generative AI and its impact on society. As ML practitioners and researchers, we are responsible for advancing the technical capabilities of generative models and developing robust systems and frameworks that can mitigate potential risks and ensure the beneficial application of this powerful technology. +Understanding and addressing these system challenges and ethical considerations will be important in shaping the future of generative AI and its impact on society. As ML practitioners and researchers, we are responsible for advancing the technical capabilities of generative models and developing robust systems and frameworks that can mitigate potential risks and ensure the beneficial application of this powerful technology. ## Applying AI for Good The potential for AI to be used for social good is vast, provided that responsible ML systems are developed and deployed at scale across various use cases (@sec-ai_for_good). To realize this potential, it is essential for researchers and practitioners to actively engage in the process of learning, experimentation, and pushing the boundaries of what is possible. -Throughout the development of ML systems, it is crucial to remember the key themes and lessons explored in this book. These include the importance of data quality and diversity, the pursuit of efficiency and robustness, the potential of TinyML and neuromorphic computing, and the imperative of security and privacy. These insights inform the work and guide the decisions of those involved in developing AI systems. +Throughout the development of ML systems, it is important to remember the key themes and lessons explored in this book. These include the importance of data quality and diversity, the pursuit of efficiency and robustness, the potential of TinyML and neuromorphic computing, and the imperative of security and privacy. These insights inform the work and guide the decisions of those involved in developing AI systems. It is important to recognize that the development of AI is not solely a technical endeavor but also a deeply human one. It requires collaboration, empathy, and a commitment to understanding the societal implications of the systems being created. Engaging with experts from diverse fields, such as ethics, social sciences, and policy, is essential to ensure that the AI systems developed are technically sound, socially responsible, and beneficial. Embracing the opportunity to be part of this transformative field and shaping its future is a privilege and a responsibility. By working together, we can create a world where ML systems serve as tools for positive change and improving the human condition. diff --git a/contents/core/dnn_architectures/dnn_architectures.qmd b/contents/core/dnn_architectures/dnn_architectures.qmd index 00b849fb3..8ac35727b 100644 --- a/contents/core/dnn_architectures/dnn_architectures.qmd +++ b/contents/core/dnn_architectures/dnn_architectures.qmd @@ -38,7 +38,7 @@ Deep learning architecture stands for specific representation or organizations o Neural network architectures have evolved to address specific pattern processing challenges. Whether processing arbitrary feature relationships, exploiting spatial patterns, managing temporal dependencies, or handling dynamic information flow, each architectural pattern emerged from particular computational needs. These architectures, from a computer systems perspective, require an examination of how their computational patterns map to system resources. -Most often the architectures are discussed in terms of their algorithmic structures (MLPs, CNNs, RNNs, Transformers). However, in this chapter we take a more fundamental approach by examining how their computational patterns map to hardware resources. Each section analyzes how specific pattern processing needs influence algorithmic structure and how these structures map to computer system resources. The implications for computer system design require examining how their computational patterns map to hardware resources. The mapping from algorithmic requirements to computer system design involves several key considerations: +Most often the architectures are discussed in terms of their algorithmic structures (MLPs, CNNs, RNNs, Transformers). However, in this chapter we take a more fundamental approach by examining how their computational patterns map to hardware resources. Each section analyzes how specific Pattern Processing Needss influence algorithmic structure and how these structures map to computer system resources. The implications for computer system design require examining how their computational patterns map to hardware resources. The mapping from algorithmic requirements to computer system design involves several key considerations: 1. Memory access patterns: How data moves through the memory hierarchy 2. Computation characteristics: The nature and organization of arithmetic operations @@ -53,7 +53,7 @@ Multi-Layer Perceptrons (MLPs) represent the most direct extension of neural net When applied to the MNIST handwritten digit recognition challenge, an MLP reveals its computational power by transforming a complex 28×28 pixel image into a precise digit classification. By treating each of the 784 pixels as an equally weighted input, the network learns to decompose visual information through a systematic progression of layers, converting raw pixel intensities into increasingly abstract representations that capture the essential characteristics of handwritten digits. -### Pattern Processing Need +### Pattern Processing Needs Deep learning systems frequently encounter problems where any input feature could potentially influence any output---there are no inherent constraints on these relationships. Consider analyzing financial market data: any economic indicator might affect any market outcome or in natural language processing, where the meaning of a word could depend on any other word in the sentence. These scenarios demand an architectural pattern capable of learning arbitrary relationships across all input features. @@ -185,7 +185,7 @@ We've just scratched the surface of neural networks. Now, you'll get to try and While MLPs treat each input element independently, many real-world data types exhibit strong spatial relationships. Images, for example, derive their meaning from the spatial arrangement of pixels—a pattern of edges and textures that form recognizable objects. Audio signals show temporal patterns of frequency components, and sensor data often contains spatial or temporal correlations. These spatial relationships suggest that treating every input-output connection with equal importance, as MLPs do, might not be the most effective approach. -### Pattern Processing Need +### Pattern Processing Needs Spatial pattern processing addresses scenarios where the relationship between data points depends on their relative positions or proximity. Consider processing a natural image: a pixel's relationship with its neighbors is important for detecting edges, textures, and shapes. These local patterns then combine hierarchically to form more complex features—edges form shapes, shapes form objects, and objects form scenes. @@ -309,7 +309,7 @@ The predictable spatial access pattern enables strategic data movement optimizat While MLPs handle arbitrary relationships and CNNs process spatial patterns, many real-world problems involve sequential data where the order and relationship between elements over time matters. Text processing requires understanding how words relate to previous context, speech recognition needs to track how sounds form coherent patterns, and time-series analysis must capture how values evolve over time. These sequential relationships suggest that treating each time step independently misses crucial temporal patterns. -### Pattern Processing Need +### Pattern Processing Needs Sequential pattern processing addresses scenarios where the meaning of current input depends on what came before it. Consider natural language processing: the meaning of a word often depends heavily on previous words in the sentence. The word "bank" means something different in "river bank" versus "bank account." Similarly, in speech recognition, a phoneme's interpretation often depends on surrounding sounds, and in financial forecasting, future predictions require understanding patterns in historical data. @@ -417,7 +417,7 @@ These characteristics illustrate why different optimization strategies have evol While previous architectures process patterns in fixed ways—MLPs with dense connectivity, CNNs with spatial operations, and RNNs with sequential updates—many tasks require dynamic relationships between elements that change based on content. Language understanding, for instance, needs to capture relationships between words that depend on meaning rather than just position. Graph analysis requires understanding connections that vary by node. These dynamic relationships suggest we need an architecture that can learn and adapt its processing patterns based on the data itself. -### Pattern Processing Need +### Pattern Processing Needs Dynamic pattern processing addresses scenarios where relationships between elements aren't fixed by architecture but instead emerge from content. Consider language translation: when translating "the bank by the river," understanding "bank" requires attending to "river," but in "the bank approved the loan," the important relationship is with "approved" and "loan." Unlike RNNs that process information sequentially or CNNs that use fixed spatial patterns, we need an architecture that can dynamically determine which relationships matter. @@ -679,22 +679,6 @@ Dynamic computation, where the operation itself depends on the input data, emerg These primitives combine in sophisticated ways in modern architectures. A Transformer layer processing a sequence of 512 tokens demonstrates this clearly: it uses matrix multiplications for feature projections (512×512 operations implemented through tensor cores), may employ sliding windows for efficient attention over long sequences (using specialized memory access patterns for local regions), and requires dynamic computation for attention weights (computing 512×512 attention patterns at runtime). The way these primitives interact creates specific demands on system design---from memory hierarchy organization to computation scheduling. -Different neural network architectures leverage these core computational primitives in varying ways, as illustrated in @tbl-nn-arch-primitives: - -+----------------+-----------------------+----------------------------+-----------------------------+--------------------------+ -| Primitive Type | MLP | CNN | RNN | Transformer | -+:===============+:======================+:===========================+:============================+:=========================+ -| Computational | Matrix Multiplication | Convolution (Matrix Mult.) | Matrix Mult. + State Update | Matrix Mult. + Attention | -+----------------+-----------------------+----------------------------+-----------------------------+--------------------------+ -| Memory Access | Sequential | Strided | Sequential + Random | Random (Attention) | -+----------------+-----------------------+----------------------------+-----------------------------+--------------------------+ -| Data Movement | Broadcast | Sliding Window | Sequential | Broadcast + Gather | -+----------------+-----------------------+----------------------------+-----------------------------+--------------------------+ - -: Utilization of primitives across neural network architectures. {#tbl-nn-arch-primitives .hover .striped} - -@tbl-nn-arch-primitives highlights how the fundamental operations we've discussed manifest in different architectures, showcasing both the commonalities and differences in their computational needs. For instance, while all architectures rely on matrix multiplication as a core computational primitive, they differ significantly in their memory access and data movement patterns. - The building blocks we've discussed help explain why certain hardware features exist (like tensor cores for matrix multiplication) and why software frameworks organize computations in particular ways (like batching similar operations together). As we move from computational primitives to consider memory access and data movement patterns, it's important to recognize how these fundamental operations shape the demands placed on memory systems and data transfer mechanisms. The way computational primitives are implemented and combined has direct implications for how data needs to be stored, accessed, and moved within the system. ### Memory Access Primitives