Advertisement
data science the mit press essential knowledge series: Data Science John D. Kelleher, Brendan Tierney, 2018-04-13 A concise introduction to the emerging field of data science, explaining its evolution, relation to machine learning, current uses, data infrastructure issues, and ethical challenges. The goal of data science is to improve decision making through the analysis of data. Today data science determines the ads we see online, the books and movies that are recommended to us online, which emails are filtered into our spam folders, and even how much we pay for health insurance. This volume in the MIT Press Essential Knowledge series offers a concise introduction to the emerging field of data science, explaining its evolution, current uses, data infrastructure issues, and ethical challenges. It has never been easier for organizations to gather, store, and process data. Use of data science is driven by the rise of big data and social media, the development of high-performance computing, and the emergence of such powerful methods for data analysis and modeling as deep learning. Data science encompasses a set of principles, problem definitions, algorithms, and processes for extracting non-obvious and useful patterns from large datasets. It is closely related to the fields of data mining and machine learning, but broader in scope. This book offers a brief history of the field, introduces fundamental data concepts, and describes the stages in a data science project. It considers data infrastructure and the challenges posed by integrating data from multiple sources, introduces the basics of machine learning, and discusses how to link machine learning expertise with real-world problems. The book also reviews ethical and legal issues, developments in data regulation, and computational approaches to preserving privacy. Finally, it considers the future impact of data science and offers principles for success in data science projects. |
data science the mit press essential knowledge series: Deep Learning John D. Kelleher, 2019-09-10 An accessible introduction to the artificial intelligence technology that enables computer vision, speech recognition, machine translation, and driverless cars. Deep learning is an artificial intelligence technology that enables computer vision, speech recognition in mobile phones, machine translation, AI games, driverless cars, and other applications. When we use consumer products from Google, Microsoft, Facebook, Apple, or Baidu, we are often interacting with a deep learning system. In this volume in the MIT Press Essential Knowledge series, computer scientist John Kelleher offers an accessible and concise but comprehensive introduction to the fundamental technology at the heart of the artificial intelligence revolution. Kelleher explains that deep learning enables data-driven decisions by identifying and extracting patterns from large datasets; its ability to learn from complex data makes deep learning ideally suited to take advantage of the rapid growth in big data and computational power. Kelleher also explains some of the basic concepts in deep learning, presents a history of advances in the field, and discusses the current state of the art. He describes the most important deep learning architectures, including autoencoders, recurrent neural networks, and long short-term networks, as well as such recent developments as Generative Adversarial Networks and capsule networks. He also provides a comprehensive (and comprehensible) introduction to the two fundamental algorithms in deep learning: gradient descent and backpropagation. Finally, Kelleher considers the future of deep learning—major trends, possible developments, and significant challenges. |
data science the mit press essential knowledge series: Introduction to Machine Learning Ethem Alpaydin, 2014-08-22 Introduction -- Supervised learning -- Bayesian decision theory -- Parametric methods -- Multivariate methods -- Dimensionality reduction -- Clustering -- Nonparametric methods -- Decision trees -- Linear discrimination -- Multilayer perceptrons -- Local models -- Kernel machines -- Graphical models -- Brief contents -- Hidden markov models -- Bayesian estimation -- Combining multiple learners -- Reinforcement learning -- Design and analysis of machine learning experiments. |
data science the mit press essential knowledge series: Data Feminism Catherine D'Ignazio, Lauren F. Klein, 2020-03-31 A new way of thinking about data science and data ethics that is informed by the ideas of intersectional feminism. Today, data science is a form of power. It has been used to expose injustice, improve health outcomes, and topple governments. But it has also been used to discriminate, police, and surveil. This potential for good, on the one hand, and harm, on the other, makes it essential to ask: Data science by whom? Data science for whom? Data science with whose interests in mind? The narratives around big data and data science are overwhelmingly white, male, and techno-heroic. In Data Feminism, Catherine D'Ignazio and Lauren Klein present a new way of thinking about data science and data ethics—one that is informed by intersectional feminist thought. Illustrating data feminism in action, D'Ignazio and Klein show how challenges to the male/female binary can help challenge other hierarchical (and empirically wrong) classification systems. They explain how, for example, an understanding of emotion can expand our ideas about effective data visualization, and how the concept of invisible labor can expose the significant human efforts required by our automated systems. And they show why the data never, ever “speak for themselves.” Data Feminism offers strategies for data scientists seeking to learn how feminism can help them work toward justice, and for feminists who want to focus their efforts on the growing field of data science. But Data Feminism is about much more than gender. It is about power, about who has it and who doesn't, and about how those differentials of power can be challenged and changed. |
data science the mit press essential knowledge series: AI Ethics Mark Coeckelbergh, 2020-04-07 This overview of the ethical issues raised by artificial intelligence moves beyond hype and nightmare scenarios to address concrete questions—offering a compelling, necessary read for our ChatGPT era. Artificial intelligence powers Google’s search engine, enables Facebook to target advertising, and allows Alexa and Siri to do their jobs. AI is also behind self-driving cars, predictive policing, and autonomous weapons that can kill without human intervention. These and other AI applications raise complex ethical issues that are the subject of ongoing debate. This volume in the MIT Press Essential Knowledge series offers an accessible synthesis of these issues. Written by a philosopher of technology, AI Ethics goes beyond the usual hype and nightmare scenarios to address concrete questions. Mark Coeckelbergh describes influential AI narratives, ranging from Frankenstein’s monster to transhumanism and the technological singularity. He surveys relevant philosophical discussions: questions about the fundamental differences between humans and machines and debates over the moral status of AI. He explains the technology of AI, describing different approaches and focusing on machine learning and data science. He offers an overview of important ethical issues, including privacy concerns, responsibility and the delegation of decision making, transparency, and bias as it arises at all stages of data science processes. He also considers the future of work in an AI economy. Finally, he analyzes a range of policy proposals and discusses challenges for policymakers. He argues for ethical practices that embed values in design, translate democratic values into practices and include a vision of the good life and the good society. |
data science the mit press essential knowledge series: Machine Learning Ethem Alpaydin, 2016-10-07 A concise overview of machine learning—computer programs that learn from data—which underlies applications that include recommendation systems, face recognition, and driverless cars. Today, machine learning underlies a range of applications we use every day, from product recommendations to voice recognition—as well as some we don't yet use everyday, including driverless cars. It is the basis of the new approach in computing where we do not write programs but collect data; the idea is to learn the algorithms for the tasks automatically from data. As computing devices grow more ubiquitous, a larger part of our lives and work is recorded digitally, and as “Big Data” has gotten bigger, the theory of machine learning—the foundation of efforts to process that data into knowledge—has also advanced. In this book, machine learning expert Ethem Alpaydin offers a concise overview of the subject for the general reader, describing its evolution, explaining important learning algorithms, and presenting example applications. Alpaydin offers an account of how digital technology advanced from number-crunching mainframes to mobile devices, putting today's machine learning boom in context. He describes the basics of machine learning and some applications; the use of machine learning algorithms for pattern recognition; artificial neural networks inspired by the human brain; algorithms that learn associations between instances, with such applications as customer segmentation and learning recommendations; and reinforcement learning, when an autonomous agent learns act so as to maximize reward and minimize penalty. Alpaydin then considers some future directions for machine learning and the new field of “data science,” and discusses the ethical and legal implications for data privacy and security. |
data science the mit press essential knowledge series: Fundamentals of Machine Learning for Predictive Data Analytics, second edition John D. Kelleher, Brian Mac Namee, Aoife D'Arcy, 2020-10-20 The second edition of a comprehensive introduction to machine learning approaches used in predictive data analytics, covering both theory and practice. Machine learning is often used to build predictive models by extracting patterns from large datasets. These models are used in predictive data analytics applications including price prediction, risk assessment, predicting customer behavior, and document classification. This introductory textbook offers a detailed and focused treatment of the most important machine learning approaches used in predictive data analytics, covering both theoretical concepts and practical applications. Technical and mathematical material is augmented with explanatory worked examples, and case studies illustrate the application of these models in the broader business context. This second edition covers recent developments in machine learning, especially in a new chapter on deep learning, and two new chapters that go beyond predictive analytics to cover unsupervised learning and reinforcement learning. |
data science the mit press essential knowledge series: Recommendation Engines Michael Schrage, 2020-09-01 How companies like Amazon, Netflix, and Spotify know what you might also like: the history, technology, business, and societal impact of online recommendation engines. Increasingly, our technologies are giving us better, faster, smarter, and more personal advice than our own families and best friends. Amazon already knows what kind of books and household goods you like and is more than eager to recommend more; YouTube and TikTok always have another video lined up to show you; Netflix has crunched the numbers of your viewing habits to suggest whole genres that you would enjoy. In this volume in the MIT Press's Essential Knowledge series, innovation expert Michael Schrage explains the origins, technologies, business applications, and increasing societal impact of recommendation engines, the systems that allow companies worldwide to know what products, services, and experiences you might also like. |
data science the mit press essential knowledge series: Annotation Remi H. Kalir, Antero Garcia, 2021-04-06 An introduction to annotation as a genre--a synthesis of reading, thinking, writing, and communication--and its significance in scholarship and everyday life. Annotation--the addition of a note to a text--is an everyday and social activity that provides information, shares commentary, sparks conversation, expresses power, and aids learning. It helps mediate the relationship between reading and writing. This volume in the MIT Press Essential Knowledge series offers an introduction to annotation and its literary, scholarly, civic, and everyday significance across historical and contemporary contexts. It approaches annotation as a genre--a synthesis of reading, thinking, writing, and communication--and offer examples of annotation that range from medieval rubrication and early book culture to data labeling and online reviews. |
data science the mit press essential knowledge series: Algorithms Panos Louridas, 2020-08-18 In the tradition of Real World Algorithms: A Beginner's Guide, Panos Louridas is back to introduce algorithms in an accessible manner, utilizing various examples to explain not just what algorithms are but how they work. Digital technology runs on algorithms, sets of instructions that describe how to do something efficiently. Application areas range from search engines to tournament scheduling, DNA sequencing, and machine learning. Arguing that every educated person today needs to have some understanding of algorithms and what they do, in this volume in the MIT Press Essential Knowledge series, Panos Louridas offers an introduction to algorithms that is accessible to the nonspecialist reader. Louridas explains not just what algorithms are but also how they work, offering a wide range of examples and keeping mathematics to a minimum. |
data science the mit press essential knowledge series: Critical Thinking Jonathan Haber, 2020-04-07 An insightful guide to the practice, teaching, and history of critical thinking—from Aristotle and Plato to Thomas Dewey—for teachers, students, and anyone looking to hone their critical thinking skills. Critical thinking is regularly cited as an essential 21st century skill, the key to success in school and work. Given the propensity to believe fake news, draw incorrect conclusions, and make decisions based on emotion rather than reason, it might even be said that critical thinking is vital to the survival of a democratic society. But what, exactly, is critical thinking? Jonathan Haber explains how the concept of critical thinking emerged, how it has been defined, and how critical thinking skills can be taught and assessed. Haber describes the term's origins in such disciplines as philosophy, psychology, and science. He examines the components of critical thinking, including • structured thinking • language skills • background knowledge • information literacy • intellectual humility • empathy and open-mindedness Haber argues that the most important critical thinking issue today is that not enough people are doing enough of it. Fortunately, critical thinking can be taught, practiced, and evaluated. This book offers a guide for teachers, students, and aspiring critical thinkers everywhere, including advice for educational leaders and policy makers on how to make the teaching and learning of critical thinking an educational priority and practical reality. |
data science the mit press essential knowledge series: Behavioral Insights Michael Hallsworth, Elspeth Kirkman, 2020-09-01 The definitive introduction to the behavioral insights approach, which applies evidence about human behavior to practical problems. Our behavior is strongly influenced by factors that lie outside our conscious awareness, although we tend to underestimate the power of this “automatic” side of our behavior. As a result, governments make ineffective policies, businesses create bad products, and individuals make unrealistic plans. In contrast, the behavioral insights approach applies evidence about actual human behavior—rather than assumptions about it—to practical problems. This volume in the MIT Press Essential Knowledge series, written by two leading experts in the field, offers an accessible introduction to behavioral insights, describing core features, origins, and practical examples. These insights have opened up new ways of addressing some of the biggest challenges faced by societies, changing the way that governments, businesses, and nonprofits work in the process. This book shows how the approach is grounded in a concern with practical problems, the use of evidence about human behavior to address those problems, and experimentation to evaluate the impact of the solutions. It gives an overview of the approach's origins in psychology and behavioral economics, its early adoption by the UK's pioneering “nudge unit,” and its recent expansion into new areas. The book also provides examples from across different policy areas and guidance on how to run a behavioral insights project. Finally, the book outlines the limitations and ethical implications of the approach, and what the future holds for this fast-moving area. |
data science the mit press essential knowledge series: Deep Learning Ian Goodfellow, Yoshua Bengio, Aaron Courville, 2016-11-10 An introduction to a broad range of topics in deep learning, covering mathematical and conceptual background, deep learning techniques used in industry, and research perspectives. “Written by three experts in the field, Deep Learning is the only comprehensive book on the subject.” —Elon Musk, cochair of OpenAI; cofounder and CEO of Tesla and SpaceX Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors. |
data science the mit press essential knowledge series: Metadata Jeffrey Pomerantz, 2015-11-06 Everything we need to know about metadata, the usually invisible infrastructure for information with which we interact every day. When “metadata” became breaking news, appearing in stories about surveillance by the National Security Agency, many members of the public encountered this once-obscure term from information science for the first time. Should people be reassured that the NSA was “only” collecting metadata about phone calls—information about the caller, the recipient, the time, the duration, the location—and not recordings of the conversations themselves? Or does phone call metadata reveal more than it seems? In this book, Jeffrey Pomerantz offers an accessible and concise introduction to metadata. In the era of ubiquitous computing, metadata has become infrastructural, like the electrical grid or the highway system. We interact with it or generate it every day. It is not, Pomerantz tell us, just “data about data.” It is a means by which the complexity of an object is represented in a simpler form. For example, the title, the author, and the cover art are metadata about a book. When metadata does its job well, it fades into the background; everyone (except perhaps the NSA) takes it for granted. Pomerantz explains what metadata is, and why it exists. He distinguishes among different types of metadata—descriptive, administrative, structural, preservation, and use—and examines different users and uses of each type. He discusses the technologies that make modern metadata possible, and he speculates about metadata's future. By the end of the book, readers will see metadata everywhere. Because, Pomerantz warns us, it's metadata's world, and we are just living in it. |
data science the mit press essential knowledge series: Smart Cities Germaine Halegoua, 2020-02-18 Key concepts, definitions, examples, and historical contexts for understanding smart cities, along with discussions of both drawbacks and benefits of this approach to urban problems. Over the past ten years, urban planners, technology companies, and governments have promoted smart cities with a somewhat utopian vision of urban life made knowable and manageable through data collection and analysis. Emerging smart cities have become both crucibles and showrooms for the practical application of the Internet of Things, cloud computing, and the integration of big data into everyday life. Are smart cities optimized, sustainable, digitally networked solutions to urban problems? Or are they neoliberal, corporate-controlled, undemocratic non-places? This volume in the MIT Press Essential Knowledge series offers a concise introduction to smart cities, presenting key concepts, definitions, examples, and historical contexts, along with discussions of both the drawbacks and the benefits of this approach to urban life. After reviewing current terminology and justifications employed by technology designers, journalists, and researchers, the book describes three models for smart city development—smart-from-the-start cities, retrofitted cities, and social cities—and offers examples of each. It covers technologies and methods, including sensors, public wi-fi, big data, and smartphone apps, and discusses how developers conceive of interactions among the built environment, technological and urban infrastructures, citizens, and citizen engagement. Throughout, the author—who has studied smart cities around the world—argues that smart city developers should work more closely with local communities, recognizing their preexisting relationship to urban place and realizing the limits of technological fixes. Smartness is a means to an end: improving the quality of urban life. |
data science the mit press essential knowledge series: Self-Tracking Gina Neff, Dawn Nafus, 2016-06-24 What happens when people turn their everyday experience into data: an introduction to the essential ideas and key challenges of self-tracking. People keep track. In the eighteenth century, Benjamin Franklin kept charts of time spent and virtues lived up to. Today, people use technology to self-track: hours slept, steps taken, calories consumed, medications administered. Ninety million wearable sensors were shipped in 2014 to help us gather data about our lives. This book examines how people record, analyze, and reflect on this data, looking at the tools they use and the communities they become part of. Gina Neff and Dawn Nafus describe what happens when people turn their everyday experience—in particular, health and wellness-related experience—into data, and offer an introduction to the essential ideas and key challenges of using these technologies. They consider self-tracking as a social and cultural phenomenon, describing not only the use of data as a kind of mirror of the self but also how this enables people to connect to, and learn from, others. Neff and Nafus consider what's at stake: who wants our data and why; the practices of serious self-tracking enthusiasts; the design of commercial self-tracking technology; and how self-tracking can fill gaps in the healthcare system. Today, no one can lead an entirely untracked life. Neff and Nafus show us how to use data in a way that empowers and educates. |
data science the mit press essential knowledge series: Computational Thinking Peter J. Denning, Matti Tedre, 2019-05-14 This pocket-sized introduction to computational thinking and problem-solving traces its genealogy centuries before the digital computer. A few decades into the digital era, scientists discovered that thinking in terms of computation made possible an entirely new way of organizing scientific investigation. Eventually, every field had a computational branch: computational physics, computational biology, computational sociology. More recently, “computational thinking” has become part of the K–12 curriculum. But what is computational thinking? This volume in the MIT Press Essential Knowledge series offers an accessible overview—tracing a genealogy that begins centuries before digital computers and portraying computational thinking as the pioneers of computing have described it. The authors explain that computational thinking (CT) is not a set of concepts for programming; it is a way of thinking that is honed through practice: the mental skills for designing computations to do jobs for us, and for explaining and interpreting the world as a complex of information processes. Mathematically trained experts (known as “computers”) who performed complex calculations as teams engaged in CT long before electronic computers. In each chapter, the author identify different dimensions of today's highly developed CT: • Computational Methods • Computing Machines • Computing Education • Software Engineering • Computational Science • Design Along the way, they debunk inflated claims for CT and computation while making clear the power of CT in all its complexity and multiplicity. |
data science the mit press essential knowledge series: Cloud Computing Nayan B. Ruparelia, 2016-05-13 Why cloud computing represents a paradigm shift for business, and how business users can best take advantage of cloud services. Most of the information available on cloud computing is either highly technical, with details that are irrelevant to non-technologists, or pure marketing hype, in which the cloud is simply a selling point. This book, however, explains the cloud from the user's viewpoint—the business user's in particular. Nayan Ruparelia explains what the cloud is, when to use it (and when not to), how to select a cloud service, how to integrate it with other technologies, and what the best practices are for using cloud computing. Cutting through the hype, Ruparelia cites the simple and basic definition of cloud computing from the National Institute of Science and Technology: a model enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources. Thus with cloud computing, businesses can harness information technology resources usually available only to large enterprises. And this, Ruparelia demonstrates, represents a paradigm shift for business. It will ease funding for startups, alter business plans, and allow big businesses greater agility. Ruparelia discusses the key issues for any organization considering cloud computing: service level agreements, business service delivery and consumption, finance, legal jurisdiction, security, and social responsibility. He introduces novel concepts made possible by cloud computing: cloud cells, or specialist clouds for specific uses; the personal cloud; the cloud of things; and cloud service exchanges. He examines use case patterns in terms of infrastructure and platform, software information, and business process; and he explains how to transition to a cloud service. Current and future users will find this book an indispensable guide to the cloud. |
data science the mit press essential knowledge series: Science Fiction Sherryl Vint, 2021-02-16 How science fiction has been a tool for understanding and living through rapid technological change. The world today seems to be slipping into a science fiction future. We have phones that speak to us, cars that drive themselves, and connected devices that communicate with each other in languages we don't understand. Depending the news of the day, we inhabit either a technological utopia or Brave New World nightmare. This volume in the MIT Press Essential Knowledge surveys the uses of science fiction. It focuses on what is at the core of all definitions of science fiction: a vision of the world made otherwise and what possibilities might flow from such otherness. |
data science the mit press essential knowledge series: fMRI Peter A. Bandettini, 2020-02-25 An accessible introduction to the history, fundamental concepts, challenges, and controversies of the fMRI by one of the pioneers in the field. The discovery of functional MRI (fMRI) methodology in 1991 was a breakthrough in neuroscience research. This non-invasive, relatively high-speed, and high sensitivity method of mapping human brain activity enabled observation of subtle localized changes in blood flow associated with brain activity. Thousands of scientists around the world have not only embraced fMRI as a new and powerful method that complemented their ongoing studies but have also gone on to redirect their research around this revolutionary technique. This volume in the MIT Press Essential Knowledge series offers an accessible introduction to the history, fundamental concepts, challenges, and controversies of fMRI, written by one of the pioneers in the field. Peter Bandettini covers the essentials of fMRI, providing insight and perspective from his nearly three decades of research. He describes other brain imaging and assessment methods; the sources of fMRI contrasts; the basic methodology, from hardware to pulse sequences; brain activation experiment design strategies; and data and image processing. A unique, standalone chapter addresses major controversies in the field, outlining twenty-six challenges that have helped shape fMRI research. Finally, Bandettini lays out the four essential pillars of fMRI: technology, methodology, interpretation, and applications. The book can serve as a guide for the curious nonexpert and a reference for both veteran and novice fMRI scientists. |
data science the mit press essential knowledge series: Principles of Data Mining David J. Hand, Heikki Mannila, Padhraic Smyth, 2001-08-17 The first truly interdisciplinary text on data mining, blending the contributions of information science, computer science, and statistics. The growing interest in data mining is motivated by a common problem across disciplines: how does one store, access, model, and ultimately describe and understand very large data sets? Historically, different aspects of data mining have been addressed independently by different disciplines. This is the first truly interdisciplinary text on data mining, blending the contributions of information science, computer science, and statistics. The book consists of three sections. The first, foundations, provides a tutorial overview of the principles underlying data mining algorithms and their application. The presentation emphasizes intuition rather than rigor. The second section, data mining algorithms, shows how algorithms are constructed to solve specific problems in a principled manner. The algorithms covered include trees and rules for classification and regression, association rules, belief networks, classical statistical models, nonlinear models such as neural networks, and local memory-based models. The third section shows how all of the preceding analysis fits together when applied to real-world data mining problems. Topics include the role of metadata, how to handle missing data, and data preprocessing. |
data science the mit press essential knowledge series: Democratizing Innovation Eric Von Hippel, 2006-02-17 The process of user-centered innovation: how it can benefit both users and manufacturers and how its emergence will bring changes in business models and in public policy. Innovation is rapidly becoming democratized. Users, aided by improvements in computer and communications technology, increasingly can develop their own new products and services. These innovating users—both individuals and firms—often freely share their innovations with others, creating user-innovation communities and a rich intellectual commons. In Democratizing Innovation, Eric von Hippel looks closely at this emerging system of user-centered innovation. He explains why and when users find it profitable to develop new products and services for themselves, and why it often pays users to reveal their innovations freely for the use of all.The trend toward democratized innovation can be seen in software and information products—most notably in the free and open-source software movement—but also in physical products. Von Hippel's many examples of user innovation in action range from surgical equipment to surfboards to software security features. He shows that product and service development is concentrated among lead users, who are ahead on marketplace trends and whose innovations are often commercially attractive. Von Hippel argues that manufacturers should redesign their innovation processes and that they should systematically seek out innovations developed by users. He points to businesses—the custom semiconductor industry is one example—that have learned to assist user-innovators by providing them with toolkits for developing new products. User innovation has a positive impact on social welfare, and von Hippel proposes that government policies, including R&D subsidies and tax credits, should be realigned to eliminate biases against it. The goal of a democratized user-centered innovation system, says von Hippel, is well worth striving for. An electronic version of this book is available under a Creative Commons license. |
data science the mit press essential knowledge series: Cybersecurity Duane C. Wilson, 2021-09-14 An accessible guide to cybersecurity for the everyday user, covering cryptography and public key infrastructure, malware, blockchain, and other topics. It seems that everything we touch is connected to the internet, from mobile phones and wearable technology to home appliances and cyber assistants. The more connected our computer systems, the more exposed they are to cyber attacks--attempts to steal data, corrupt software, disrupt operations, and even physically damage hardware and network infrastructures. In this volume of the MIT Press Essential Knowledge series, cybersecurity expert Duane Wilson offers an accessible guide to cybersecurity issues for everyday users, describing risks associated with internet use, modern methods of defense against cyber attacks, and general principles for safer internet use. Wilson describes the principles that underlie all cybersecurity defense: confidentiality, integrity, availability, authentication, authorization, and non-repudiation (validating the source of information). He explains that confidentiality is accomplished by cryptography; examines the different layers of defense; analyzes cyber risks, threats, and vulnerabilities; and breaks down the cyber kill chain and the many forms of malware. He reviews some online applications of cybersecurity, including end-to-end security protection, secure ecommerce transactions, smart devices with built-in protections, and blockchain technology. Finally, Wilson considers the future of cybersecurity, discussing the continuing evolution of cyber defenses as well as research that may alter the overall threat landscape. |
data science the mit press essential knowledge series: Data Science and Machine Learning Dirk P. Kroese, Zdravko Botev, Thomas Taimre, Radislav Vaisman, 2019-11-20 Focuses on mathematical understanding Presentation is self-contained, accessible, and comprehensive Full color throughout Extensive list of exercises and worked-out examples Many concrete algorithms with actual code |
data science the mit press essential knowledge series: Foundations of Data Science Avrim Blum, John Hopcroft, Ravindran Kannan, 2020-01-23 This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data. |
data science the mit press essential knowledge series: Probabilistic Machine Learning Kevin P. Murphy, 2022-03-01 A detailed and up-to-date introduction to machine learning, presented through the unifying lens of probabilistic modeling and Bayesian decision theory. This book offers a detailed and up-to-date introduction to machine learning (including deep learning) through the unifying lens of probabilistic modeling and Bayesian decision theory. The book covers mathematical background (including linear algebra and optimization), basic supervised learning (including linear and logistic regression and deep neural networks), as well as more advanced topics (including transfer learning and unsupervised learning). End-of-chapter exercises allow students to apply what they have learned, and an appendix covers notation. Probabilistic Machine Learning grew out of the author’s 2012 book, Machine Learning: A Probabilistic Perspective. More than just a simple update, this is a completely new book that reflects the dramatic developments in the field since 2012, most notably deep learning. In addition, the new book is accompanied by online Python code, using libraries such as scikit-learn, JAX, PyTorch, and Tensorflow, which can be used to reproduce nearly all the figures; this code can be run inside a web browser using cloud-based notebooks, and provides a practical complement to the theoretical topics discussed in the book. This introductory text will be followed by a sequel that covers more advanced topics, taking the same probabilistic approach. |
data science the mit press essential knowledge series: Free Will Mark Balaguer, 2014-02-14 A philosopher considers whether the scientific and philosophical arguments against free will are reason enough to give up our belief in it. In our daily life, it really seems as though we have free will, that what we do from moment to moment is determined by conscious decisions that we freely make. You get up from the couch, you go for a walk, you eat chocolate ice cream. It seems that we're in control of actions like these; if we are, then we have free will. But in recent years, some have argued that free will is an illusion. The neuroscientist (and best-selling author) Sam Harris and the late Harvard psychologist Daniel Wegner, for example, claim that certain scientific findings disprove free will. In this engaging and accessible volume in the Essential Knowledge series, the philosopher Mark Balaguer examines the various arguments and experiments that have been cited to support the claim that human beings don't have free will. He finds them to be overstated and misguided. Balaguer discusses determinism, the view that every physical event is predetermined, or completely caused by prior events. He describes several philosophical and scientific arguments against free will, including one based on Benjamin Libet's famous neuroscientific experiments, which allegedly show that our conscious decisions are caused by neural events that occur before we choose. He considers various religious and philosophical views, including the philosophical pro-free-will view known as compatibilism. Balaguer concludes that the anti-free-will arguments put forward by philosophers, psychologists, and neuroscientists simply don't work. They don't provide any good reason to doubt the existence of free will. But, he cautions, this doesn't necessarily mean that we have free will. The question of whether we have free will remains an open one; we simply don't know enough about the brain to answer it definitively. |
data science the mit press essential knowledge series: Learning for Adaptive and Reactive Robot Control Aude Billard, Sina Mirrazavi, Nadia Figueroa, 2022-02-08 Methods by which robots can learn control laws that enable real-time reactivity using dynamical systems; with applications and exercises. This book presents a wealth of machine learning techniques to make the control of robots more flexible and safe when interacting with humans. It introduces a set of control laws that enable reactivity using dynamical systems, a widely used method for solving motion-planning problems in robotics. These control approaches can replan in milliseconds to adapt to new environmental constraints and offer safe and compliant control of forces in contact. The techniques offer theoretical advantages, including convergence to a goal, non-penetration of obstacles, and passivity. The coverage of learning begins with low-level control parameters and progresses to higher-level competencies composed of combinations of skills. Learning for Adaptive and Reactive Robot Control is designed for graduate-level courses in robotics, with chapters that proceed from fundamentals to more advanced content. Techniques covered include learning from demonstration, optimization, and reinforcement learning, and using dynamical systems in learning control laws, trajectory planning, and methods for compliant and force control . Features for teaching in each chapter: applications, which range from arm manipulators to whole-body control of humanoid robots; pencil-and-paper and programming exercises; lecture videos, slides, and MATLAB code examples available on the author’s website . an eTextbook platform website offering protected material[EPS2] for instructors including solutions. |
data science the mit press essential knowledge series: Synesthesia Richard E. Cytowic, 2012-12-06 Synesthesia comes from the Greek syn (meaning union) and aisthesis (sensation), literally interpreted as a joining of the senses. Synesthesia is an involuntary joining in which the real information from one sense is joined or accompanies a perception in another. Dr. Cytowic reports extensive research into the physical, psychological, neural, and familial background of a group of synesthets. His findings form the first complete picture of the brain mechanisms that underlie this remarkable perceptual experience. His research demonstrates that this rare condition is brain-based and perceptual and not mind-based, as is the case with memory or imagery. Synesthesia offers a unique and detailed study of a condition which has confounded scientists for more than 200 years. |
data science the mit press essential knowledge series: Machine Learning Kevin P. Murphy, 2012-08-24 A comprehensive introduction to machine learning that uses probabilistic models and inference as a unifying approach. Today's Web-enabled deluge of electronic data calls for automated methods of data analysis. Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach. The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package—PMTK (probabilistic modeling toolkit)—that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students. |
data science the mit press essential knowledge series: Building the New Economy Alex Pentland, Alexander Lipton, Thomas Hardjono, 2021-10-12 How to empower people and communities with user-centric data ownership, transparent and accountable algorithms, and secure digital transaction systems. Data is now central to the economy, government, and health systems—so why are data and the AI systems that interpret the data in the hands of so few people? Building the New Economy calls for us to reinvent the ways that data and artificial intelligence are used in civic and government systems. Arguing that we need to think about data as a new type of capital, the authors show that the use of data trusts and distributed ledgers can empower people and communities with user-centric data ownership, transparent and accountable algorithms, machine learning fairness principles and methodologies, and secure digital transaction systems. It’s well known that social media generate disinformation and that mobile phone tracking apps threaten privacy. But these same technologies may also enable the creation of more agile systems in which power and decision-making are distributed among stakeholders rather than concentrated in a few hands. Offering both big ideas and detailed blueprints, the authors describe such key building blocks as data cooperatives, tokenized funding mechanisms, and tradecoin architecture. They also discuss technical issues, including how to build an ecosystem of trusted data, the implementation of digital currencies, and interoperability, and consider the evolution of computational law systems. |
data science the mit press essential knowledge series: Understanding Beliefs Nils J. Nilsson, 2014-08-01 What beliefs are, what they do for us, how we come to hold them, and how to evaluate them. Our beliefs constitute a large part of our knowledge of the world. We have beliefs about objects, about culture, about the past, and about the future. We have beliefs about other people, and we believe that they have beliefs as well. We use beliefs to predict, to explain, to create, to console, to entertain. Some of our beliefs we call theories, and we are extraordinarily creative at constructing them. Theories of quantum mechanics, evolution, and relativity are examples. But so are theories about astrology, alien abduction, guardian angels, and reincarnation. All are products (with varying degrees of credibility) of fertile minds trying to find explanations for observed phenomena. In this book, Nils Nilsson examines beliefs: what they do for us, how we come to hold them, and how to evaluate them. We should evaluate our beliefs carefully, Nilsson points out, because they influence so many of our actions and decisions. Some of our beliefs are more strongly held than others, but all should be considered tentative and changeable. Nilsson shows that beliefs can be quantified by probability, and he describes networks of beliefs in which the probabilities of some beliefs affect the probabilities of others. He argues that we can evaluate our beliefs by adapting some of the practices of the scientific method and by consulting expert opinion. And he warns us about “belief traps”—holding onto beliefs that wouldn't survive critical evaluation. The best way to escape belief traps, he writes, is to expose our beliefs to the reasoned criticism of others. |
data science the mit press essential knowledge series: Mathematics for Machine Learning Marc Peter Deisenroth, A. Aldo Faisal, Cheng Soon Ong, 2020-04-23 The fundamental mathematical tools needed to understand machine learning include linear algebra, analytic geometry, matrix decompositions, vector calculus, optimization, probability and statistics. These topics are traditionally taught in disparate courses, making it hard for data science or computer science students, or professionals, to efficiently learn the mathematics. This self-contained textbook bridges the gap between mathematical and machine learning texts, introducing the mathematical concepts with a minimum of prerequisites. It uses these concepts to derive four central machine learning methods: linear regression, principal component analysis, Gaussian mixture models and support vector machines. For students and others with a mathematical background, these derivations provide a starting point to machine learning texts. For those learning the mathematics for the first time, the methods help build intuition and practical experience with applying mathematical concepts. Every chapter includes worked examples and exercises to test understanding. Programming tutorials are offered on the book's web site. |
data science the mit press essential knowledge series: Elements of Causal Inference Jonas Peters, Dominik Janzing, Bernhard Scholkopf, 2017-11-29 A concise and self-contained introduction to causal inference, increasingly important in data science and machine learning. The mathematization of causality is a relatively recent development, and has become increasingly important in data science and machine learning. This book offers a self-contained and concise introduction to causal models and how to learn them from data. After explaining the need for causal models and discussing some of the principles underlying causal inference, the book teaches readers how to use causal models: how to compute intervention distributions, how to infer causal models from observational and interventional data, and how causal ideas could be exploited for classical machine learning problems. All of these topics are discussed first in terms of two variables and then in the more general multivariate case. The bivariate case turns out to be a particularly hard problem for causal learning because there are no conditional independences as used by classical methods for solving multivariate cases. The authors consider analyzing statistical asymmetries between cause and effect to be highly instructive, and they report on their decade of intensive research into this problem. The book is accessible to readers with a background in machine learning or statistics, and can be used in graduate courses or as a reference for researchers. The text includes code snippets that can be copied and pasted, exercises, and an appendix with a summary of the most important technical concepts. |
data science the mit press essential knowledge series: Introduction to Data Science for Social and Policy Research Jose Manuel Magallanes Reyes, 2017-09-21 This comprehensive guide provides a step-by-step approach to data collection, cleaning, formatting, and storage, using Python and R. |
data science the mit press essential knowledge series: The Internet of Things Samuel Greengard, 2015-03-20 A guided tour through the Internet of Things, a networked world of connected devices, objects, and people that is changing the way we live and work. We turn on the lights in our house from a desk in an office miles away. Our refrigerator alerts us to buy milk on the way home. A package of cookies on the supermarket shelf suggests that we buy it, based on past purchases. The cookies themselves are on the shelf because of a “smart” supply chain. When we get home, the thermostat has already adjusted the temperature so that it's toasty or bracing, whichever we prefer. This is the Internet of Things—a networked world of connected devices, objects, and people. In this book, Samuel Greengard offers a guided tour through this emerging world and how it will change the way we live and work. Greengard explains that the Internet of Things (IoT) is still in its early stages. Smart phones, cloud computing, RFID (radio-frequency identification) technology, sensors, and miniaturization are converging to make possible a new generation of embedded and immersive technology. Greengard traces the origins of the IoT from the early days of personal computers and the Internet and examines how it creates the conceptual and practical framework for a connected world. He explores the industrial Internet and machine-to-machine communication, the basis for smart manufacturing and end-to-end supply chain visibility; the growing array of smart consumer devices and services—from Fitbit fitness wristbands to mobile apps for banking; the practical and technical challenges of building the IoT; and the risks of a connected world, including a widening digital divide and threats to privacy and security. Finally, he considers the long-term impact of the IoT on society, narrating an eye-opening “Day in the Life” of IoT connections circa 2025. |
data science the mit press essential knowledge series: The Future Nick Montfort, 2017-12-08 How the future has been imagined and made, through the work of writers, artists, inventors, and designers. The future is like an unwritten book. It is not something we see in a crystal ball, or can only hope to predict, like the weather. In this volume of the MIT Press's Essential Knowledge series, Nick Montfort argues that the future is something to be made, not predicted. Montfort offers what he considers essential knowledge about the future, as seen in the work of writers, artists, inventors, and designers (mainly in Western culture) who developed and described the core components of the futures they envisioned. Montfort's approach is not that of futurology or scenario planning; instead, he reports on the work of making the future—the thinkers who devoted themselves to writing pages in the unwritten book. Douglas Engelbart, Alan Kay, and Ted Nelson didn't predict the future of computing, for instance. They were three of the people who made it. Montfort focuses on how the development of technologies—with an emphasis on digital technologies—has been bound up with ideas about the future. Readers learn about kitchens of the future and the vision behind them; literary utopias, from Plato's Republic to Edward Bellamy's Looking Backward and Charlotte Perkins Gilman's Herland; the Futurama exhibit at the 1939 New York World's Fair; and what led up to Tim Berners-Lee's invention of the World Wide Web. Montfort describes the notebook computer as a human-centered alterative to the idea of the computer as a room-sized “giant brain”; speculative practice in design and science fiction; and, throughout, the best ways to imagine and build the future. |
data science the mit press essential knowledge series: Open Access Peter Suber, 2012-07-20 A concise introduction to the basics of open access, describing what it is (and isn't) and showing that it is easy, fast, inexpensive, legal, and beneficial. The Internet lets us share perfect copies of our work with a worldwide audience at virtually no cost. We take advantage of this revolutionary opportunity when we make our work “open access”: digital, online, free of charge, and free of most copyright and licensing restrictions. Open access is made possible by the Internet and copyright-holder consent, and many authors, musicians, filmmakers, and other creators who depend on royalties are understandably unwilling to give their consent. But for 350 years, scholars have written peer-reviewed journal articles for impact, not for money, and are free to consent to open access without losing revenue. In this concise introduction, Peter Suber tells us what open access is and isn't, how it benefits authors and readers of research, how we pay for it, how it avoids copyright problems, how it has moved from the periphery to the mainstream, and what its future may hold. Distilling a decade of Suber's influential writing and thinking about open access, this is the indispensable book on the subject for researchers, librarians, administrators, funders, publishers, and policy makers. |
data science the mit press essential knowledge series: Introduction to Data Science Rafael A. Irizarry, 2019-11-20 Introduction to Data Science: Data Analysis and Prediction Algorithms with R introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression, and machine learning. It also helps you develop skills such as R programming, data wrangling, data visualization, predictive algorithm building, file organization with UNIX/Linux shell, version control with Git and GitHub, and reproducible document preparation. This book is a textbook for a first course in data science. No previous knowledge of R is necessary, although some experience with programming may be helpful. The book is divided into six parts: R, data visualization, statistics with R, data wrangling, machine learning, and productivity tools. Each part has several chapters meant to be presented as one lecture. The author uses motivating case studies that realistically mimic a data scientist’s experience. He starts by asking specific questions and answers these through data analysis so concepts are learned as a means to answering the questions. Examples of the case studies included are: US murder rates by state, self-reported student heights, trends in world health and economics, the impact of vaccines on infectious disease rates, the financial crisis of 2007-2008, election forecasting, building a baseball team, image processing of hand-written digits, and movie recommendation systems. The statistical concepts used to answer the case study questions are only briefly introduced, so complementing with a probability and statistics textbook is highly recommended for in-depth understanding of these concepts. If you read and understand the chapters and complete the exercises, you will be prepared to learn the more advanced concepts and skills needed to become an expert. |
data science the mit press essential knowledge series: High-Dimensional Probability Roman Vershynin, 2018-09-27 An integrated package of powerful probabilistic tools and key applications in modern mathematical data science. |
Data and Digital Outputs Management Plan (DDOMP)
Data and Digital Outputs Management Plan (DDOMP)
Building New Tools for Data Sharing and Reuse through a …
Jan 10, 2019 · The SEI CRA will closely link research thinking and technological innovation toward accelerating the full path of discovery-driven data use …
Open Data Policy and Principles - Belmont Forum
The data policy includes the following principles: Data should be: Discoverable through catalogues and search engines; Accessible as open …
Belmont Forum Adopts Open Data Principles for Environme…
Jan 27, 2016 · Adoption of the open data policy and principles is one of five recommendations in A Place to Stand: e-Infrastructures and Data …
Belmont Forum Data Accessibility Statement an…
The DAS encourages researchers to plan for the longevity, reusability, and stability of the data attached to their research publications and results. …
Data and Digital Outputs Management Plan (DDOMP)
Data and Digital Outputs Management Plan (DDOMP)
Building New Tools for Data Sharing and Reuse through a …
Jan 10, 2019 · The SEI CRA will closely link research thinking and technological innovation toward accelerating the full path of discovery-driven data use …
Open Data Policy and Principles - Belmont Forum
The data policy includes the following principles: Data should be: Discoverable through catalogues and search engines; Accessible as open …
Belmont Forum Adopts Open Data Principles for Environme…
Jan 27, 2016 · Adoption of the open data policy and principles is one of five recommendations in A Place to Stand: e-Infrastructures and Data …
Belmont Forum Data Accessibility Statement an…
The DAS encourages researchers to plan for the longevity, reusability, and stability of the data attached to their research publications and results. …