Can a system be considered truly reliable if it isn't fundamentally secure? Or can it be considered secure if it's unreliable? Security is crucial to the design and operation of scalable systems in production, as it plays an important part in product quality, performance, and availability. In this book, experts from Google share best practices to help your organization design scalable and reliable systems that are fundamentally secure.
Two previous O'Reilly books from Google--Site Reliability Engineering and The Site Reliability Workbook--demonstrated how and why a commitment to the entire service lifecycle enables organizations to successfully build, deploy, monitor, and maintain software systems. In this latest guide, the authors offer insights into system design, implementation, and maintenance from practitioners who specialize in security and reliability. They also discuss how building and adopting their recommended best practices requires a culture that's supportive of such change.
You'll learn about secure and reliable systems through:
- Design strategies
- Recommendations for coding, testing, and debugging practices
- Strategies to prepare for, respond to, and recover from incidents
- Cultural best practices that help teams across your organization collaborate effectively
About the Author
Heather Adkins is a 17-year Google veteran and founding member of the Google Security Team. As Sr Director of Information Security, she has built a global team responsible for maintaining the safety and security of Google's networks, systems and applications. She has an extensive background in systems and network administration with an emphasis on practical security, and has worked to build and secure some of the world's largest infrastructure. She now focuses her time primarily on the defense of Google's computing infrastructure and working with industry to tackle some of the greatest security challenges.Betsy Beyer is a Technical Writer for Google Site Reliability Engineering in NYC, and the editor of Site Reliability Engineering: How Google Runs Production Systems and The Site Reliability Workbook. She has previously written documentation for Google's Data Center and Hardware Operations Teams in Mountain View and across its globally-distributed data centers. Before moving to New York, Betsy was a lecturer on technical writing at Stanford University. En route to her current career, Betsy studied International Relations and English Literature, and holds degrees from Stanford and Tulane.Paul Blankinship manages the Technical Writing team for Google's Security and Privacy Engineering group. He's previously written documentation for Google Web Designer, and helped develop Google's internal security and privacy policies.Piotr Lewandowski is a Senior Staff Site Reliability Engineer, and has spent the past nine years improving the security posture of Google's infrastructure. As the Production Tech Lead for Security, he is responsible for harmonious collaboration between the SRE and security organizations. In his previous role, he led a team responsible for the reliability of Google's critical security infrastructure. Before joining Google, he built a startup, worked at CERT Polska, and got a degree in computer science from Warsaw University of Technology.Ana Oprea specializes in Site Reliability Engineering, Security, and planning and strategy for Google's Technical Infrastructure - a role that follows naturally from her previous experience as a Software Developer, Technical Consultant, and Network Admin. After working and studying in Germany, France, and Romania, she accounts for different cultural approaches when facing any challenge.Adam Stubblefield is a Distinguished Engineer and the Horizontal Lead for Security at Google. Over the past 8 years, he's led teams that have built much of Google's core security infrastructure. Adam has a PhD in Computer Science from Johns Hopkins.