Certification_CIPT_300x250final-01
Webcon_PA_300x250_ad_MARCH_2015-01

On Finding Reasonable Measures To Bridge the Gap Between Privacy Engineers and Lawyers

Getting privacy lawyers and software engineers to work together to implement privacy is a perennial problem. Peter Swire, CIPP/US, and Annie Anton's article "Engineers and Lawyers in Privacy Protection: Can We All Just Get Along?" has explored this as has The Privacy Engineer's Manifesto, of which their article is part.

But here's the problem: We cannot keep addressing privacy from a top-down, legally driven perspective. No amount of additional processes and compliance checks is going to change the fact that software itself is so complex. Software engineering is often assumed to be the final stage and by some a mere consequence of many requirements from a large number of often conflicting sources.

Often privacy issues are “solved” by hashing an identifier, encrypting a communications link or anonymizing a data set, for some definition of anonymization. Many of these are piecemeal, ineffective, Band-Aid type solutions—the proverbial rearranging of deck chairs on the Titanic. Privacy Theater at its worst … or best.

So how do we address this?

First, maybe we should realize that we don't really understand each other’s disciplines. Very few lawyers are trained software engineers and vice-versa; therefore, constructing a lingua franca between these groups is part of that first step.

Often, understanding ideas that are simple in one domain do not translate to the other. For example, the use of the term “reasonable” means nothing in the software engineering domain. Plus, the simple act of “encryption” to a software engineer hides an enormous complexity of network protocols, key management, performance, encryption libraries and so on. Similarly, the now-ubiquitous use of “app” by lawyers to mean something that runs on your phone means a lot more to the software engineer.

What does “app” really mean to software engineers?

That little piece of code that downloads your news feed each morning and presents it in a friendly way—allowing you the scroll through using your touch screen device and maybe now and again presenting an advert because you didn't want to buy the paid version—is, in fact, not a tiny bit of code at all.

Effectively, what runs on your device is a multi-layered, complex interaction of hundreds of components, each sharing the device's memory, long-term storage, network components, display, keyboard, microphone, camera, speaker, GPS and so on. Each of those individual components interacts with layers underneath the interface, passing messages between components, scheduling when pieces of code must run and which piece of code gets the notification that you've just touched the screen as well as how and where data is stored.

When the app needs to get the next news item “from the web,” we have a huge interaction of network protocols that marshal the contents of the news feed, check its consistency, perform error-correction, marshal the individual segments of message from the network, manage addressing messages and internal format. There are protocols underneath this that decide on the routing between networks and those that control the electrical pulses over wires or antennae. This itself is repeated many, many times as messages are passed between routing points all over the Internet and includes cell towers, home wireless routers, undersea cables and satellite connections. Even the use of privacy-preserving technologies—encryption, for example—can be rendered virtually meaningless because underneath lies a veritable treasure trove of metadata. And that is just the networking component!

Let's for the moment look at the application's development itself, probably coded in some language such as Java, which itself is a clever beast that theoretically runs anywhere because it runs on top of a virtual machine and can be ported to almost any computing platform. The language contains many features to make a programmer's life easier by abstracting away complex details of the underlying mechanisms of how computers work. These are brought together through a vast library of publicly available libraries for a plethora of tasks—from generating random numbers to big data analytics—in only a 'few' lines of well-chosen code.

Even without diving deep into the code and components of which it’s made, similar complexity awaits in understanding where our content flows. Maybe your smartphone’s news app gets data from a news server and gives you the opportunity to share it through your favorite social network. Where does your data flow after these points? To advertisers, marketers, system administrators and so on? Through what components? And what kinds of processing? How about cross-referencing? Where is that data logged and who has access?

Constructing a full map or data flow of a typical software system—even a simple mobile app—becomes a spider web of interacting information flows, further complicated by layering over the logical and physical architectures, the geographical distribution and concepts such as controller and processor. Not to mention the actual content of the data, its provenance, the details of its collection, the risks, the links with policies and so on. And yet we still have not considered the security, performance and other functional and nonfunctional aspects!

Software engineers have enough of a problem managing this complexity without having to learn and comprehend legal language, too. Indeed, privacy won't get a foothold in software engineering until a path from “reasonable” can be traced through the myriad requirements to those millions of lines of code that actually implement “reasonable.”

Engineers love a challenge, but often when solutions are laid out before them in terms of policy and vague concepts—concepts which to us might be perfectly reasonable (there's that word again!)—then those engineers are just going to ignore or, worse, mis-implement those requirements. Not out of any malicious intent but because those requirements are almost meaningless in their domain. Even concepts such as data minimization can lose much if not all of their meaning as we move through the layers of code and interaction.

Life as a software engineer is hard: juggling a complex framework of requirements, ideas, systems, components and so on without interpreting what a privacy policy actually means across and inside the Internet-wide environment in which we work.

Software isn't a black box protected by a privacy policy and a suite of magic privacy-enhancing technologies but a veritable Pandora's Box of who-knows-what of “things.”

As privacy professionals, we should open that box often to fully comprehend what is really going on when we say “reasonable.” As software engineers, we'll more than make you welcome in our world, and we'd probably relish the chance to explore yours. But until we both get visibility in our respective domains and make the effort to understand each other's languages and how these relate to each other, this isn't going to happen—at least not in any “reasonable” way.

Written By

Ian Oliver

3 Comments

If you want to comment on this post, you need to login
  • James Jul 29, 2014

    This is a good article, and well overdue. One of its contributions is to give a hint of the complexity inherent in software development. The lawyers that run the IAPP are fond of talking about 'apps', as if a tiny component running on a smartphone was the pinnacle of complexity. In reality, most enterprise systems running at large scale are highly complex socio-technical systems operating on a tremendously complex stack of technology. They have a myriad of interactions, dependencies and the like. To top it off, they are often used in ways that their designers did not intend, since the human actors can be creative and find their own ways to alter workflows. Those of us with competence in both law and software engineering are often at a loss to explain concepts from one domain to the other. Software engineers have unfortunate tendency to regard the law a system of rules, to take but one instance. Terms like 'reasonable' are in statutes for a reason - namely, to provide a space for interpretation and discretion. Engineers often talk of 'making it more precise', which shows a complete lack of understand of why such terms exist in the first place. You covered some of the difficulties of getting lawyers to think about technology. (Namely, sheer ignorance). Encryption, anonymization and like terms cover a vast array of options, and in many cases there are major trade-offs to be weighed. The sheer scale of some systems makes manual data flow analysis very difficult, as any Amazon or Microsoft cloud engineer could attest to. You have identified one of the key problems, which is translating legal requirements into artifacts for use by systems designers and engineers. This is a current research problem. The best introductory paper is probably "Towards the development of privacy-aware systems" by Garda and Zannone. I would love to see the IAPP cease and desist from having lawyers write missives on 'privacy engineering'. If you haven't worked as an engineer (in any discipline), you have no business talking about engineering.

  • Richard Jul 30, 2014

    Speaking as someone whose job it is to try and 'bridge the gap' - I couldn't agree more. It also seems the House of Lords agrees, as they have said recently that asking search engines to make right to be forgotten decisions based on 'vague, ambiguous and unhelpful criteria' is not the right approach. I agree with the sentiment - but not their solution. They suggest absolving the search engines of responsibility, whereas clarifying and codifying the criteria would be the more positive step forward. In almost all cases the law is too vague for engineering but to give up on that basis is defeatist. Bridging that gap is the key to improved privacy and data protection.

  • Stephen Aug 5, 2014

    It's great to see so much attention now on the gap between engineering and privacy. In my view, one particular problem is that the words we use to "market" privacy are disengaging for engineers. Too often we hear well meaning slogans, like "Privacy is good for business" and "Privacy is not a technology issue". These sentiments are simplistic and as such, often repel the proud engineer. We've heard it all before. "Quality is good for business" and "security is good for business". These are weasel words. In truth, privacy is not innately good for business at all! It's actually costly, it slows design, it conflicts with orthodox security and operations management (which would prefer to log much more information than privacy allows) and it creates opportunity costs. Privacy creates conflicting requirements. But this is the bread and butter of serious engineering. The job of the engineer is to resolve competing requirements. Usually it's cost vs performance, or speed to market vs quality, or manufacturability vs functionality. The best way to engage engineers in privacy is to surface the competing requirements, and write requirements specifications in ways that engineers can deal with. One practical way to do this is to hybridise Security and Privacy analysis in a unified Threat & Risk Assessment.

Related